<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Projects | MODE Collaboration</title><link>https://mode-demo.github.io/project/</link><atom:link href="https://mode-demo.github.io/project/index.xml" rel="self" type="application/rss+xml"/><description>Projects</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Wed, 04 Oct 2023 00:00:00 +0000</lastBuildDate><image><url>https://mode-demo.github.io/media/icon_hu_ebbff252c19052d0.png</url><title>Projects</title><link>https://mode-demo.github.io/project/</link></image><item><title>Data-Driven Decision-Making Algorithms</title><link>https://mode-demo.github.io/project/algorithms/</link><pubDate>Wed, 04 Oct 2023 00:00:00 +0000</pubDate><guid>https://mode-demo.github.io/project/algorithms/</guid><description>&lt;!-- A main research direction for the AIR-DREAM Lab is to develop high-performance, robust, generalizable, and real-world deployable data-driven decision-making algorithms. We are specifically interested in offline policy learning methods, such as offline reinforcement learning (RL), offline imitation learning (IL), and offline planning, which enable a simulation-free and low-cost solution to address many real-world problems.
Our current research focus include:
- Sample-efficient / high-generalization offline RL / IL / planning algorithms
- Foundation models for decision-making
- Safe offline RL algorithms
- Hybrid RL that combines offline and online policy learning
- Offline policy learning under imperfect reward
- Feedback-efficient RLHF --&gt;
&lt;div style="font-family: Helvetica, sans-serif; max-width: 960px; margin: 0 auto; padding: 20px; line-height: 1.6; color: #333;"&gt;
&lt;div style="
padding: 2px;
border-radius: 12px;
background: linear-gradient(135deg, #e0f2fe, #ecfdf5);
box-shadow: 0 4px 12px rgba(0,0,0,0.05);
"&gt;
&lt;div style="
background: white;
border-radius: 10px;
padding: 20px;
"&gt;
&lt;p style="
font-size: 18px;
line-height: 1.7;
color: #1e293b;
margin: 0;
"&gt;
A main research direction for the AIR-DREAM Lab is to develop high-performance, robust, generalizable, and real-world deployable data-driven decision-making algorithms. We are specifically interested in offline policy learning methods, such as offline reinforcement learning (RL), offline imitation learning (IL), and offline planning, which enable a simulation-free and low-cost solution to address many real-world problems.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 style="margin-top: 24px; color: #00bcd4; font-size: 24px;"&gt;Our current research focus includes:&lt;/h3&gt;
&lt;!-- 卡片式布局 --&gt;
&lt;div style="display: grid; grid-template-columns: repeat(auto-fill, minmax(280px, 1fr)); gap: 24px; margin-top: 24px;"&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #00bcd4;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Sample-efficient / high-generalization offline RL / IL / planning algorithms&lt;/h4&gt;
&lt;!-- &lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Data-driven control optimization for complex industrial systems&lt;/p&gt; --&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #4caf50;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Foundation models for decision-making&lt;/h4&gt;
&lt;!-- &lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Energy saving optimization for data centers&lt;/p&gt; --&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #ff9800;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Safe offline RL algorithms&lt;/h4&gt;
&lt;!-- &lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Massive MIMO Beamforming optimization for 5G&lt;/p&gt; --&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid rgb(255, 204, 0);"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Hybrid RL that combines offline and online policy learning&lt;/h4&gt;
&lt;!-- &lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Massive MIMO Beamforming optimization for 5G&lt;/p&gt; --&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #9c27b0;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Offline policy learning under imperfect reward&lt;/h4&gt;
&lt;!-- &lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Engineering policy integrated hybrid reinforcement learning&lt;/p&gt; --&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid rgb(215, 58, 205);"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Feedback-efficient RLHF&lt;/h4&gt;
&lt;!-- &lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Engineering policy integrated hybrid reinforcement learning&lt;/p&gt; --&gt;
&lt;/div&gt;
&lt;/div&gt;</description></item><item><title>Learning-based Methods for Robotics &amp; Autonomous Driving</title><link>https://mode-demo.github.io/project/robotics/</link><pubDate>Tue, 03 Oct 2023 00:00:00 +0000</pubDate><guid>https://mode-demo.github.io/project/robotics/</guid><description>&lt;!-- We focus on developing robotic control and autonomous driving policy learning methods that could directly learn from real-world data, bypassing or alleviating sim-to-real gap, while achieving robust and generalizable performance.
Our current research focus include:
- Offline RL / IL / planning methods for autonomous driving and robotic control
- Offline policy optimization for safety-critical scenarios
- Foundation models for robotic control
- Sim-to-real adaptation
**Latest research**:
- [Diffusion-Planner: Diffusion-Based Planning for Autonomous Driving with Flexible Guidance](../../publication/zheng-2025-diffusion/) --&gt;
&lt;div style="font-family: Helvetica, sans-serif; max-width: 960px; margin: 0 auto; padding: 20px; line-height: 1.6; color: #333;"&gt;
&lt;div style="
padding: 2px;
border-radius: 12px;
background: linear-gradient(135deg, #e0f2fe, #ecfdf5);
box-shadow: 0 4px 12px rgba(0,0,0,0.05);
"&gt;
&lt;div style="
background: white;
border-radius: 10px;
padding: 20px;
"&gt;
&lt;p style="
font-size: 18px;
line-height: 1.7;
color: #1e293b;
margin: 0;
"&gt;
We focus on developing robotic control and autonomous driving policy learning methods that could directly learn from real-world data, bypassing or alleviating sim-to-real gap, while achieving robust and generalizable performance.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 style="margin-top: 24px; color: #00bcd4; font-size: 24px;"&gt;Our current research focus includes:&lt;/h3&gt;
&lt;!-- 卡片式布局 --&gt;
&lt;div style="display: grid; grid-template-columns: repeat(auto-fill, minmax(280px, 1fr)); gap: 24px; margin-top: 24px;"&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #00bcd4;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Offline RL / IL / planning methods for autonomous driving and robotic control&lt;/h4&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #4caf50;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Offline policy optimization for safety-critical scenarios&lt;/h4&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #ff9800;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Foundation models for robotic control&lt;/h4&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #9c27b0;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Sim-to-real adaptation&lt;/h4&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div align="center" style="font-family: Helvetica, sans-serif; margin-bottom: 1em; margin-top: 60px;"&gt;
&lt;h1 style="color: #00bcd4; text-transform: uppercase; font-size: 40px; margin: 0;"&gt;Latest Achievement&lt;/h1&gt;
&lt;div class="card"&gt;
&lt;h3 style="color: #121212; font-size: 24px; font-weight: bold; margin: 0.3em 0 1em;"&gt;
&lt;a href="../../publication/zheng-2025-xvla/" style="color:rgb(212, 191, 55);"&gt;X-VLA has won First Place in the AGIBOT World Challenge (Manipulation track) @ IROS 2025!&lt;/a&gt;&lt;/h3&gt;
&lt;/div&gt;
&lt;div class="card"&gt;
&lt;h3 style="color: #121212; font-size: 24px; font-weight: bold; margin: 0.3em 0 1em;"&gt;
&lt;a href="../../publication/zheng-2025-diffusion/" style="color:rgb(13, 181, 227);"&gt;Diffusion-Planner: Diffusion-Based Planning for Autonomous Driving with Flexible Guidance&lt;/a&gt;&lt;/h3&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;style&gt;
.card {
background: white;
border-radius: 12px;
padding: 5px;
box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05);
transition: transform 0.3s ease;
border: none;
}
/* 鼠标悬停时的效果 */
.card:hover {
transform: scale(1.05); /* 放大 */
box-shadow: 0 10px 25px rgba(0, 0, 0, 0.15); /* 阴影更明显 */
}
&lt;/style&gt;</description></item><item><title>Data-Driven Methods for Sustainable Industrial and AIoT Systems</title><link>https://mode-demo.github.io/project/aiot/</link><pubDate>Mon, 02 Oct 2023 00:00:00 +0000</pubDate><guid>https://mode-demo.github.io/project/aiot/</guid><description>&lt;!-- Conventional industrial systems and emerging systems such as data centers, 5G communication networks consume enormous amount of energy and non-renewable resources. We focus on developing advanced data-driven AI methods to optimize real-world complex industrial and AIoT systems. Helping the related industries to improve operation efficiency, save energy, reduce emission, and ultimately achieving the goal of green and sustanable development.
Our current research focus include:
- Simulator-free data-driven control optimization for complex industrial systems
- Energy saving optimization for data centers
- 5G Massive MIMO Beamforming optimization
- Engineering policy integrated hybrid RL --&gt;
&lt;div style="font-family: Helvetica, sans-serif; max-width: 960px; margin: 0 auto; padding: 20px; line-height: 1.6; color: #333;"&gt;
&lt;!-- &lt;p style="font-size: 18px;"&gt;
Conventional industrial systems and emerging systems such as data centers, 5G communication networks consume enormous amount of energy and non-renewable resources.
We focus on developing advanced data-driven AI methods to optimize real-world complex industrial and AIoT systems.
Helping the related industries to improve operation efficiency, save energy, reduce emission, and ultimately achieving the goal of green and sustainable development.
&lt;/p&gt; --&gt;
&lt;div style="
padding: 2px;
border-radius: 12px;
background: linear-gradient(135deg, #e0f2fe, #ecfdf5);
box-shadow: 0 4px 12px rgba(0,0,0,0.05);
"&gt;
&lt;div style="
background: white;
border-radius: 10px;
padding: 20px;
"&gt;
&lt;p style="
font-size: 18px;
line-height: 1.7;
color: #1e293b;
margin: 0;
"&gt;
Conventional industrial systems and emerging systems such as data centers, 5G communication networks consume enormous amount of energy and non-renewable resources.
We focus on developing advanced data-driven AI methods to optimize real-world complex industrial and AIoT systems.
Helping the related industries to improve operation efficiency, save energy, reduce emission, and ultimately achieving the goal of green and sustainable development.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 style="margin-top: 24px; color: #00bcd4; font-size: 24px;"&gt;Our current research focus includes:&lt;/h3&gt;
&lt;!-- 卡片式布局 --&gt;
&lt;div style="display: grid; grid-template-columns: repeat(auto-fill, minmax(280px, 1fr)); gap: 24px; margin-top: 24px;"&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #00bcd4;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Simulator-Free Optimization&lt;/h4&gt;
&lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Data-driven control optimization for complex industrial systems&lt;/p&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #4caf50;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Data Center Efficiency&lt;/h4&gt;
&lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Energy saving optimization for data centers&lt;/p&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #ff9800;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;5G Beamforming&lt;/h4&gt;
&lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Massive MIMO Beamforming optimization for 5G&lt;/p&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #9c27b0;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Hybrid RL&lt;/h4&gt;
&lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Engineering policy integrated hybrid reinforcement learning&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;div align="center" style="font-family: Helvetica, sans-serif; margin-bottom: 1em; margin-top: 60px;"&gt;
&lt;h2 style="color: #00bcd4; text-transform: uppercase; font-size: 40px; margin: 0;"&gt;Latest Achievement&lt;/h2&gt;
&lt;h1 style="color: #222; font-size: 28px; font-weight: bold; margin: 0.3em 0 1em;"&gt;Data Center Cooling System Optimization&lt;/h1&gt;
&lt;/div&gt;
&lt;!-- &lt;div align="center"&gt;
&lt;iframe
src="https://player.bilibili.com/player.html?bvid=BV1ADMcz2EYf&amp;autoplay=1&amp;loop=1"
allowfullscreen
style="width: 100%; max-width: 960px; aspect-ratio: 16/9; border: 0; border-radius: 20px; box-shadow: 0 4px 20px rgba(0,0,0,0.1);"&gt;
&lt;/iframe&gt;
&lt;/div&gt; --&gt;
&lt;div align="center" style="
position: relative;
overflow: hidden;
border-radius: 20px;
box-shadow: 0 4px 20px rgba(0,0,0,0.1);
display: inline-block;
background: #fff; /* 确保背景色一致 */
line-height: 0; /* 消除行高影响 */
font-size: 0; /* 消除字体大小间隙 */
width: 100%;
max-width: 960px;
"&gt;
&lt;iframe
src="https://player.bilibili.com/player.html?bvid=BV1ADMcz2EYf&amp;autoplay=1&amp;loop=1"
allowfullscreen
style="
display: block;
width: 100%;
height: auto;
aspect-ratio: 16/9;
border: 0;
border-radius: 20px;
background: #fff;
transform: translateZ(0);
vertical-align: bottom; /* 消除底部间隙 */
"&gt;
&lt;/iframe&gt;
&lt;!-- 边界覆盖层 - 确保边缘完美 --&gt;
&lt;div style="
position: absolute;
top: 0;
left: 0;
right: 0;
height: 1px;
background: #fff;
z-index: 10;
pointer-events: none;
"&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/div&gt;</description></item><item><title>Tools &amp; Libraries</title><link>https://mode-demo.github.io/project/libs/</link><pubDate>Sun, 01 Oct 2023 00:00:00 +0000</pubDate><guid>https://mode-demo.github.io/project/libs/</guid><description>&lt;div style="font-family: Helvetica, sans-serif; max-width: 960px; margin: 0 auto; padding: 20px; line-height: 1.6; color: #333;"&gt;
&lt;div style="
padding: 2px;
border-radius: 12px;
background: linear-gradient(135deg, #e0f2fe, #ecfdf5);
box-shadow: 0 4px 12px rgba(0,0,0,0.05);
"&gt;
&lt;div style="
background: white;
border-radius: 10px;
padding: 20px;
"&gt;
&lt;p style="
font-size: 18px;
line-height: 1.7;
color: #1e293b;
margin: 0;
"&gt;
We provide open code implementations for most of our research, please check our papers for related codes. In addition, we aim to develop easy-to-use and comprehensive algorithm libraries and tools to accelerate the real-world deployment of advanced data-driven decision-making methods.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 style="margin-top: 24px; color: #00bcd4; font-size: 24px; text-align: center;"&gt;Data-Drivien Decision-Making Libraries / Tools&lt;/h3&gt;
&lt;p&gt;
&lt;figure &gt;
&lt;div class="d-flex justify-content-center"&gt;
&lt;div class="w-100" &gt;&lt;img alt="screen reader text" srcset="
/project/libs/d2c-logo_hu_5d40481b3d148996.webp 400w,
/project/libs/d2c-logo_hu_55cf71f6467108d1.webp 760w,
/project/libs/d2c-logo_hu_169fc8daa277fe2a.webp 1200w"
src="https://mode-demo.github.io/project/libs/d2c-logo_hu_5d40481b3d148996.webp"
width="339"
height="123"
loading="lazy" data-zoomable /&gt;&lt;/div&gt;
&lt;/div&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;div style="font-family: Helvetica, sans-serif; max-width: 960px; margin: 0 auto; padding: 20px; line-height: 1.6; color: #333;"&gt;
&lt;div style="
padding: 2px;
border-radius: 12px;
background: linear-gradient(135deg, #e0f2fe, #ecfdf5);
box-shadow: 0 4px 12px rgba(0,0,0,0.05);
"&gt;
&lt;div style="
background: white;
border-radius: 10px;
padding: 20px;
"&gt;
&lt;p style="
font-size: 18px;
line-height: 1.7;
color: #1e293b;
margin: 0;
"&gt;
&lt;a href="https://github.com/AIR-DI/D2C"&gt;Data-Driven Control Lib (D2C)&lt;/a&gt; is a library for data-driven decision-making &amp; control based on state-of-the-art offline reinforcement learning (RL), offline imitation learning (IL), and offline planning algorithms. It is a platform for solving various decision-making &amp; control problems in real-world scenarios. D2C is designed to offer fast and convenient algorithm performance development and testing, as well as providing easy-to-use toolchains to accelerate the real-world deployment of SOTA data-driven decision-making methods.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 style="margin-top: 24px; color:rgb(94, 120, 225); font-size: 20px;"&gt;The current supported offline RL/IL algorithms include (more to come):&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/pdf/2106.06860.pdf" target="_blank" rel="noopener"&gt;Twin Delayed DDPG with Behavior Cloning (TD3+BC)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2205.11027.pdf" target="_blank" rel="noopener"&gt;Distance-Sensitive Offline Reinforcement Learning (DOGE)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2206.13464.pdf" target="_blank" rel="noopener"&gt;Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning (H2O)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2303.15810" target="_blank" rel="noopener"&gt;Sparse Q-learning (SQL)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2210.08323" target="_blank" rel="noopener"&gt;Policy-guided Offline RL (POR)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/pdf/2110.06169.pdf" target="_blank" rel="noopener"&gt;Offline Reinforcement Learning with Implicit Q-Learning (IQL)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2207.00244" target="_blank" rel="noopener"&gt;Discriminator-Guided Model-Based Offline Imitation Learning (DMIL)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.cse.unsw.edu.au/~claude/papers/MI15.pdf" target="_blank" rel="noopener"&gt;Behavior Cloning (BC)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style="margin-top: 24px; color:rgb(94, 120, 225); font-size: 20px;"&gt;Features:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;D2C includes a large collection of offline RL and IL algorithms: model-free and model-based offline RL/IL algorithms, as well as planning methods.&lt;/li&gt;
&lt;li&gt;D2C is highly modular and extensible. You can easily build custom algorithms and conduct experiments with it.&lt;/li&gt;
&lt;li&gt;D2C automates the development process in real-world control applications. It simplifies the steps of problem definition/mathematical formulation, policy training, policy evaluation and model deployment.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style="margin-top: 24px; color:rgb(94, 120, 225); font-size: 20px;"&gt;Library Information:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;The library is available in &lt;a href="https://github.com/AIR-DI/D2C" target="_blank" rel="noopener"&gt;https://github.com/AIR-DI/D2C&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;The tutorials and API documentation are hosted on &lt;a href="https://air-d2c.readthedocs.io/" target="_blank" rel="noopener"&gt;air-d2c.readthedocs.io&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style="margin-top: 24px; color: #00bcd4; font-size: 24px; text-align: center;"&gt;Online RL Library&lt;/h3&gt;
&lt;div style="font-family: Helvetica, sans-serif; max-width: 960px; margin: 0 auto; padding: 20px; line-height: 1.6; color: #333;"&gt;
&lt;div style="
padding: 2px;
border-radius: 12px;
background: linear-gradient(135deg, #e0f2fe, #ecfdf5);
box-shadow: 0 4px 12px rgba(0,0,0,0.05);
"&gt;
&lt;div style="
background: white;
border-radius: 10px;
padding: 20px;
"&gt;
&lt;p style="
font-size: 18px;
line-height: 1.7;
color: #1e293b;
margin: 0;
"&gt;
&lt;a href="https://github.com/imoneoi/onerl"&gt;OneRL&lt;/a&gt;: Event-driven fully distributed reinforcement learning framework proposed in &lt;a href="https://arxiv.org/abs/2110.11573"&gt;"A Versatile and Efficient Reinforcement Learning Approach for Autonomous Driving"&lt;/a&gt; that can facilitate highly efficient policy learning in RL-based tasks.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 style="margin-top: 24px; color:rgb(94, 120, 225); font-size: 20px;"&gt;Features:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Super fast RL training! (15~30min for MuJoCo &amp;amp; Atari on single machine)&lt;/li&gt;
&lt;li&gt;State-of-the-art performance&lt;/li&gt;
&lt;li&gt;Scheduled and pipelined sample collection&lt;/li&gt;
&lt;li&gt;Completely lock-free execution&lt;/li&gt;
&lt;li&gt;Fully distributed architecture&lt;/li&gt;
&lt;li&gt;Full profiling &amp;amp; overhead identification tools&lt;/li&gt;
&lt;li&gt;Online visualization &amp;amp; rendering&lt;/li&gt;
&lt;li&gt;Support multi-GPU parallel training&lt;/li&gt;
&lt;li&gt;Support exporting trained policy to ONNX for faster inference &amp;amp; deployment&lt;/li&gt;
&lt;/ul&gt;</description></item></channel></rss>