<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Libs | MODE Collaboration</title><link>https://mode-demo.github.io/tags/libs/</link><atom:link href="https://mode-demo.github.io/tags/libs/index.xml" rel="self" type="application/rss+xml"/><description>Libs</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Sun, 01 Oct 2023 00:00:00 +0000</lastBuildDate><image><url>https://mode-demo.github.io/media/icon_hu_ebbff252c19052d0.png</url><title>Libs</title><link>https://mode-demo.github.io/tags/libs/</link></image><item><title>Tools &amp; Libraries</title><link>https://mode-demo.github.io/project/libs/</link><pubDate>Sun, 01 Oct 2023 00:00:00 +0000</pubDate><guid>https://mode-demo.github.io/project/libs/</guid><description>&lt;div style="font-family: Helvetica, sans-serif; max-width: 960px; margin: 0 auto; padding: 20px; line-height: 1.6; color: #333;"&gt;
&lt;div style="
padding: 2px;
border-radius: 12px;
background: linear-gradient(135deg, #e0f2fe, #ecfdf5);
box-shadow: 0 4px 12px rgba(0,0,0,0.05);
"&gt;
&lt;div style="
background: white;
border-radius: 10px;
padding: 20px;
"&gt;
&lt;p style="
font-size: 18px;
line-height: 1.7;
color: #1e293b;
margin: 0;
"&gt;
We provide open code implementations for most of our research, please check our papers for related codes. In addition, we aim to develop easy-to-use and comprehensive algorithm libraries and tools to accelerate the real-world deployment of advanced data-driven decision-making methods.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 style="margin-top: 24px; color: #00bcd4; font-size: 24px; text-align: center;"&gt;Data-Drivien Decision-Making Libraries / Tools&lt;/h3&gt;
&lt;p&gt;
&lt;figure &gt;
&lt;div class="d-flex justify-content-center"&gt;
&lt;div class="w-100" &gt;&lt;img alt="screen reader text" srcset="
/project/libs/d2c-logo_hu_5d40481b3d148996.webp 400w,
/project/libs/d2c-logo_hu_55cf71f6467108d1.webp 760w,
/project/libs/d2c-logo_hu_169fc8daa277fe2a.webp 1200w"
src="https://mode-demo.github.io/project/libs/d2c-logo_hu_5d40481b3d148996.webp"
width="339"
height="123"
loading="lazy" data-zoomable /&gt;&lt;/div&gt;
&lt;/div&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;div style="font-family: Helvetica, sans-serif; max-width: 960px; margin: 0 auto; padding: 20px; line-height: 1.6; color: #333;"&gt;
&lt;div style="
padding: 2px;
border-radius: 12px;
background: linear-gradient(135deg, #e0f2fe, #ecfdf5);
box-shadow: 0 4px 12px rgba(0,0,0,0.05);
"&gt;
&lt;div style="
background: white;
border-radius: 10px;
padding: 20px;
"&gt;
&lt;p style="
font-size: 18px;
line-height: 1.7;
color: #1e293b;
margin: 0;
"&gt;
&lt;a href="https://github.com/AIR-DI/D2C"&gt;Data-Driven Control Lib (D2C)&lt;/a&gt; is a library for data-driven decision-making &amp; control based on state-of-the-art offline reinforcement learning (RL), offline imitation learning (IL), and offline planning algorithms. It is a platform for solving various decision-making &amp; control problems in real-world scenarios. D2C is designed to offer fast and convenient algorithm performance development and testing, as well as providing easy-to-use toolchains to accelerate the real-world deployment of SOTA data-driven decision-making methods.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 style="margin-top: 24px; color:rgb(94, 120, 225); font-size: 20px;"&gt;The current supported offline RL/IL algorithms include (more to come):&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/pdf/2106.06860.pdf" target="_blank" rel="noopener"&gt;Twin Delayed DDPG with Behavior Cloning (TD3+BC)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2205.11027.pdf" target="_blank" rel="noopener"&gt;Distance-Sensitive Offline Reinforcement Learning (DOGE)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2206.13464.pdf" target="_blank" rel="noopener"&gt;Dynamics-Aware Hybrid Offline-and-Online Reinforcement Learning (H2O)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2303.15810" target="_blank" rel="noopener"&gt;Sparse Q-learning (SQL)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2210.08323" target="_blank" rel="noopener"&gt;Policy-guided Offline RL (POR)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/pdf/2110.06169.pdf" target="_blank" rel="noopener"&gt;Offline Reinforcement Learning with Implicit Q-Learning (IQL)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/2207.00244" target="_blank" rel="noopener"&gt;Discriminator-Guided Model-Based Offline Imitation Learning (DMIL)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.cse.unsw.edu.au/~claude/papers/MI15.pdf" target="_blank" rel="noopener"&gt;Behavior Cloning (BC)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style="margin-top: 24px; color:rgb(94, 120, 225); font-size: 20px;"&gt;Features:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;D2C includes a large collection of offline RL and IL algorithms: model-free and model-based offline RL/IL algorithms, as well as planning methods.&lt;/li&gt;
&lt;li&gt;D2C is highly modular and extensible. You can easily build custom algorithms and conduct experiments with it.&lt;/li&gt;
&lt;li&gt;D2C automates the development process in real-world control applications. It simplifies the steps of problem definition/mathematical formulation, policy training, policy evaluation and model deployment.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style="margin-top: 24px; color:rgb(94, 120, 225); font-size: 20px;"&gt;Library Information:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;The library is available in &lt;a href="https://github.com/AIR-DI/D2C" target="_blank" rel="noopener"&gt;https://github.com/AIR-DI/D2C&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;The tutorials and API documentation are hosted on &lt;a href="https://air-d2c.readthedocs.io/" target="_blank" rel="noopener"&gt;air-d2c.readthedocs.io&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 style="margin-top: 24px; color: #00bcd4; font-size: 24px; text-align: center;"&gt;Online RL Library&lt;/h3&gt;
&lt;div style="font-family: Helvetica, sans-serif; max-width: 960px; margin: 0 auto; padding: 20px; line-height: 1.6; color: #333;"&gt;
&lt;div style="
padding: 2px;
border-radius: 12px;
background: linear-gradient(135deg, #e0f2fe, #ecfdf5);
box-shadow: 0 4px 12px rgba(0,0,0,0.05);
"&gt;
&lt;div style="
background: white;
border-radius: 10px;
padding: 20px;
"&gt;
&lt;p style="
font-size: 18px;
line-height: 1.7;
color: #1e293b;
margin: 0;
"&gt;
&lt;a href="https://github.com/imoneoi/onerl"&gt;OneRL&lt;/a&gt;: Event-driven fully distributed reinforcement learning framework proposed in &lt;a href="https://arxiv.org/abs/2110.11573"&gt;"A Versatile and Efficient Reinforcement Learning Approach for Autonomous Driving"&lt;/a&gt; that can facilitate highly efficient policy learning in RL-based tasks.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 style="margin-top: 24px; color:rgb(94, 120, 225); font-size: 20px;"&gt;Features:&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;Super fast RL training! (15~30min for MuJoCo &amp;amp; Atari on single machine)&lt;/li&gt;
&lt;li&gt;State-of-the-art performance&lt;/li&gt;
&lt;li&gt;Scheduled and pipelined sample collection&lt;/li&gt;
&lt;li&gt;Completely lock-free execution&lt;/li&gt;
&lt;li&gt;Fully distributed architecture&lt;/li&gt;
&lt;li&gt;Full profiling &amp;amp; overhead identification tools&lt;/li&gt;
&lt;li&gt;Online visualization &amp;amp; rendering&lt;/li&gt;
&lt;li&gt;Support multi-GPU parallel training&lt;/li&gt;
&lt;li&gt;Support exporting trained policy to ONNX for faster inference &amp;amp; deployment&lt;/li&gt;
&lt;/ul&gt;</description></item></channel></rss>