<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Algorithms | MODE Collaboration</title><link>https://mode-demo.github.io/tags/algorithms/</link><atom:link href="https://mode-demo.github.io/tags/algorithms/index.xml" rel="self" type="application/rss+xml"/><description>Algorithms</description><generator>Hugo Blox Builder (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Wed, 04 Oct 2023 00:00:00 +0000</lastBuildDate><image><url>https://mode-demo.github.io/media/icon_hu_ebbff252c19052d0.png</url><title>Algorithms</title><link>https://mode-demo.github.io/tags/algorithms/</link></image><item><title>Data-Driven Decision-Making Algorithms</title><link>https://mode-demo.github.io/project/algorithms/</link><pubDate>Wed, 04 Oct 2023 00:00:00 +0000</pubDate><guid>https://mode-demo.github.io/project/algorithms/</guid><description>&lt;!-- A main research direction for the AIR-DREAM Lab is to develop high-performance, robust, generalizable, and real-world deployable data-driven decision-making algorithms. We are specifically interested in offline policy learning methods, such as offline reinforcement learning (RL), offline imitation learning (IL), and offline planning, which enable a simulation-free and low-cost solution to address many real-world problems.
Our current research focus include:
- Sample-efficient / high-generalization offline RL / IL / planning algorithms
- Foundation models for decision-making
- Safe offline RL algorithms
- Hybrid RL that combines offline and online policy learning
- Offline policy learning under imperfect reward
- Feedback-efficient RLHF --&gt;
&lt;div style="font-family: Helvetica, sans-serif; max-width: 960px; margin: 0 auto; padding: 20px; line-height: 1.6; color: #333;"&gt;
&lt;div style="
padding: 2px;
border-radius: 12px;
background: linear-gradient(135deg, #e0f2fe, #ecfdf5);
box-shadow: 0 4px 12px rgba(0,0,0,0.05);
"&gt;
&lt;div style="
background: white;
border-radius: 10px;
padding: 20px;
"&gt;
&lt;p style="
font-size: 18px;
line-height: 1.7;
color: #1e293b;
margin: 0;
"&gt;
A main research direction for the AIR-DREAM Lab is to develop high-performance, robust, generalizable, and real-world deployable data-driven decision-making algorithms. We are specifically interested in offline policy learning methods, such as offline reinforcement learning (RL), offline imitation learning (IL), and offline planning, which enable a simulation-free and low-cost solution to address many real-world problems.
&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;h3 style="margin-top: 24px; color: #00bcd4; font-size: 24px;"&gt;Our current research focus includes:&lt;/h3&gt;
&lt;!-- 卡片式布局 --&gt;
&lt;div style="display: grid; grid-template-columns: repeat(auto-fill, minmax(280px, 1fr)); gap: 24px; margin-top: 24px;"&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #00bcd4;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Sample-efficient / high-generalization offline RL / IL / planning algorithms&lt;/h4&gt;
&lt;!-- &lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Data-driven control optimization for complex industrial systems&lt;/p&gt; --&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #4caf50;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Foundation models for decision-making&lt;/h4&gt;
&lt;!-- &lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Energy saving optimization for data centers&lt;/p&gt; --&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #ff9800;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Safe offline RL algorithms&lt;/h4&gt;
&lt;!-- &lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Massive MIMO Beamforming optimization for 5G&lt;/p&gt; --&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid rgb(255, 204, 0);"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Hybrid RL that combines offline and online policy learning&lt;/h4&gt;
&lt;!-- &lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Massive MIMO Beamforming optimization for 5G&lt;/p&gt; --&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid #9c27b0;"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Offline policy learning under imperfect reward&lt;/h4&gt;
&lt;!-- &lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Engineering policy integrated hybrid reinforcement learning&lt;/p&gt; --&gt;
&lt;/div&gt;
&lt;div style="background: white; border-radius: 12px; padding: 24px; box-shadow: 0 5px 15px rgba(0, 0, 0, 0.05); transition: transform 0.3s ease; border-left: 4px solid rgb(215, 58, 205);"&gt;
&lt;h4 style="margin-top: 0; margin-bottom: 12px; color: #222; font-size: 18px;"&gt;Feedback-efficient RLHF&lt;/h4&gt;
&lt;!-- &lt;p style="margin: 0; font-size: 16px; color: #555;"&gt;Engineering policy integrated hybrid reinforcement learning&lt;/p&gt; --&gt;
&lt;/div&gt;
&lt;/div&gt;</description></item></channel></rss>