This commit is contained in:
晓雷 2025-08-01 04:31:58 +08:00
parent 62afd16c2c
commit 642ee93605
2 changed files with 48 additions and 27 deletions

View File

@ -20,11 +20,12 @@ like how you enjoy real-world milk tea (cheers).
**AReaL Highlights**
- ⚡ <span style="color: red; font-weight: bold;">**\[NEW\] AReaLite:**</span> Our new release AReaLite is
a **light-weight** and **AI-centric** codebase that prioritizes better development experiences for AI
researchers. As a result, AReaLite delivers most AReaL functionalities and
maintains its high performance with much fewer lines of code, supporting
users to build their own **agentic** and **RLVR** training workflows with minimal effort.
- ⚡ <span style="color: red; font-weight: bold;">**\[NEW\] AReaLite:**</span> Our new
release AReaLite is a **light-weight** and **AI-centric** codebase that prioritizes
better development experiences for AI researchers. As a result, AReaLite delivers most
AReaL functionalities and maintains its high performance with much fewer lines of
code, supporting users to build their own **agentic** and **RLVR** training workflows
with minimal effort.
- 🔥 **Asynchronous RL**: With algorithm-system co-design, AReaL supports fully
asynchronous RL for **the fastest training**! Experimental support for multi-turn
agentic RL is also provided.
@ -37,12 +38,12 @@ like how you enjoy real-world milk tea (cheers).
## News
**\[2025/07/31\] (AReaLite)** We introduce AReaLite, a **light-weight** version of
AReaL designed specifically for AI researchers and rapid prototyping. AReaLite features
an **AI-centric** API design that prioritizes ease of use and algorithm development,
while inherently supporting fully asynchronous **agentic RL**. With 80% fewer lines of
code, AReaLite maintains 90% of AReaL's high performance and core functionality. Check
out [our AReaLite design doc](/arealite/README.md) and
**\[2025/07/31\] (AReaLite)** We introduce AReaLite, a **light-weight** version of AReaL
designed specifically for AI researchers and rapid prototyping. AReaLite features an
**AI-centric** API design that prioritizes ease of use and algorithm development, while
inherently supporting fully asynchronous **agentic RL**. With 80% fewer lines of code,
AReaLite maintains 90% of AReaL's high performance and core functionality. Check out
[our AReaLite design doc](/arealite/README.md) and
[the quickstart guide](/docs/tutorial/quickstart.md) to begin your journey with
**AReaLite**!
@ -65,13 +66,16 @@ SOTA 7B and 32B models on math reasoning. Check our
New highlights in AReaLite:
- Instead of the *system-centric* architecture in old AReaL, AReaLite follows an **AI-centric** API design that aims to provide the following key features:
* **Light-weight**, **focused** algorithm and training workflow customization.
* **Easy to scale up** without knowing system and infrastructure details.
* **Adaptable and plugable:** Smooth to integrate with other modern AI applications.
- Instead of the *system-centric* architecture in old AReaL, AReaLite follows an
**AI-centric** API design that aims to provide the following key features:
- **Light-weight**, **focused** algorithm and training workflow customization.
- **Easy to scale up** without knowing system and infrastructure details.
- **Adaptable and plugable:** Smooth to integrate with other modern AI applications.
These features make AReaLite easy for AI researchers to adopt, understand, and develop
effectively and efficiently. To learn more about the design principles of AReaL, please read the [AReaLite design doc](/arealite/README.md)!
effectively and efficiently. To learn more about the design principles of AReaL,
please read the [AReaLite design doc](/arealite/README.md)!
- A much more *light-weight* codebase compared to old AReaL codebase with only **20%** #
lines of code, with a detailed [code walkthrough](/docs/arealite/gsm8k_grpo.md) on an
@ -139,7 +143,14 @@ Then you should be able to run experiments with your old settings on AReaLite!
## AReaLite vs legacy AReaL
AReaLite is an initiative to fully refactor AReaL, addressing historical issues such as redundant code and unnecessary system-level abstractions. Currently, AReaLite provides a lightweight codebase that enables fast prototyping for new RL training workflows and algorithms on a relatively small scale. For large-scale experiments (1K+ GPUs), we recommend using the battle-tested legacy AReaL to ensure stability. In the future, we will continue developing AReaLite by expanding its APIs, migrating legacy features, introducing new functionality, and validating the system through large-scale experiments.
AReaLite is an initiative to fully refactor AReaL, addressing historical issues such as
redundant code and unnecessary system-level abstractions. Currently, AReaLite provides a
lightweight codebase that enables fast prototyping for new RL training workflows and
algorithms on a relatively small scale. For large-scale experiments (1K+ GPUs), we
recommend using the battle-tested legacy AReaL to ensure stability. In the future, we
will continue developing AReaLite by expanding its APIs, migrating legacy features,
introducing new functionality, and validating the system through large-scale
experiments.
## Resources

View File

@ -9,15 +9,24 @@ immediately get started with AReaLite!
AReaL presents several challenges that make it difficult for AI researchers to adopt,
understand, and develop with effectively. The primary issue stems from its
*system-centric* rather than *AI-centric* architecture and API design. An *AI-centric* design aims to provide three
key features:
- **Light-weight & focused customization:** Users can implement their algorithms and training workflows with minimal and concentrated code, often in just a few files or even a single file.
- **Easy to scale up:** Experiments can be scaled up seamlessly without requiring knowledge of underlying system or infrastructure details.
- **Adaptable and plugable:** Users is free to integrate the system with code or APIs from other AI libraries, or plug APIs from the system into other frameworks.
We believe that AReaL, as well as other existing RL systems, falls short in fully delivering these features.
For example, the RL training workflow in AReaL is built around multiple *workers* executing *model
function calls* in a *DAG* (Directed Acyclic Graph). To customize a training workflow, researchers first need to understand these system-level concepts. Then they are forced to find code to modify, which is scattered around in the codebase. It is also nearly impossible to exploit packages like `datasets` since it is not compatible with the workers.
This gap is the core motivation behind AReaLite: rebuilding AReaL with an AI-centric architecture and APIs.
*system-centric* rather than *AI-centric* architecture and API design. An *AI-centric*
design aims to provide three key features:
- **Light-weight & focused customization:** Users can implement their algorithms and
training workflows with minimal and concentrated code, often in just a few files or
even a single file.
- **Easy to scale up:** Experiments can be scaled up seamlessly without requiring
knowledge of underlying system or infrastructure details.
- **Adaptable and plugable:** Users is free to integrate the system with code or APIs
from other AI libraries, or plug APIs from the system into other frameworks. We
believe that AReaL, as well as other existing RL systems, falls short in fully
delivering these features. For example, the RL training workflow in AReaL is built
around multiple *workers* executing *model function calls* in a *DAG* (Directed
Acyclic Graph). To customize a training workflow, researchers first need to understand
these system-level concepts. Then they are forced to find code to modify, which is
scattered around in the codebase. It is also nearly impossible to exploit packages
like `datasets` since it is not compatible with the workers. This gap is the core
motivation behind AReaLite: rebuilding AReaL with an AI-centric architecture and APIs.
Beyond architectural concerns, AReaL suffers from accumulated technical debt. The
codebase contains substantial legacy code inherited from previous projects that no
@ -41,7 +50,8 @@ backends for efficient large-scale training.
## Design Principles
To achieve *AI-centric* and *light-weight* while maintaining the efficiency, our design is guided by seven core principles:
To achieve *AI-centric* and *light-weight* while maintaining the efficiency, our design
is guided by seven core principles:
1. **Native asynchronous RL training support** — Built from the ground up for
disentangled generation and training