This commit is contained in:
nuzant 2025-08-01 04:22:53 +08:00
parent a8da72a95b
commit 62afd16c2c
2 changed files with 30 additions and 19 deletions

View File

@ -20,11 +20,11 @@ like how you enjoy real-world milk tea (cheers).
**AReaL Highlights**
- ⚡ **\[NEW\] Light-weight & AI-centric:** Our new release **AReaLite** follows an
**AI-centric** design that prioritizes better development experiences for AI
researchers. As a result, **AReaLite** delivers most AReaL functionalities and
maintains its high performance with a much more light-weight codebase, supporting
users to build their own **agentic** and **RLVR** training workflows with less effort.
- ⚡ <span style="color: red; font-weight: bold;">**\[NEW\] AReaLite:**</span> Our new release AReaLite is
a **light-weight** and **AI-centric** codebase that prioritizes better development experiences for AI
researchers. As a result, AReaLite delivers most AReaL functionalities and
maintains its high performance with much fewer lines of code, supporting
users to build their own **agentic** and **RLVR** training workflows with minimal effort.
- 🔥 **Asynchronous RL**: With algorithm-system co-design, AReaL supports fully
asynchronous RL for **the fastest training**! Experimental support for multi-turn
agentic RL is also provided.
@ -37,7 +37,7 @@ like how you enjoy real-world milk tea (cheers).
## News
**\[2025/07/31\] (AReaLite)** We introduce **AReaLite**, a **light-weight** version of
**\[2025/07/31\] (AReaLite)** We introduce AReaLite, a **light-weight** version of
AReaL designed specifically for AI researchers and rapid prototyping. AReaLite features
an **AI-centric** API design that prioritizes ease of use and algorithm development,
while inherently supporting fully asynchronous **agentic RL**. With 80% fewer lines of
@ -61,14 +61,17 @@ SOTA 7B and 32B models on math reasoning. Check our
**\[2025/02/24\] (v0.1)** Our initial release includes reproducible results for 1.5B and
7B LRMs. Check our [v0.1 technical blog](/blog/AReaL_v0_1.md).
## Release Highlights
## AReaLite Release Highlights
New highlights in AReaLite:
- Follows an *AI-centric* API design instead of the *system-centric* architecture in old
AReaL, which make it easier for AI researchers to adopt, understand, and develop
effectively and efficiently. To learn more about the design principles of AReaL,
please read the [AReaLite design doc](/arealite/README.md)!
- Instead of the *system-centric* architecture in old AReaL, AReaLite follows an **AI-centric** API design that aims to provide the following key features:
* **Light-weight**, **focused** algorithm and training workflow customization.
* **Easy to scale up** without knowing system and infrastructure details.
* **Adaptable and plugable:** Smooth to integrate with other modern AI applications.
These features make AReaLite easy for AI researchers to adopt, understand, and develop
effectively and efficiently. To learn more about the design principles of AReaL, please read the [AReaLite design doc](/arealite/README.md)!
- A much more *light-weight* codebase compared to old AReaL codebase with only **20%** #
lines of code, with a detailed [code walkthrough](/docs/arealite/gsm8k_grpo.md) on an
@ -121,7 +124,7 @@ python3 -m arealite.launcher.local examples/arealite/eval.py --config examples/a
For more detailed guide on how to run experiments in AReaLite, please check out
[our quickstart guide](/docs/tutorial/quickstart.md)!
## Switching from AReaL to AReaLite
## Switching from legacy AReaL to AReaLite
We also provide a convenient script to convert your AReaL YAML config into AReaLite
config in one command line. First you need to locate your AReaL config either modified
@ -134,6 +137,10 @@ python3 examples/arealite/convert_config.py -f <config_path> -o <output_path>
Then you should be able to run experiments with your old settings on AReaLite!
## AReaLite vs legacy AReaL
AReaLite is an initiative to fully refactor AReaL, addressing historical issues such as redundant code and unnecessary system-level abstractions. Currently, AReaLite provides a lightweight codebase that enables fast prototyping for new RL training workflows and algorithms on a relatively small scale. For large-scale experiments (1K+ GPUs), we recommend using the battle-tested legacy AReaL to ensure stability. In the future, we will continue developing AReaLite by expanding its APIs, migrating legacy features, introducing new functionality, and validating the system through large-scale experiments.
## Resources
- [Documentation](https://inclusionai.github.io/AReaL/)

View File

@ -9,11 +9,15 @@ immediately get started with AReaLite!
AReaL presents several challenges that make it difficult for AI researchers to adopt,
understand, and develop with effectively. The primary issue stems from its
*system-centric* rather than *AI-centric* architecture. The reinforcement learning
algorithm workflow is built around multiple *workers* executing consecutive *model
function calls* — concepts that are unfamiliar to most AI researchers. This forces users
to first master these system-level abstractions before they can implement workflows and
algorithms for their specific research needs.
*system-centric* rather than *AI-centric* architecture and API design. An *AI-centric* design aims to provide three
key features:
- **Light-weight & focused customization:** Users can implement their algorithms and training workflows with minimal and concentrated code, often in just a few files or even a single file.
- **Easy to scale up:** Experiments can be scaled up seamlessly without requiring knowledge of underlying system or infrastructure details.
- **Adaptable and plugable:** Users is free to integrate the system with code or APIs from other AI libraries, or plug APIs from the system into other frameworks.
We believe that AReaL, as well as other existing RL systems, falls short in fully delivering these features.
For example, the RL training workflow in AReaL is built around multiple *workers* executing *model
function calls* in a *DAG* (Directed Acyclic Graph). To customize a training workflow, researchers first need to understand these system-level concepts. Then they are forced to find code to modify, which is scattered around in the codebase. It is also nearly impossible to exploit packages like `datasets` since it is not compatible with the workers.
This gap is the core motivation behind AReaLite: rebuilding AReaL with an AI-centric architecture and APIs.
Beyond architectural concerns, AReaL suffers from accumulated technical debt. The
codebase contains substantial legacy code inherited from previous projects that no
@ -37,11 +41,11 @@ backends for efficient large-scale training.
## Design Principles
Our design is guided by seven core principles:
To achieve *AI-centric* and *light-weight* while maintaining the efficiency, our design is guided by seven core principles:
1. **Native asynchronous RL training support** — Built from the ground up for
disentangled generation and training
1. **AI-centric design** — Minimize exposure to system concepts like "PlacementGroup"
1. **System-less design** — Minimize exposure to system concepts like "PlacementGroup"
1. **PyTorch-centric approach** — Use raw PyTorch types without unnecessary abstractions
1. **Transparent algorithm orchestration** — Make the flow of operations clear and
understandable