mirror of https://github.com/inclusionAI/AReaL
revise
This commit is contained in:
parent
a8da72a95b
commit
62afd16c2c
31
README.md
31
README.md
|
@ -20,11 +20,11 @@ like how you enjoy real-world milk tea (cheers).
|
|||
|
||||
**AReaL Highlights**
|
||||
|
||||
- ⚡ **\[NEW\] Light-weight & AI-centric:** Our new release **AReaLite** follows an
|
||||
**AI-centric** design that prioritizes better development experiences for AI
|
||||
researchers. As a result, **AReaLite** delivers most AReaL functionalities and
|
||||
maintains its high performance with a much more light-weight codebase, supporting
|
||||
users to build their own **agentic** and **RLVR** training workflows with less effort.
|
||||
- ⚡ <span style="color: red; font-weight: bold;">**\[NEW\] AReaLite:**</span> Our new release AReaLite is
|
||||
a **light-weight** and **AI-centric** codebase that prioritizes better development experiences for AI
|
||||
researchers. As a result, AReaLite delivers most AReaL functionalities and
|
||||
maintains its high performance with much fewer lines of code, supporting
|
||||
users to build their own **agentic** and **RLVR** training workflows with minimal effort.
|
||||
- 🔥 **Asynchronous RL**: With algorithm-system co-design, AReaL supports fully
|
||||
asynchronous RL for **the fastest training**! Experimental support for multi-turn
|
||||
agentic RL is also provided.
|
||||
|
@ -37,7 +37,7 @@ like how you enjoy real-world milk tea (cheers).
|
|||
|
||||
## News
|
||||
|
||||
**\[2025/07/31\] (AReaLite)** We introduce **AReaLite**, a **light-weight** version of
|
||||
**\[2025/07/31\] (AReaLite)** We introduce AReaLite, a **light-weight** version of
|
||||
AReaL designed specifically for AI researchers and rapid prototyping. AReaLite features
|
||||
an **AI-centric** API design that prioritizes ease of use and algorithm development,
|
||||
while inherently supporting fully asynchronous **agentic RL**. With 80% fewer lines of
|
||||
|
@ -61,14 +61,17 @@ SOTA 7B and 32B models on math reasoning. Check our
|
|||
**\[2025/02/24\] (v0.1)** Our initial release includes reproducible results for 1.5B and
|
||||
7B LRMs. Check our [v0.1 technical blog](/blog/AReaL_v0_1.md).
|
||||
|
||||
## Release Highlights
|
||||
## AReaLite Release Highlights
|
||||
|
||||
New highlights in AReaLite:
|
||||
|
||||
- Follows an *AI-centric* API design instead of the *system-centric* architecture in old
|
||||
AReaL, which make it easier for AI researchers to adopt, understand, and develop
|
||||
effectively and efficiently. To learn more about the design principles of AReaL,
|
||||
please read the [AReaLite design doc](/arealite/README.md)!
|
||||
- Instead of the *system-centric* architecture in old AReaL, AReaLite follows an **AI-centric** API design that aims to provide the following key features:
|
||||
* **Light-weight**, **focused** algorithm and training workflow customization.
|
||||
* **Easy to scale up** without knowing system and infrastructure details.
|
||||
* **Adaptable and plugable:** Smooth to integrate with other modern AI applications.
|
||||
|
||||
These features make AReaLite easy for AI researchers to adopt, understand, and develop
|
||||
effectively and efficiently. To learn more about the design principles of AReaL, please read the [AReaLite design doc](/arealite/README.md)!
|
||||
|
||||
- A much more *light-weight* codebase compared to old AReaL codebase with only **20%** #
|
||||
lines of code, with a detailed [code walkthrough](/docs/arealite/gsm8k_grpo.md) on an
|
||||
|
@ -121,7 +124,7 @@ python3 -m arealite.launcher.local examples/arealite/eval.py --config examples/a
|
|||
For more detailed guide on how to run experiments in AReaLite, please check out
|
||||
[our quickstart guide](/docs/tutorial/quickstart.md)!
|
||||
|
||||
## Switching from AReaL to AReaLite
|
||||
## Switching from legacy AReaL to AReaLite
|
||||
|
||||
We also provide a convenient script to convert your AReaL YAML config into AReaLite
|
||||
config in one command line. First you need to locate your AReaL config either modified
|
||||
|
@ -134,6 +137,10 @@ python3 examples/arealite/convert_config.py -f <config_path> -o <output_path>
|
|||
|
||||
Then you should be able to run experiments with your old settings on AReaLite!
|
||||
|
||||
## AReaLite vs legacy AReaL
|
||||
|
||||
AReaLite is an initiative to fully refactor AReaL, addressing historical issues such as redundant code and unnecessary system-level abstractions. Currently, AReaLite provides a lightweight codebase that enables fast prototyping for new RL training workflows and algorithms on a relatively small scale. For large-scale experiments (1K+ GPUs), we recommend using the battle-tested legacy AReaL to ensure stability. In the future, we will continue developing AReaLite by expanding its APIs, migrating legacy features, introducing new functionality, and validating the system through large-scale experiments.
|
||||
|
||||
## Resources
|
||||
|
||||
- [Documentation](https://inclusionai.github.io/AReaL/)
|
||||
|
|
|
@ -9,11 +9,15 @@ immediately get started with AReaLite!
|
|||
|
||||
AReaL presents several challenges that make it difficult for AI researchers to adopt,
|
||||
understand, and develop with effectively. The primary issue stems from its
|
||||
*system-centric* rather than *AI-centric* architecture. The reinforcement learning
|
||||
algorithm workflow is built around multiple *workers* executing consecutive *model
|
||||
function calls* — concepts that are unfamiliar to most AI researchers. This forces users
|
||||
to first master these system-level abstractions before they can implement workflows and
|
||||
algorithms for their specific research needs.
|
||||
*system-centric* rather than *AI-centric* architecture and API design. An *AI-centric* design aims to provide three
|
||||
key features:
|
||||
- **Light-weight & focused customization:** Users can implement their algorithms and training workflows with minimal and concentrated code, often in just a few files or even a single file.
|
||||
- **Easy to scale up:** Experiments can be scaled up seamlessly without requiring knowledge of underlying system or infrastructure details.
|
||||
- **Adaptable and plugable:** Users is free to integrate the system with code or APIs from other AI libraries, or plug APIs from the system into other frameworks.
|
||||
We believe that AReaL, as well as other existing RL systems, falls short in fully delivering these features.
|
||||
For example, the RL training workflow in AReaL is built around multiple *workers* executing *model
|
||||
function calls* in a *DAG* (Directed Acyclic Graph). To customize a training workflow, researchers first need to understand these system-level concepts. Then they are forced to find code to modify, which is scattered around in the codebase. It is also nearly impossible to exploit packages like `datasets` since it is not compatible with the workers.
|
||||
This gap is the core motivation behind AReaLite: rebuilding AReaL with an AI-centric architecture and APIs.
|
||||
|
||||
Beyond architectural concerns, AReaL suffers from accumulated technical debt. The
|
||||
codebase contains substantial legacy code inherited from previous projects that no
|
||||
|
@ -37,11 +41,11 @@ backends for efficient large-scale training.
|
|||
|
||||
## Design Principles
|
||||
|
||||
Our design is guided by seven core principles:
|
||||
To achieve *AI-centric* and *light-weight* while maintaining the efficiency, our design is guided by seven core principles:
|
||||
|
||||
1. **Native asynchronous RL training support** — Built from the ground up for
|
||||
disentangled generation and training
|
||||
1. **AI-centric design** — Minimize exposure to system concepts like "PlacementGroup"
|
||||
1. **System-less design** — Minimize exposure to system concepts like "PlacementGroup"
|
||||
1. **PyTorch-centric approach** — Use raw PyTorch types without unnecessary abstractions
|
||||
1. **Transparent algorithm orchestration** — Make the flow of operations clear and
|
||||
understandable
|
||||
|
|
Loading…
Reference in New Issue