Commit Graph

194 Commits

Author SHA1 Message Date
Wei Fu 40b0070f97
[Fix] Add a title for the v0.3 blog (#186) 2025-07-21 10:54:41 +08:00
nuzant 71c47c5f17
fix single node bug (#185) 2025-07-18 10:54:49 +08:00
nuzant 0d45f43285
[Bug] Fix a bug in ray training utils. (#184)
* fix ray schedule

* .

* .

* remove debug info
2025-07-17 16:30:10 +08:00
Richard.Bian ef8b340ea7
nit: fix typo in blog v0.2 release (#182)
Nitpick fix for the title in the blog md file AReaL_v0_2.md
2025-07-17 16:29:34 +08:00
Wei Fu b9dbd4a2c1
Update to persistent wechat QR code. (#159) 2025-07-09 10:50:19 +08:00
xssstory 17ea7fe94d
fix math reward verifier (#156)
* PullRequest: 293 fix get_param_realloc_path

Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/293

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* fix get_param_realloc_path

* PullRequest: 297 bugfix: reward is always -5

Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/297

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* bugfix: reward is always -5

* PullRequest: 321 fix checkpoint save dir

Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/321

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* fix checkpoint save dir

* PullRequest: 328 [Doc] update installation

Merge branch sxj/doc of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/328

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* [Doc] update installation

* PullRequest: 329 bugfix: math verifier blocks the async training

Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/329

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* bugfix: math verifier block the async training

* format

---------

Co-authored-by: 冰临 <shenxujie.sxj@antgroup.com>
Co-authored-by: garrett4wade <fuwth17@gmail.com>
2025-07-07 15:49:13 +08:00
Wei Fu 5b7c83b5d9
update qrcode (#153) 2025-07-07 12:28:21 +08:00
Wei Fu 0ff8c59435
[Fix] Merge error fixes. (#152)
* PullRequest: 293 fix get_param_realloc_path

Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/293

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* fix get_param_realloc_path

* PullRequest: 297 bugfix: reward is always -5

Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/297

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* bugfix: reward is always -5

* PullRequest: 321 fix checkpoint save dir

Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/321

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* fix checkpoint save dir

* PullRequest: 328 [Doc] update installation

Merge branch sxj/doc of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/328

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* [Doc] update installation

---------

Co-authored-by: 温差 <xushusheng.xss@antgroup.com>
Co-authored-by: 冰临 <shenxujie.sxj@antgroup.com>
2025-07-07 10:30:27 +08:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟 e0aee03109
Fix several syntax warning related to invalid escape sequence (#144)
by using raw strings or properly escaping the backslashes.

```log
AReaL/realhf/impl/dataset/math_parser.py:292: SyntaxWarning: invalid escape sequence '\%'
  string = string.replace("\%", "")
AReaL/realhf/impl/dataset/math_parser.py:402: SyntaxWarning: invalid escape sequence '\d'
  pattern = "-?\d*\.?\d+"
AReaL/realhf/impl/model/parallelism/tensor_parallel/modules.py:1125: SyntaxWarning: invalid escape sequence '\s'
```

Signed-off-by: Hollow Man <hollowman@opensuse.org>
2025-07-01 10:12:39 +08:00
Wei Fu 254614df87
Update wechat QR code (#135) 2025-06-25 16:29:19 +08:00
Wei Fu adeb8eb13f
[Fix] Fix yaml configurations for v0.2 experiments. (#129)
* .

* fix
2025-06-24 13:48:02 +08:00
GurrenLagann97 e3005d57f6
add a preprocessing script for code training data and update readme (#126)
* add a preprocessing script for code training data and update readme

* add a preprocessing script for code training data and update readme

* add a preprocessing script for code training data and update readme

* fix eval doc

---------

Co-authored-by: hcy <hechuyi.hcy@antgroup.com>
2025-06-24 09:44:15 +08:00
Wei Fu 3642cce2fc
[FIX] Fix the flash-attn version to 2.7.3 (#114)
* .

* degenerate to 2.7.3

* .

* .

* update ci

* update ci

* update ci

* update CI

* update to tsinghua mirror

* update ci

* .

* .

* .
2025-06-23 15:46:17 +08:00
Wei Fu 1ec1399f19
PullRequest: 252 [Feature] Fix constants initialization. (#122)
Merge branch fw/gh/fix-init-constants of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/252?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* remove LOG_ROOT
* remove MODEL_SAVE_PATH
* remove PARAM_REALLOC_PATH, DATASET_CACHE
* prepare for testing
* prepare for testing
* ready for run
* local run
* tests mainly pass
* format
* amend cluster.py
* fix
2025-06-23 12:52:49 +08:00
Wei Fu b63eea9d07
Update WeChat QRCode (#111) 2025-06-16 19:52:21 +08:00
xichengpro bb14f022dc
Support using SwanLab for experiment tracking (#98)
* Support using SwanLab for experiment tracking

* docs: improve WandB and SwanLab integration documentation
- Added official links for better user reference
- Used backticks to quote commands and parameters
- Unified mode settings to use "online" / "cloud" convention
- Merged WandB and SwanLab descriptions into a single concise statement
- Added note on using `swanlab.mode="local"` when server connection is unavailable

* refactor: update default value of api_key

* fix: correct help description from WandB to SwanLab in SwanLabConfig

* refactor: merge log_swanlab_tensorboard and log_wandb_tensorboard into log_swanlab_wandb_tensorboard

 - Unified logging logic for SwanLab, WandB, and TensorBoard to reduce code duplication

* chore: update swanlab version in dependency config files

 - Updated SwanLab version in pyproject.toml
 - Updated SwanLab version in requirements.txt

* refactor: enhance SwanLab config handling for logging purposes
- Config now uses provided arguments first
- Falls back to reading from config.yaml if no input is given

* docs: add note on using  when server connection is unavailable

* refactor: merge _LATEST_WANDB_STEP and _LATEST_SWANLAB_STEP into _LATEST_LOG_STEP

* Format code with black and isort

* chore: update swanlab version in dependency config files
- Updated SwanLab version in requirements.txt

* refactor: rename swanlab_wandb_data to log_data

---------

Co-authored-by: dubingnan <dubingnan@360.cn>
2025-06-16 19:51:31 +08:00
GurrenLagann97 f2f4b67bcd
fix readme in evaluation (#103)
Co-authored-by: hcy <hechuyi.hcy@antgroup.com>
2025-06-10 17:56:08 +08:00
Wei Fu e7eda16311
fix wandb mode typo (#100) 2025-06-10 14:01:24 +08:00
Wei Fu 86625c56e4
[Doc] Add WeChat QR code (#99)
* update qrcode

* update qrcode
2025-06-10 13:57:20 +08:00
Wei Fu b3f5392f44
[Bug] Fix the dependency of a virtual environment for sympy==1.12 (#92)
* change to math local eval

* .

* update docker image tag
2025-06-08 21:11:35 +08:00
Wei Fu c7d6ccc18e
[Bug] Suppress the SGLang patch error in Ray workers (#93) 2025-06-08 20:50:56 +08:00
Wei Fu 54daceb6dc
[Feature] Add multi-turn agent example (#72)
* update benchmark script

* .

* add benchmark docs

* PullRequest: 178 multi turn math agent training

Merge branch gjx/multi-turn-math of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/178?tab=diff

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* multi turn math agent training
* training data logging and clean math multi-turn exp
* fix
* .

* format

* .

---------

Co-authored-by: 步偶 <sam.gjx@antgroup.com>
2025-06-08 20:36:52 +08:00
Wei Fu 448bb05a3d
Fix formatting (#90) 2025-06-06 21:42:12 +08:00
Ligeng Zhu de134b4a7a
[Feature] Switch dataset path / model path to HF location to ease community usage (#82)
* Update .gitignore and modify dataset paths in scripts for improved file management and compatibility with Hugging Face datasets. Additionally, refactor dataset loading functions to utilize load_hf_or_local_file for better flexibility.

* Remove sglang subproject and update dataset path format in load_hf_or_local_file function for compatibility with Hugging Face datasets.

* Refactor imports in grader.py and parser.py to include sympy for improved functionality.
2025-06-06 21:38:06 +08:00
Wei Fu b768e5ce3c
update readme (#78) 2025-06-04 12:02:15 +08:00
samjia2000 ae4b03d95c
[Doc] update blog to discuss staleness (#76)
* update blog to discuss staleness

* update

* update

---------

Co-authored-by: gjx <sam.gjx@antgroup.com>
2025-06-04 11:48:30 +08:00
Wei Fu 15024d8d32
fix (#77) 2025-06-04 11:27:09 +08:00
GurrenLagann97 67dc056dd9
add hf links in readme and blog (#75)
* add hf links in readme and blog

* fix

* add model link in table

* add model link in table

---------

Co-authored-by: hcy <hechuyi.hcy@antgroup.com>
2025-06-04 10:32:18 +08:00
Wei Fu d56df5102e
[Doc] Update README. (#74)
* update benchmark script

* .

* add benchmark docs

* add v0.3.0 configs

* .

* PullRequest: 178 multi turn math agent training

Merge branch gjx/multi-turn-math of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/178?tab=diff

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* multi turn math agent training
* training data logging and clean math multi-turn exp
* fix
* .

* fix

* change readme

* fix typo

* revert multiturn

---------

Co-authored-by: 步偶 <sam.gjx@antgroup.com>
2025-06-04 00:43:19 +08:00
Wei Fu 2d4d937d10
[Doc] Add doc for reproducing released models (#73)
* update benchmark script

* .

* add benchmark docs

* add v0.3.0 configs

* .

* PullRequest: 178 multi turn math agent training

Merge branch gjx/multi-turn-math of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/178?tab=diff

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* multi turn math agent training
* training data logging and clean math multi-turn exp
* fix
* .

* fix

* add docs and config

* format

* revert multi-turn agent

* add config

---------

Co-authored-by: 步偶 <sam.gjx@antgroup.com>
2025-06-03 20:33:48 +08:00
Wei Fu ad6e5bd3fa
[Doc] Add verl benchmark scripts (#71)
* update benchmark script

* .

* add benchmark docs

* .
2025-06-03 18:10:32 +08:00
Wei Fu 326dfe3462
[Feature] Update Dockerfile to v0.3.0 (#70)
* update benchmark script

* .

* add benchmark docs

* update dockerfile
2025-06-03 11:57:56 +08:00
Wei Fu 2e5728a2c1
[Doc] Mark the equivalent between zero-staleness and synchronous PPO (#69) 2025-06-02 21:44:35 +08:00
Wei Fu fabe59aad1
add doc (#68) 2025-06-02 21:16:36 +08:00
Wei Fu ab20f940f9
add async doc (#67) 2025-06-02 15:59:50 +08:00
GurrenLagann97 ac7a9c5a89
[Doc] add dataset customization docs (#66)
* add dataset customization docs

* fix dataset customization docs and change orders

* fix dataset customization docs

---------

Co-authored-by: hcy <hechuyi.hcy@antgroup.com>
Co-authored-by: Wei Fu <36355462+garrett4wade@users.noreply.github.com>
2025-06-02 12:31:34 +08:00
Wei Fu 11a34dfb51
[Doc] Add doc for reproducing benchmark results (#65)
* update benchmark script

* .

* add benchmark docs
2025-06-01 20:50:40 +08:00
Wei Fu ce4d7354bf
[Doc] Fix documentation for using Docker containers and customized agents (#64)
* test env setup

* .

* fix a missing cherry-pick

* .

* .

* .

* update docker instrcution

* fix
2025-06-01 16:33:29 +08:00
Wei Fu afe5a2c880
[Fix] Fix tutorial async_ppo script and doc structure (#63)
* test env setup

* .

* fix a missing cherry-pick
2025-06-01 15:46:46 +08:00
Wei Fu 4fab3ac769
[Doc & Fix] Simplify the environment setup procedure (#62)
* PullRequest: 176 [FIX] clear sensitive info

Merge branch fw/fix-sensitive-info of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/176

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .
* .
* .

* .

* .

* test env setup

* fix

* allow cached model

* .

* revise docs

* change docs

* format docs

* update readme
2025-06-01 14:57:21 +08:00
Wei Fu d87c898d36
[Feature] Add link to documentation in README (#61) 2025-05-30 19:58:58 +08:00
Wei Fu 473eeb2db0
[Feature] Create docs and examples for multi-turn agent RL (#60)
* PullRequest: 168 添加Codeforces测试,修复其它测试问题

Merge branch areal-eval-0.3 of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/168

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* fix eval and add codeforces elo calc
* fix codeforce test
* fix qwen3 prompt
* change annotations to eng
* add code verify files

* PullRequest: 173 [FIX} format code and fix a recover error in rollout worker

Merge branch fw/fix-rollout-recover of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/173

Reviewed-by: 温差 <xushusheng.xss@antgroup.com>


* format code and fix a recover error in rollout worker

* PullRequest: 171 更新评估文档

Merge branch eval-doc of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/171?tab=diff

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* update eval doc
* complete eval doc
* complete eval doc
* fix ood info
* add data obtaining guide
* fix supported datasets

* PullRequest: 174 decouple max_behav_imp_weight and c_clip & track entropy, positve_seq_len and negative_seq_len

Merge branch xss/max_behav_imp_weight of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/174

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* decouple max_behav_imp_weight and c_clip
* rename log: positve_* -> correct_*, negative_* -> incorrect_*
* rename hyper-parameter: max_behav_imp_weight -> behav_imp_weight_cap

* PullRequest: 175 [Fix] Fix the "event loop is already running" error in ray scripts

Merge branch fw/fix-ray-asyncio of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/175

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* format code and fix a recover error in rollout worker
* .

* add docs

* .

---------

Co-authored-by: 乘鹭 <hechuyi.hcy@antgroup.com>
Co-authored-by: 温差 <xushusheng.xss@antgroup.com>
2025-05-30 16:26:07 +08:00
Wei Fu c0200f10d0
[Feature] Support behavior importance weight capping and update evaluation scripts (#59)
* PullRequest: 168 添加Codeforces测试,修复其它测试问题

Merge branch areal-eval-0.3 of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/168

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* fix eval and add codeforces elo calc
* fix codeforce test
* fix qwen3 prompt
* change annotations to eng
* add code verify files

* PullRequest: 173 [FIX} format code and fix a recover error in rollout worker

Merge branch fw/fix-rollout-recover of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/173

Reviewed-by: 温差 <xushusheng.xss@antgroup.com>


* format code and fix a recover error in rollout worker

* PullRequest: 171 更新评估文档

Merge branch eval-doc of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/171?tab=diff

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* update eval doc
* complete eval doc
* complete eval doc
* fix ood info
* add data obtaining guide
* fix supported datasets

* PullRequest: 174 decouple max_behav_imp_weight and c_clip & track entropy, positve_seq_len and negative_seq_len

Merge branch xss/max_behav_imp_weight of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/174

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* decouple max_behav_imp_weight and c_clip
* rename log: positve_* -> correct_*, negative_* -> incorrect_*
* rename hyper-parameter: max_behav_imp_weight -> behav_imp_weight_cap

* PullRequest: 175 [Fix] Fix the "event loop is already running" error in ray scripts

Merge branch fw/fix-ray-asyncio of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/175

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* format code and fix a recover error in rollout worker
* .

---------

Co-authored-by: 乘鹭 <hechuyi.hcy@antgroup.com>
Co-authored-by: 温差 <xushusheng.xss@antgroup.com>
2025-05-30 10:29:21 +08:00
Wei Fu 0815bb6494
[CI] Try to fix doc CI (#58)
* .

* .

* .

* .

* .

* .

* .

* .

* .

* .

* .
2025-05-29 12:45:10 +08:00
Wei Fu b3375607c6
[CI] Fix doc CI again (#57)
* .

* .

* .

* .

* .

* .
2025-05-29 11:22:27 +08:00
Wei Fu 409fa9842d
[CI] Fix doc CI (#56)
* .

* .

* .
2025-05-29 11:09:01 +08:00
Wei Fu b240a9cf80
. (#55) 2025-05-28 20:29:32 +08:00
Wei Fu d13b517cbf
add doc ci (#54) 2025-05-28 20:01:10 +08:00
Wei Fu 7826fdbb87
[Feature] Amend yaml configurations for Ray experiments (#53)
* feat: one buffer for each task

* feat: support "one buffer for each task" for async

* make kv_cache_dtype configurable

Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>

* style: use plural form

fix: use _seed_from_key to set different seeds for data loaders
fix: call load_data for one buffer each time

* PullRequest: 125 Support running async experiments in the 2407 image.

Merge branch fw/async2407 of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/125

Signed-off-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .

* fix: handle multiple datasets in recover indices
fix: `isinstance(self.__datasets, PullerStreamDataset)`
feat: use the "spec" request to obtain the number of datasets
fix: revert rollout worker

* fix: revert async_rl_exp.py

* fix flag for list (cuda_graph_bs)

* format

* [FIX] fix async task reward [sglang bf16-> fp16]

* fix: define `self.__datasets` in advance

* PullRequest: 130 [Refactor] Remove deprecated search related code

Merge branch mzy/remove-search of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/130

Signed-off-by: 博惟 <bowei.fw@antgroup.com>


* remove search related

* PullRequest: 131 [Refactor] Change terminology "model parallel" into "tensor parallel" to align with megatron.

Merge branch mzy/mp-to-tp of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/131?tab=comment

Signed-off-by: 博惟 <bowei.fw@antgroup.com>


* change mp to tp
* .
* .

* PullRequest: 142 Fix an error for megatron backend destroy

Merge branch fw/fix-meagatron-destroy of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/142

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .

* PullRequest: 143 Fix the port conflict issue of generation servers

Merge branch fw/fix-gen-port of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/143?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* somehow fix the port issue
* add clearance period
* .
* .

* PullRequest: 145 Add code environment

Merge branch fw/code-env of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/145?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* add code env
* somehow fix the port issue
* fix

* PullRequest: 144 Add decoupled PPO loss

Merge branch fw/decoupled-ppo-loss of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/144?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* fix ppo step logging, nan in stats tracker, and add decoupled loss
* .
* somehow fix the port issue
* fix typo

* PullRequest: 146 Merge SLURM logs and save experiment configs in yaml format.

Merge branch fw/better-logging of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/146

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* merge all slurm logs into one
* write config to yaml

* PullRequest: 141 Merge changes during NeurIPS submission

Merge branch fw/async-dev of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/141

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .
* .
* .
* .
* .
* .
* .
* .
* .
* update script
* .
* .
* .
* .
* [ADD] add least req scheduling
* fix test genreq
* .
* .
* fix stats tracker nan
* .
* .
* .
* .
* .
* .
* .
* uppper clip decoupled objective
* add throughput exp script
* .
* remove behav upper clip param
* .
* .
* .
* plot curve
* update thpt script
* .
* master worker raise error when exiting
* update script
* add gen throughput logging
* .
* .
* add decoupled wandb data
* .
* fix port issue and add no training option
* .
* enlarge ttl
* remove gserver manager await staled
* update weights in groups
* .
* .
* .
* add port clearance period
* .
* .
* .
* add plot script
* add sft throughput eval
* .
* log tokens in null interface
* 消融实验和interruptible generation
* 画图脚本/运行脚本/数据结果
* .
* remove scripts
* add port test
* remove force_sync_reward
* revert some changes
* .
* revert
* revert fix
* fix
* revert
* fix typo

* support qwen3 training

* PullRequest: 147 Support interruption in SGLang and fix a KeyError in gather-scatter communication

Merge branch fw/sglang046-with-abort-request of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/147?tab=diff

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* fix ppo step logging, nan in stats tracker, and add decoupled loss
* .
* somehow fix the port issue
* initial commit
* add interupt request
* fix data transfer issue
* max concurrent rollouts defaults to train batch size
* merge main
* add patch
* fix patch typp
* revert sglang
* fix typo
* fix minor typo
* .
* pip show editable sglang path

* PullRequest: 149 fix: code faas max_retries

Merge branch xss/fix_code_verifier of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/149

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* fix: code faas max_retries

* PullRequest: 150 [Bug Fix] Fix key errors in `_run_scatter` in data transfer

Merge branch mzy/fix-scatter-groups of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/150

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* fix scatter groups key error

* fix test

* .

* PullRequest: 151 Fix Qwen3 import error when using transformers with a lower version

Merge branch fw/fix-qwen3 of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/151

Reviewed-by: 温差 <xushusheng.xss@antgroup.com>


* merge all slurm logs into one
* write config to yaml
* .

* PullRequest: 152 Support sglang0.4.6 and fix master_worker import error

Merge branch adopt_sglang046 of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/152

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* Support sglang0.4.6 and fix master_worker import error
* remove disable_mla option

* PullRequest: 155 [FIX] reduce port conflicts

Merge branch sxj/reduce_port_conflict of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/155

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* [FIX] reduce port conflicts

* PullRequest: 153 Fix stuck and recover issues for async experiments

Merge branch fw/stable-async of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/153

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* fix sample cnt stuck
* fix recover
* code cleanup
* merge all slurm logs into one
* write config to yaml
* .
* .
* .
* revert birth time change
* .
* enlarge sock connect timeout

* PullRequest: 158 [Fix] Fix the error where "accepted" is not defined

Merge branch fw/fix-rollout-accepted of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/158

Reviewed-by: 温差 <xushusheng.xss@antgroup.com>


* .

* PullRequest: 154 Fix unit tests and simplify package installation

Merge branch fw/v0.3.0-tests of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/154?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* fix some tests
* fix tests except for experiments
* fix tests
* fix tests
* .
* .

* PullRequest: 159 [fix] Enlarge the default aiohttp connection timeout and fix a recover error in model worker

Merge branch fw/stable-async of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/159

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* fix sample cnt stuck
* fix recover
* code cleanup
* merge all slurm logs into one
* write config to yaml
* .
* .
* .
* revert birth time change
* .
* enlarge sock connect timeout
* .

* PullRequest: 160 set sock_connect as rollout_request_timeout in partial_rollout.py

Merge branch xss/rollout_timeout of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/160

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* set sock_connect as rollout_request_timeout in partial_rollout.py

* PullRequest: 161 Prioritize rollouts that are submitted earlier rather than arrived earlier

Merge branch fw/birth-time of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/161

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .
* blocking push

* PullRequest: 163 [bugfix] Fix synchronized training when birth time is absent

Merge branch fw/fix-sync-birthtime of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/163

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .

* PullRequest: 164 [Refactor] Move cluster spec into CLI args

Merge branch fw/refactor-cluster-spec of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/164?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* set cluster spec path in args
* .
* fix
* add default cluster spec

* PullRequest: 165 Normally exit all workers after experiment completion

Merge branch fw/exit-all-workers of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/165

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .
* .

* PullRequest: 167 [Feature] Use chunked logits computation to alleviate SGLang OOM

Merge branch fw/patch-sglang-oom of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/167

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .

* PullRequest: 166 [Feature] Support single-script experiment launch with Ray

Merge branch fw/turbolaunch of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/166?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* add training script without ray name resolve
* add ray name resolve
* ray worker
* run
* run async
* local run
* set cluster spec path in args
* .
* .
* fix
* .
* .
* .
* .
* .
* update config
* .
* minor renaming

* PullRequest: 169 [Doc] Add v0.3.0 docs based on jupyter-book

Merge branch fw/doc of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/169

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* add docs
* refine doc
* refine doc

* PullRequest: 170 [Feature] Amend configs for ray scripts

Merge branch fw/ray-configs of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/170

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .

---------

Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Co-authored-by: wanghuaijie.whj <wanghuaijie.whj@antgroup.com>
Co-authored-by: Tiwei Bie <tiwei.btw@antgroup.com>
Co-authored-by: kira.gw <kira.gw@antgroup.com>
Co-authored-by: shenxujie.sxj <shenxujie.sxj@antgroup.com>
Co-authored-by: 晓雷 <meizhiyu.mzy@antgroup.com>
Co-authored-by: sam.gjx <sam.gjx@antgroup.com>
Co-authored-by: 温差 <xushusheng.xss@antgroup.com>
Co-authored-by: 履渊 <yuhong.gyh@antgroup.com>
2025-05-28 19:32:42 +08:00
Wei Fu cf46993a30
[Feature & Doc & Bug Fix] Add docs, simplified ray-based scripts, and fix issues to stablize asynchronous experiments (#52)
* feat: one buffer for each task

* feat: support "one buffer for each task" for async

* make kv_cache_dtype configurable

Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>

* style: use plural form

fix: use _seed_from_key to set different seeds for data loaders
fix: call load_data for one buffer each time

* PullRequest: 125 Support running async experiments in the 2407 image.

Merge branch fw/async2407 of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/125

Signed-off-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .

* fix: handle multiple datasets in recover indices
fix: `isinstance(self.__datasets, PullerStreamDataset)`
feat: use the "spec" request to obtain the number of datasets
fix: revert rollout worker

* fix: revert async_rl_exp.py

* fix flag for list (cuda_graph_bs)

* format

* [FIX] fix async task reward [sglang bf16-> fp16]

* fix: define `self.__datasets` in advance

* PullRequest: 130 [Refactor] Remove deprecated search related code

Merge branch mzy/remove-search of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/130

Signed-off-by: 博惟 <bowei.fw@antgroup.com>


* remove search related

* PullRequest: 131 [Refactor] Change terminology "model parallel" into "tensor parallel" to align with megatron.

Merge branch mzy/mp-to-tp of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/131?tab=comment

Signed-off-by: 博惟 <bowei.fw@antgroup.com>


* change mp to tp
* .
* .

* PullRequest: 142 Fix an error for megatron backend destroy

Merge branch fw/fix-meagatron-destroy of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/142

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .

* PullRequest: 143 Fix the port conflict issue of generation servers

Merge branch fw/fix-gen-port of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/143?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* somehow fix the port issue
* add clearance period
* .
* .

* PullRequest: 145 Add code environment

Merge branch fw/code-env of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/145?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* add code env
* somehow fix the port issue
* fix

* PullRequest: 144 Add decoupled PPO loss

Merge branch fw/decoupled-ppo-loss of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/144?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* fix ppo step logging, nan in stats tracker, and add decoupled loss
* .
* somehow fix the port issue
* fix typo

* PullRequest: 146 Merge SLURM logs and save experiment configs in yaml format.

Merge branch fw/better-logging of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/146

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* merge all slurm logs into one
* write config to yaml

* PullRequest: 141 Merge changes during NeurIPS submission

Merge branch fw/async-dev of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/141

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .
* .
* .
* .
* .
* .
* .
* .
* .
* update script
* .
* .
* .
* .
* [ADD] add least req scheduling
* fix test genreq
* .
* .
* fix stats tracker nan
* .
* .
* .
* .
* .
* .
* .
* uppper clip decoupled objective
* add throughput exp script
* .
* remove behav upper clip param
* .
* .
* .
* plot curve
* update thpt script
* .
* master worker raise error when exiting
* update script
* add gen throughput logging
* .
* .
* add decoupled wandb data
* .
* fix port issue and add no training option
* .
* enlarge ttl
* remove gserver manager await staled
* update weights in groups
* .
* .
* .
* add port clearance period
* .
* .
* .
* add plot script
* add sft throughput eval
* .
* log tokens in null interface
* 消融实验和interruptible generation
* 画图脚本/运行脚本/数据结果
* .
* remove scripts
* add port test
* remove force_sync_reward
* revert some changes
* .
* revert
* revert fix
* fix
* revert
* fix typo

* support qwen3 training

* PullRequest: 147 Support interruption in SGLang and fix a KeyError in gather-scatter communication

Merge branch fw/sglang046-with-abort-request of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/147?tab=diff

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* fix ppo step logging, nan in stats tracker, and add decoupled loss
* .
* somehow fix the port issue
* initial commit
* add interupt request
* fix data transfer issue
* max concurrent rollouts defaults to train batch size
* merge main
* add patch
* fix patch typp
* revert sglang
* fix typo
* fix minor typo
* .
* pip show editable sglang path

* PullRequest: 149 fix: code faas max_retries

Merge branch xss/fix_code_verifier of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/149

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* fix: code faas max_retries

* PullRequest: 150 [Bug Fix] Fix key errors in `_run_scatter` in data transfer

Merge branch mzy/fix-scatter-groups of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/150

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* fix scatter groups key error

* fix test

* .

* PullRequest: 151 Fix Qwen3 import error when using transformers with a lower version

Merge branch fw/fix-qwen3 of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/151

Reviewed-by: 温差 <xushusheng.xss@antgroup.com>


* merge all slurm logs into one
* write config to yaml
* .

* PullRequest: 152 Support sglang0.4.6 and fix master_worker import error

Merge branch adopt_sglang046 of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/152

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* Support sglang0.4.6 and fix master_worker import error
* remove disable_mla option

* PullRequest: 155 [FIX] reduce port conflicts

Merge branch sxj/reduce_port_conflict of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/155

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* [FIX] reduce port conflicts

* PullRequest: 153 Fix stuck and recover issues for async experiments

Merge branch fw/stable-async of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/153

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* fix sample cnt stuck
* fix recover
* code cleanup
* merge all slurm logs into one
* write config to yaml
* .
* .
* .
* revert birth time change
* .
* enlarge sock connect timeout

* PullRequest: 158 [Fix] Fix the error where "accepted" is not defined

Merge branch fw/fix-rollout-accepted of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/158

Reviewed-by: 温差 <xushusheng.xss@antgroup.com>


* .

* PullRequest: 154 Fix unit tests and simplify package installation

Merge branch fw/v0.3.0-tests of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/154?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* fix some tests
* fix tests except for experiments
* fix tests
* fix tests
* .
* .

* PullRequest: 159 [fix] Enlarge the default aiohttp connection timeout and fix a recover error in model worker

Merge branch fw/stable-async of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/159

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* fix sample cnt stuck
* fix recover
* code cleanup
* merge all slurm logs into one
* write config to yaml
* .
* .
* .
* revert birth time change
* .
* enlarge sock connect timeout
* .

* PullRequest: 160 set sock_connect as rollout_request_timeout in partial_rollout.py

Merge branch xss/rollout_timeout of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/160

Reviewed-by: 博惟 <bowei.fw@antgroup.com>


* set sock_connect as rollout_request_timeout in partial_rollout.py

* PullRequest: 161 Prioritize rollouts that are submitted earlier rather than arrived earlier

Merge branch fw/birth-time of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/161

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .
* blocking push

* PullRequest: 163 [bugfix] Fix synchronized training when birth time is absent

Merge branch fw/fix-sync-birthtime of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/163

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .

* PullRequest: 164 [Refactor] Move cluster spec into CLI args

Merge branch fw/refactor-cluster-spec of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/164?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* set cluster spec path in args
* .
* fix
* add default cluster spec

* PullRequest: 165 Normally exit all workers after experiment completion

Merge branch fw/exit-all-workers of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/165

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .
* .

* PullRequest: 167 [Feature] Use chunked logits computation to alleviate SGLang OOM

Merge branch fw/patch-sglang-oom of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/167

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* .

* PullRequest: 166 [Feature] Support single-script experiment launch with Ray

Merge branch fw/turbolaunch of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/166?tab=comment

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* add training script without ray name resolve
* add ray name resolve
* ray worker
* run
* run async
* local run
* set cluster spec path in args
* .
* .
* fix
* .
* .
* .
* .
* .
* update config
* .
* minor renaming

* PullRequest: 169 [Doc] Add v0.3.0 docs based on jupyter-book

Merge branch fw/doc of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/169

Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>


* add docs
* refine doc
* refine doc

---------

Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
Co-authored-by: wanghuaijie.whj <wanghuaijie.whj@antgroup.com>
Co-authored-by: Tiwei Bie <tiwei.btw@antgroup.com>
Co-authored-by: kira.gw <kira.gw@antgroup.com>
Co-authored-by: shenxujie.sxj <shenxujie.sxj@antgroup.com>
Co-authored-by: 晓雷 <meizhiyu.mzy@antgroup.com>
Co-authored-by: sam.gjx <sam.gjx@antgroup.com>
Co-authored-by: 温差 <xushusheng.xss@antgroup.com>
Co-authored-by: 履渊 <yuhong.gyh@antgroup.com>
2025-05-28 19:18:05 +08:00