晓雷
00455c0db2
.
2025-07-18 10:17:11 +08:00
meizhiyu.mzy
51f7df6310
.
2025-07-18 10:11:17 +08:00
nuzant
0d45f43285
[Bug] Fix a bug in ray training utils. ( #184 )
...
* fix ray schedule
* .
* .
* remove debug info
2025-07-17 16:30:10 +08:00
Richard.Bian
ef8b340ea7
nit: fix typo in blog v0.2 release ( #182 )
...
Nitpick fix for the title in the blog md file AReaL_v0_2.md
2025-07-17 16:29:34 +08:00
meizhiyu.mzy
a0fd720012
remove debug info
2025-07-17 16:18:34 +08:00
meizhiyu.mzy
ad57e6c741
.
2025-07-17 16:17:51 +08:00
meizhiyu.mzy
0660154a8e
.
2025-07-17 15:49:01 +08:00
晓雷
f9253d6d74
fix ray schedule
2025-07-17 15:44:41 +08:00
Wei Fu
b9dbd4a2c1
Update to persistent wechat QR code. ( #159 )
2025-07-09 10:50:19 +08:00
xssstory
17ea7fe94d
fix math reward verifier ( #156 )
...
* PullRequest: 293 fix get_param_realloc_path
Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/293
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* fix get_param_realloc_path
* PullRequest: 297 bugfix: reward is always -5
Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/297
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* bugfix: reward is always -5
* PullRequest: 321 fix checkpoint save dir
Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/321
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* fix checkpoint save dir
* PullRequest: 328 [Doc] update installation
Merge branch sxj/doc of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/328
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* [Doc] update installation
* PullRequest: 329 bugfix: math verifier blocks the async training
Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/329
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* bugfix: math verifier block the async training
* format
---------
Co-authored-by: 冰临 <shenxujie.sxj@antgroup.com>
Co-authored-by: garrett4wade <fuwth17@gmail.com>
2025-07-07 15:49:13 +08:00
Wei Fu
5b7c83b5d9
update qrcode ( #153 )
2025-07-07 12:28:21 +08:00
Wei Fu
0ff8c59435
[Fix] Merge error fixes. ( #152 )
...
* PullRequest: 293 fix get_param_realloc_path
Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/293
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* fix get_param_realloc_path
* PullRequest: 297 bugfix: reward is always -5
Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/297
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* bugfix: reward is always -5
* PullRequest: 321 fix checkpoint save dir
Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/321
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* fix checkpoint save dir
* PullRequest: 328 [Doc] update installation
Merge branch sxj/doc of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/328
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* [Doc] update installation
---------
Co-authored-by: 温差 <xushusheng.xss@antgroup.com>
Co-authored-by: 冰临 <shenxujie.sxj@antgroup.com>
2025-07-07 10:30:27 +08:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
e0aee03109
Fix several syntax warning related to invalid escape sequence ( #144 )
...
by using raw strings or properly escaping the backslashes.
```log
AReaL/realhf/impl/dataset/math_parser.py:292: SyntaxWarning: invalid escape sequence '\%'
string = string.replace("\%", "")
AReaL/realhf/impl/dataset/math_parser.py:402: SyntaxWarning: invalid escape sequence '\d'
pattern = "-?\d*\.?\d+"
AReaL/realhf/impl/model/parallelism/tensor_parallel/modules.py:1125: SyntaxWarning: invalid escape sequence '\s'
```
Signed-off-by: Hollow Man <hollowman@opensuse.org>
2025-07-01 10:12:39 +08:00
Wei Fu
254614df87
Update wechat QR code ( #135 )
2025-06-25 16:29:19 +08:00
Wei Fu
adeb8eb13f
[Fix] Fix yaml configurations for v0.2 experiments. ( #129 )
...
* .
* fix
2025-06-24 13:48:02 +08:00
GurrenLagann97
e3005d57f6
add a preprocessing script for code training data and update readme ( #126 )
...
* add a preprocessing script for code training data and update readme
* add a preprocessing script for code training data and update readme
* add a preprocessing script for code training data and update readme
* fix eval doc
---------
Co-authored-by: hcy <hechuyi.hcy@antgroup.com>
2025-06-24 09:44:15 +08:00
Wei Fu
3642cce2fc
[FIX] Fix the flash-attn version to 2.7.3 ( #114 )
...
* .
* degenerate to 2.7.3
* .
* .
* update ci
* update ci
* update ci
* update CI
* update to tsinghua mirror
* update ci
* .
* .
* .
2025-06-23 15:46:17 +08:00
Wei Fu
1ec1399f19
PullRequest: 252 [Feature] Fix constants initialization. ( #122 )
...
Merge branch fw/gh/fix-init-constants of git@code.alipay.com:inclusionAI/AReaL.git into gh
https://code.alipay.com/inclusionAI/AReaL/pull_requests/252?tab=comment
Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>
* remove LOG_ROOT
* remove MODEL_SAVE_PATH
* remove PARAM_REALLOC_PATH, DATASET_CACHE
* prepare for testing
* prepare for testing
* ready for run
* local run
* tests mainly pass
* format
* amend cluster.py
* fix
2025-06-23 12:52:49 +08:00
Wei Fu
b63eea9d07
Update WeChat QRCode ( #111 )
2025-06-16 19:52:21 +08:00
xichengpro
bb14f022dc
Support using SwanLab for experiment tracking ( #98 )
...
* Support using SwanLab for experiment tracking
* docs: improve WandB and SwanLab integration documentation
- Added official links for better user reference
- Used backticks to quote commands and parameters
- Unified mode settings to use "online" / "cloud" convention
- Merged WandB and SwanLab descriptions into a single concise statement
- Added note on using `swanlab.mode="local"` when server connection is unavailable
* refactor: update default value of api_key
* fix: correct help description from WandB to SwanLab in SwanLabConfig
* refactor: merge log_swanlab_tensorboard and log_wandb_tensorboard into log_swanlab_wandb_tensorboard
- Unified logging logic for SwanLab, WandB, and TensorBoard to reduce code duplication
* chore: update swanlab version in dependency config files
- Updated SwanLab version in pyproject.toml
- Updated SwanLab version in requirements.txt
* refactor: enhance SwanLab config handling for logging purposes
- Config now uses provided arguments first
- Falls back to reading from config.yaml if no input is given
* docs: add note on using when server connection is unavailable
* refactor: merge _LATEST_WANDB_STEP and _LATEST_SWANLAB_STEP into _LATEST_LOG_STEP
* Format code with black and isort
* chore: update swanlab version in dependency config files
- Updated SwanLab version in requirements.txt
* refactor: rename swanlab_wandb_data to log_data
---------
Co-authored-by: dubingnan <dubingnan@360.cn>
2025-06-16 19:51:31 +08:00
GurrenLagann97
f2f4b67bcd
fix readme in evaluation ( #103 )
...
Co-authored-by: hcy <hechuyi.hcy@antgroup.com>
2025-06-10 17:56:08 +08:00
Wei Fu
e7eda16311
fix wandb mode typo ( #100 )
2025-06-10 14:01:24 +08:00
Wei Fu
86625c56e4
[Doc] Add WeChat QR code ( #99 )
...
* update qrcode
* update qrcode
2025-06-10 13:57:20 +08:00
Wei Fu
b3f5392f44
[Bug] Fix the dependency of a virtual environment for sympy==1.12 ( #92 )
...
* change to math local eval
* .
* update docker image tag
2025-06-08 21:11:35 +08:00
Wei Fu
c7d6ccc18e
[Bug] Suppress the SGLang patch error in Ray workers ( #93 )
2025-06-08 20:50:56 +08:00
Wei Fu
54daceb6dc
[Feature] Add multi-turn agent example ( #72 )
...
* update benchmark script
* .
* add benchmark docs
* PullRequest: 178 multi turn math agent training
Merge branch gjx/multi-turn-math of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/178?tab=diff
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* multi turn math agent training
* training data logging and clean math multi-turn exp
* fix
* .
* format
* .
---------
Co-authored-by: 步偶 <sam.gjx@antgroup.com>
2025-06-08 20:36:52 +08:00
Wei Fu
448bb05a3d
Fix formatting ( #90 )
2025-06-06 21:42:12 +08:00
Ligeng Zhu
de134b4a7a
[Feature] Switch dataset path / model path to HF location to ease community usage ( #82 )
...
* Update .gitignore and modify dataset paths in scripts for improved file management and compatibility with Hugging Face datasets. Additionally, refactor dataset loading functions to utilize load_hf_or_local_file for better flexibility.
* Remove sglang subproject and update dataset path format in load_hf_or_local_file function for compatibility with Hugging Face datasets.
* Refactor imports in grader.py and parser.py to include sympy for improved functionality.
2025-06-06 21:38:06 +08:00
Wei Fu
b768e5ce3c
update readme ( #78 )
2025-06-04 12:02:15 +08:00
samjia2000
ae4b03d95c
[Doc] update blog to discuss staleness ( #76 )
...
* update blog to discuss staleness
* update
* update
---------
Co-authored-by: gjx <sam.gjx@antgroup.com>
2025-06-04 11:48:30 +08:00
Wei Fu
15024d8d32
fix ( #77 )
2025-06-04 11:27:09 +08:00
GurrenLagann97
67dc056dd9
add hf links in readme and blog ( #75 )
...
* add hf links in readme and blog
* fix
* add model link in table
* add model link in table
---------
Co-authored-by: hcy <hechuyi.hcy@antgroup.com>
2025-06-04 10:32:18 +08:00
Wei Fu
d56df5102e
[Doc] Update README. ( #74 )
...
* update benchmark script
* .
* add benchmark docs
* add v0.3.0 configs
* .
* PullRequest: 178 multi turn math agent training
Merge branch gjx/multi-turn-math of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/178?tab=diff
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* multi turn math agent training
* training data logging and clean math multi-turn exp
* fix
* .
* fix
* change readme
* fix typo
* revert multiturn
---------
Co-authored-by: 步偶 <sam.gjx@antgroup.com>
2025-06-04 00:43:19 +08:00
Wei Fu
2d4d937d10
[Doc] Add doc for reproducing released models ( #73 )
...
* update benchmark script
* .
* add benchmark docs
* add v0.3.0 configs
* .
* PullRequest: 178 multi turn math agent training
Merge branch gjx/multi-turn-math of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/178?tab=diff
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* multi turn math agent training
* training data logging and clean math multi-turn exp
* fix
* .
* fix
* add docs and config
* format
* revert multi-turn agent
* add config
---------
Co-authored-by: 步偶 <sam.gjx@antgroup.com>
2025-06-03 20:33:48 +08:00
Wei Fu
ad6e5bd3fa
[Doc] Add verl benchmark scripts ( #71 )
...
* update benchmark script
* .
* add benchmark docs
* .
2025-06-03 18:10:32 +08:00
Wei Fu
326dfe3462
[Feature] Update Dockerfile to v0.3.0 ( #70 )
...
* update benchmark script
* .
* add benchmark docs
* update dockerfile
2025-06-03 11:57:56 +08:00
Wei Fu
2e5728a2c1
[Doc] Mark the equivalent between zero-staleness and synchronous PPO ( #69 )
2025-06-02 21:44:35 +08:00
Wei Fu
fabe59aad1
add doc ( #68 )
2025-06-02 21:16:36 +08:00
Wei Fu
ab20f940f9
add async doc ( #67 )
2025-06-02 15:59:50 +08:00
GurrenLagann97
ac7a9c5a89
[Doc] add dataset customization docs ( #66 )
...
* add dataset customization docs
* fix dataset customization docs and change orders
* fix dataset customization docs
---------
Co-authored-by: hcy <hechuyi.hcy@antgroup.com>
Co-authored-by: Wei Fu <36355462+garrett4wade@users.noreply.github.com>
2025-06-02 12:31:34 +08:00
Wei Fu
11a34dfb51
[Doc] Add doc for reproducing benchmark results ( #65 )
...
* update benchmark script
* .
* add benchmark docs
2025-06-01 20:50:40 +08:00
Wei Fu
ce4d7354bf
[Doc] Fix documentation for using Docker containers and customized agents ( #64 )
...
* test env setup
* .
* fix a missing cherry-pick
* .
* .
* .
* update docker instrcution
* fix
2025-06-01 16:33:29 +08:00
Wei Fu
afe5a2c880
[Fix] Fix tutorial async_ppo script and doc structure ( #63 )
...
* test env setup
* .
* fix a missing cherry-pick
2025-06-01 15:46:46 +08:00
Wei Fu
4fab3ac769
[Doc & Fix] Simplify the environment setup procedure ( #62 )
...
* PullRequest: 176 [FIX] clear sensitive info
Merge branch fw/fix-sensitive-info of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/176
Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>
* .
* .
* .
* .
* .
* test env setup
* fix
* allow cached model
* .
* revise docs
* change docs
* format docs
* update readme
2025-06-01 14:57:21 +08:00
Wei Fu
d87c898d36
[Feature] Add link to documentation in README ( #61 )
2025-05-30 19:58:58 +08:00
Wei Fu
473eeb2db0
[Feature] Create docs and examples for multi-turn agent RL ( #60 )
...
* PullRequest: 168 添加Codeforces测试,修复其它测试问题
Merge branch areal-eval-0.3 of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/168
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* fix eval and add codeforces elo calc
* fix codeforce test
* fix qwen3 prompt
* change annotations to eng
* add code verify files
* PullRequest: 173 [FIX} format code and fix a recover error in rollout worker
Merge branch fw/fix-rollout-recover of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/173
Reviewed-by: 温差 <xushusheng.xss@antgroup.com>
* format code and fix a recover error in rollout worker
* PullRequest: 171 更新评估文档
Merge branch eval-doc of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/171?tab=diff
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* update eval doc
* complete eval doc
* complete eval doc
* fix ood info
* add data obtaining guide
* fix supported datasets
* PullRequest: 174 decouple max_behav_imp_weight and c_clip & track entropy, positve_seq_len and negative_seq_len
Merge branch xss/max_behav_imp_weight of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/174
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* decouple max_behav_imp_weight and c_clip
* rename log: positve_* -> correct_*, negative_* -> incorrect_*
* rename hyper-parameter: max_behav_imp_weight -> behav_imp_weight_cap
* PullRequest: 175 [Fix] Fix the "event loop is already running" error in ray scripts
Merge branch fw/fix-ray-asyncio of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/175
Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>
* format code and fix a recover error in rollout worker
* .
* add docs
* .
---------
Co-authored-by: 乘鹭 <hechuyi.hcy@antgroup.com>
Co-authored-by: 温差 <xushusheng.xss@antgroup.com>
2025-05-30 16:26:07 +08:00
Wei Fu
c0200f10d0
[Feature] Support behavior importance weight capping and update evaluation scripts ( #59 )
...
* PullRequest: 168 添加Codeforces测试,修复其它测试问题
Merge branch areal-eval-0.3 of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/168
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* fix eval and add codeforces elo calc
* fix codeforce test
* fix qwen3 prompt
* change annotations to eng
* add code verify files
* PullRequest: 173 [FIX} format code and fix a recover error in rollout worker
Merge branch fw/fix-rollout-recover of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/173
Reviewed-by: 温差 <xushusheng.xss@antgroup.com>
* format code and fix a recover error in rollout worker
* PullRequest: 171 更新评估文档
Merge branch eval-doc of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/171?tab=diff
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* update eval doc
* complete eval doc
* complete eval doc
* fix ood info
* add data obtaining guide
* fix supported datasets
* PullRequest: 174 decouple max_behav_imp_weight and c_clip & track entropy, positve_seq_len and negative_seq_len
Merge branch xss/max_behav_imp_weight of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/174
Reviewed-by: 博惟 <bowei.fw@antgroup.com>
* decouple max_behav_imp_weight and c_clip
* rename log: positve_* -> correct_*, negative_* -> incorrect_*
* rename hyper-parameter: max_behav_imp_weight -> behav_imp_weight_cap
* PullRequest: 175 [Fix] Fix the "event loop is already running" error in ray scripts
Merge branch fw/fix-ray-asyncio of git@code.alipay.com:inclusionAI/AReaL.git into main
https://code.alipay.com/inclusionAI/AReaL/pull_requests/175
Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com>
* format code and fix a recover error in rollout worker
* .
---------
Co-authored-by: 乘鹭 <hechuyi.hcy@antgroup.com>
Co-authored-by: 温差 <xushusheng.xss@antgroup.com>
2025-05-30 10:29:21 +08:00
Wei Fu
0815bb6494
[CI] Try to fix doc CI ( #58 )
...
* .
* .
* .
* .
* .
* .
* .
* .
* .
* .
* .
2025-05-29 12:45:10 +08:00
Wei Fu
b3375607c6
[CI] Fix doc CI again ( #57 )
...
* .
* .
* .
* .
* .
* .
2025-05-29 11:22:27 +08:00
Wei Fu
409fa9842d
[CI] Fix doc CI ( #56 )
...
* .
* .
* .
2025-05-29 11:09:01 +08:00