AReaL

Commit Graph

Author	SHA1	Message	Date
晓雷	00455c0db2	.	2025-07-18 10:17:11 +08:00
meizhiyu.mzy	51f7df6310	.	2025-07-18 10:11:17 +08:00
nuzant	0d45f43285	[Bug] Fix a bug in ray training utils. (#184 ) * fix ray schedule * . * . * remove debug info	2025-07-17 16:30:10 +08:00
Richard.Bian	ef8b340ea7	nit: fix typo in blog v0.2 release (#182 ) Nitpick fix for the title in the blog md file AReaL_v0_2.md	2025-07-17 16:29:34 +08:00
meizhiyu.mzy	a0fd720012	remove debug info	2025-07-17 16:18:34 +08:00
meizhiyu.mzy	ad57e6c741	.	2025-07-17 16:17:51 +08:00
meizhiyu.mzy	0660154a8e	.	2025-07-17 15:49:01 +08:00
晓雷	f9253d6d74	fix ray schedule	2025-07-17 15:44:41 +08:00
Wei Fu	b9dbd4a2c1	Update to persistent wechat QR code. (#159 )	2025-07-09 10:50:19 +08:00
xssstory	17ea7fe94d	fix math reward verifier (#156 ) * PullRequest: 293 fix get_param_realloc_path Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/293 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * fix get_param_realloc_path * PullRequest: 297 bugfix: reward is always -5 Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/297 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * bugfix: reward is always -5 * PullRequest: 321 fix checkpoint save dir Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/321 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * fix checkpoint save dir * PullRequest: 328 [Doc] update installation Merge branch sxj/doc of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/328 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * [Doc] update installation * PullRequest: 329 bugfix: math verifier blocks the async training Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/329 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * bugfix: math verifier block the async training * format --------- Co-authored-by: 冰临 <shenxujie.sxj@antgroup.com> Co-authored-by: garrett4wade <fuwth17@gmail.com>	2025-07-07 15:49:13 +08:00
Wei Fu	5b7c83b5d9	update qrcode (#153 )	2025-07-07 12:28:21 +08:00
Wei Fu	0ff8c59435	[Fix] Merge error fixes. (#152 ) * PullRequest: 293 fix get_param_realloc_path Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/293 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * fix get_param_realloc_path * PullRequest: 297 bugfix: reward is always -5 Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/297 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * bugfix: reward is always -5 * PullRequest: 321 fix checkpoint save dir Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/321 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * fix checkpoint save dir * PullRequest: 328 [Doc] update installation Merge branch sxj/doc of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/328 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * [Doc] update installation --------- Co-authored-by: 温差 <xushusheng.xss@antgroup.com> Co-authored-by: 冰临 <shenxujie.sxj@antgroup.com>	2025-07-07 10:30:27 +08:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟	e0aee03109	Fix several syntax warning related to invalid escape sequence (#144 ) by using raw strings or properly escaping the backslashes. ```log AReaL/realhf/impl/dataset/math_parser.py:292: SyntaxWarning: invalid escape sequence '\%' string = string.replace("\%", "") AReaL/realhf/impl/dataset/math_parser.py:402: SyntaxWarning: invalid escape sequence '\d' pattern = "-?\d*\.?\d+" AReaL/realhf/impl/model/parallelism/tensor_parallel/modules.py:1125: SyntaxWarning: invalid escape sequence '\s' ``` Signed-off-by: Hollow Man <hollowman@opensuse.org>	2025-07-01 10:12:39 +08:00
Wei Fu	254614df87	Update wechat QR code (#135 )	2025-06-25 16:29:19 +08:00
Wei Fu	adeb8eb13f	[Fix] Fix yaml configurations for v0.2 experiments. (#129 ) * . * fix	2025-06-24 13:48:02 +08:00
GurrenLagann97	e3005d57f6	add a preprocessing script for code training data and update readme (#126 ) * add a preprocessing script for code training data and update readme * add a preprocessing script for code training data and update readme * add a preprocessing script for code training data and update readme * fix eval doc --------- Co-authored-by: hcy <hechuyi.hcy@antgroup.com>	2025-06-24 09:44:15 +08:00
Wei Fu	3642cce2fc	[FIX] Fix the flash-attn version to 2.7.3 (#114 ) * . * degenerate to 2.7.3 * . * . * update ci * update ci * update ci * update CI * update to tsinghua mirror * update ci * . * . * .	2025-06-23 15:46:17 +08:00
Wei Fu	1ec1399f19	PullRequest: 252 [Feature] Fix constants initialization. (#122 ) Merge branch fw/gh/fix-init-constants of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/252?tab=comment Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com> * remove LOG_ROOT * remove MODEL_SAVE_PATH * remove PARAM_REALLOC_PATH, DATASET_CACHE * prepare for testing * prepare for testing * ready for run * local run * tests mainly pass * format * amend cluster.py * fix	2025-06-23 12:52:49 +08:00
Wei Fu	b63eea9d07	Update WeChat QRCode (#111 )	2025-06-16 19:52:21 +08:00
xichengpro	bb14f022dc	Support using SwanLab for experiment tracking (#98 ) * Support using SwanLab for experiment tracking * docs: improve WandB and SwanLab integration documentation - Added official links for better user reference - Used backticks to quote commands and parameters - Unified mode settings to use "online" / "cloud" convention - Merged WandB and SwanLab descriptions into a single concise statement - Added note on using `swanlab.mode="local"` when server connection is unavailable * refactor: update default value of api_key * fix: correct help description from WandB to SwanLab in SwanLabConfig * refactor: merge log_swanlab_tensorboard and log_wandb_tensorboard into log_swanlab_wandb_tensorboard - Unified logging logic for SwanLab, WandB, and TensorBoard to reduce code duplication * chore: update swanlab version in dependency config files - Updated SwanLab version in pyproject.toml - Updated SwanLab version in requirements.txt * refactor: enhance SwanLab config handling for logging purposes - Config now uses provided arguments first - Falls back to reading from config.yaml if no input is given * docs: add note on using when server connection is unavailable * refactor: merge _LATEST_WANDB_STEP and _LATEST_SWANLAB_STEP into _LATEST_LOG_STEP * Format code with black and isort * chore: update swanlab version in dependency config files - Updated SwanLab version in requirements.txt * refactor: rename swanlab_wandb_data to log_data --------- Co-authored-by: dubingnan <dubingnan@360.cn>	2025-06-16 19:51:31 +08:00
GurrenLagann97	f2f4b67bcd	fix readme in evaluation (#103 ) Co-authored-by: hcy <hechuyi.hcy@antgroup.com>	2025-06-10 17:56:08 +08:00
Wei Fu	e7eda16311	fix wandb mode typo (#100 )	2025-06-10 14:01:24 +08:00
Wei Fu	86625c56e4	[Doc] Add WeChat QR code (#99 ) * update qrcode * update qrcode	2025-06-10 13:57:20 +08:00
Wei Fu	b3f5392f44	[Bug] Fix the dependency of a virtual environment for sympy==1.12 (#92 ) * change to math local eval * . * update docker image tag	2025-06-08 21:11:35 +08:00
Wei Fu	c7d6ccc18e	[Bug] Suppress the SGLang patch error in Ray workers (#93 )	2025-06-08 20:50:56 +08:00
Wei Fu	54daceb6dc	[Feature] Add multi-turn agent example (#72 ) * update benchmark script * . * add benchmark docs * PullRequest: 178 multi turn math agent training Merge branch gjx/multi-turn-math of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/178?tab=diff Reviewed-by: 博惟 <bowei.fw@antgroup.com> * multi turn math agent training * training data logging and clean math multi-turn exp * fix * . * format * . --------- Co-authored-by: 步偶 <sam.gjx@antgroup.com>	2025-06-08 20:36:52 +08:00
Wei Fu	448bb05a3d	Fix formatting (#90 )	2025-06-06 21:42:12 +08:00
Ligeng Zhu	de134b4a7a	[Feature] Switch dataset path / model path to HF location to ease community usage (#82 ) * Update .gitignore and modify dataset paths in scripts for improved file management and compatibility with Hugging Face datasets. Additionally, refactor dataset loading functions to utilize load_hf_or_local_file for better flexibility. * Remove sglang subproject and update dataset path format in load_hf_or_local_file function for compatibility with Hugging Face datasets. * Refactor imports in grader.py and parser.py to include sympy for improved functionality.	2025-06-06 21:38:06 +08:00
Wei Fu	b768e5ce3c	update readme (#78 )	2025-06-04 12:02:15 +08:00
samjia2000	ae4b03d95c	[Doc] update blog to discuss staleness (#76 ) * update blog to discuss staleness * update * update --------- Co-authored-by: gjx <sam.gjx@antgroup.com>	2025-06-04 11:48:30 +08:00
Wei Fu	15024d8d32	fix (#77 )	2025-06-04 11:27:09 +08:00
GurrenLagann97	67dc056dd9	add hf links in readme and blog (#75 ) * add hf links in readme and blog * fix * add model link in table * add model link in table --------- Co-authored-by: hcy <hechuyi.hcy@antgroup.com>	2025-06-04 10:32:18 +08:00
Wei Fu	d56df5102e	[Doc] Update README. (#74 ) * update benchmark script * . * add benchmark docs * add v0.3.0 configs * . * PullRequest: 178 multi turn math agent training Merge branch gjx/multi-turn-math of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/178?tab=diff Reviewed-by: 博惟 <bowei.fw@antgroup.com> * multi turn math agent training * training data logging and clean math multi-turn exp * fix * . * fix * change readme * fix typo * revert multiturn --------- Co-authored-by: 步偶 <sam.gjx@antgroup.com>	2025-06-04 00:43:19 +08:00
Wei Fu	2d4d937d10	[Doc] Add doc for reproducing released models (#73 ) * update benchmark script * . * add benchmark docs * add v0.3.0 configs * . * PullRequest: 178 multi turn math agent training Merge branch gjx/multi-turn-math of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/178?tab=diff Reviewed-by: 博惟 <bowei.fw@antgroup.com> * multi turn math agent training * training data logging and clean math multi-turn exp * fix * . * fix * add docs and config * format * revert multi-turn agent * add config --------- Co-authored-by: 步偶 <sam.gjx@antgroup.com>	2025-06-03 20:33:48 +08:00
Wei Fu	ad6e5bd3fa	[Doc] Add verl benchmark scripts (#71 ) * update benchmark script * . * add benchmark docs * .	2025-06-03 18:10:32 +08:00
Wei Fu	326dfe3462	[Feature] Update Dockerfile to v0.3.0 (#70 ) * update benchmark script * . * add benchmark docs * update dockerfile	2025-06-03 11:57:56 +08:00
Wei Fu	2e5728a2c1	[Doc] Mark the equivalent between zero-staleness and synchronous PPO (#69 )	2025-06-02 21:44:35 +08:00
Wei Fu	fabe59aad1	add doc (#68 )	2025-06-02 21:16:36 +08:00
Wei Fu	ab20f940f9	add async doc (#67 )	2025-06-02 15:59:50 +08:00
GurrenLagann97	ac7a9c5a89	[Doc] add dataset customization docs (#66 ) * add dataset customization docs * fix dataset customization docs and change orders * fix dataset customization docs --------- Co-authored-by: hcy <hechuyi.hcy@antgroup.com> Co-authored-by: Wei Fu <36355462+garrett4wade@users.noreply.github.com>	2025-06-02 12:31:34 +08:00
Wei Fu	11a34dfb51	[Doc] Add doc for reproducing benchmark results (#65 ) * update benchmark script * . * add benchmark docs	2025-06-01 20:50:40 +08:00
Wei Fu	ce4d7354bf	[Doc] Fix documentation for using Docker containers and customized agents (#64 ) * test env setup * . * fix a missing cherry-pick * . * . * . * update docker instrcution * fix	2025-06-01 16:33:29 +08:00
Wei Fu	afe5a2c880	[Fix] Fix tutorial async_ppo script and doc structure (#63 ) * test env setup * . * fix a missing cherry-pick	2025-06-01 15:46:46 +08:00
Wei Fu	4fab3ac769	[Doc & Fix] Simplify the environment setup procedure (#62 ) * PullRequest: 176 [FIX] clear sensitive info Merge branch fw/fix-sensitive-info of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/176 Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com> * . * . * . * . * . * test env setup * fix * allow cached model * . * revise docs * change docs * format docs * update readme	2025-06-01 14:57:21 +08:00
Wei Fu	d87c898d36	[Feature] Add link to documentation in README (#61 )	2025-05-30 19:58:58 +08:00
Wei Fu	473eeb2db0	[Feature] Create docs and examples for multi-turn agent RL (#60 ) * PullRequest: 168 添加Codeforces测试，修复其它测试问题 Merge branch areal-eval-0.3 of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/168 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * fix eval and add codeforces elo calc * fix codeforce test * fix qwen3 prompt * change annotations to eng * add code verify files * PullRequest: 173 [FIX} format code and fix a recover error in rollout worker Merge branch fw/fix-rollout-recover of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/173 Reviewed-by: 温差 <xushusheng.xss@antgroup.com> * format code and fix a recover error in rollout worker * PullRequest: 171 更新评估文档 Merge branch eval-doc of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/171?tab=diff Reviewed-by: 博惟 <bowei.fw@antgroup.com> * update eval doc * complete eval doc * complete eval doc * fix ood info * add data obtaining guide * fix supported datasets * PullRequest: 174 decouple max_behav_imp_weight and c_clip & track entropy, positve_seq_len and negative_seq_len Merge branch xss/max_behav_imp_weight of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/174 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * decouple max_behav_imp_weight and c_clip * rename log: positve_* -> correct_, negative_ -> incorrect_* * rename hyper-parameter: max_behav_imp_weight -> behav_imp_weight_cap * PullRequest: 175 [Fix] Fix the "event loop is already running" error in ray scripts Merge branch fw/fix-ray-asyncio of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/175 Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com> * format code and fix a recover error in rollout worker * . * add docs * . --------- Co-authored-by: 乘鹭 <hechuyi.hcy@antgroup.com> Co-authored-by: 温差 <xushusheng.xss@antgroup.com>	2025-05-30 16:26:07 +08:00
Wei Fu	c0200f10d0	[Feature] Support behavior importance weight capping and update evaluation scripts (#59 ) * PullRequest: 168 添加Codeforces测试，修复其它测试问题 Merge branch areal-eval-0.3 of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/168 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * fix eval and add codeforces elo calc * fix codeforce test * fix qwen3 prompt * change annotations to eng * add code verify files * PullRequest: 173 [FIX} format code and fix a recover error in rollout worker Merge branch fw/fix-rollout-recover of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/173 Reviewed-by: 温差 <xushusheng.xss@antgroup.com> * format code and fix a recover error in rollout worker * PullRequest: 171 更新评估文档 Merge branch eval-doc of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/171?tab=diff Reviewed-by: 博惟 <bowei.fw@antgroup.com> * update eval doc * complete eval doc * complete eval doc * fix ood info * add data obtaining guide * fix supported datasets * PullRequest: 174 decouple max_behav_imp_weight and c_clip & track entropy, positve_seq_len and negative_seq_len Merge branch xss/max_behav_imp_weight of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/174 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * decouple max_behav_imp_weight and c_clip * rename log: positve_* -> correct_, negative_ -> incorrect_* * rename hyper-parameter: max_behav_imp_weight -> behav_imp_weight_cap * PullRequest: 175 [Fix] Fix the "event loop is already running" error in ray scripts Merge branch fw/fix-ray-asyncio of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/175 Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com> * format code and fix a recover error in rollout worker * . --------- Co-authored-by: 乘鹭 <hechuyi.hcy@antgroup.com> Co-authored-by: 温差 <xushusheng.xss@antgroup.com>	2025-05-30 10:29:21 +08:00
Wei Fu	0815bb6494	[CI] Try to fix doc CI (#58 ) * . * . * . * . * . * . * . * . * . * . * .	2025-05-29 12:45:10 +08:00
Wei Fu	b3375607c6	[CI] Fix doc CI again (#57 ) * . * . * . * . * . * .	2025-05-29 11:22:27 +08:00
Wei Fu	409fa9842d	[CI] Fix doc CI (#56 ) * . * . * .	2025-05-29 11:09:01 +08:00

1 2 3 4

198 Commits All Branches Search

198 Commits

All Branches