AReaL

Commit Graph

Author	SHA1	Message	Date
Wei Fu	b9dbd4a2c1	Update to persistent wechat QR code. (#159 )	2025-07-09 10:50:19 +08:00
博惟	57b9b945ab	add readme	2025-07-07 17:46:46 +08:00
garrett4wade	cf0db6a43c	format	2025-07-07 17:02:32 +08:00
garrett4wade	7ab6755379	Merge branch 'lite' of https://github.com/inclusionAI/AReaL into lite	2025-07-07 17:02:21 +08:00
博惟	e251abbbe8	add remote sglang engine	2025-07-07 17:01:35 +08:00
博惟	3b2f43a295	checkout prev impl	2025-07-07 15:53:34 +08:00
博惟	95c315e0b8	checkout previous implementations	2025-07-07 15:52:44 +08:00
博惟	3a0f1e558c	checkout previous impl	2025-07-07 15:51:32 +08:00
xssstory	17ea7fe94d	fix math reward verifier (#156 ) * PullRequest: 293 fix get_param_realloc_path Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/293 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * fix get_param_realloc_path * PullRequest: 297 bugfix: reward is always -5 Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/297 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * bugfix: reward is always -5 * PullRequest: 321 fix checkpoint save dir Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/321 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * fix checkpoint save dir * PullRequest: 328 [Doc] update installation Merge branch sxj/doc of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/328 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * [Doc] update installation * PullRequest: 329 bugfix: math verifier blocks the async training Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/329 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * bugfix: math verifier block the async training * format --------- Co-authored-by: 冰临 <shenxujie.sxj@antgroup.com> Co-authored-by: garrett4wade <fuwth17@gmail.com>	2025-07-07 15:49:13 +08:00
博惟	6710d5f275	Merge branch 'lite' of https://code.alipay.com/inclusionAI/AReaL into lite	2025-07-07 15:35:02 +08:00
garrett4wade	b6e19dbf60	format	2025-07-07 14:02:59 +08:00
博惟	28c9479981	add tests template	2025-07-07 13:51:23 +08:00
博惟	1006be8a01	add directory structure	2025-07-07 13:49:54 +08:00
博惟	1dfe91c470	add api	2025-07-07 13:47:41 +08:00
Wei Fu	5b7c83b5d9	update qrcode (#153 )	2025-07-07 12:28:21 +08:00
Wei Fu	0ff8c59435	[Fix] Merge error fixes. (#152 ) * PullRequest: 293 fix get_param_realloc_path Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/293 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * fix get_param_realloc_path * PullRequest: 297 bugfix: reward is always -5 Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/297 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * bugfix: reward is always -5 * PullRequest: 321 fix checkpoint save dir Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/321 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * fix checkpoint save dir * PullRequest: 328 [Doc] update installation Merge branch sxj/doc of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/328 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * [Doc] update installation --------- Co-authored-by: 温差 <xushusheng.xss@antgroup.com> Co-authored-by: 冰临 <shenxujie.sxj@antgroup.com>	2025-07-07 10:30:27 +08:00
温差	ed6bcd66a7	PullRequest: 321 fix checkpoint save dir Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/321 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * fix checkpoint save dir	2025-07-03 10:03:33 +08:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟	e0aee03109	Fix several syntax warning related to invalid escape sequence (#144 ) by using raw strings or properly escaping the backslashes. ```log AReaL/realhf/impl/dataset/math_parser.py:292: SyntaxWarning: invalid escape sequence '\%' string = string.replace("\%", "") AReaL/realhf/impl/dataset/math_parser.py:402: SyntaxWarning: invalid escape sequence '\d' pattern = "-?\d*\.?\d+" AReaL/realhf/impl/model/parallelism/tensor_parallel/modules.py:1125: SyntaxWarning: invalid escape sequence '\s' ``` Signed-off-by: Hollow Man <hollowman@opensuse.org>	2025-07-01 10:12:39 +08:00
温差	623f7c7407	PullRequest: 297 bugfix: reward is always -5 Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/297 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * bugfix: reward is always -5	2025-06-27 13:06:37 +08:00
温差	a5cabddcea	PullRequest: 293 fix get_param_realloc_path Merge branch xss/debug of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/293 Reviewed-by: 博惟 <bowei.fw@antgroup.com> * fix get_param_realloc_path	2025-06-26 21:26:56 +08:00
Wei Fu	254614df87	Update wechat QR code (#135 )	2025-06-25 16:29:19 +08:00
Wei Fu	adeb8eb13f	[Fix] Fix yaml configurations for v0.2 experiments. (#129 ) * . * fix	2025-06-24 13:48:02 +08:00
GurrenLagann97	e3005d57f6	add a preprocessing script for code training data and update readme (#126 ) * add a preprocessing script for code training data and update readme * add a preprocessing script for code training data and update readme * add a preprocessing script for code training data and update readme * fix eval doc --------- Co-authored-by: hcy <hechuyi.hcy@antgroup.com>	2025-06-24 09:44:15 +08:00
Wei Fu	3642cce2fc	[FIX] Fix the flash-attn version to 2.7.3 (#114 ) * . * degenerate to 2.7.3 * . * . * update ci * update ci * update ci * update CI * update to tsinghua mirror * update ci * . * . * .	2025-06-23 15:46:17 +08:00
Wei Fu	1ec1399f19	PullRequest: 252 [Feature] Fix constants initialization. (#122 ) Merge branch fw/gh/fix-init-constants of git@code.alipay.com:inclusionAI/AReaL.git into gh https://code.alipay.com/inclusionAI/AReaL/pull_requests/252?tab=comment Reviewed-by: 晓雷 <meizhiyu.mzy@antgroup.com> * remove LOG_ROOT * remove MODEL_SAVE_PATH * remove PARAM_REALLOC_PATH, DATASET_CACHE * prepare for testing * prepare for testing * ready for run * local run * tests mainly pass * format * amend cluster.py * fix	2025-06-23 12:52:49 +08:00
Wei Fu	b63eea9d07	Update WeChat QRCode (#111 )	2025-06-16 19:52:21 +08:00
xichengpro	bb14f022dc	Support using SwanLab for experiment tracking (#98 ) * Support using SwanLab for experiment tracking * docs: improve WandB and SwanLab integration documentation - Added official links for better user reference - Used backticks to quote commands and parameters - Unified mode settings to use "online" / "cloud" convention - Merged WandB and SwanLab descriptions into a single concise statement - Added note on using `swanlab.mode="local"` when server connection is unavailable * refactor: update default value of api_key * fix: correct help description from WandB to SwanLab in SwanLabConfig * refactor: merge log_swanlab_tensorboard and log_wandb_tensorboard into log_swanlab_wandb_tensorboard - Unified logging logic for SwanLab, WandB, and TensorBoard to reduce code duplication * chore: update swanlab version in dependency config files - Updated SwanLab version in pyproject.toml - Updated SwanLab version in requirements.txt * refactor: enhance SwanLab config handling for logging purposes - Config now uses provided arguments first - Falls back to reading from config.yaml if no input is given * docs: add note on using when server connection is unavailable * refactor: merge _LATEST_WANDB_STEP and _LATEST_SWANLAB_STEP into _LATEST_LOG_STEP * Format code with black and isort * chore: update swanlab version in dependency config files - Updated SwanLab version in requirements.txt * refactor: rename swanlab_wandb_data to log_data --------- Co-authored-by: dubingnan <dubingnan@360.cn>	2025-06-16 19:51:31 +08:00
GurrenLagann97	f2f4b67bcd	fix readme in evaluation (#103 ) Co-authored-by: hcy <hechuyi.hcy@antgroup.com>	2025-06-10 17:56:08 +08:00
Wei Fu	e7eda16311	fix wandb mode typo (#100 )	2025-06-10 14:01:24 +08:00
Wei Fu	86625c56e4	[Doc] Add WeChat QR code (#99 ) * update qrcode * update qrcode	2025-06-10 13:57:20 +08:00
Wei Fu	b3f5392f44	[Bug] Fix the dependency of a virtual environment for sympy==1.12 (#92 ) * change to math local eval * . * update docker image tag	2025-06-08 21:11:35 +08:00
Wei Fu	c7d6ccc18e	[Bug] Suppress the SGLang patch error in Ray workers (#93 )	2025-06-08 20:50:56 +08:00
Wei Fu	54daceb6dc	[Feature] Add multi-turn agent example (#72 ) * update benchmark script * . * add benchmark docs * PullRequest: 178 multi turn math agent training Merge branch gjx/multi-turn-math of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/178?tab=diff Reviewed-by: 博惟 <bowei.fw@antgroup.com> * multi turn math agent training * training data logging and clean math multi-turn exp * fix * . * format * . --------- Co-authored-by: 步偶 <sam.gjx@antgroup.com>	2025-06-08 20:36:52 +08:00
Wei Fu	448bb05a3d	Fix formatting (#90 )	2025-06-06 21:42:12 +08:00
Ligeng Zhu	de134b4a7a	[Feature] Switch dataset path / model path to HF location to ease community usage (#82 ) * Update .gitignore and modify dataset paths in scripts for improved file management and compatibility with Hugging Face datasets. Additionally, refactor dataset loading functions to utilize load_hf_or_local_file for better flexibility. * Remove sglang subproject and update dataset path format in load_hf_or_local_file function for compatibility with Hugging Face datasets. * Refactor imports in grader.py and parser.py to include sympy for improved functionality.	2025-06-06 21:38:06 +08:00
Wei Fu	b768e5ce3c	update readme (#78 )	2025-06-04 12:02:15 +08:00
samjia2000	ae4b03d95c	[Doc] update blog to discuss staleness (#76 ) * update blog to discuss staleness * update * update --------- Co-authored-by: gjx <sam.gjx@antgroup.com>	2025-06-04 11:48:30 +08:00
Wei Fu	15024d8d32	fix (#77 )	2025-06-04 11:27:09 +08:00
GurrenLagann97	67dc056dd9	add hf links in readme and blog (#75 ) * add hf links in readme and blog * fix * add model link in table * add model link in table --------- Co-authored-by: hcy <hechuyi.hcy@antgroup.com>	2025-06-04 10:32:18 +08:00
Wei Fu	d56df5102e	[Doc] Update README. (#74 ) * update benchmark script * . * add benchmark docs * add v0.3.0 configs * . * PullRequest: 178 multi turn math agent training Merge branch gjx/multi-turn-math of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/178?tab=diff Reviewed-by: 博惟 <bowei.fw@antgroup.com> * multi turn math agent training * training data logging and clean math multi-turn exp * fix * . * fix * change readme * fix typo * revert multiturn --------- Co-authored-by: 步偶 <sam.gjx@antgroup.com>	2025-06-04 00:43:19 +08:00
Wei Fu	2d4d937d10	[Doc] Add doc for reproducing released models (#73 ) * update benchmark script * . * add benchmark docs * add v0.3.0 configs * . * PullRequest: 178 multi turn math agent training Merge branch gjx/multi-turn-math of git@code.alipay.com:inclusionAI/AReaL.git into main https://code.alipay.com/inclusionAI/AReaL/pull_requests/178?tab=diff Reviewed-by: 博惟 <bowei.fw@antgroup.com> * multi turn math agent training * training data logging and clean math multi-turn exp * fix * . * fix * add docs and config * format * revert multi-turn agent * add config --------- Co-authored-by: 步偶 <sam.gjx@antgroup.com>	2025-06-03 20:33:48 +08:00
Wei Fu	ad6e5bd3fa	[Doc] Add verl benchmark scripts (#71 ) * update benchmark script * . * add benchmark docs * .	2025-06-03 18:10:32 +08:00
Wei Fu	326dfe3462	[Feature] Update Dockerfile to v0.3.0 (#70 ) * update benchmark script * . * add benchmark docs * update dockerfile	2025-06-03 11:57:56 +08:00
Wei Fu	2e5728a2c1	[Doc] Mark the equivalent between zero-staleness and synchronous PPO (#69 )	2025-06-02 21:44:35 +08:00
Wei Fu	fabe59aad1	add doc (#68 )	2025-06-02 21:16:36 +08:00
Wei Fu	ab20f940f9	add async doc (#67 )	2025-06-02 15:59:50 +08:00
GurrenLagann97	ac7a9c5a89	[Doc] add dataset customization docs (#66 ) * add dataset customization docs * fix dataset customization docs and change orders * fix dataset customization docs --------- Co-authored-by: hcy <hechuyi.hcy@antgroup.com> Co-authored-by: Wei Fu <36355462+garrett4wade@users.noreply.github.com>	2025-06-02 12:31:34 +08:00
Wei Fu	11a34dfb51	[Doc] Add doc for reproducing benchmark results (#65 ) * update benchmark script * . * add benchmark docs	2025-06-01 20:50:40 +08:00
Wei Fu	ce4d7354bf	[Doc] Fix documentation for using Docker containers and customized agents (#64 ) * test env setup * . * fix a missing cherry-pick * . * . * . * update docker instrcution * fix	2025-06-01 16:33:29 +08:00
Wei Fu	afe5a2c880	[Fix] Fix tutorial async_ppo script and doc structure (#63 ) * test env setup * . * fix a missing cherry-pick	2025-06-01 15:46:46 +08:00

1 2 3 4 5 ...

255 Commits All Branches Search

255 Commits

All Branches