BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//PyTorch - ECPv6.15.20//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-ORIGINAL-URL:https://pytorch.org
X-WR-CALDESC:Events for PyTorch
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/Los_Angeles
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20240310T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20241103T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20250309T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20251102T090000
END:STANDARD
BEGIN:DAYLIGHT
TZOFFSETFROM:-0800
TZOFFSETTO:-0700
TZNAME:PDT
DTSTART:20260308T100000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0700
TZOFFSETTO:-0800
TZNAME:PST
DTSTART:20261101T090000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/Los_Angeles:20250806T110000
DTEND;TZID=America/Los_Angeles:20250806T120000
DTSTAMP:20260427T013032
CREATED:20250707T201254Z
LAST-MODIFIED:20250811T204034Z
UID:10000040-1754478000-1754481600@pytorch.org
SUMMARY:verl: Flexible and Scalable Reinforcement Learning Library for LLM Reasoning and Tool-Calling
DESCRIPTION:Speaker: Haibin Lin \n\nverl is a flexible and efficient framework for building end-to-end reinforcement learning pipelines for LLMs. It provides a user-friendly hybrid-controller programming model\, supporting various algorithms such as PPO/GRPO/DAPO with effortless scaling. Recent trends in reasoning models bring new challenges to RL infrastructure\, such as efficient tool calling\, multi-turn interactions\, and capability to scale up to giant MoE models like DeepSeek. To lower the barrier to RL for advanced reasoning and tool calling\, we improve verl with (1) efficient request level async multi-turn rollout and tool calling\, (2) integration with expert parallelism for large scale MoE models\, (3) async system architecture for off-policy / async RL algorithms and flexible device placement.\n\n\n\n\nHaibin Lin works on LLM infrastructure at Bytedance Seed\, focusing on optimizing training performance for LLMs & multimodal understanding and generation models on large scale clusters\, from pre-training to post-training. Before he joined Bytedance\, he was working on Apache MXNet (training\, inference\, runtime\, and recipes like gluon-nlp).\n\n\n\nLinkedIn\nGitHub
URL:https://pytorch.org/event/verl-flexible-and-scalable-reinforcement-learning-library-for-llm-reasoning-and-tool-calling/
CATEGORIES:PyTorch-hosted
ATTACH;FMTTYPE=image/png:https://pytorch.org/wp-content/uploads/2025/07/Haibin-Lin.png
END:VEVENT
END:VCALENDAR