CUDAGraph
- class torch.cuda.CUDAGraph[source]
Wrapper around a CUDA graph.
Warning
This API is in beta and may change in future releases.
- capture_begin(pool=None, capture_error_mode='global')[source]
Begin capturing CUDA work on the current stream.
Typically, you shouldn’t call
capture_begin
yourself. Usegraph
ormake_graphed_callables()
, which callcapture_begin
internally.- Parameters
pool (optional) – Token (returned by
graph_pool_handle()
orother_Graph_instance.pool()
) that hints this graph may share memory with the indicated pool. See Graph memory management.capture_error_mode (str, optional) – specifies the cudaStreamCaptureMode for the graph capture stream. Can be “global”, “thread_local” or “relaxed”. During cuda graph capture, some actions, such as cudaMalloc, may be unsafe. “global” will error on actions in other threads, “thread_local” will only error for actions in the current thread, and “relaxed” will not error on these actions. Do NOT change this setting unless you’re familiar with cudaStreamCaptureMode
- capture_end()[source]
End CUDA graph capture on the current stream.
After
capture_end
,replay
may be called on this instance.Typically, you shouldn’t call
capture_end
yourself. Usegraph
ormake_graphed_callables()
, which callcapture_end
internally.
- debug_dump(debug_path)[source]
- Parameters
debug_path (required) – Path to dump the graph to.
Calls a debugging function to dump the graph if the debugging is enabled via CUDAGraph.enable_debug_mode()
- enable_debug_mode()[source]
Enable debugging mode for CUDAGraph.debug_dump.
- pool()[source]
Return an opaque token representing the id of this graph’s memory pool.
This id can optionally be passed to another graph’s
capture_begin
, which hints the other graph may share the same memory pool.
- replay()[source]
Replay the CUDA work captured by this graph.
- reset()[source]
Delete the graph currently held by this instance.