CUDAGraph

class torch.cuda.CUDAGraph[source]

Wrapper around a CUDA graph.

Warning

This API is in beta and may change in future releases.

capture_begin(pool=None, capture_error_mode='global')[source]

Begin capturing CUDA work on the current stream.

Typically, you shouldn’t call capture_begin yourself. Use graph or make_graphed_callables(), which call capture_begin internally.

Parameters

pool (optional) – Token (returned by graph_pool_handle() or other_Graph_instance.pool()) that hints this graph may share memory with the indicated pool. See Graph memory management.
capture_error_mode (str, optional) – specifies the cudaStreamCaptureMode for the graph capture stream. Can be “global”, “thread_local” or “relaxed”. During cuda graph capture, some actions, such as cudaMalloc, may be unsafe. “global” will error on actions in other threads, “thread_local” will only error for actions in the current thread, and “relaxed” will not error on these actions. Do NOT change this setting unless you’re familiar with cudaStreamCaptureMode

capture_end()[source]

End CUDA graph capture on the current stream.

After capture_end, replay may be called on this instance.

Typically, you shouldn’t call capture_end yourself. Use graph or make_graphed_callables(), which call capture_end internally.

debug_dump(debug_path)[source]

Calls a debugging function to dump the graph if the debugging is enabled via CUDAGraph.enable_debug_mode()

enable_debug_mode()[source]: Enable debugging mode for CUDAGraph.debug_dump.

Return an opaque token representing the id of this graph’s memory pool.

This id can optionally be passed to another graph’s capture_begin, which hints the other graph may share the same memory pool.

Docs