Shortcuts

graph

class torch.cuda.graph(cuda_graph, pool=None, stream=None, capture_error_mode='global')[source]

Context-manager that captures CUDA work into a torch.cuda.CUDAGraph object for later replay.

See CUDA Graphs for a general introduction, detailed use, and constraints.

Parameters
  • cuda_graph (torch.cuda.CUDAGraph) – Graph object used for capture.

  • pool (optional) – Opaque token (returned by a call to graph_pool_handle() or other_Graph_instance.pool()) hinting this graph’s capture may share memory from the specified pool. See Graph memory management.

  • stream (torch.cuda.Stream, optional) – If supplied, will be set as the current stream in the context. If not supplied, graph sets its own internal side stream as the current stream in the context.

  • capture_error_mode (str, optional) – specifies the cudaStreamCaptureMode for the graph capture stream. Can be “global”, “thread_local” or “relaxed”. During cuda graph capture, some actions, such as cudaMalloc, may be unsafe. “global” will error on actions in other threads, “thread_local” will only error for actions in the current thread, and “relaxed” will not error on actions. Do NOT change this setting unless you’re familiar with cudaStreamCaptureMode

Note

For effective memory sharing, if you pass a pool used by a previous capture and the previous capture used an explicit stream argument, you should pass the same stream argument to this capture.

Warning

This API is in beta and may change in future releases.

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources