Utils¶
This contains TorchX utility components that are ready-to-use out of the box. These are
components that simply execute well known binaries (e.g. cp
)
and are meant to be used as tutorial materials or glue operations between
meaningful stages in a workflow.
- torchx.components.utils.echo(msg: str = 'hello world', image: str = 'ghcr.io/pytorch/torchx:0.7.0', num_replicas: int = 1) AppDef [source]¶
Echos a message to stdout (calls echo)
- Parameters:
msg – message to echo
image – image to use
num_replicas – number of replicas to run
- torchx.components.utils.touch(file: str, image: str = 'ghcr.io/pytorch/torchx:0.7.0') AppDef [source]¶
Touches a file (calls touch)
- Parameters:
file – file to create
image – the image to use
- torchx.components.utils.sh(*args: str, image: str = 'ghcr.io/pytorch/torchx:0.7.0', num_replicas: int = 1, cpu: int = 1, gpu: int = 0, memMB: int = 1024, h: Optional[str] = None, env: Optional[Dict[str, str]] = None, max_retries: int = 0, mounts: Optional[List[str]] = None) AppDef [source]¶
Runs the provided command via sh. Currently sh does not support environment variable substitution.
- Parameters:
args – bash arguments
image – image to use
num_replicas – number of replicas to run
cpu – number of cpus per replica
gpu – number of gpus per replica
memMB – cpu memory in MB per replica
h – a registered named resource (if specified takes precedence over cpu, gpu, memMB)
env – environment varibles to be passed to the run (e.g. ENV1=v1,ENV2=v2,ENV3=v3)
max_retries – the number of scheduler retries allowed
mounts – mounts to mount into the worker environment/container (ex. type=<bind/volume>,src=/host,dst=/job[,readonly]). See scheduler documentation for more info.
- torchx.components.utils.copy(src: str, dst: str, image: str = 'ghcr.io/pytorch/torchx:0.7.0') AppDef [source]¶
copy copies the file from src to dst. src and dst can be any valid fsspec url.
This does not support recursive copies or directories.
- Parameters:
src – the source fsspec file location
dst – the destination fsspec file location
image – the image that contains the copy app
- torchx.components.utils.python(*args: str, m: Optional[str] = None, c: Optional[str] = None, script: Optional[str] = None, image: str = 'ghcr.io/pytorch/torchx:0.7.0', name: str = 'torchx_utils_python', cpu: int = 1, gpu: int = 0, memMB: int = 1024, h: Optional[str] = None, num_replicas: int = 1) AppDef [source]¶
Runs
python
with the specified module, command or script on the specified image and host. Use--
to separate component args and program args (e.g.torchx run utils.python --m foo.main -- --args to --main
)- Note: (cpu, gpu, memMB) parameters are mutually exclusive with
h
(named resource) where h
takes precedence if specified for setting resource requirements. See registering named resources.
- Parameters:
args – arguments passed to the program in sys.argv[1:] (ignored with –c)
m – run library module as a script
c – program passed as string (may error if scheduler has a length limit on args)
script – .py script to run
image – image to run on
name – name of the job
cpu – number of cpus per replica
gpu – number of gpus per replica
memMB – cpu memory in MB per replica
h – a registered named resource (if specified takes precedence over cpu, gpu, memMB)
num_replicas – number of copies to run (each on its own container)
- Returns:
- Note: (cpu, gpu, memMB) parameters are mutually exclusive with
- torchx.components.utils.booth(x1: float, x2: float, trial_idx: int = 0, tracker_base: str = '/tmp/torchx-util-booth', image: str = 'ghcr.io/pytorch/torchx:0.7.0') AppDef [source]¶
Evaluates the booth function,
f(x1, x2) = (x1 + 2*x2 - 7)^2 + (2*x1 + x2 - 5)^2
. Output result is accessible viaFsspecResultTracker(outdir)[trial_idx]
- Parameters:
x1 – x1
x2 – x2
trial_idx – ignore if not running hpo
tracker_base – URI of the tracker’s base output directory (e.g. s3://foo/bar)
image – the image that contains the booth app
- torchx.components.utils.binary(*args: str, entrypoint: str, name: str = 'torchx_utils_binary', num_replicas: int = 1, cpu: int = 1, gpu: int = 0, memMB: int = 1024, h: Optional[str] = None) AppDef [source]¶
Test component
- Parameters:
args – arguments passed to the program in sys.argv[1:] (ignored with –c)
name – name of the job
num_replicas – number of copies to run (each on its own container)
cpu – number of cpus per replica
gpu – number of gpus per replica
memMB – cpu memory in MB per replica
h – a registered named resource (if specified takes precedence over cpu, gpu, memMB)
- Returns: