Shortcuts

.torchxconfig

Status: Beta

You can store the scheduler run cfg (run configs) for your project by storing them in the .torchxconfig file. Currently this file is only read and honored when running the component from the CLI.

CLI Usage

Scheduler Config

  1. cd into the directory where you want the .torchxconfig file to be dropped. The CLI only picks up .torchxconfig files from the current-working-directory (CWD) so chose a directory where you typically run torchx from. Typically this is the root of your project directory.

  2. Generate the config file by running

    $ torchx configure -s <comma,delimited,scheduler,names>
    
    # -- or for all registered schedulers --
    $ torchx configure
    
  3. If you specified -s local_cwd,kubernetes, you should see a .torchxconfig file as shown below:

    $ cat .torchxconfig
    [local_cwd]
    
    [kubernetes]
    queue = #FIXME:(str) Volcano queue to schedule job in
    
  4. .torchxconfig in in INI format and the section names map to the scheduler names. Each section contains the run configs for the scheduler as $key = $value pairs. You may find that certain schedulers have empty sections, this means that the scheduler defines sensible defaults for all its run configs hence no run configs are required at runtime. If you’d like to override the default you can add them. TIP: To see all the run options for a scheduler use torchx runopts <scheduler_name>.

  5. The sections with FIXME placeholders are run configs that are required by the scheduler. Replace these with the values that apply to you.

  6. IMPORTANT: If you are happy with the scheduler provided defaults for a particular run config, you should not redundantly specity them in .torchxconfig with the same default value. This is because the scheduler may decide to change the default value at a later date which would leave you with a stale default.

  7. Now you can run your component without having to specify the scheduler run configs each time. Just make sure the directory you are running torchx cli from actually has .torchxconfig!

    $ ls .torchxconfig
    .torchxconfig
    
    $ torchx run -s local_cwd ./my_component.py:train
    
  8. In addition, it is possible to specify a different config other than .torchxconfig to load at runtime. Requirements are that the config path is specified by enviornment variable TORCHX_CONFIG. It also disables hierarchy loading configs from multiple directories as the cases otherwise.

Component Config

You can specify component defaults by adding a section prefixed with component:.

[component:dist.ddp]
j=2x8
cpu=4

Now when you run the dist.ddp component those configs are automatically picked up.

$ torchx run -s local_cwd dist.ddp
... runs with -j 2x8 --cpu 4

CLI Subcommand Config

The default arguments for the torchx subcommands can be overwritten. Any --foo FOO argument can be set via the correspond [cli:<cmd>] settings block.

For the run command you can additionally set component to set the default component to run.

[cli:run]
component=dist.ddp
scheduler=local_docker
workspace=file://some_workspace

Programmatic Usage

Unlike the cli, .torchxconfig file is not picked up automatically from CWD if you are programmatically running your component with torchx.runner.Runner. You’ll have to manually specify the directory containing .torchxconfig.

Below is an example

from torchx.runner import get_runner
from torchx.runner.config import apply
import torchx.specs as specs

def my_component(a: int) -> specs.AppDef:
   # <... component body omitted for brevity ...>
   pass

scheduler = "local_cwd"
cfg = {"log_dir": "/these/take/outmost/precedence"}

apply(scheduler, cfg, dirs=["/home/bob"])  # looks for /home/bob/.torchxconfig
get_runner().run(my_component(1), scheduler, cfg)

You may also specify multiple directories (in preceding order) which is useful when you want to keep personal config overrides on top of a project defined default.

Config API Functions

torchx.runner.config.apply(scheduler: str, cfg: Dict[str, Optional[Union[str, int, float, bool, List[str]]]], dirs: Optional[List[str]] = None) None[source]

Loads a .torchxconfig INI file from the specified directories in preceding order and applies the run configs for the scheduler onto the given cfg.

If no dirs is specified, then it looks for .torchxconfig in the current working directory. If a specified directory does not have .torchxconfig then it is ignored.

Note that the configs already present in the given cfg take precedence over the ones in the config file and only new configs are added. The same holds true for the configs loaded in list order.

For instance if cfg={"foo":"bar"} and the config file is:

# dir_1/.torchxconfig
[local_cwd]
foo = baz
hello = world

# dir_2/.torchxconfig
[local_cwd]
hello = bob

Then after the method call, cfg={"foo":"bar","hello":"world"}.

torchx.runner.config.load(scheduler: str, f: TextIO, cfg: Dict[str, Optional[Union[str, int, float, bool, List[str]]]]) None[source]

loads the section [{scheduler}] from the given configfile f (in .INI format) into the provided runcfg, only adding configs that are NOT currently in the given runcfg (e.g. does not override existing values in runcfg). If no section is found, does nothing.

torchx.runner.config.dump(f: TextIO, schedulers: Optional[List[str]] = None, required_only: bool = False) None[source]

Dumps a default INI-style config template containing the :py:class:torchx.specs.runopts for the given scheduler names into the file-like object specified by f. If no schedulers are specified dumps all known registered schedulers.

Optional runopts are pre-filled with their default values. Required runopts are set with a FIXME: ... placeholder. To only dump required runopts pass required_only=True.

Each scheduler’s runopts are written in the section called [{scheduler_name}].

For example:

[kubernetes]
namespace = default
queue = #FIXME (str)Volcano queue to schedule job in
Raises:

ValueError – if given a scheduler name that is not known

torchx.runner.config.find_configs(dirs: Optional[Iterable[str]] = None) List[str][source]

Finds and returns the filepath to .torchxconfig files based on the following logic:

  1. If the environment variable TORCHXCONFIG exists, then its value is returned in a single-element list and the directories specified through the dirs parameter is NOT searched.

  2. Otherwise, a .torchxconfig file is looked for in dirs and the filepaths to existing config files are returned. If dirs is not specified or is empty then, this method looks for a .torchxconfig file in CWD (current working dir) and returns the filepath to it if one exists.

torchx.runner.config.get_configs(prefix: str, name: str, dirs: Optional[List[str]]) Dict[str, str][source]

Gets all the config values in the section ["{prefix}:{name}"]. Or an empty map if the section does not exist.

Example:

# for config file:
# [foo:bar]
# baz = 1

get_configs(prefix="foo", name="bar") # returns {"baz": "1"}
get_config(prefix="foo", name="barr") # returns {}
torchx.runner.config.get_config(prefix: str, name: str, key: str, dirs: Optional[List[str]] = None) Optional[str][source]

Gets the config value for the key in the section ["{prefix}:{name}"]. Or None if no section or key exists

Example:

# for config file:
# [foo:bar]
# baz = 1

get_config(prefix="foo", name="bar", key="baz") == 1
get_config(prefix="foo", name="bar", key="bazz") == None
get_config(prefix="foo", name="barr", key="baz") == None
get_config(prefix="fooo", name="bar", key="baz") == None
torchx.runner.config.load_sections(prefix: str, dirs: Optional[List[str]] = None) Dict[str, Dict[str, str]][source]

Loads the content of the sections in the given .torchxconfig file that start with the specified prefix. Returns a map of maps of the section name WITHOUT the prefix with the contents of the section loaded into a map. ":" is used as the prefix delimiter.

Example config format for specifying defaults for the builtin component dist.ddp is shown below:

[component:dist.ddp]
j = 1x2
image = ghcr.io/foo:1

# calling `load_sections(prefix="component")` returns
#  {
#    "dist.ddp": {
#       "j":"1x2",
#       "image":"ghcr.io/foo:1",
#     },
#  }

The keys in the section must match the parameter name of the component function. The example below shows how to represent the various types that are allowable as component parameter types.

[component:foo.bar]
int = 1
float = 1.2
bool = True # or False
str = foobar
list = a,b,c
map = A=B,C=D
vararg = -a b --c=d e

# to call the component as:
foo.bar(
   "-a", "b", "--c=d", "e",
   int=1,
   float=1.2,
   bool=True,
   str="foobar",
   list=["a", "b", "c"],
   map={"A":"B", "C": "D"})

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources