.. _rendezvous-api:

Rendezvous
==========

.. automodule:: torchelastic.rendezvous

Below is a state diagram describing how rendezvous works.

.. image:: etcd_rdzv_diagram.png


Handler
--------------------

.. currentmodule:: torchelastic.rendezvous

.. autoclass:: RendezvousHandler
   :members:

Exceptions
-------------
.. autoclass:: RendezvousClosedException
.. autoclass:: RendezvousTimeoutException
.. autoclass:: RendezvousNonRetryableError

Implmentations
----------------

Etcd Rendezvous
****************

.. currentmodule:: torchelastic.rendezvous.etcd_rendezvous

.. autoclass:: EtcdRendezvousHandler

.. autoclass:: EtcdRendezvous
   :members:

.. autoclass:: EtcdStore
   :members:

Etcd Server
*************

The ``EtcdServer`` is a convenience class that makes it easy for you to
start and stop an etcd server on a subprocess. This is useful for testing
or single-node (multi-worker) deployments where manually setting up an
etcd server on the side is cumbersome.

.. warning:: For production and multi-node deployments please consider
             properly deploying a highly available etcd server as this is
             the single point of failure for your distributed jobs.

.. currentmodule:: torchelastic.rendezvous.etcd_server

.. autoclass:: EtcdServer