The MaxVit transformer models are based on the MaxViT: Multi-Axis Vision Transformer paper.
The following model builders can be used to instantiate an MaxVit model with and without pre-trained weights.
All the model builders internally rely on the
base class. Please refer to the source code for
more details about this class.
Constructs a maxvit_t architecture from MaxViT: Multi-Axis Vision Transformer.