Model Overview¶

This page gives an overview of the Transformer models currently supported by adapter-transformers. The table below further shows which model architectures support which adaptation methods and which features of adapter-transformers.

Note

Each supported model architecture X typically provides a class XAdapterModel for usage with AutoAdapterModel. Additionally, it is possible to use adapters with the model classes already shipped with HuggingFace Transformers. E.g., for BERT, this means adapter-transformers provides a BertAdapterModel class, but you can also use BertModel, BertForSequenceClassification etc. together with adapters.

Model	(Bottleneck) Adapters	Prefix Tuning	LoRA	Compacter	Adapter Fusion	Invertible Adapters	Parallel block
ALBERT	✅	✅	✅	✅	✅	✅	✅
BART	✅	✅	✅	✅	✅	✅	✅
BEIT	✅	✅	✅	✅	✅
BERT-Generation	✅	✅	✅	✅	✅	✅	✅
BERT	✅	✅	✅	✅	✅	✅	✅
CLIP	✅	✅	✅	✅	✅	✅
DeBERTa	✅	✅	✅	✅	✅	✅	✅
DeBERTa-v2	✅	✅	✅	✅	✅	✅	✅
DistilBERT	✅	✅	✅	✅	✅	✅	✅
Encoder Decoder	(*)	(*)	(*)	(*)	(*)	(*)
GPT-2	✅	✅	✅	✅	✅	✅	✅
GPT-J	✅	✅	✅	✅	✅	✅	✅
MBart	✅	✅	✅	✅	✅	✅	✅
RoBERTa	✅	✅	✅	✅	✅	✅	✅
T5	✅	✅	✅	✅	✅	✅	✅
ViT	✅	✅	✅	✅	✅	✅	✅
XLM-RoBERTa	✅	✅	✅	✅	✅	✅	✅

(*) If the used encoder and decoder model class are supported.

Missing a model architecture you’d like to use? adapter-transformers can be easily extended to new model architectures as described in Adding Adapters to a Model. Feel free to open an issue requesting support for a new architecture. We very much welcome pull requests adding new model implementations!