Model Overview¶
This page gives an overview of the Transformer models currently supported by adapter-transformers
.
The table below further shows which model architectures support which adaptation methods and which features of adapter-transformers
.
Note
Each supported model architecture X typically provides a class XAdapterModel
for usage with AutoAdapterModel
.
Additionally, it is possible to use adapters with the model classes already shipped with HuggingFace Transformers.
E.g., for BERT, this means adapter-transformers provides a BertAdapterModel
class, but you can also use BertModel
, BertForSequenceClassification
etc. together with adapters.
Model | (Bottleneck) Adapters |
Prefix Tuning |
LoRA | Compacter | Adapter Fusion |
Invertible Adapters |
Parallel block |
---|---|---|---|---|---|---|---|
ALBERT | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
BART | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
BEIT | ✅ | ✅ | ✅ | ✅ | ✅ | ||
BERT-Generation | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
BERT | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
CLIP | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | |
DeBERTa | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
DeBERTa-v2 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
DistilBERT | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
Encoder Decoder | (*) | (*) | (*) | (*) | (*) | (*) | |
GPT-2 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
GPT-J | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
MBart | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
RoBERTa | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
T5 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
ViT | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
XLM-RoBERTa | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
(*) If the used encoder and decoder model class are supported.
Missing a model architecture you’d like to use? adapter-transformers can be easily extended to new model architectures as described in Adding Adapters to a Model. Feel free to open an issue requesting support for a new architecture. We very much welcome pull requests adding new model implementations!