Embeddings¶
Since v2.3.0, adapter-transformers
supports dynamically adding, loading, and deleting of Embeddings
. This section
will give you an overview of these features.
Adding and Deleting Embeddings¶
The methods for handling embeddings are similar to the ones handling adapters. To add new embeddings we call
add_embeddings
. This adds new embeddings for the vocabulary of the tokenizer
.
In some cases, it might be useful to initialize embeddings of tokens to the ones of another embeddings module. If a
reference_embedding
and reference_tokenizer
are provided all embeddings for tokens that are present in both embeddings are initialized to the
to the embedding provided by the reference_embedding
. The new embedding will be created and set as the active embedding. If you are unsure which embedding
is currently active, the active_embeddings
property contains the currently active embedding.
model.add_embeddings('name', tokenizer, reference_embedding='default', reference_tokenizer=reference_tokenizer)
embedding_name = model.active_embeddings
The original embedding of the transformers model is always available under the name "default"
. To set it as the active
embedding simply call the set_active_embedding('name')
method.
model.set_active_embeddings("default")
Similarly, all other embeddings can be set as active by passing their name to the set_active_embedding
method.
To delete an embedding that is no longer needed we can call the delete_embeddings
method with the name of the adapter
we want to delete. However, you can not delete the default embedding.
model.delete_embeddings('name')
Please note, that if the active embedding is deleted the default embedding is set as the active embedding.
Saving and Loading Embeddings¶
You can save the embeddings by calling save_embeddings('path/to/dir', 'name')
and load them with load_embeddings('path/to/dir', 'name')
.
model.save_embeddings(path, 'name')
model.load_embeddings(path, 'reloaded_name')
The path needs to be to a directory in which the weights of the embedding will be saved.
You can save and load the tokenizer
with the embedding by passing the tokenizer to save_embeddings
.
model.save_embeddings(path, 'name', tokenizer)
loaded_tokenizer = model.load_embeddings(path, 'name')