Text Encoder extracted from mjaliz/vision-text-dual-encoder-v1
This is the text encoder component extracted from the VisionTextDualEncoder model mjaliz/vision-text-dual-encoder-v1.
Model Details
- Model type: XLMRobertaModel
- Source model: mjaliz/vision-text-dual-encoder-v1
- Includes projection: False
Usage
from transformers import AutoModel, AutoTokenizer
# Load text encoder
model = AutoModel.from_pretrained("mjaliz/siglip-text-encoder")
tokenizer = AutoTokenizer.from_pretrained("mjaliz/siglip-text-encoder")
# Encode text
texts = ["Hello world", "How are you?"]
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
outputs = model(**inputs)
# Get embeddings (pooler output or mean of last hidden state)
if hasattr(outputs, "pooler_output") and outputs.pooler_output is not None:
embeddings = outputs.pooler_output
else:
embeddings = outputs.last_hidden_state.mean(dim=1)
print(embeddings.shape)
Citation
If you use this model, please cite the original dual encoder model.
- Downloads last month
- 14
Model tree for mjaliz/siglip-text-encoder
Base model
mjaliz/vision-text-dual-encoder-v1