Sentence Similarity
sentence-transformers
PyTorch
Transformers
English
t5
text-embedding
embeddings
information-retrieval
beir
text-classification
language-model
text-clustering
text-semantic-similarity
text-evaluation
prompt-retrieval
text-reranking
feature-extraction
English
Sentence Similarity
natural_questions
ms_marco
fever
hotpot_qa
mteb
Eval Results (legacy)
Instructions to use hkunlp/instructor-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use hkunlp/instructor-base with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("hkunlp/instructor-base") sentences = [ "That is a happy person", "That is a happy dog", "That is a very happy person", "Today is a sunny day" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Transformers
How to use hkunlp/instructor-base with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("hkunlp/instructor-base") model = AutoModel.from_pretrained("hkunlp/instructor-base") - Notebooks
- Google Colab
- Kaggle
fix typo
#1
by tianbaoxiexxx - opened
README.md
CHANGED
|
@@ -11,7 +11,7 @@ tags:
|
|
| 11 |
|
| 12 |
# hkunlp/instructor-base
|
| 13 |
We introduce **Instructor**👨🏫, an instruction-finetuned text embedding model that can generate text embeddings tailored to any task (e.g., classification, retrieval, clustering, text evaluation, etc.) and domains (e.g., science, finance, etc.) ***by simply providing the task instruction, without any finetuning***. Instructor👨 achieves sota on 70 diverse embedding tasks!
|
| 14 |
-
The model is easy to use with `sentence-transformer` library.
|
| 15 |
|
| 16 |
## Quick start
|
| 17 |
<hr />
|
|
@@ -43,7 +43,7 @@ If you want to calculate customized embeddings for specific sentences, you may f
|
|
| 43 |
Represent the `domain` `text_type` for `task_objective`; Input:
|
| 44 |
* `domain` is optional, and it specifies the domain of the text, e.g., science, finance, medicine, etc.
|
| 45 |
* `text_type` is required, and it specifies the encoding unit, e.g., sentence, document, paragraph, etc.
|
| 46 |
-
* `task_objective` is optional, and it specifies the objective of
|
| 47 |
|
| 48 |
## Calculate Sentence similarities
|
| 49 |
You can further use the model to compute similarities between two groups of sentences, with **customized embeddings**.
|
|
|
|
| 11 |
|
| 12 |
# hkunlp/instructor-base
|
| 13 |
We introduce **Instructor**👨🏫, an instruction-finetuned text embedding model that can generate text embeddings tailored to any task (e.g., classification, retrieval, clustering, text evaluation, etc.) and domains (e.g., science, finance, etc.) ***by simply providing the task instruction, without any finetuning***. Instructor👨 achieves sota on 70 diverse embedding tasks!
|
| 14 |
+
The model is easy to use with the `sentence-transformer` library.
|
| 15 |
|
| 16 |
## Quick start
|
| 17 |
<hr />
|
|
|
|
| 43 |
Represent the `domain` `text_type` for `task_objective`; Input:
|
| 44 |
* `domain` is optional, and it specifies the domain of the text, e.g., science, finance, medicine, etc.
|
| 45 |
* `text_type` is required, and it specifies the encoding unit, e.g., sentence, document, paragraph, etc.
|
| 46 |
+
* `task_objective` is optional, and it specifies the objective of embedding, e.g., retrieve a document, classify the sentence, etc.
|
| 47 |
|
| 48 |
## Calculate Sentence similarities
|
| 49 |
You can further use the model to compute similarities between two groups of sentences, with **customized embeddings**.
|