sup simcse roberta large
princeton-nlpIntroduction
The Sup-SimCSE-RoBERTa-Large model is developed by Princeton NLP for the task of feature extraction using sentence embeddings. It is a supervised model based on the RoBERTa-large architecture, designed to improve semantic textual similarity tasks.
Architecture
The model is built upon the RoBERTa-large architecture, utilizing contrastive learning techniques to enhance sentence embeddings. It is specifically fine-tuned for feature extraction tasks, making it a suitable choice for semantic textual similarity and related applications.
Training
Training Data
- Unsupervised SimCSE: Trained on 106 randomly sampled sentences from English Wikipedia.
- Supervised SimCSE: Trained on a combination of MNLI and SNLI datasets, totaling 314k sentences.
Training Procedure
Details on preprocessing, training speeds, sizes, and times are not specified. The model's evaluation employs a modified version of SentEval, focusing on semantic textual similarity tasks and reporting Spearman's correlation.
Environmental Impact
The carbon emissions and environmental impact details remain unspecified. However, estimation can be done using the Machine Learning Impact calculator by Lacoste et al. (2019).
Guide: Running Locally
To run the model locally, follow these steps:
-
Install Transformers: Ensure you have the
transformers
library installed.pip install transformers
-
Import and Load Model:
from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("princeton-nlp/sup-simcse-roberta-large") model = AutoModel.from_pretrained("princeton-nlp/sup-simcse-roberta-large")
-
Use Cloud GPUs: For optimal performance, consider using cloud-based GPU services such as AWS EC2, Google Cloud Platform, or Azure.
License
The licensing information for the Sup-SimCSE-RoBERTa-Large model is not explicitly provided. Users should check the Hugging Face model card or associated GitHub repository for detailed licensing terms.