sup simcse roberta large LLM Model

Introduction

The Sup-SimCSE-RoBERTa-Large model is developed by Princeton NLP for the task of feature extraction using sentence embeddings. It is a supervised model based on the RoBERTa-large architecture, designed to improve semantic textual similarity tasks.

Architecture

The model is built upon the RoBERTa-large architecture, utilizing contrastive learning techniques to enhance sentence embeddings. It is specifically fine-tuned for feature extraction tasks, making it a suitable choice for semantic textual similarity and related applications.

Training

Training Data

Unsupervised SimCSE: Trained on 106 randomly sampled sentences from English Wikipedia.
Supervised SimCSE: Trained on a combination of MNLI and SNLI datasets, totaling 314k sentences.

Training Procedure

Details on preprocessing, training speeds, sizes, and times are not specified. The model's evaluation employs a modified version of SentEval, focusing on semantic textual similarity tasks and reporting Spearman's correlation.

Environmental Impact

The carbon emissions and environmental impact details remain unspecified. However, estimation can be done using the Machine Learning Impact calculator by Lacoste et al. (2019).

Guide: Running Locally

To run the model locally, follow these steps:

Install Transformers: Ensure you have the transformers library installed.
```
pip install transformers
```

Import and Load Model:

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("princeton-nlp/sup-simcse-roberta-large")
model = AutoModel.from_pretrained("princeton-nlp/sup-simcse-roberta-large")

Use Cloud GPUs: For optimal performance, consider using cloud-based GPU services such as AWS EC2, Google Cloud Platform, or Azure.

License

The licensing information for the Sup-SimCSE-RoBERTa-Large model is not explicitly provided. Users should check the Hugging Face model card or associated GitHub repository for detailed licensing terms.

More Related APIs in Feature Extraction