roberta large snli_mnli_fever_anli_ R1_ R2_ R3 nli LLM Model

Introduction

The model is a robust pre-trained RoBERTa-Large Natural Language Inference (NLI) model. It integrates multiple well-known NLI datasets, including SNLI, MNLI, FEVER-NLI, and ANLI (R1, R2, R3), to enhance its inference capabilities. This model is developed by Yixin Nie and is available under the MIT license.

Architecture

The model is based on the RoBERTa architecture, which is an optimized version of BERT. RoBERTa-Large is designed to perform better in NLP tasks by using a larger training dataset and larger batches. It supports various transformers like ALBert, BART, ELECTRA, and XLNet as alternatives.

Training

The training process involves a diverse set of NLI datasets:

SNLI
MNLI
FEVER-NLI
ANLI (Rounds 1, 2, 3)

These datasets provide a comprehensive set of examples, enabling the model to understand and reason about natural language effectively. The model was trained using PyTorch and JAX libraries.

Guide: Running Locally

To run the model locally, ensure you have Python installed along with the necessary libraries such as PyTorch and Transformers. Here's a basic setup guide:

Install Libraries:
```
pip install torch transformers
```

Setup Script:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

if __name__ == '__main__':
    max_length = 256
    premise = "Two women are embracing while holding to go packages."
    hypothesis = "The men are fighting outside a deli."
    hg_model_hub_name = "ynie/roberta-large-snli_mnli_fever_anli_R1_R2_R3-nli"
    tokenizer = AutoTokenizer.from_pretrained(hg_model_hub_name)
    model = AutoModelForSequenceClassification.from_pretrained(hg_model_hub_name)
    tokenized_input_seq_pair = tokenizer.encode_plus(premise, hypothesis,
                                                     max_length=max_length,
                                                     return_token_type_ids=True, truncation=True)
    input_ids = torch.Tensor(tokenized_input_seq_pair['input_ids']).long().unsqueeze(0)
    token_type_ids = torch.Tensor(tokenized_input_seq_pair['token_type_ids']).long().unsqueeze(0)
    attention_mask = torch.Tensor(tokenized_input_seq_pair['attention_mask']).long().unsqueeze(0)

    outputs = model(input_ids, attention_mask=attention_mask, token_type_ids=token_type_ids)
    predicted_probability = torch.softmax(outputs[0], dim=1)[0].tolist()
    print("Entailment:", predicted_probability[0])
    print("Neutral:", predicted_probability[1])
    print("Contradiction:", predicted_probability[2])

Run the Script: Execute your script in a Python environment.
Suggest using Cloud GPUs: For better performance and handling large models, consider using cloud-based GPU services such as AWS, GCP, or Azure.

License

The model is distributed under the MIT license, which allows for open-source usage with minimal restrictions.

More Related APIs in Text Classification