rubert base cased nli threeway

cointegrated

Introduction

The RUBERT-BASE-CASED-NLI-THREEWAY is a model developed by Cointegrated, based on the DeepPavlov's RuBERT model. It is fine-tuned to handle Natural Language Inference (NLI) tasks, determining the logical relationship between pairs of Russian texts: entailment, contradiction, or neutral.

Architecture

The model is based on the BERT architecture, specifically RuBERT, which is tailored for the Russian language. It has been extended to handle three-way classification tasks in NLI.

Training

This model is trained on an array of datasets translated into Russian from English, such as JOCI, MNLI, MPE, SICK, and SNLI. The training involves predicting entailment, contradiction, or neutrality between text pairs. Performance metrics are given as ROC AUC scores across various datasets.

Guide: Running Locally

  1. Install Dependencies: Ensure you have Python installed. Use pip to install necessary libraries:

    pip install transformers sentencepiece --quiet
    
  2. Load the Model and Tokenizer:

    import torch
    from transformers import AutoTokenizer, AutoModelForSequenceClassification
    
    model_checkpoint = 'cointegrated/rubert-base-cased-nli-threeway'
    tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
    model = AutoModelForSequenceClassification.from_pretrained(model_checkpoint)
    if torch.cuda.is_available():
        model.cuda()
    
  3. Inference: Use the model for inference by preparing input texts and processing them through the model:

    text1 = 'Сократ - человек, а все люди смертны.'
    text2 = 'Сократ никогда не умрёт.'
    with torch.inference_mode():
        out = model(**tokenizer(text1, text2, return_tensors='pt').to(model.device))
        proba = torch.softmax(out.logits, -1).cpu().numpy()[0]
    print({v: proba[k] for k, v in model.config.id2label.items()})
    
  4. GPU Acceleration: For enhanced performance, utilize cloud services offering GPU resources such as AWS, Google Cloud, or Microsoft Azure.

License

The usage of this model is subject to the licensing terms set forth by the model's creators on its Hugging Face repository. Ensure compliance with the terms when using or distributing the model.

More Related APIs in Zero Shot Classification