deberta base mnli

microsoft

Introduction

DeBERTa (Decoding-Enhanced BERT with Disentangled Attention) is an advanced transformer model developed by Microsoft. It enhances the BERT and RoBERTa models with a disentangled attention mechanism and an improved mask decoder, leading to superior performance on natural language understanding (NLU) tasks. The DeBERTa-base-MNLI model is a version fine-tuned specifically on the MNLI (Multi-Genre Natural Language Inference) task.

Architecture

DeBERTa incorporates a unique attention mechanism that disentangles the position and content information of words. This approach allows DeBERTa to better capture the relationships between words in a sentence, enhancing its effectiveness on a variety of NLU tasks. The model uses 80GB of training data to achieve state-of-the-art results.

Training

DeBERTa has been evaluated on several benchmark datasets, including SQuAD 1.1/2.0 and MNLI. The base version of DeBERTa has demonstrated improved performance over models like RoBERTa-base and XLNet-Large:

  • SQuAD 1.1: 93.1/87.2
  • SQuAD 2.0: 86.2/83.1
  • MNLI-m: 88.8

Guide: Running Locally

To run DeBERTa-base-MNLI locally:

  1. Install Dependencies: Ensure you have Python and the PyTorch library installed. Additionally, install the transformers library from Hugging Face.
    pip install transformers torch
    
  2. Load the Model: Use the following code snippet to load the model in your Python environment.
    from transformers import DebertaTokenizer, DebertaForSequenceClassification
    
    tokenizer = DebertaTokenizer.from_pretrained('microsoft/deberta-base-mnli')
    model = DebertaForSequenceClassification.from_pretrained('microsoft/deberta-base-mnli')
    
  3. Inference: Tokenize input text and use the model to make predictions.
    inputs = tokenizer("I love you. [SEP] I like you.", return_tensors="pt")
    outputs = model(**inputs)
    

For more efficient execution, especially with large datasets or models, consider using cloud-based GPUs such as those provided by AWS, Google Cloud Platform, or Microsoft Azure.

License

DeBERTa is released under the MIT License, allowing for wide usage and modification. Ensure compliance with the terms of this license when using or distributing the model.

More Related APIs in Text Classification