rst word sense disambiguation 11b

GAIR

RST-WORD-SENSE-DISAMBIGUATION-11B

Introduction

RST-WORD-SENSE-DISAMBIGUATION-11B is part of the RST (Restructured Pre-Training) paradigm, which unifies various data signals from multiple sources to enhance language pre-training. It demonstrates superior performance across diverse NLP tasks and examinations, notably the National College Entrance Examination in China.

Architecture

The model consists of 11 billion parameters and is designed for text-to-text generation tasks using the Transformers library and PyTorch. It supports various applications, including word sense disambiguation, part-of-speech tagging, and other general information extraction tasks. The model integrates signals from numerous datasets like WordNet and Wikipedia, making it versatile for several NLP scenarios.

Training

The RST model is pre-trained using a data-centric approach, emphasizing the importance of structured data storage and access. It leverages JSON data instead of plain text for efficient caching and accessibility. The dataset used comprises 29 signal types with over 46 million samples, sourced from platforms like Rotten Tomatoes, Wikipedia, and WordNet.

Guide: Running Locally

To run the RST model locally, follow these steps:

  1. Install the Transformers library:

    pip install transformers
    
  2. Load the model and tokenizer:

    from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
    
    tokenizer = AutoTokenizer.from_pretrained("XLab/rst-all-11b")
    model = AutoModelForSeq2SeqLM.from_pretrained("XLab/rst-all-11b")
    
  3. Encode inputs and generate outputs:

    inputs = tokenizer.encode("TEXT: this is the best cast iron skillet you will ever buy. QUERY: Is this review \"positive\" or \"negative\"", return_tensors="pt")
    outputs = model.generate(inputs)
    print(tokenizer.decode(outputs[0], skip_special_tokens=True, clean_up_tokenization_spaces=True))
    

Consider using cloud GPUs like AWS, GCP, or Azure for efficient training and inference, especially due to the model's large size.

License

This model is licensed under the Academic Free License v3.0 (AFL-3.0).

More Related APIs in Text2text Generation