rst word sense disambiguation 11b
GAIRRST-WORD-SENSE-DISAMBIGUATION-11B
Introduction
RST-WORD-SENSE-DISAMBIGUATION-11B is part of the RST (Restructured Pre-Training) paradigm, which unifies various data signals from multiple sources to enhance language pre-training. It demonstrates superior performance across diverse NLP tasks and examinations, notably the National College Entrance Examination in China.
Architecture
The model consists of 11 billion parameters and is designed for text-to-text generation tasks using the Transformers library and PyTorch. It supports various applications, including word sense disambiguation, part-of-speech tagging, and other general information extraction tasks. The model integrates signals from numerous datasets like WordNet and Wikipedia, making it versatile for several NLP scenarios.
Training
The RST model is pre-trained using a data-centric approach, emphasizing the importance of structured data storage and access. It leverages JSON data instead of plain text for efficient caching and accessibility. The dataset used comprises 29 signal types with over 46 million samples, sourced from platforms like Rotten Tomatoes, Wikipedia, and WordNet.
Guide: Running Locally
To run the RST model locally, follow these steps:
-
Install the Transformers library:
pip install transformers
-
Load the model and tokenizer:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("XLab/rst-all-11b") model = AutoModelForSeq2SeqLM.from_pretrained("XLab/rst-all-11b")
-
Encode inputs and generate outputs:
inputs = tokenizer.encode("TEXT: this is the best cast iron skillet you will ever buy. QUERY: Is this review \"positive\" or \"negative\"", return_tensors="pt") outputs = model.generate(inputs) print(tokenizer.decode(outputs[0], skip_special_tokens=True, clean_up_tokenization_spaces=True))
Consider using cloud GPUs like AWS, GCP, or Azure for efficient training and inference, especially due to the model's large size.
License
This model is licensed under the Academic Free License v3.0 (AFL-3.0).