bleurt tiny 512
ElronIntroduction
BLEURT-TINY-512 is a PyTorch implementation of the original BLEURT models designed for text classification tasks. Developed by Elron Bandel, Thibault Sellam, Dipanjan Das, and Ankur P. Parikh from Google Research, this model builds on BERT architecture for natural language processing. It is available on Hugging Face and can be used for various text classification applications.
Architecture
The BLEURT-TINY-512 model leverages the BERT architecture, specifically tailored for robust text classification. It is optimized to handle natural language generation evaluation tasks. The precise details of the model architecture and its computational requirements are not fully detailed in the documentation.
Training
Training Data
The model was trained using data from the WMT Metrics Shared Task, specifically to-English language pairs from the years 2017 to 2019. The dataset includes thousands of sentence pairs with human ratings, amounting to 5,360, 9,492, and 147,691 records for each year, respectively.
Training Procedure
While specific preprocessing and training speeds are not documented, it is notable that BLEURT models aim to learn robust metrics for text generation evaluation. The model has been evaluated on test sets from 2018 and 2019, although these are noted to be noisier.
Guide: Running Locally
To run BLEURT-TINY-512 locally, follow these steps:
-
Install Dependencies: Ensure you have Python and PyTorch installed. You will also need the Transformers library.
pip install transformers torch
-
Load the Model and Tokenizer:
from transformers import AutoModelForSequenceClassification, AutoTokenizer import torch tokenizer = AutoTokenizer.from_pretrained("Elron/bleurt-tiny-512") model = AutoModelForSequenceClassification.from_pretrained("Elron/bleurt-tiny-512") model.eval()
-
Prepare Inputs and Run Inference:
references = ["hello world", "hello world"] candidates = ["hi universe", "bye world"] with torch.no_grad(): scores = model(**tokenizer(references, candidates, return_tensors='pt'))[0].squeeze() print(scores) # Example output: tensor([-0.9414, -0.5678])
-
Optional: To leverage cloud resources, consider using services like AWS EC2, Google Cloud Platform, or Azure for access to GPUs.
License
The license details for BLEURT-TINY-512 are not explicitly mentioned in the documentation. Users should refer to the repository or contact the model authors for more information regarding licensing and usage permissions.