bert political election2020 twitter mlm
kornoskIntroduction
The BERT-POLITICAL-ELECTION2020-TWITTER-MLM model is a pre-trained language model designed for stance detection in the context of the 2020 US Presidential Election. Developed as part of research presented at NAACL 2021, this model builds on BERT-base (uncased) and is trained on a corpus of over 5 million tweets related to the election.
Architecture
The model uses the BERT architecture, specifically the BERT-base variant, which is uncased. It is designed to perform masked language modeling (MLM), a common pre-training task for transformer-based models.
Training
The model was pre-trained using a normal MLM objective on a dataset comprising more than 5 million tweets in English about the 2020 US Presidential Election. This training prepares the model to be fine-tuned for various downstream tasks, such as text classification.
Guide: Running Locally
To run the BERT-POLITICAL-ELECTION2020-TWITTER-MLM model locally, follow these steps:
-
Set Up Environment: Ensure you have Python and PyTorch installed. Use a cloud GPU like AWS, Google Cloud, or Azure for optimal performance.
-
Install Transformers Library:
pip install transformers torch
-
Load the Model:
from transformers import BertTokenizer, BertForMaskedLM, pipeline import torch # Choose GPU if available device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # Load model pretrained_LM_path = "kornosk/bert-political-election2020-twitter-mlm" tokenizer = BertTokenizer.from_pretrained(pretrained_LM_path) model = BertForMaskedLM.from_pretrained(pretrained_LM_path) # Fill mask example example = "Trump is the [MASK] of USA" fill_mask = pipeline('fill-mask', model=pretrained_LM_path, tokenizer=tokenizer) outputs = fill_mask(example) print(outputs)
-
Fine-Tune for Downstream Tasks: Use the model for specific tasks like text classification by training it on your dataset.
License
The model is available under the GPL-3.0 license, which allows for free use, modification, and distribution, provided that any derivative works also adhere to the same license terms.