bert political election2020 twitter mlm

kornosk

Introduction

The BERT-POLITICAL-ELECTION2020-TWITTER-MLM model is a pre-trained language model designed for stance detection in the context of the 2020 US Presidential Election. Developed as part of research presented at NAACL 2021, this model builds on BERT-base (uncased) and is trained on a corpus of over 5 million tweets related to the election.

Architecture

The model uses the BERT architecture, specifically the BERT-base variant, which is uncased. It is designed to perform masked language modeling (MLM), a common pre-training task for transformer-based models.

Training

The model was pre-trained using a normal MLM objective on a dataset comprising more than 5 million tweets in English about the 2020 US Presidential Election. This training prepares the model to be fine-tuned for various downstream tasks, such as text classification.

Guide: Running Locally

To run the BERT-POLITICAL-ELECTION2020-TWITTER-MLM model locally, follow these steps:

  1. Set Up Environment: Ensure you have Python and PyTorch installed. Use a cloud GPU like AWS, Google Cloud, or Azure for optimal performance.

  2. Install Transformers Library:

    pip install transformers torch
    
  3. Load the Model:

    from transformers import BertTokenizer, BertForMaskedLM, pipeline
    import torch
    
    # Choose GPU if available
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    
    # Load model
    pretrained_LM_path = "kornosk/bert-political-election2020-twitter-mlm"
    tokenizer = BertTokenizer.from_pretrained(pretrained_LM_path)
    model = BertForMaskedLM.from_pretrained(pretrained_LM_path)
    
    # Fill mask example
    example = "Trump is the [MASK] of USA"
    fill_mask = pipeline('fill-mask', model=pretrained_LM_path, tokenizer=tokenizer)
    outputs = fill_mask(example)
    print(outputs)
    
  4. Fine-Tune for Downstream Tasks: Use the model for specific tasks like text classification by training it on your dataset.

License

The model is available under the GPL-3.0 license, which allows for free use, modification, and distribution, provided that any derivative works also adhere to the same license terms.

More Related APIs in Fill Mask