Betelgeuse bert base uncased
prithivMLmodsBETELGEUSE BERT BASED UNCASED
Introduction
Betelgeuse-BERT-Base-Uncased is a transformer model based on BERT, pretrained on a large corpus of English data using self-supervised learning methods. It utilizes masked language modeling (MLM) and next sentence prediction (NSP) objectives to learn a bidirectional representation of the text. The model is primarily designed for fine-tuning on various NLP tasks such as sequence classification and question answering.
Architecture
The model is a variant of BERT, available in base and large configurations, with cased and uncased options. The uncased version removes accent markers. Preprocessing involves lowercasing and tokenizing the text using WordPiece, with a vocabulary size of 30,000. Inputs consist of concatenated sentences with special tokens [CLS]
and [SEP]
.
Training
The model was pretrained on 4 cloud TPUs over one million steps with a batch size of 256. The training employed the Adam optimizer with a learning rate of 1e-4. The sequence length was limited to 128 tokens for most steps, with some training conducted at 512 tokens. The model's training data is characterized as fairly neutral, though it may produce biased predictions due to inherent biases in the data.
Guide: Running Locally
- Setup Environment: Ensure Python and the
transformers
library are installed. - Load Model:
from transformers import pipeline unmasker = pipeline('fill-mask', model='prithivMLmods/Betelgeuse-bert-base-uncased')
- Use Model:
result = unmasker("Hello I'm a [MASK] model.") print(result)
- Cloud GPUs: For large-scale operations, consider using cloud-based GPU services such as AWS EC2, Google Cloud, or Azure.
License
This model is licensed under the CreativeML Open RAIL-M license.