bert base uncased
google-bertIntroduction
The BERT Base Model (Uncased) is a pretrained model on English language data using a masked language modeling (MLM) objective. It was introduced in the paper "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding" and first released by Google Research. The model is uncased, meaning it does not differentiate between uppercase and lowercase letters.
Architecture
BERT is a transformer model pretrained in a self-supervised manner on a large corpus of English data. It uses two main objectives during pretraining:
- Masked Language Modeling (MLM): Randomly masks 15% of words and predicts them using context.
- Next Sentence Prediction (NSP): Predicts if two sentences are consecutive in the original text.
BERT models are available in base and large sizes, with variations for cased and uncased text. It supports multiple languages, with uncased and cased versions for languages like Chinese and multilingual texts.
Training
BERT was pretrained using the BookCorpus and English Wikipedia datasets. Texts were lowercased and tokenized using WordPiece with a vocabulary size of 30,000. Training involved a sequence length of up to 512 tokens and utilized cloud TPUs. The model was trained for one million steps using the Adam optimizer with a learning rate of 1e-4.
Guide: Running Locally
-
Install Transformers Library:
pip install transformers
-
Load the Model: For PyTorch:
from transformers import BertTokenizer, BertModel tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') model = BertModel.from_pretrained("bert-base-uncased")
For TensorFlow:
from transformers import BertTokenizer, TFBertModel tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') model = TFBertModel.from_pretrained("bert-base-uncased")
-
Use the Model:
text = "Replace me by any text you'd like." encoded_input = tokenizer(text, return_tensors='pt') output = model(**encoded_input)
-
Cloud GPUs: For efficient training and inference, consider using cloud GPU services such as AWS, Google Cloud, or Azure.
License
BERT is licensed under the Apache 2.0 License.