bert base uncased

google-bert

Introduction

The BERT Base Model (Uncased) is a pretrained model on English language data using a masked language modeling (MLM) objective. It was introduced in the paper "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding" and first released by Google Research. The model is uncased, meaning it does not differentiate between uppercase and lowercase letters.

Architecture

BERT is a transformer model pretrained in a self-supervised manner on a large corpus of English data. It uses two main objectives during pretraining:

  • Masked Language Modeling (MLM): Randomly masks 15% of words and predicts them using context.
  • Next Sentence Prediction (NSP): Predicts if two sentences are consecutive in the original text.

BERT models are available in base and large sizes, with variations for cased and uncased text. It supports multiple languages, with uncased and cased versions for languages like Chinese and multilingual texts.

Training

BERT was pretrained using the BookCorpus and English Wikipedia datasets. Texts were lowercased and tokenized using WordPiece with a vocabulary size of 30,000. Training involved a sequence length of up to 512 tokens and utilized cloud TPUs. The model was trained for one million steps using the Adam optimizer with a learning rate of 1e-4.

Guide: Running Locally

  1. Install Transformers Library:

    pip install transformers
    
  2. Load the Model: For PyTorch:

    from transformers import BertTokenizer, BertModel
    tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
    model = BertModel.from_pretrained("bert-base-uncased")
    

    For TensorFlow:

    from transformers import BertTokenizer, TFBertModel
    tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
    model = TFBertModel.from_pretrained("bert-base-uncased")
    
  3. Use the Model:

    text = "Replace me by any text you'd like."
    encoded_input = tokenizer(text, return_tensors='pt')
    output = model(**encoded_input)
    
  4. Cloud GPUs: For efficient training and inference, consider using cloud GPU services such as AWS, Google Cloud, or Azure.

License

BERT is licensed under the Apache 2.0 License.

More Related APIs in Fill Mask