mobilebert uncased
googleIntroduction
MobileBERT is a compact version of BERT_LARGE, optimized for resource-limited devices. It balances bottleneck structures with self-attentions and feed-forward networks, making it suitable for environments with limited computational resources.
Architecture
MobileBERT maintains the core architecture of BERT_LARGE but introduces bottleneck structures to reduce model size while preserving performance. The model has 24 layers, a hidden size of 128, 512 feed-forward hidden size, 4 attention heads, and a 4-fold reduction factor.
Training
MobileBERT is pre-trained in a task-agnostic manner, allowing it to be fine-tuned for various NLP tasks. The pre-trained checkpoint provided is optimized for uncased English text.
Guide: Running Locally
To use MobileBERT with the Hugging Face Transformers library:
-
Install Transformers:
pip install transformers
-
Use the model in a Python script:
from transformers import pipeline fill_mask = pipeline( "fill-mask", model="google/mobilebert-uncased", tokenizer="google/mobilebert-uncased" ) print( fill_mask(f"HuggingFace is creating a {fill_mask.tokenizer.mask_token} that the community uses to solve NLP tasks.") )
-
For faster performance, consider using a cloud GPU service like AWS, Google Cloud, or Azure.
License
The MobileBERT model is licensed under the Apache 2.0 License, permitting broad usage and distribution with minimal restrictions.