albert base chinese cluecorpussmall
uerIntroduction
The ALBERT-Base-Chinese-CLUECorpusSmall model is a Chinese ALBERT model pre-trained with the UER-py toolkit. This model is designed for fill-mask tasks and supports both PyTorch and TensorFlow. It is pre-trained using the CLUECorpusSmall dataset and is available on Hugging Face.
Architecture
The model architecture includes ALBERT configurations with different sizes, such as ALBERT-Base (L=12/H=768) and ALBERT-Large (L=24/H=1024). ALBERT aims to improve the efficiency of BERT models while maintaining similar performance levels.
Training
The model is pre-trained using the CLUECorpusSmall dataset. The training process involves two stages:
- Pre-training for 1,000,000 steps with a sequence length of 128.
- Continuing for an additional 250,000 steps with a sequence length of 512.
Training is conducted on Tencent Cloud using UER-py, with the final model converted to Hugging Face's format.
Guide: Running Locally
- Clone the Model: Download the model from the UER-py Modelzoo or Hugging Face.
- Setup Environment: Ensure that the
transformers
library is installed. - Load Model:
- For PyTorch:
from transformers import BertTokenizer, AlbertForMaskedLM tokenizer = BertTokenizer.from_pretrained("uer/albert-base-chinese-cluecorpussmall") model = AlbertForMaskedLM.from_pretrained("uer/albert-base-chinese-cluecorpussmall")
- For TensorFlow:
from transformers import BertTokenizer, TFAlbertModel tokenizer = BertTokenizer.from_pretrained("uer/albert-base-chinese-cluecorpussmall") model = TFAlbertModel.from_pretrained("uer/albert-base-chinese-cluecorpussmall")
- For PyTorch:
- Use Model: Use the model for tasks such as fill-mask prediction or feature extraction.
For optimal performance, consider using cloud GPU services like AWS, Azure, or Google Cloud.
License
The model and associated resources are made available under open-source licenses as outlined in the respective repositories and documentation. Users should review the licenses for compliance and usage terms.