albert base chinese cluecorpussmall

uer

Introduction

The ALBERT-Base-Chinese-CLUECorpusSmall model is a Chinese ALBERT model pre-trained with the UER-py toolkit. This model is designed for fill-mask tasks and supports both PyTorch and TensorFlow. It is pre-trained using the CLUECorpusSmall dataset and is available on Hugging Face.

Architecture

The model architecture includes ALBERT configurations with different sizes, such as ALBERT-Base (L=12/H=768) and ALBERT-Large (L=24/H=1024). ALBERT aims to improve the efficiency of BERT models while maintaining similar performance levels.

Training

The model is pre-trained using the CLUECorpusSmall dataset. The training process involves two stages:

  1. Pre-training for 1,000,000 steps with a sequence length of 128.
  2. Continuing for an additional 250,000 steps with a sequence length of 512.

Training is conducted on Tencent Cloud using UER-py, with the final model converted to Hugging Face's format.

Guide: Running Locally

  1. Clone the Model: Download the model from the UER-py Modelzoo or Hugging Face.
  2. Setup Environment: Ensure that the transformers library is installed.
  3. Load Model:
    • For PyTorch:
      from transformers import BertTokenizer, AlbertForMaskedLM
      tokenizer = BertTokenizer.from_pretrained("uer/albert-base-chinese-cluecorpussmall")
      model = AlbertForMaskedLM.from_pretrained("uer/albert-base-chinese-cluecorpussmall")
      
    • For TensorFlow:
      from transformers import BertTokenizer, TFAlbertModel
      tokenizer = BertTokenizer.from_pretrained("uer/albert-base-chinese-cluecorpussmall")
      model = TFAlbertModel.from_pretrained("uer/albert-base-chinese-cluecorpussmall")
      
  4. Use Model: Use the model for tasks such as fill-mask prediction or feature extraction.

For optimal performance, consider using cloud GPU services like AWS, Azure, or Google Cloud.

License

The model and associated resources are made available under open-source licenses as outlined in the respective repositories and documentation. Users should review the licenses for compliance and usage terms.

More Related APIs in Fill Mask