chinese_roberta_wwm_large_ext_fix_mlm
genggui001Introduction
The CHINESE_ROBERTA_WWM_LARGE_EXT_FIX_MLM
model is a Chinese version of BERT with Whole Word Masking (WWM) designed for Masked Language Modeling (MLM) tasks. It builds upon the chinese-roberta-wwm-ext-large
model and addresses specific parameter initialization issues.
Architecture
This model uses the architecture of RoBERTa tailored for the Chinese language, with a focus on whole word masking. The model is trained using PyTorch and is compatible with TensorFlow, JAX, and Safetensors.
Training
The model was initialized using parameters from hfl/chinese-roberta-wwm-ext-large
. It resolves an issue with missing MLM parameters, detailed in this GitHub issue. During training, only the MLM parameters are trained while other parameters are frozen, to focus the model's learning on masked token prediction.
Guide: Running Locally
To run this model locally:
- Install the Hugging Face Transformers library:
pip install transformers
- Use BERT-related functions to load the model:
from transformers import BertTokenizer, BertForMaskedLM tokenizer = BertTokenizer.from_pretrained('genggui001/chinese_roberta_wwm_large_ext_fix_mlm') model = BertForMaskedLM.from_pretrained('genggui001/chinese_roberta_wwm_large_ext_fix_mlm')
- For faster performance, use cloud GPUs like those available from AWS, Google Cloud, or Azure.
License
This model is licensed under the Apache-2.0 License, allowing for both personal and commercial use.