hindi bert
monsoon-nlpIntroduction
The Hindi-BERT model by Monsoon-NLP is a Hindi language model utilizing Google Research's ELECTRA. It is designed for various NLP tasks in Hindi and offers comparable results to Multilingual BERT in specific tasks such as news classification and sentiment analysis. The model leverages SimpleTransformers for fine-tuning and provides resources for further exploration and training.
Architecture
The model is based on the ELECTRA architecture, which focuses on efficiency in pretraining by replacing certain tokens in the input sentence and training the model to distinguish between real and replaced tokens. This model has been fine-tuned to provide results on tasks like news classification and sentiment analysis in Hindi.
Training
The training process utilizes a structured dataset within a designated directory containing vocabulary files, pretraining records, and model checkpoints. The corpus includes Hindi CommonCrawl data and the latest Hindi Wikipedia dumps. The vocabulary for this model can be modified using Hugging Face Tokenizers, allowing customization in vocabulary size for different tasks.
Guide: Running Locally
- Setup: Clone the Transformers repository from Hugging Face.
- Convert Checkpoints: Use the provided script to convert ELECTRA checkpoints to a format compatible with Transformers.
git clone https://github.com/huggingface/transformers python ./transformers/src/transformers/convert_electra_original_tf_checkpoint_to_pytorch.py \ --tf_checkpoint_path=./models/checkpointdir \ --config_file=config.json \ --pytorch_dump_path=pytorch_model.bin \ --discriminator_or_generator=discriminator
- Load Model: Load the model in TensorFlow for further use.
from transformers import TFElectraForPreTraining model = TFElectraForPreTraining.from_pretrained("./dir_with_pytorch", from_pt=True) model.save_pretrained("tf")
- Upload Model: Prepare the model directory and use
transformers-cli
to upload.transformers-cli upload directory
For enhanced performance, consider using cloud GPUs such as those provided by Google Colab or AWS.
License
The model and its accompanying resources are released under a license specified by the creators, ensuring compliance with usage terms and distribution rights. Please review the license details in the respective repositories and documentation.