Introduction

The Hindi-BERT model by Monsoon-NLP is a Hindi language model utilizing Google Research's ELECTRA. It is designed for various NLP tasks in Hindi and offers comparable results to Multilingual BERT in specific tasks such as news classification and sentiment analysis. The model leverages SimpleTransformers for fine-tuning and provides resources for further exploration and training.

Architecture

The model is based on the ELECTRA architecture, which focuses on efficiency in pretraining by replacing certain tokens in the input sentence and training the model to distinguish between real and replaced tokens. This model has been fine-tuned to provide results on tasks like news classification and sentiment analysis in Hindi.

Training

The training process utilizes a structured dataset within a designated directory containing vocabulary files, pretraining records, and model checkpoints. The corpus includes Hindi CommonCrawl data and the latest Hindi Wikipedia dumps. The vocabulary for this model can be modified using Hugging Face Tokenizers, allowing customization in vocabulary size for different tasks.

Guide: Running Locally

  1. Setup: Clone the Transformers repository from Hugging Face.
  2. Convert Checkpoints: Use the provided script to convert ELECTRA checkpoints to a format compatible with Transformers.
    git clone https://github.com/huggingface/transformers
    python ./transformers/src/transformers/convert_electra_original_tf_checkpoint_to_pytorch.py \
      --tf_checkpoint_path=./models/checkpointdir \
      --config_file=config.json \
      --pytorch_dump_path=pytorch_model.bin \
      --discriminator_or_generator=discriminator
    
  3. Load Model: Load the model in TensorFlow for further use.
    from transformers import TFElectraForPreTraining
    model = TFElectraForPreTraining.from_pretrained("./dir_with_pytorch", from_pt=True)
    model.save_pretrained("tf")
    
  4. Upload Model: Prepare the model directory and use transformers-cli to upload.
    transformers-cli upload directory
    

For enhanced performance, consider using cloud GPUs such as those provided by Google Colab or AWS.

License

The model and its accompanying resources are released under a license specified by the creators, ensuring compliance with usage terms and distribution rights. Please review the license details in the respective repositories and documentation.

More Related APIs in Feature Extraction