bert fa zwnj base

HooshvareLab

Introduction

ParsBERT is a monolingual language model specifically designed for Persian language understanding. It is based on Google's BERT architecture and has been pre-trained on an extensive range of Persian texts, covering diverse writing styles from various subjects such as scientific articles, novels, and news. The model is capable of handling the zero-width non-joiner character in Persian writing.

Architecture

ParsBERT utilizes the Transformer-based architecture of BERT, which is renowned for its effectiveness in natural language processing tasks. It is tailored to handle the Persian language by incorporating specific features like the zero-width non-joiner character.

Training

The model has been trained on a comprehensive set of Persian corpora that includes multiple types of text. This training encompassed a broad array of vocabulary, enabling ParsBERT to understand and generate Persian text with improved accuracy.

Guide: Running Locally

To run ParsBERT locally, follow these basic steps:

  1. Installation: Make sure you have Python 3.6 or higher installed. Set up a virtual environment and install the Hugging Face Transformers library.
    pip install transformers
    
  2. Download the Model: You can download ParsBERT from Hugging Face's Model Hub.
  3. Run Inference: Use the model with the Transformers library to perform tasks such as fill-mask or text classification.

For optimal performance, consider using a cloud GPU service such as Google Colab, AWS, or Azure, which can significantly speed up processing times for large models like ParsBERT.

License

ParsBERT is released under the Apache 2.0 License. This permits use, distribution, and modification of the software, provided that the original license is included with any substantial portions of the software.

More Related APIs in Fill Mask