bert base chinese finetuned ner

leonadase

BERT-BASE-CHINESE-FINETUNED-NER

Introduction

BERT-BASE-CHINESE-FINETUNED-NER is a model fine-tuned from bert-base-chinese on the fdner dataset for token classification tasks. It achieves notable evaluation results, including a precision of 0.9146, recall of 0.9414, F1 score of 0.9278, and accuracy of 0.9751.

Architecture

This model is based on the BERT architecture, specifically tailored for token classification tasks in the Chinese language. It makes use of the Transformers library and is compatible with PyTorch and TensorBoard.

Training

Training Procedure

The model was trained using the following hyperparameters:

  • Learning rate: 2e-05
  • Train batch size: 10
  • Eval batch size: 10
  • Seed: 42
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • Learning rate scheduler type: Linear
  • Number of epochs: 30

Training Results

Training was conducted over 30 epochs, with metrics improving significantly over time, culminating in a final validation loss of 0.1016 and the aforementioned evaluation metrics.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Clone the Repository:

    git clone https://huggingface.co/leonadase/bert-base-chinese-finetuned-ner
    cd bert-base-chinese-finetuned-ner
    
  2. Install Dependencies: Install the necessary Python packages:

    pip install transformers torch datasets
    
  3. Load the Model: Use the Transformers library to load and start using the model:

    from transformers import BertTokenizer, BertForTokenClassification
    
    tokenizer = BertTokenizer.from_pretrained("leonadase/bert-base-chinese-finetuned-ner")
    model = BertForTokenClassification.from_pretrained("leonadase/bert-base-chinese-finetuned-ner")
    
  4. Inference: Prepare your text data and tokenize it before running inference through the model.

Consider using cloud GPUs from providers like AWS, Google Cloud, or Azure for efficient training and inference, especially if working with large datasets or requiring high computational power.

License

The licensing information for this model is not specified in the provided details. Please refer to the Hugging Face repository or contact the model creator for more information on usage rights and restrictions.

More Related APIs in Token Classification