sbert base chinese nli
uerIntroduction
The SBERT-BASE-CHINESE-NLI model is a pre-trained sentence embedding model designed for sentence similarity tasks in Chinese, utilizing Sentence-BERT architecture. It was developed using the UER-py framework and can also be pre-trained with TencentPretrain.
Architecture
The model is based on the Sentence-BERT architecture, utilizing a pre-trained chinese_roberta_L-12_H-768
model. It is designed for feature extraction and sentence similarity tasks, using cosine similarity to compare sentence embeddings.
Training
The SBERT-BASE-CHINESE-NLI was fine-tuned on the ChineseTextualInference dataset using the UER-py framework. Training occurred over five epochs with a sequence length of 128. The training process included saving the model at the end of each epoch when the best performance on the development set was achieved.
python3 finetune/run_classifier_siamese.py --pretrained_model_path models/cluecorpussmall_roberta_base_seq512_model.bin-250000 \
--vocab_path models/google_zh_vocab.txt \
--config_path models/sbert/base_config.json \
--train_path datasets/ChineseTextualInference/train.tsv \
--dev_path datasets/ChineseTextualInference/dev.tsv \
--learning_rate 5e-5 --epochs_num 5 --batch_size 64
The pre-trained model is then converted to Hugging Face's format:
python3 scripts/convert_sbert_from_uer_to_huggingface.py --input_model_path models/finetuned_model.bin \
--output_model_path pytorch_model.bin \
--layers_num 12
Guide: Running Locally
To run the SBERT-BASE-CHINESE-NLI model locally, follow these steps:
-
Install the
sentence-transformers
library:pip install sentence-transformers
-
Load the model and encode sentences:
from sentence_transformers import SentenceTransformer model = SentenceTransformer('uer/sbert-base-chinese-nli') sentences = ['那个人很开心', '那个人非常开心'] sentence_embeddings = model.encode(sentences)
-
Calculate cosine similarity:
from sklearn.metrics.pairwise import paired_cosine_distances cosine_score = 1 - paired_cosine_distances([sentence_embeddings[0]], [sentence_embeddings[1]])
For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure.
License
The SBERT-BASE-CHINESE-NLI model is released under the Apache License 2.0.