klue sroberta base continue learning by mnr
bespin-globalIntroduction
The KLUE-SROBERTA-BASE-CONTINUE-LEARNING-BY-MNR model is a Korean language sentence-transformer designed for sentence similarity tasks. It leverages the KLUE/NLI and KLUE/STS datasets and employs a continue-learning method for enhanced performance in tasks like clustering and semantic search.
Architecture
The model architecture is built using the SentenceTransformer framework, featuring a RobertaModel with a maximum sequence length of 512 and mean pooling for sentence embeddings. The embedding dimension is 768.
Training
The training process involved two stages:
- First, the model underwent training with the NLI dataset using negative sampling and MultipleNegativeRankingLoss.
- It was further fine-tuned with the STS dataset using CosineSimilarityLoss.
The model was trained over 4 epochs with a batch size of 32, employing an AdamW optimizer with a learning rate of 2e-5 and a warmup linear scheduler.
Guide: Running Locally
-
Installation: Ensure you have
sentence-transformers
ortransformers
installed.pip install -U sentence-transformers
-
Usage with SentenceTransformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("bespin-global/klue-sroberta-base-continue-learning-by-mnr")
-
Usage with HuggingFace Transformers:
from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("bespin-global/klue-sroberta-base-continue-learning-by-mnr") model = AutoModel.from_pretrained("bespin-global/klue-sroberta-base-continue-learning-by-mnr")
-
Cloud GPUs: For optimal performance and faster computations, consider using cloud GPU services such as AWS, Google Cloud, or Azure.
License
This model is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.