klue sroberta base continue learning by mnr LLM Model

Introduction

The KLUE-SROBERTA-BASE-CONTINUE-LEARNING-BY-MNR model is a Korean language sentence-transformer designed for sentence similarity tasks. It leverages the KLUE/NLI and KLUE/STS datasets and employs a continue-learning method for enhanced performance in tasks like clustering and semantic search.

Architecture

The model architecture is built using the SentenceTransformer framework, featuring a RobertaModel with a maximum sequence length of 512 and mean pooling for sentence embeddings. The embedding dimension is 768.

Training

The training process involved two stages:

First, the model underwent training with the NLI dataset using negative sampling and MultipleNegativeRankingLoss.
It was further fine-tuned with the STS dataset using CosineSimilarityLoss.

The model was trained over 4 epochs with a batch size of 32, employing an AdamW optimizer with a learning rate of 2e-5 and a warmup linear scheduler.

Guide: Running Locally

Installation: Ensure you have sentence-transformers or transformers installed.
```
pip install -U sentence-transformers
```

Usage with SentenceTransformers:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("bespin-global/klue-sroberta-base-continue-learning-by-mnr")

Usage with HuggingFace Transformers:

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("bespin-global/klue-sroberta-base-continue-learning-by-mnr")
model = AutoModel.from_pretrained("bespin-global/klue-sroberta-base-continue-learning-by-mnr")

Cloud GPUs: For optimal performance and faster computations, consider using cloud GPU services such as AWS, Google Cloud, or Azure.

License

This model is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

More Related APIs in Sentence Similarity