Helsinki N L P Fine Tuned Legal es zh LLM Model

Introduction

The HelsinkiNLP-FineTuned-Legal-es-zh model is a fine-tuned version of the Helsinki-NLP/opus-tatoeba-es-zh model, designed for translation tasks between Spanish and Chinese within the legal domain. This model was developed as part of a master's thesis project at the Autonomous University of Barcelona, focusing on neural machine translation techniques.

Architecture

The model is based on the Marian NMT architecture and uses the Transformers library for implementation, specifically utilizing the PyTorch framework.

Training

The model was trained on a dataset composed of legal texts, including the Spanish Civil Code and other legal documents translated into Chinese. The dataset consists of 9,972 sentence pairs, with 1,000 pairs reserved for evaluation. Training was conducted using specific hyperparameters such as a learning rate of 2e-05, a batch size of 8, and an Adam optimizer with linear learning rate scheduling. The best validation loss achieved was 1.338 at step 5600, with training conducted over 10 epochs.

Guide: Running Locally

To run this model locally, follow these steps:

Set Up Environment:
- Install Python and create a virtual environment.
- Install necessary libraries: transformers, torch, datasets, and tokenizers.

Download the Model:

Use the Hugging Face Transformers library to download the model with:

from transformers import MarianMTModel, MarianTokenizer
model_name = "guocheng98/HelsinkiNLP-FineTuned-Legal-es-zh"
model = MarianMTModel.from_pretrained(model_name)
tokenizer = MarianTokenizer.from_pretrained(model_name)

Prepare Data:
- Tokenize your input text using the tokenizer and prepare it for translation.
Perform Inference:
- Run the model with the tokenized input to get translations.
Cloud GPUs:
- For large-scale inference or training, consider using cloud GPU services like AWS, Azure, or Google Cloud for better performance.

License

The model is released under the Apache 2.0 License, allowing for both personal and commercial use with appropriate attribution.

More Related APIs in Translation