es_text_neutralizer LLM Model

Introduction

The ES_TEXT_NEUTRALIZER model is designed to transform Spanish text into gender-neutral language, supporting the United Nations' goal of gender equality. This model converts non-inclusive words or expressions into their inclusive counterparts, fostering a more equitable language use.

Architecture

The model is a fine-tuned version of the spanish-t5-small and is implemented using PyTorch. It specializes in Text2Text Generation, focusing on gender-neutralization in Spanish.

Training

Training Data

The model was trained using a variety of Spanish language resources to ensure non-sexist language. These sources include guidelines from the Spanish Ministry of Health, Social Services, and Equality, as well as various universities and organizations focused on gender-neutral language.

Training Procedure

The model was trained with the following hyperparameters:

Learning Rate: 1e-04
Train Batch Size: 32
Seed: 42
Number of Epochs: 10
Weight Decay: 0.01

Metrics

Evaluation metrics include sacrebleu (0.96), BertScoreF1 (0.98), and DiffBleu (0.35). These metrics ensure the semantic similarity and accuracy of the text neutralization process.

Guide: Running Locally

Setup Environment: Ensure you have Python and PyTorch installed. Use a virtual environment for better management.
Clone Repository: Download the model files from the Hugging Face repository.
Install Dependencies: Run pip install transformers to install necessary libraries.
Run Model: Utilize the Hugging Face transformers library to load and run the model on your local machine.
Cloud GPUs: For better performance, especially for large datasets, consider using cloud-based GPUs such as those offered by AWS, Google Cloud, or Azure.

License

The ES_TEXT_NEUTRALIZER model is released under the Apache 2.0 License, allowing for both personal and commercial use with proper attribution.

More Related APIs in Text2text Generation