Introduction

The ES_TEXT_NEUTRALIZER model is designed to transform Spanish text into gender-neutral language, supporting the United Nations' goal of gender equality. This model converts non-inclusive words or expressions into their inclusive counterparts, fostering a more equitable language use.

Architecture

The model is a fine-tuned version of the spanish-t5-small and is implemented using PyTorch. It specializes in Text2Text Generation, focusing on gender-neutralization in Spanish.

Training

Training Data

The model was trained using a variety of Spanish language resources to ensure non-sexist language. These sources include guidelines from the Spanish Ministry of Health, Social Services, and Equality, as well as various universities and organizations focused on gender-neutral language.

Training Procedure

The model was trained with the following hyperparameters:

  • Learning Rate: 1e-04
  • Train Batch Size: 32
  • Seed: 42
  • Number of Epochs: 10
  • Weight Decay: 0.01

Metrics

Evaluation metrics include sacrebleu (0.96), BertScoreF1 (0.98), and DiffBleu (0.35). These metrics ensure the semantic similarity and accuracy of the text neutralization process.

Guide: Running Locally

  1. Setup Environment: Ensure you have Python and PyTorch installed. Use a virtual environment for better management.

  2. Clone Repository: Download the model files from the Hugging Face repository.

  3. Install Dependencies: Run pip install transformers to install necessary libraries.

  4. Run Model: Utilize the Hugging Face transformers library to load and run the model on your local machine.

  5. Cloud GPUs: For better performance, especially for large datasets, consider using cloud-based GPUs such as those offered by AWS, Google Cloud, or Azure.

License

The ES_TEXT_NEUTRALIZER model is released under the Apache 2.0 License, allowing for both personal and commercial use with proper attribution.

More Related APIs in Text2text Generation