mt5 small turkish summarization
ozcangundesIntroduction
The MT5-Small Turkish Summarization system is a fine-tuned version of Google's Multilingual T5-small model, adapted for summarizing Turkish news articles. Utilizing the MLSUM dataset, it is optimized for the text-to-text generation task using PyTorch Lightning.
Architecture
The model is built on the mT5 small architecture, which contains 300 million parameters and requires approximately 1.2GB of storage. It is pre-trained on the mC4 dataset, necessitating fine-tuning for specific tasks like summarization. This version is specifically fine-tuned for Turkish text summarization.
Training
The model was trained using the MLSUM dataset, which includes over 250,000 Turkish news articles and summaries. Due to the model's size, a subset of 20,000 articles was used for training, and 4,000 for validation. Training involved 10 epochs, a batch size of 8, and a learning rate of 0.0001, taking around 4 hours.
Guide: Running Locally
- Install the
transformers
library from Hugging Face:pip install transformers
- Load the model and tokenizer:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("ozcangundes/mt5-small-turkish-summarization") model = AutoModelForSeq2SeqLM.from_pretrained("ozcangundes/mt5-small-turkish-summarization")
- Use the provided
generate_summary
function to summarize your text. - Given the model's size and computational requirements, using a cloud GPU is recommended for efficient processing. Services like AWS, Google Cloud Platform, or Azure can be considered.
License
The MT5-Small Turkish Summarization model is licensed under the MIT License, allowing for extensive freedom to use, modify, and distribute the software.