mt5 small turkish summarization LLM Model

Introduction
The MT5-Small Turkish Summarization system is a fine-tuned version of Google's Multilingual T5-small model, adapted for summarizing Turkish news articles. Utilizing the MLSUM dataset, it is optimized for the text-to-text generation task using PyTorch Lightning.

Architecture
The model is built on the mT5 small architecture, which contains 300 million parameters and requires approximately 1.2GB of storage. It is pre-trained on the mC4 dataset, necessitating fine-tuning for specific tasks like summarization. This version is specifically fine-tuned for Turkish text summarization.

Training
The model was trained using the MLSUM dataset, which includes over 250,000 Turkish news articles and summaries. Due to the model's size, a subset of 20,000 articles was used for training, and 4,000 for validation. Training involved 10 epochs, a batch size of 8, and a learning rate of 0.0001, taking around 4 hours.

Guide: Running Locally

Install the transformers library from Hugging Face:
```
pip install transformers
```

Load the model and tokenizer:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("ozcangundes/mt5-small-turkish-summarization")
model = AutoModelForSeq2SeqLM.from_pretrained("ozcangundes/mt5-small-turkish-summarization")

Use the provided generate_summary function to summarize your text.
Given the model's size and computational requirements, using a cloud GPU is recommended for efficient processing. Services like AWS, Google Cloud Platform, or Azure can be considered.

License
The MT5-Small Turkish Summarization model is licensed under the MIT License, allowing for extensive freedom to use, modify, and distribute the software.

More Related APIs in Summarization