it5 large news summarization LLM Model

Introduction

The IT5 Large model is designed for Italian news summarization. It has been fine-tuned on the Fanpage and Il Post datasets. This model is part of the research on large-scale text-to-text pretraining for Italian language understanding and generation.

Architecture

IT5 Large is based on the T5 architecture, supporting text-to-text generation tasks. It is compatible with TensorFlow, PyTorch, and JAX libraries.

Training

The model was fine-tuned on news datasets from Fanpage and Il Post. It achieved notable performance measured by ROUGE and BERTScore metrics. The training utilized a TPU v3-8 VM in Eemshaven, Netherlands, with reported CO2 emissions of 51g.

Guide: Running Locally

To run the IT5 Large model locally, follow these steps:

Install the Transformers library:
```
pip install transformers
```

Load the model using Transformers pipeline:

from transformers import pipeline

newsum = pipeline("summarization", model='it5/it5-large-news-summarization')

Summarize text:

result = newsum("Your news article text here")

Use AutoTokenizer and AutoModelForSeq2SeqLM for detailed customization:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("it5/it5-large-news-summarization")
model = AutoModelForSeq2SeqLM.from_pretrained("it5/it5-large-news-summarization")

For optimal performance, consider using cloud GPUs from platforms like Google Cloud, AWS, or Azure.

License

The model is licensed under the Apache-2.0 License, allowing for both personal and commercial use.

More Related APIs in Summarization