it5 large news summarization
gsartiIntroduction
The IT5 Large model is designed for Italian news summarization. It has been fine-tuned on the Fanpage and Il Post datasets. This model is part of the research on large-scale text-to-text pretraining for Italian language understanding and generation.
Architecture
IT5 Large is based on the T5 architecture, supporting text-to-text generation tasks. It is compatible with TensorFlow, PyTorch, and JAX libraries.
Training
The model was fine-tuned on news datasets from Fanpage and Il Post. It achieved notable performance measured by ROUGE and BERTScore metrics. The training utilized a TPU v3-8 VM in Eemshaven, Netherlands, with reported CO2 emissions of 51g.
Guide: Running Locally
To run the IT5 Large model locally, follow these steps:
-
Install the Transformers library:
pip install transformers
-
Load the model using Transformers pipeline:
from transformers import pipeline newsum = pipeline("summarization", model='it5/it5-large-news-summarization')
-
Summarize text:
result = newsum("Your news article text here")
-
Use AutoTokenizer and AutoModelForSeq2SeqLM for detailed customization:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("it5/it5-large-news-summarization") model = AutoModelForSeq2SeqLM.from_pretrained("it5/it5-large-news-summarization")
For optimal performance, consider using cloud GPUs from platforms like Google Cloud, AWS, or Azure.
License
The model is licensed under the Apache-2.0 License, allowing for both personal and commercial use.