m T5_m2o_chinese_simplified_cross Sum
csebuetnlpIntroduction
The mT5_M2O_CHINESE_SIMPLIFIED_CROSSSUM
model is a many-to-one multilingual T5 checkpoint finetuned on various cross-lingual pairs using the CrossSum dataset. It aims to summarize text from any language into Simplified Chinese.
Architecture
The model is based on the mT5 architecture, which is a multilingual variant of the T5 model designed for text-to-text tasks. It can handle text inputs in 43 different languages, summarizing them into Simplified Chinese.
Training
The model was finetuned using the CrossSum dataset, which includes cross-lingual pairs with target summaries in Simplified Chinese. Detailed training scripts and methodologies can be found in the associated research paper and the official repository linked in the documentation.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install the Transformers Library: Ensure you have the
transformers
library installed. It was tested on version 4.11.0.dev0.pip install transformers
-
Import Required Modules:
import re from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
-
Define a Whitespace Handler:
WHITESPACE_HANDLER = lambda k: re.sub('\s+', ' ', re.sub('\n+', ' ', k.strip()))
-
Prepare Text for Summarization:
article_text = """Your text here"""
-
Load the Model and Tokenizer:
model_name = "csebuetnlp/mT5_m2o_chinese_simplified_crossSum" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
-
Tokenize and Generate Summary:
input_ids = tokenizer( [WHITESPACE_HANDLER(article_text)], return_tensors="pt", padding="max_length", truncation=True, max_length=512 )["input_ids"] output_ids = model.generate( input_ids=input_ids, max_length=84, no_repeat_ngram_size=2, num_beams=4 )[0] summary = tokenizer.decode( output_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False ) print(summary)
For optimal performance, consider using cloud GPUs such as AWS, GCP, or Azure.
License
This model is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (cc-by-nc-sa-4.0).