flan t5 small

google

Introduction

FLAN-T5-Small is a language model developed by Google and available on Hugging Face. It is an enhanced version of the T5 model, fine-tuned on over 1000 additional tasks and supports multiple languages. The model has shown improved performance and usability in zero-shot and few-shot learning tasks.

Architecture

  • Model Type: Language model
  • Languages: Supports a wide range of languages including English, French, German, Chinese, Arabic, and many more.
  • License: Apache 2.0
  • Related Models: All FLAN-T5 checkpoints are available for further exploration and use.
  • Resources: The model's design and improvements are documented in its research paper available on arXiv, and the implementation can be reviewed on GitHub.

Training

FLAN-T5-Small was trained using TPU v3 or TPU v4 pods with the t5x codebase and JAX. The training data comprised a diverse set of tasks to improve zero-shot and few-shot performance. The model inherits from the T5 architecture and is fine-tuned with instructions for enhanced performance.

Guide: Running Locally

Basic Steps

  1. Install Dependencies: Ensure you have transformers installed. Use pip install transformers.

  2. Loading the Model:

    from transformers import T5Tokenizer, T5ForConditionalGeneration
    
    tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-small")
    model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-small")
    
  3. Running on CPU:

    input_text = "translate English to German: How old are you?"
    input_ids = tokenizer(input_text, return_tensors="pt").input_ids
    
    outputs = model.generate(input_ids)
    print(tokenizer.decode(outputs[0]))
    
  4. Running on GPU:

    # pip install accelerate
    input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
    outputs = model.generate(input_ids)
    print(tokenizer.decode(outputs[0]))
    

Cloud GPUs

For enhanced performance, especially for larger inputs or batch processing, consider using cloud GPU services such as AWS, GCP, or Azure.

License

FLAN-T5-Small is licensed under the Apache 2.0 License, allowing for both personal and commercial use with proper attribution.

More Related APIs in Text2text Generation