flan t5 base

google

Introduction

FLAN-T5 is an advanced version of the T5 language model, fine-tuned on over 1,000 additional tasks across multiple languages. It offers enhanced performance for few-shot tasks compared to larger models, while employing instruction finetuning to boost usability and efficacy.

Architecture

FLAN-T5 is a language model supporting numerous languages, including English, Spanish, and Chinese, among others. It is released under the Apache 2.0 license and has related FLAN-T5 checkpoints available for use. The model is designed to improve performance on zero-shot and few-shot tasks and was trained using TPU v3/v4 pods with the JAX codebase.

Training

FLAN-T5 was trained on a diverse set of tasks to improve zero-shot and few-shot capabilities. The model builds on the pretrained T5 architecture, with additional fine-tuning for better performance. It was trained on Google Cloud TPU Pods using the T5X framework, which leverages JAX for efficient processing.

Guide: Running Locally

To run FLAN-T5 locally, follow these steps:

  1. Install Dependencies:

    pip install transformers accelerate
    
  2. Load the Model and Tokenizer:

    • CPU:

      from transformers import T5Tokenizer, T5ForConditionalGeneration
      
      tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-base")
      model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-base")
      
      input_text = "translate English to German: How old are you?"
      input_ids = tokenizer(input_text, return_tensors="pt").input_ids
      
      outputs = model.generate(input_ids)
      print(tokenizer.decode(outputs[0]))
      
    • GPU:

      from transformers import T5Tokenizer, T5ForConditionalGeneration
      
      tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-base")
      model = T5ForConditionalGeneration.from_pretrained("google/flan-t5-base", device_map="auto")
      
      input_text = "translate English to German: How old are you?"
      input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
      
      outputs = model.generate(input_ids)
      print(tokenizer.decode(outputs[0]))
      
  3. Cloud GPUs:

    • Consider using cloud GPU providers like AWS, Google Cloud, or Azure for enhanced computational resources.

License

FLAN-T5 is licensed under Apache 2.0, allowing for broad use and modification within the terms of the license.

More Related APIs in Text2text Generation