t5 small awesome text to sql

cssupport

Introduction

The T5-Small-Awesome-Text-to-SQL model is designed to generate SQL queries from natural language input. It is based on the T5-small architecture and fine-tuned to handle SQL generation across various tables with "CREATE TABLE" statements. This lightweight model is suitable for analytical applications that require SQL query generation.

Architecture

The model is a language model fine-tuned from the T5-small model. It operates within the PyTorch framework and uses the Hugging Face Transformers library. The model is trained to work with English language inputs, utilizing datasets such as Clinton/Text-to-sql-v1 and b-mc2/sql-create-context.

Training

The model was trained using a combination of datasets focusing on text-to-SQL tasks. The training process employed one A100-80 GPU and leveraged PyTorch along with the Hugging Face Transformers library.

Guide: Running Locally

  1. Install Dependencies: Ensure you have Python and PyTorch installed. Install the Transformers library via pip:

    pip install transformers
    
  2. Initialize the Model: Use the provided Python code snippet to load and run the model:

    import torch
    from transformers import T5Tokenizer, T5ForConditionalGeneration
    
    tokenizer = T5Tokenizer.from_pretrained('t5-small')
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model = T5ForConditionalGeneration.from_pretrained('cssupport/t5-small-awesome-text-to-sql').to(device)
    model.eval()
    
    def generate_sql(input_prompt):
        inputs = tokenizer(input_prompt, padding=True, truncation=True, return_tensors="pt").to(device)
        with torch.no_grad():
            outputs = model.generate(**inputs, max_length=512)
        return tokenizer.decode(outputs[0], skip_special_tokens=True)
    
  3. Test the Model: Use a sample input prompt to generate SQL:

    input_prompt = "tables:\n CREATE TABLE student_course_attendance (student_id VARCHAR); CREATE TABLE students (student_id VARCHAR)\n query for: List the id of students who never attends courses?"
    print("The generated SQL query is:", generate_sql(input_prompt))
    
  4. Cloud GPUs: For intensive workloads, consider using cloud solutions such as AWS, GCP, or Azure, which offer GPU instances like Nvidia A100.

License

The T5-Small-Awesome-Text-to-SQL model is licensed under the Apache 2.0 License, allowing for wide use and modification with proper attribution.

More Related APIs in Text2text Generation