t5 base finetuned wiki S Q L
mrm8488Introduction
This project involves fine-tuning Google's T5 model on the WikiSQL dataset to translate English queries into SQL commands. The T5 model is renowned for its transfer learning capabilities, which have been applied to a variety of natural language processing tasks.
Architecture
The T5 model, introduced in the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer," uses a text-to-text framework to handle different NLP tasks. It is pre-trained on a broad range of tasks before being fine-tuned on specific datasets like WikiSQL for SQL translation.
Training
The model is fine-tuned using a script adapted from a Colab Notebook by Suraj Patil. The WikiSQL dataset consists of 56,355 training samples and 14,436 validation samples, which are used to train and validate the model's SQL translation capabilities.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install the transformers library:
pip install transformers
-
Load the model and tokenizer:
from transformers import AutoModelWithLMHead, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("mrm8488/t5-base-finetuned-wikiSQL") model = AutoModelWithLMHead.from_pretrained("mrm8488/t5-base-finetuned-wikiSQL")
-
Define a function to generate SQL from English queries:
def get_sql(query): input_text = "translate English to SQL: %s </s>" % query features = tokenizer([input_text], return_tensors='pt') output = model.generate(input_ids=features['input_ids'], attention_mask=features['attention_mask']) return tokenizer.decode(output[0])
-
Example usage:
query = "How many models were finetuned using BERT as base model?" print(get_sql(query))
For optimal performance, it is recommended to use cloud GPU providers such as AWS EC2, Google Cloud, or Azure for running the model.
License
This project is licensed under the Apache-2.0 License.