pip sql 1.3b G G U F
QuantFactoryIntroduction
PIP-SQL-1.3B-GGUF is a quantized SQL model developed by QuantFactory. It is designed to excel in text-to-SQL tasks, outperforming many expert models and even ChatGPT on certain benchmarks. This model is based on the deepseek base model, with enhancements to optimize SQL query generation using advanced machine learning techniques.
Architecture
The model utilizes a 1.3 billion parameter architecture built for SQL query generation tasks. It leverages softmax cross-entropy and a modified policy gradient approach in an EM (Expectation-Maximization) setup to optimize performance. The architecture supports both PyTorch and JAX frameworks, ensuring flexibility and compatibility.
Training
The model was trained using a dataset from PipableAI, known as the pip-txt-to-sql-spider-bird-dataset. The training process involved sophisticated loss functions, including Q loss, to enhance the model's ability to understand and generate SQL queries accurately. Benchmarking was conducted using the Semantic Evaluation for Text-to-SQL with Distilled Test Suites, which is an officially recognized evaluation framework.
Guide: Running Locally
-
Installation:
- Install the necessary libraries with the following command:
pip install transformers
- Install the necessary libraries with the following command:
-
Setup:
- Load the model and tokenizer using PyTorch:
from transformers import AutoModelForCausalLM, AutoTokenizer device = "cuda" model = AutoModelForCausalLM.from_pretrained("PipableAI/pip-sql-1.3b") tokenizer = AutoTokenizer.from_pretrained("PipableAI/pip-sql-1.3b")
- Load the model and tokenizer using PyTorch:
-
Execution:
- Generate SQL queries with:
inputs = tokenizer(text, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=200) print(tokenizer.decode(outputs[0], skip_special_tokens=True).split('<sql>')[1].split('</sql>')[0])
- Generate SQL queries with:
-
Cloud GPU:
- Consider using cloud-based GPUs such as those provided by AWS, Google Cloud, or Azure for efficient model execution.
License
The model is open source and distributed under the Apache 2.0 License. This allows for use, modification, and distribution with proper attribution.