G O D E L v1_1 large seq2seq
microsoftIntroduction
GODEL is a large-scale pre-trained model developed for goal-directed dialogs. It uses a Transformer-based encoder-decoder architecture to generate responses based on external text, enhancing fine-tuning capabilities for dialog tasks that incorporate external information.
Architecture
GODEL's architecture is based on a Transformer model designed for response generation. It is trained on 551 million multi-turn dialogs from Reddit and 5 million instruction and knowledge-grounded dialogs, allowing it to generate responses with contextual awareness and external knowledge integration.
Training
The model is pre-trained on a substantial dataset of dialogs from Reddit and other sources, enabling effective fine-tuning on specific dialog tasks with minimal additional data. This training approach allows the model to adapt to new dialog tasks by conditioning responses on external knowledge.
Guide: Running Locally
To run the GODEL model locally, follow these steps:
-
Install the Transformers library:
pip install transformers
-
Load the pre-trained model and tokenizer:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("microsoft/GODEL-v1_1-large-seq2seq") model = AutoModelForSeq2SeqLM.from_pretrained("microsoft/GODEL-v1_1-large-seq2seq")
-
Define a function to generate dialog responses:
def generate(instruction, knowledge, dialog): if knowledge != '': knowledge = '[KNOWLEDGE] ' + knowledge dialog = ' EOS '.join(dialog) query = f"{instruction} [CONTEXT] {dialog} {knowledge}" input_ids = tokenizer(f"{query}", return_tensors="pt").input_ids outputs = model.generate(input_ids, max_length=128, min_length=8, top_p=0.9, do_sample=True) output = tokenizer.decode(outputs[0], skip_special_tokens=True) return output
-
Test the model with a sample dialog:
instruction = 'Instruction: given a dialog context, you need to response empathically.' knowledge = '' dialog = [ 'Does money buy happiness?', 'It is a question. Money buys you a lot of things, but not enough to buy happiness.', 'What is the best way to buy happiness ?' ] response = generate(instruction, knowledge, dialog) print(response)
For optimal performance, consider utilizing cloud GPU services such as AWS EC2, Google Cloud, or Microsoft Azure.
License
The GODEL model is released under the MIT License, allowing for wide use and modification.