stablelm 2 12b chat
stabilityaiIntroduction
StableLM 2 12B Chat is a 12 billion parameter language model developed by Stability AI for instruction-tuned applications. It is designed for generating text in conversational settings using a mixture of publicly available and synthetic datasets.
Architecture
StableLM 2 12B Chat is an auto-regressive language model based on the transformer decoder architecture. It supports multiple functionalities such as text generation and function calling while using the ChatML format for input prompts.
Training
The model is trained on a combination of large-scale datasets available on the HuggingFace Hub and an internal safety dataset. Key datasets include HuggingFaceH4/ultrachat_200k, meta-math/MetaMathQA, and others. The training also involves Direct Preference Optimization (DPO) to enhance performance.
Guide: Running Locally
To run the StableLM 2 12B Chat model locally, follow these basic steps:
-
Install the necessary libraries:
pip install transformers
-
Use the following Python code to load and interact with the model:
from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained('stabilityai/stablelm-2-12b-chat') model = AutoModelForCausalLM.from_pretrained('stabilityai/stablelm-2-12b-chat', device_map="auto") prompt = [{'role': 'user', 'content': 'Implement snake game using pygame'}] inputs = tokenizer.apply_chat_template(prompt, add_generation_prompt=True, return_tensors='pt') tokens = model.generate(inputs.to(model.device), max_new_tokens=100, temperature=0.7, do_sample=True) output = tokenizer.decode(tokens[:, inputs.shape[-1]:][0], skip_special_tokens=False) print(output)
-
Consider using cloud GPUs for efficient performance, such as those offered by AWS, Google Cloud, or Azure.
License
StableLM 2 12B Chat is released under the StabilityAI Non-Commercial Research Community License. For commercial applications, users must contact Stability AI for licensing. Further details are available on the official Stability AI license page.