Tower Instruct 7 B v0.2
UnbabelIntroduction
TowerInstruct-7B-v0.2 is a language model developed by Unbabel in collaboration with Instituto Superior Técnico and CentraleSupélec University of Paris-Saclay. It is designed to handle a variety of translation-related tasks, including machine translation, automatic post-editing, named-entity recognition, grammatical error correction, and paraphrase generation. The model supports ten languages: English, Portuguese, Spanish, French, German, Dutch, Italian, Korean, Chinese, and Russian.
Architecture
TowerInstruct-7B-v0.2 is a 7 billion parameter model fine-tuned from TowerBase, using a mixture of publicly available and synthetic datasets focused on translation-related tasks, conversational datasets, and code instructions. It employs the ChatML prompt templates for training, without system prompts.
Training
The model was trained using the TowerBlocks dataset, which includes diverse data sources for translation, automatic post-editing, context-aware translation, and more. The training used a total batch size of 256, a learning rate of 7e-06 with a cosine scheduler, and ran for four epochs. The optimizer was Adam with specific parameters, and the maximum sequence length was set to 2048.
Guide: Running Locally
-
Install Requirements
- Ensure you have Python installed.
- Install the Transformers library:
pip install git+https://github.com/huggingface/transformers.git pip install accelerate
-
Run the Model
import torch from transformers import pipeline pipe = pipeline("text-generation", model="Unbabel/TowerInstruct-7B-v0.2", torch_dtype=torch.bfloat16, device_map="auto") messages = [ {"role": "user", "content": "Translate the following text from Portuguese into English.\nPortuguese: Um grupo de investigadores lançou um novo modelo para tarefas relacionadas com tradução.\nEnglish:"}, ] prompt = pipe.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) outputs = pipe(prompt, max_new_tokens=256, do_sample=False) print(outputs[0]["generated_text"])
- For optimal performance, consider using a cloud GPU service like AWS, Google Cloud, or Azure.
License
TowerInstruct-7B-v0.2 is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). Llama 2, which some components might depend on, is under the LLAMA 2 Community License from Meta Platforms, Inc.