Llama 3.3 70 B Instruct 4bit
mlx-communityIntroduction
Llama-3.3-70B-Instruct-4bit is a model hosted on Hugging Face by the MLX Community. It supports text generation tasks and is optimized for performance with 4-bit precision. The model is built on Meta's foundational Llama-3.3 architecture and is integrated with the MLX library.
Architecture
This model is based on the meta-llama/Llama-3.3-70B-Instruct
architecture. It is designed to support multiple languages, including English, French, Italian, Portuguese, Hindi, Spanish, Thai, and German. The model uses the Transformers library and is optimized for PyTorch, employing the Safetensors format for safe and efficient tensor handling.
Training
The model was trained using the Llama framework and has been converted to the MLX format using version 0.20.1 of the mlx-lm library. While specific details about the dataset or training process are not provided, it inherits the capabilities and architecture of Meta's Llama-3.3 models, which are designed for large-scale language tasks.
Guide: Running Locally
To run the Llama-3.3-70B-Instruct-4bit model locally, follow these steps:
-
Install the MLX Library:
pip install mlx-lm
-
Load and Generate Text with the Model:
from mlx_lm import load, generate model, tokenizer = load("mlx-community/Llama-3.3-70B-Instruct-4bit") prompt = "hello" if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None: messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) response = generate(model, tokenizer, prompt=prompt, verbose=True)
-
Hardware Requirements: Due to the size and complexity of the model, it is recommended to use a cloud GPU service like AWS, Google Cloud, or Azure for efficient processing.
License
The model is distributed under the Llama 3.3 Community License by Meta. Users are granted a non-exclusive, worldwide, non-transferable, and royalty-free license to use, reproduce, and distribute the model, with specific conditions for attribution and redistribution. Users must comply with all applicable laws and the Acceptable Use Policy, which prohibits illegal and harmful activities.