miscii 14b 1028 4bit LLM Model

Introduction

The MISCII-14B-1028-4BIT is a text-generation model from the MLX Community, designed for various tasks including chat and conversational contexts. It supports both English and Chinese languages and is based on the MLX format, offering efficient deployment with 4-bit precision.

Architecture

The model is derived from the original sthenno-com/miscii-14b-1028 and converted to the MLX format using mlx-lm version 0.19.3. It uses the transformers library and is optimized for text-generation tasks, making it suitable for conversational interfaces and custom research applications.

Training

The model was evaluated on the MMLU-PRO dataset using a 5-shot approach. It achieved an exact-match score of 0.6143, indicating its capability in generating accurate text responses in a controlled evaluation setting.

Guide: Running Locally

To run the model locally, follow these steps:

Install MLX-LM:
```
pip install mlx-lm
```

Load and Use the Model:

from mlx_lm import load, generate

model, tokenizer = load("mlx-community/miscii-14b-1028-4bit")

prompt = "hello"

if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Cloud GPUs: For enhanced performance, especially with larger models, consider using cloud GPU services like AWS, Google Cloud, or Azure.

License

The MISCII-14B-1028-4BIT model is released under the Apache 2.0 License, allowing for flexible use and distribution with compliance to the license terms.

More Related APIs in Text Generation