LLAMA3-CHINESE-8B-INSTRUCT

Introduction

LLAMA3-CHINESE-8B-INSTRUCT is a fine-tuned conversational model based on Llama3-8B, developed collaboratively by the Llama Chinese community and AtomEcho. It is designed for Chinese text generation and aims to provide ongoing updates to model parameters.

Architecture

This model leverages the Llama3 architecture, optimized for Chinese language processing. It is specifically fine-tuned for dialogue and conversational tasks, utilizing advanced text-generation capabilities.

Training

The model's training process involves fine-tuning on Chinese datasets, focusing on dialogue and conversational contexts. Detailed methodologies for deployment, training, and fine-tuning can be found in the Llama Chinese Community GitHub repository: Llama-Chinese GitHub.

Guide: Running Locally

To run the LLAMA3-CHINESE-8B-INSTRUCT model locally, you can use the following Python code:

import transformers
import torch

model_id = "FlagAlpha/Llama3-Chinese-8B-Instruct"

pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.float16},
    device="cuda",
)

messages = [{"role": "system", "content": ""}]
messages.append({"role": "user", "content": "介绍一下机器学习"})

prompt = pipeline.tokenizer.apply_chat_template(
    messages, 
    tokenize=False, 
    add_generation_prompt=True
)

terminators = [
    pipeline.tokenizer.eos_token_id,
    pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

outputs = pipeline(
    prompt,
    max_new_tokens=512,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9
)

content = outputs[0]["generated_text"][len(prompt):]
print(content)

For optimal performance, it is recommended to run this model on a cloud GPU such as those provided by AWS, Google Cloud Platform, or Azure.

License

This model is licensed under the Apache 2.0 License, allowing for both personal and commercial use with compliance to the license terms.