Llama3 Chinese 8 B Instruct
FlagAlphaLLAMA3-CHINESE-8B-INSTRUCT
Introduction
LLAMA3-CHINESE-8B-INSTRUCT is a fine-tuned conversational model based on Llama3-8B, developed collaboratively by the Llama Chinese community and AtomEcho. It is designed for Chinese text generation and aims to provide ongoing updates to model parameters.
Architecture
This model leverages the Llama3 architecture, optimized for Chinese language processing. It is specifically fine-tuned for dialogue and conversational tasks, utilizing advanced text-generation capabilities.
Training
The model's training process involves fine-tuning on Chinese datasets, focusing on dialogue and conversational contexts. Detailed methodologies for deployment, training, and fine-tuning can be found in the Llama Chinese Community GitHub repository: Llama-Chinese GitHub.
Guide: Running Locally
To run the LLAMA3-CHINESE-8B-INSTRUCT model locally, you can use the following Python code:
import transformers
import torch
model_id = "FlagAlpha/Llama3-Chinese-8B-Instruct"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={"torch_dtype": torch.float16},
device="cuda",
)
messages = [{"role": "system", "content": ""}]
messages.append({"role": "user", "content": "介绍一下机器学习"})
prompt = pipeline.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = pipeline(
prompt,
max_new_tokens=512,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9
)
content = outputs[0]["generated_text"][len(prompt):]
print(content)
For optimal performance, it is recommended to run this model on a cloud GPU such as those provided by AWS, Google Cloud Platform, or Azure.
License
This model is licensed under the Apache 2.0 License, allowing for both personal and commercial use with compliance to the license terms.