llm jp 3 172b instruct3
llm-jpIntroduction
LLM-JP-3-172B-INSTRUCT3 is a large language model developed by the Research and Development Center for Large Language Models at the National Institute of Informatics. The model supports both English and Japanese languages and is designed for text generation tasks.
Architecture
The LLM-JP models are based on the Transformer architecture. The model variants, including LLM-JP-3-172B-INSTRUCT3, differ in their parameters and configurations, such as the number of layers, hidden size, and the number of attention heads. The tokenizer used is based on a Unigram byte-fallback model.
Training
Pre-Training
The models were pre-trained using datasets like Japanese Wikipedia, Common Crawl, and English Wikipedia, among others, with a total of 2.1 trillion seen tokens for LLM-JP-3-172B.
Post-Training
The model underwent supervised fine-tuning and Direct Preference Optimization. Various datasets, including ichikara-instruction and synthetic datasets, were used for fine-tuning to enhance the model's safety and helpfulness.
Guide: Running Locally
-
Install Required Libraries:
Ensure you have the following Python libraries:torch>=2.3.0
transformers>=4.40.1
tokenizers>=0.19.1
accelerate>=0.29.3
flash-attn>=2.5.8
-
Load Model and Tokenizer:
import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("llm-jp/llm-jp-3-172b-instruct3") model = AutoModelForCausalLM.from_pretrained("llm-jp/llm-jp-3-172b-instruct3", device_map="auto", torch_dtype=torch.bfloat16)
-
Prepare Input and Generate Output:
chat = [ {"role": "system", "content": "以下は、タスクを説明する指示です。要求を適切に満たす応答を書きなさい。"}, {"role": "user", "content": "自然言語処理とは何か"}, ] tokenized_input = tokenizer.apply_chat_template(chat, add_generation_prompt=True, tokenize=True, return_tensors="pt").to(model.device) with torch.no_grad(): output = model.generate( tokenized_input, max_new_tokens=100, do_sample=True, top_p=0.95, temperature=0.7, repetition_penalty=1.05, )[0] print(tokenizer.decode(output))
-
Suggest Cloud GPUs:
To run the model efficiently, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.
License
The model is released under the "llm-jp-3-172b-instruct3-tou" license. For detailed licensing information, refer to the LICENSE file.