miscii 14b 1225
sthenno-comIntroduction
MISCII-14B-1225 is a text-generation model designed for various language tasks, supporting both English and Chinese. It utilizes the transformers
library and employs the mergekit
tool for model merging. The model is licensed under Apache 2.0.
Architecture
The model architecture is a result of merging several pre-trained language models using the TIES method. The base model is sthenno-com/miscii-14b-1028
. The configuration involves parameters such as bfloat16
data type, normalization, and specific weights and densities for each contributing model. The primary models contributing to this merge include sthenno/exp-002
and sthenno/miscii-1218
.
Training
Details on the specific training process for MISCII-14B-1225 are forthcoming. However, the model was created using a merging technique rather than traditional training from scratch, implying that it combines pre-trained models to enhance performance.
Guide: Running Locally
-
Installation: Start by installing the Hugging Face Transformers library:
pip install transformers
-
Model Download: Download the MISCII-14B-1225 model from Hugging Face Model Hub:
from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("sthenno-com/miscii-14b-1225") model = AutoModelForCausalLM.from_pretrained("sthenno-com/miscii-14b-1225")
-
Inference: Use the model for text generation:
inputs = tokenizer("Hello, how are you?", return_tensors="pt") outputs = model.generate(inputs["input_ids"], max_length=50) print(tokenizer.decode(outputs[0], skip_special_tokens=True))
-
Hardware Recommendation: Consider using cloud GPU services such as AWS EC2, Google Cloud Platform, or Azure for optimal performance, especially for resource-intensive tasks.
License
The MISCII-14B-1225 model is released under the Apache 2.0 license, allowing for both personal and commercial use with attribution.