x L S T M 7b
NX-AIIntroduction
The xLSTM-7B is a pre-trained language model developed by NX-AI, utilizing the DCLM framework and high-quality data comprising approximately 2.3 trillion tokens. It is built with the xlstm-jax framework.
Architecture
The model employs the xLSTM architecture, which includes advanced features like triton kernels for optimized performance. It is designed to be compatible with various tokenization and model frameworks offered by Hugging Face.
Training
xLSTM-7B was trained on extensive datasets using an efficient training framework, which enhances its performance in generating text and handling various language tasks. The model's training process incorporated specific techniques to optimize both speed and accuracy.
Guide: Running Locally
To run xLSTM-7B locally, follow these steps:
-
Install Dependencies:
pip install xlstm pip install mlstm_kernels pip install 'transformers @ git+ssh://git@github.com/NX-AI/transformers.git@integrate_xlstm'
-
Load the Model:
from transformers import AutoModelForCausalLM, AutoTokenizer xlstm = AutoModelForCausalLM.from_pretrained("NX-AI/xLSTM-7b", device_map="auto") tokenizer = AutoTokenizer.from_pretrained("NX-AI/xLSTM-7b") tokens = tokenizer("Hello xLSTM, how are you doing?", return_tensors='pt')['input_ids'].to(device="cuda") out = xlstm.generate(tokens, max_new_tokens=20) print(tokenizer.decode(out[0]))
-
Optional: Change Kernels to Native PyTorch:
from transformers import AutoConfig xlstm_config = AutoConfig.from_pretrained("NX-AI/xLSTM-7b") xlstm_config.step_kernel = "native" xlstm_config.chunkwise_kernel = "chunkwise--native_autograd" xlstm_config.sequence_kernel = "native_sequence__native" xlstm = AutoModelForCausalLM.from_pretrained("NX-AI/xLSTM-7b", config=xlstm_config, device_map="auto")
For optimal performance, it is recommended to use cloud GPUs like NVIDIA H100 for running the model efficiently.
License
The model is released under the NXAI Community License. Please refer to the LICENSE file for detailed terms and conditions.