moxin llm 7b

moxin-org

MOXIN LLM 7B

Introduction

MOXIN LLM 7B is an open-source language model designed for text generation tasks. It is available through Hugging Face, supporting the Transformers library and built using PyTorch. The model is evaluated on several benchmarks, showcasing its capabilities in both few-shot and zero-shot settings.

Architecture

The model comprises 7 billion parameters, making it a powerful tool for various text generation applications. It is available in two versions: a base model and a fine-tuned chat model. The model leverages advanced techniques to deliver high performance across different datasets.

Training

The base model is initially trained on a large corpus, and further fine-tuned on the Tulu v2 dataset to enhance conversational capabilities. Evaluation metrics from several datasets, including AI2 Reasoning Challenge, HellaSwag, MMLU, and Winogrande, are provided to demonstrate the model's performance against other similar models.

Guide: Running Locally

  1. Setup Environment: Ensure you have Python and PyTorch installed. Install the transformers library from Hugging Face.

  2. Download the Model:

    git clone https://huggingface.co/moxin-org/moxin-llm-7b
    
  3. Run Inference:

    • Use the following code snippet to perform text generation using the model:
      import torch
      from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
      
      model_name = 'moxin-org/moxin-llm-7b'
      tokenizer = AutoTokenizer.from_pretrained(model_name)
      model = AutoModelForCausalLM.from_pretrained(
          model_name,
          torch_dtype=torch.bfloat16,
          device_map="auto",
          trust_remote_code=True,
      )
      
      pipe = pipeline(
          "text-generation",
          model=model,
          tokenizer=tokenizer,
          torch_dtype=torch.bfloat16,
          device_map="auto"
      )
      
      prompt = "Can you explain the concept of regularization in machine learning?"
      
      sequences = pipe(
          prompt,
          do_sample=True,
          max_new_tokens=1000,
          temperature=0.7,
          top_k=50,
          top_p=0.95,
          num_return_sequences=1,
      )
      print(sequences[0]['generated_text'])
      
  4. Cloud GPUs: For enhanced performance, consider using cloud-based GPUs such as AWS, GCP, or Azure to handle the model's computational requirements efficiently.

License

MOXIN LLM 7B is released under the Apache 2.0 license, allowing wide usage and distribution while ensuring acknowledgment of the original creators.

More Related APIs in Text Generation