Fireball Meta Llama 3.1 8 B Instruct Agent 0.003 128 K code ds auto

EpistemeAI

Fireball-Meta-Llama-3.1-8B-Instruct-Agent-0.003-128K

Introduction

Fireball-Meta-Llama-3.1-8B-Instruct-Agent-0.003-128K is a fine-tuned AI model designed for advanced coding, data science, and machine learning tasks. It offers features like search, calculator, automatic reasoning, and self-learning capabilities. The model is intended for commercial and research use in multiple languages and is optimized for various natural language generation tasks.

Architecture

The model is based on the Llama architecture, fine-tuned using the ReAct strategy for improved responses. It includes features for automatic reasoning, tool-use, and supports self-reflection and self-learning. The model is designed to work seamlessly with frameworks like Langchain and LLamaIndex.

Training

The model was trained using various datasets, such as IFEval, BBH, MATH, GPQA, MuSR, and MMLU-PRO. Training focused on improving metrics like strict accuracy, normalized accuracy, and exact match. The model benefits from self-learning and self-reflection features, improving its capabilities over time.

Guide: Running Locally

  1. Installation:

    pip install --upgrade --no-cache-dir "git+https://github.com/huggingface/transformers.git"
    pip install --upgrade tokenizer
    pip install unsloth
    
  2. Integration:
    Use popular libraries like Transformers and vLLM to integrate the model. Example code for fast response using 4-bit quantization:

    import transformers
    quantization_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype="float16", bnb_4bit_use_double_quant=True)
    pipeline = transformers.pipeline("text-generation", model="EpistemeAI/Fireball-Meta-Llama-3.1-8B-Instruct-Agent-0.003-128K-code-ds-auto", model_kwargs={"quantization_config": quantization_config}, device_map="auto")
    
  3. Execution:
    Use virtual environments to manage dependencies:

    python3 -m venv env
    source env/bin/activate
    
  4. Cloud GPUs:
    Consider using cloud GPUs for optimal performance, especially when dealing with large models and datasets.

License

The model is licensed under the Apache-2.0 License, allowing for commercial use, modification, and distribution. Users are encouraged to cite the project in academic research to promote awareness.

More Related APIs in Text Generation