jina colbert v2 LLM Model

Introduction

JinaColBERT V2 is a multilingual late interaction retriever model by Jina AI, building on the capabilities of its predecessor, JinaColBERT V1. It provides enhanced performance, multilingual support, and a flexible trade-off between efficiency and precision through Matryoshka embeddings.

Architecture

JinaColBERT V2 expands on JinaColBERT V1 with 8192 token input context and late interaction for improved token-level embeddings. It supports multilingual capabilities across many global languages and offers different versions for embeddings dimensions: 128, 96, and 64 dimensions.

Training

The model leverages flash attention and requires the installation of einops and flash_attn. It can be interfaced using the Stanford ColBERT library or the pylate and ragatouille packages for model retrieval and indexing.

Guide: Running Locally

Installation:

Install necessary packages:

pip install -U einops flash_attn
pip install -U ragatouille # or
pip install -U colbert-ai # or
pip install -U pylate

Usage:

Using Pylate:

from pylate import indexes, models, retrieve
model = models.ColBERT(model_name_or_path="jinaai/jina-colbert-v2")

Using Ragatouille:

from ragatouille import RAGPretrainedModel
RAG = RAGPretrainedModel.from_pretrained("jinaai/jina-colbert-v2")

Using Stanford ColBERT:

from colbert.infra import ColBERTConfig
from colbert.modeling.checkpoint import Checkpoint
ckpt = Checkpoint("jinaai/jina-colbert-v2", colbert_config=ColBERTConfig())

Cloud GPUs: To enhance the performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure.

License

The JinaColBERT V2 model is released under the Creative Commons Attribution-NonCommercial 4.0 International License (cc-by-nc-4.0). This allows for non-commercial use with appropriate attribution.

More Related APIs