splade v3
naverSPLADE-V3 Model Overview
Introduction
SPLADE-V3 is the latest in the SPLADE series of models, starting from SPLADE++SelfDistil. It incorporates advanced training techniques such as KL-Div and MarginMSE with multiple negatives per query, specifically using eight negatives. The model leverages the MS MARCO dataset without titles for its training.
Architecture
SPLADE-V3 builds upon the architecture of SPLADE++SelfDistil, enhancing its capabilities through improved loss functions and negative sampling. It is designed to excel in tasks like fill-mask using the Transformers library and PyTorch, and focuses on the English language.
Training
The model is trained with a combination of KL-Div and MarginMSE loss functions, using eight negatives per query from SPLADE++SelfDistil. It utilizes the MS MARCO dataset, which is a benchmark collection for information retrieval tasks.
Guide: Running Locally
- Clone the Repository: Access the SPLADE GitHub repository for setup and installation instructions.
git clone https://github.com/naver/splade
- Install Dependencies: Ensure all necessary Python packages and libraries are installed.
pip install -r requirements.txt
- Download the Model: Fetch the SPLADE-V3 model from Hugging Face.
- Run Inference: Use the provided scripts to perform tasks such as fill-mask.
Cloud GPU Suggestion: For optimal performance, use cloud GPUs such as those offered by AWS, Google Cloud, or Azure.
License
SPLADE-V3 is released under the Creative Commons Attribution Non-Commercial Share Alike 4.0 International license (cc-by-nc-sa-4.0).