SCIBERT-NLI

Introduction

SCIBERT-NLI is a model based on SciBERT, fine-tuned using the Stanford Natural Language Inference (SNLI) and MultiNLI datasets. It utilizes the sentence-transformers library to generate universal sentence embeddings. This model employs the scivocab wordpiece vocabulary and an average pooling strategy with softmax loss.

Architecture

The base model is allenai/scibert-scivocab-cased from Hugging Face's AutoModel. The training uses a batch size of 64, 20,000 training steps, 1,450 warmup steps, and supports lowercasing with a maximum sequence length of 128.

Training

Training was conducted on an NVIDIA Tesla P100 GPU in Kaggle Notebooks, taking approximately four hours. Performance was evaluated using the STS benchmark dataset with Spearman rank correlation, yielding a score of 74.50 compared to 77.12 from a BERT base model trained similarly. The model is particularly useful for tasks like similarity-based scientific paper retrieval.

Guide: Running Locally

Clone the Repository:
Clone the Covid Papers Browser repository for example usage.
Install Dependencies:
Ensure you have Python and the necessary libraries installed. You can use pip to install required packages.

Load the Model:
Use the Hugging Face Transformers library to load the model. Example code:

from transformers import AutoModel
model = AutoModel.from_pretrained("gsarti/scibert-nli")

Run Inference:
Implement the model in your application to perform tasks such as sentence embeddings or similarity measurements.
Optional - Use Cloud GPUs:
For enhanced performance, consider using cloud-based GPU services like AWS, Google Cloud, or Kaggle.

License

The SCIBERT-NLI model is released under the Apache 2.0 License. This implies that you are free to use, modify, and distribute the model, provided you comply with the terms of the license.