scibert nli
gsartiSCIBERT-NLI
Introduction
SCIBERT-NLI is a model based on SciBERT, fine-tuned using the Stanford Natural Language Inference (SNLI) and MultiNLI datasets. It utilizes the sentence-transformers library to generate universal sentence embeddings. This model employs the scivocab wordpiece vocabulary and an average pooling strategy with softmax loss.
Architecture
The base model is allenai/scibert-scivocab-cased
from Hugging Face's AutoModel. The training uses a batch size of 64, 20,000 training steps, 1,450 warmup steps, and supports lowercasing with a maximum sequence length of 128.
Training
Training was conducted on an NVIDIA Tesla P100 GPU in Kaggle Notebooks, taking approximately four hours. Performance was evaluated using the STS benchmark dataset with Spearman rank correlation, yielding a score of 74.50 compared to 77.12 from a BERT base model trained similarly. The model is particularly useful for tasks like similarity-based scientific paper retrieval.
Guide: Running Locally
-
Clone the Repository:
Clone the Covid Papers Browser repository for example usage. -
Install Dependencies:
Ensure you have Python and the necessary libraries installed. You can usepip
to install required packages. -
Load the Model:
Use the Hugging Face Transformers library to load the model. Example code:from transformers import AutoModel model = AutoModel.from_pretrained("gsarti/scibert-nli")
-
Run Inference:
Implement the model in your application to perform tasks such as sentence embeddings or similarity measurements. -
Optional - Use Cloud GPUs:
For enhanced performance, consider using cloud-based GPU services like AWS, Google Cloud, or Kaggle.
License
The SCIBERT-NLI model is released under the Apache 2.0 License. This implies that you are free to use, modify, and distribute the model, provided you comply with the terms of the license.