distilbert base cased distilled squad LLM Model

Introduction

DistilBERT is a smaller, faster, cheaper, and lighter version of BERT, developed by Hugging Face. It retains over 95% of BERT's performance with 40% fewer parameters and is 60% faster. The specific model, distilbert-base-cased-distilled-squad, is fine-tuned for question answering using the SQuAD dataset.

Architecture

DistilBERT is a Transformer-based language model. It achieves its efficiency through a process called knowledge distillation, where it is trained by distilling BERT base. The architecture is designed to handle language tasks efficiently in English, leveraging the capabilities of transformers.

Training

DistilBERT is trained using the BookCorpus and English Wikipedia datasets, similar to BERT's training data. The model uses the SQuAD v1.1 dataset for fine-tuning, achieving an F1 score of 87.1 on the dev set. Training involves preprocessing and leveraging the same procedures as in the distilbert-base-cased model.

Guide: Running Locally

To run the model locally, follow these steps:

Install Transformers library:
```
pip install transformers
```

Code for Question Answering with PyTorch:

from transformers import DistilBertTokenizer, DistilBertModel
import torch

tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-cased-distilled-squad')
model = DistilBertModel.from_pretrained('distilbert-base-cased-distilled-squad')

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
inputs = tokenizer(question, text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

print(outputs)

Code for Question Answering with TensorFlow:

from transformers import DistilBertTokenizer, TFDistilBertForQuestionAnswering
import tensorflow as tf

tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-cased-distilled-squad")
model = TFDistilBertForQuestionAnswering.from_pretrained("distilbert-base-cased-distilled-squad")

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
inputs = tokenizer(question, text, return_tensors="tf")
outputs = model(**inputs)

answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)[0])
answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)[0])

predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
tokenizer.decode(predict_answer_tokens)

Cloud GPUs: For improved performance, consider using cloud services such as AWS, GCP, or Azure, which provide GPU options like NVIDIA V100.

License

The model is licensed under the Apache 2.0 License, allowing for broad use with conditions for distribution and modifications.