bert large cased squad v1.1 portuguese LLM Model

Introduction

The model is a Portuguese BERT Large Cased question-answering model finetuned on the SQuAD v1.1 dataset, created by the Deep Learning Brasil group. The underlying language model, BERTimbau Large, is a pretrained BERT model specifically designed for Brazilian Portuguese, achieving state-of-the-art performance in various NLP tasks.

Architecture

The model utilizes the BERTimbau Large architecture, which is a variant of BERT tailored for Brazilian Portuguese. It is available in two sizes: Base and Large. This specific model has been finetuned for the task of question answering using the SQuAD v1.1 dataset translated into Portuguese.

Training

Training was conducted using the Portuguese version of the SQuAD v1.1 dataset. Performance metrics indicate an F1 score of 84.43 and an exact match score of 72.68, which are improvements over the base model's scores.

Guide: Running Locally

To use the model locally, follow these steps:

Install the transformers library.

Use the pipeline or AutoTokenizer and AutoModelForQuestionAnswering classes to load the model:

from transformers import pipeline
nlp = pipeline("question-answering", model="pierreguillou/bert-large-cased-squad-v1.1-portuguese")

Provide a context and a question to the model to receive an answer.

For enhanced performance, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.

License

The model is licensed under the MIT License.