ane distilbert base uncased finetuned sst 2 english
appleIntroduction
The ane-distilbert-base-uncased-finetuned-sst-2-english
model is an optimized version of DistilBERT for the Apple Neural Engine (ANE). It is based on the distilbert-base-uncased-finetuned-sst-2-english
model and has been adapted for improved performance on Apple hardware.
Architecture
This model uses the DistilBERT architecture, which is a smaller, faster, cheaper, and lighter version of BERT. The model has been fine-tuned on the SST-2 dataset for sentiment analysis and optimized for execution on the Apple Neural Engine.
Training
The model was fine-tuned using the SST-2 dataset, a part of the GLUE benchmark commonly used for training and evaluating text classification models. The optimization for ANE was done by modifying Apple's ml-ane-transformers
repository to be compatible with the Hugging Face Transformers library.
Guide: Running Locally
Basic Steps
-
Install Required Libraries
Ensure you havetransformers
,torch
, andcoremltools
installed. You can install them using pip:pip install transformers torch coremltools
-
Load Model and Tokenizer
Load the model and tokenizer using the Hugging Face Transformers library:import torch from transformers import AutoModelForSequenceClassification, AutoTokenizer model_checkpoint = "apple/ane-distilbert-base-uncased-finetuned-sst-2-english" tokenizer = AutoTokenizer.from_pretrained(model_checkpoint) model = AutoModelForSequenceClassification.from_pretrained( model_checkpoint, trust_remote_code=True, return_dict=False )
-
Tokenize Input
Prepare your input text for the model:inputs = tokenizer( ["The Neural Engine is really fast"], return_tensors="pt", max_length=128, padding="max_length" )
-
Inference
Run the model to obtain predictions:with torch.no_grad(): outputs = model(**inputs)
-
Using Core ML
For optimal performance on Apple devices, utilize the Core ML version:import coremltools as ct mlmodel = ct.models.MLModel("DistilBERT_fp16.mlpackage") outputs_coreml = mlmodel.predict({ "input_ids": inputs["input_ids"].astype(np.int32), "attention_mask": inputs["attention_mask"].astype(np.int32) })
Cloud GPUs
For faster training or inference, consider using cloud GPU services such as AWS, Google Cloud Platform, or Azure, which provide powerful hardware resources for machine learning tasks.
License
This model is licensed under the Apache-2.0 License, allowing for both personal and commercial use with attribution.