ane distilbert base uncased finetuned sst 2 english

apple

Introduction

The ane-distilbert-base-uncased-finetuned-sst-2-english model is an optimized version of DistilBERT for the Apple Neural Engine (ANE). It is based on the distilbert-base-uncased-finetuned-sst-2-english model and has been adapted for improved performance on Apple hardware.

Architecture

This model uses the DistilBERT architecture, which is a smaller, faster, cheaper, and lighter version of BERT. The model has been fine-tuned on the SST-2 dataset for sentiment analysis and optimized for execution on the Apple Neural Engine.

Training

The model was fine-tuned using the SST-2 dataset, a part of the GLUE benchmark commonly used for training and evaluating text classification models. The optimization for ANE was done by modifying Apple's ml-ane-transformers repository to be compatible with the Hugging Face Transformers library.

Guide: Running Locally

Basic Steps

  1. Install Required Libraries
    Ensure you have transformers, torch, and coremltools installed. You can install them using pip:

    pip install transformers torch coremltools
    
  2. Load Model and Tokenizer
    Load the model and tokenizer using the Hugging Face Transformers library:

    import torch
    from transformers import AutoModelForSequenceClassification, AutoTokenizer
    
    model_checkpoint = "apple/ane-distilbert-base-uncased-finetuned-sst-2-english"
    tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
    model = AutoModelForSequenceClassification.from_pretrained(
        model_checkpoint, trust_remote_code=True, return_dict=False
    )
    
  3. Tokenize Input
    Prepare your input text for the model:

    inputs = tokenizer(
        ["The Neural Engine is really fast"],
        return_tensors="pt",
        max_length=128,
        padding="max_length"
    )
    
  4. Inference
    Run the model to obtain predictions:

    with torch.no_grad():
        outputs = model(**inputs)
    
  5. Using Core ML
    For optimal performance on Apple devices, utilize the Core ML version:

    import coremltools as ct
    
    mlmodel = ct.models.MLModel("DistilBERT_fp16.mlpackage")
    outputs_coreml = mlmodel.predict({
        "input_ids": inputs["input_ids"].astype(np.int32),
        "attention_mask": inputs["attention_mask"].astype(np.int32)
    })
    

Cloud GPUs

For faster training or inference, consider using cloud GPU services such as AWS, Google Cloud Platform, or Azure, which provide powerful hardware resources for machine learning tasks.

License

This model is licensed under the Apache-2.0 License, allowing for both personal and commercial use with attribution.

More Related APIs in Text Classification