Ambatron B E R Ta

Peerawat2024

Introduction

AmbatronBERTa is a Thai language model fine-tuned for text classification tasks. It builds upon the WangchanBERTa architecture to enhance classification accuracy in Thai texts, utilizing a dataset of over 3,000 research papers.

Architecture

AmbatronBERTa is based on the transformer-based WangchanBERTa model. It effectively captures the nuances of Thai language, making it suitable for various document classification tasks.

Training

The model was fine-tuned using a dataset consisting of over 3,000 research papers. This training process was aimed at improving the model's ability to handle Thai text classification across different domains.

Guide: Running Locally

To use AmbatronBERTa with the transformers library, follow these steps:

  1. Install the transformers library if not already installed:

    pip install transformers
    
  2. Use the following code to load the tokenizer and model:

    from transformers import AutoTokenizer, AutoModelForSequenceClassification
    
    # Load the tokenizer and model
    tokenizer = AutoTokenizer.from_pretrained("Peerawat2024/AmbatronBERTa")
    model = AutoModelForSequenceClassification.from_pretrained("Peerawat2024/AmbatronBERTa")
    
  3. Optionally, consider using cloud GPUs from providers like AWS, GCP, or Azure to enhance performance during model training and inference.

License

The license for AmbatronBERTa is currently unknown.

More Related APIs