Introduction

The FNet-base model is a pretrained model developed by Google, intended for use on English language tasks using masked language modeling (MLM) and next sentence prediction (NSP) objectives. It is based on a novel architecture where attention mechanisms are replaced with Fourier transforms.

Architecture

FNet utilizes a transformer architecture, substituting the traditional attention mechanism with Fourier transforms. This approach removes the need for an attention mask. The model is pretrained on a large English text corpus using self-supervised techniques, enabling it to generate feature-rich embeddings for downstream tasks.

Training

Training Data

FNet was pretrained on the C4 dataset, a cleaned version of the Common Crawl dataset.

Training Procedure

  • Preprocessing: Texts are lowercased and tokenized using SentencePiece with a vocabulary size of 32,000. The model processes inputs in the form of "[CLS] Sentence A [SEP] Sentence B [SEP]".
  • Pretraining: FNet-base was trained on 4 cloud TPUs in Pod configuration for one million steps. The training used the Adam optimizer with a learning rate of 1e-4 and a batch size of 256.

Guide: Running Locally

To use FNet-base locally, follow these steps:

  1. Install the Transformers Library: Ensure you have the transformers library installed.

    pip install transformers
    
  2. Load the Model and Tokenizer:

    from transformers import FNetForMaskedLM, FNetTokenizer, pipeline
    
    tokenizer = FNetTokenizer.from_pretrained("google/fnet-base")
    model = FNetForMaskedLM.from_pretrained("google/fnet-base")
    
  3. Run a Prediction:

    unmasker = pipeline('fill-mask', model=model, tokenizer=tokenizer)
    print(unmasker("Hello I'm a [MASK] model."))
    
  4. Cloud GPUs: For intensive tasks, consider using cloud services such as AWS EC2 with GPU instances or Google Cloud TPU.

License

The FNet-base model is released under the Apache-2.0 license, allowing for free use, distribution, and modification under the license's terms.

More Related APIs