fnet base
googleIntroduction
The FNet-base model is a pretrained model developed by Google, intended for use on English language tasks using masked language modeling (MLM) and next sentence prediction (NSP) objectives. It is based on a novel architecture where attention mechanisms are replaced with Fourier transforms.
Architecture
FNet utilizes a transformer architecture, substituting the traditional attention mechanism with Fourier transforms. This approach removes the need for an attention mask. The model is pretrained on a large English text corpus using self-supervised techniques, enabling it to generate feature-rich embeddings for downstream tasks.
Training
Training Data
FNet was pretrained on the C4 dataset, a cleaned version of the Common Crawl dataset.
Training Procedure
- Preprocessing: Texts are lowercased and tokenized using SentencePiece with a vocabulary size of 32,000. The model processes inputs in the form of "[CLS] Sentence A [SEP] Sentence B [SEP]".
- Pretraining: FNet-base was trained on 4 cloud TPUs in Pod configuration for one million steps. The training used the Adam optimizer with a learning rate of 1e-4 and a batch size of 256.
Guide: Running Locally
To use FNet-base locally, follow these steps:
-
Install the Transformers Library: Ensure you have the
transformers
library installed.pip install transformers
-
Load the Model and Tokenizer:
from transformers import FNetForMaskedLM, FNetTokenizer, pipeline tokenizer = FNetTokenizer.from_pretrained("google/fnet-base") model = FNetForMaskedLM.from_pretrained("google/fnet-base")
-
Run a Prediction:
unmasker = pipeline('fill-mask', model=model, tokenizer=tokenizer) print(unmasker("Hello I'm a [MASK] model."))
-
Cloud GPUs: For intensive tasks, consider using cloud services such as AWS EC2 with GPU instances or Google Cloud TPU.
License
The FNet-base model is released under the Apache-2.0 license, allowing for free use, distribution, and modification under the license's terms.