Propaganda Techniques Analysis en B E R T
QCRIIntroduction
The Propaganda Techniques Analysis BERT model is designed to predict propaganda techniques in English news articles. This model leverages a BERT architecture to perform fine-grained analysis by detecting specific propaganda techniques within texts.
Architecture
The model utilizes a BERT-based architecture, specifically designed for token and sequence joint classification tasks. It classifies text sequences and tokens to identify propaganda techniques, leveraging a pre-trained BERT model (bert-base-cased
) as its backbone.
Training
The model was trained using a dataset of news articles annotated at the fragment level with 18 different propaganda techniques. This approach allows for a detailed analysis of texts, identifying not only the presence of propaganda but also categorizing the specific techniques used.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install Dependencies:
- Ensure you have Python installed.
- Install the Hugging Face Transformers library:
pip install transformers
-
Load the Model:
- Use the following Python script to load and run the model:
from transformers import BertTokenizerFast from .model import BertForTokenAndSequenceJointClassification tokenizer = BertTokenizerFast.from_pretrained('bert-base-cased') model = BertForTokenAndSequenceJointClassification.from_pretrained( "QCRI/PropagandaTechniquesAnalysis-en-BERT", revision="v0.1.0", ) inputs = tokenizer.encode_plus("Hello, my dog is cute", return_tensors="pt") outputs = model(**inputs) sequence_class_index = torch.argmax(outputs.sequence_logits, dim=-1) sequence_class = model.sequence_tags[sequence_class_index[0]] token_class_index = torch.argmax(outputs.token_logits, dim=-1) tokens = tokenizer.convert_ids_to_tokens(inputs.input_ids[0][1:-1]) tags = [model.token_tags[i] for i in token_class_index[0].tolist()[1:-1]]
- Use the following Python script to load and run the model:
-
Suggest Cloud GPUs:
- For performance improvements, consider using cloud GPUs such as those offered by AWS, Google Cloud, or Azure.
License
The model is licensed under the MIT License, allowing for wide use and distribution.