DistilBERT for Political Text Classification

Automated Detection of Extremist Content in Online Political Discourse

Machine Learning

NLP

Political Science

Transformers

Author

Alex Newhouse

Published

January 1, 2024

Abstract

This project demonstrates the application of transformer-based natural language processing for identifying extremist content in political discussions. Using DistilBERT, a lightweight BERT variant, we achieve 94% F1-score on a labeled dataset of political forum posts, significantly outperforming traditional machine learning baselines. The model shows practical applications for content moderation, academic research on political radicalization, and policy development for platform governance.

Research Context

Online political discourse increasingly shapes real-world political outcomes, with extremist content posing particular challenges for platform governance and democratic stability. This project applies state-of-the-art NLP techniques to automatically identify extremist political content, supporting:

Content moderation at scale for social media platforms
Academic research on political radicalization processes
Policy development for online platform governance
Early warning systems for potential offline violence

Research Questions

How effectively can transformer models identify extremist content in political text?
What performance gains do modern NLP approaches offer over traditional machine learning?
How can data augmentation improve model performance on limited labeled data?

Methodology

Our approach employs DistilBERT, a distilled version of BERT that retains 97% of BERT’s performance while being 60% smaller and faster. The methodology includes:

Text preprocessing and cleaning for social media data
Data augmentation using contextual word embeddings
Transfer learning from pre-trained language models
Comparative evaluation against traditional ML baselines
Real-world application to unlabeled political forum data

Data Collection & Preprocessing

Dataset Characteristics

Our labeled dataset consists of political forum posts manually annotated for extremist content:

Size: 1,438 posts (after augmentation)
Sources: Political discussion forums and social media
Labels: Binary classification (extremist/non-extremist)
Domain: Contemporary political discourse

Preprocessing Pipeline

Code

import pandas as pd
import torch
import transformers
from torch.utils.data import Dataset, DataLoader
import torch.nn as nn
from transformers import DistilBertModel, DistilBertTokenizer
from sklearn import metrics
from tqdm import tqdm
import numpy as np
import re

# Text preprocessing functions
def remove_URL(text):
    """Remove URLs from text"""
    url = re.compile(r'https?://\S+|www\.\S+')
    return url.sub(r'',text)

def remove_numbers(text):
    """Remove standalone numbers"""
    text = ''.join([i for i in text if not i.isdigit()])
    return text

def remove_html(text):
    """Remove HTML tags"""
    html = re.compile(r'<.*?>')
    return html.sub(r'',text)

def remove_username(text):
    """Remove @username mentions"""
    url = re.compile(r'@[A-Za-z0-9_]+')
    return url.sub(r'',text)

def pre_process_text(text):
    """Apply full preprocessing pipeline"""
    text = remove_URL(text)
    text = remove_numbers(text)
    text = remove_html(text)
    text = remove_username(text)
    return " ".join(text.split())

# Load and preprocess data
df = pd.read_csv("labeled_political_data.csv")
df = df.rename(columns={"sentiment": "target"})
df = df[["text", "target"]].copy()

# Remove null values and apply preprocessing
df = df[df['text'].notnull()]
df['text'] = df['text'].apply(pre_process_text)

print(f"Dataset shape: {df.shape}")
print(f"Target distribution:\n{df['target'].value_counts()}")

Data Augmentation Strategy

To address limited training data, we employ contextual word embedding augmentation:

Code

import nlpaug.augmenter.word as naw

# Contextual augmentation using DistilBERT
aug = naw.ContextualWordEmbsAug(
    model_path='distilbert-base-uncased',
    device='cpu',
    action="substitute"
)

# Generate augmented examples
augmented_texts = [aug.augment(text) for text in tqdm(df['text'])]
augmented_df = pd.DataFrame({
    "text": augmented_texts,
    "target": df['target']
})

# Combine original and augmented data
df = pd.concat([df, augmented_df]).reset_index(drop=True)
print(f"Final dataset size: {df.shape[0]} examples")

Model Architecture

DistilBERT Configuration

Code

# Model hyperparameters
MAX_LEN = 512
BATCH_SIZE = 32
EPOCHS = 15
LEARNING_RATE = 1e-05

# Initialize tokenizer
BERT_PATH = "distilbert-base-uncased"
tokenizer = DistilBertTokenizer.from_pretrained(
    BERT_PATH,
    do_lower_case=True
)

Custom Dataset Classes

Code

class PoliticalTextDataset(Dataset):
    """Custom dataset for political text classification"""

    def __init__(self, dataframe, tokenizer, max_len):
        self.data = dataframe
        self.tokenizer = tokenizer
        self.max_len = max_len

    def __getitem__(self, index):
        text = str(self.data.text.iloc[index])
        text = pre_process_text(text)

        inputs = self.tokenizer.encode_plus(
            text,
            None,
            add_special_tokens=True,
            padding='max_length',
            max_length=self.max_len,
            return_token_type_ids=True,
            truncation=True
        )

        return {
            'ids': torch.tensor(inputs['input_ids'], dtype=torch.long),
            'mask': torch.tensor(inputs['attention_mask'], dtype=torch.long),
            'targets': torch.tensor(self.data.target.iloc[index], dtype=torch.float)
        }

    def __len__(self):
        return len(self.data)

Model Architecture

Code

class DistilBERTClassifier(torch.nn.Module):
    """DistilBERT-based binary classifier for political text"""

    def __init__(self):
        super().__init__()
        self.distill_bert = transformers.DistilBertModel.from_pretrained(BERT_PATH)
        self.dropout = torch.nn.Dropout(0.3)
        self.classifier = torch.nn.Linear(768, 1)

    def forward(self, ids, mask):
        # Get DistilBERT outputs
        distilbert_output = self.distill_bert(ids, mask)
        hidden_state = distilbert_output[0]  # (batch_size, seq_len, hidden_size)

        # Use [CLS] token representation
        pooled_output = hidden_state[:, 0]  # (batch_size, hidden_size)

        # Apply dropout and classification layer
        output = self.dropout(pooled_output)
        return self.classifier(output)

# Initialize model
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = DistilBERTClassifier()
model.to(device)

print(f"Model initialized on device: {device}")
print(f"Number of parameters: {sum(p.numel() for p in model.parameters()):,}")

Training Process

Data Splitting and Loading

Code

# Split data
train_size = 0.7
train_dataset = df.sample(frac=train_size, random_state=42).reset_index(drop=True)
valid_dataset = df.drop(train_dataset.index).reset_index(drop=True)

print(f"Training set: {train_dataset.shape[0]} examples")
print(f"Validation set: {valid_dataset.shape[0]} examples")

# Create data loaders
training_set = PoliticalTextDataset(train_dataset, tokenizer, MAX_LEN)
validation_set = PoliticalTextDataset(valid_dataset, tokenizer, MAX_LEN)

train_loader = DataLoader(training_set, batch_size=BATCH_SIZE, shuffle=True)
valid_loader = DataLoader(validation_set, batch_size=BATCH_SIZE, shuffle=False)

Training Loop

Code

def train_model(model, train_loader, valid_loader, epochs):
    """Training loop with validation"""

    # Loss function and optimizer
    criterion = nn.BCEWithLogitsLoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)

    best_f1 = 0
    training_history = []

    for epoch in range(epochs):
        # Training phase
        model.train()
        total_loss = 0

        for batch in tqdm(train_loader, desc=f"Epoch {epoch+1}/{epochs}"):
            ids = batch['ids'].to(device)
            mask = batch['mask'].to(device)
            targets = batch['targets'].to(device)

            optimizer.zero_grad()
            outputs = model(ids, mask).squeeze()
            loss = criterion(outputs, targets)
            loss.backward()
            optimizer.step()

            total_loss += loss.item()

        # Validation phase
        val_f1 = evaluate_model(model, valid_loader)
        avg_loss = total_loss / len(train_loader)

        print(f'Epoch {epoch+1}: Train Loss: {avg_loss:.4f}, Val F1: {val_f1:.4f}')

        # Save best model
        if val_f1 > best_f1:
            best_f1 = val_f1
            torch.save(model.state_dict(), 'best_political_classifier.pth')

        training_history.append({
            'epoch': epoch + 1,
            'train_loss': avg_loss,
            'val_f1': val_f1
        })

    return training_history

def evaluate_model(model, data_loader):
    """Evaluate model and return F1 score"""
    model.eval()
    predictions = []
    true_labels = []

    with torch.no_grad():
        for batch in data_loader:
            ids = batch['ids'].to(device)
            mask = batch['mask'].to(device)
            targets = batch['targets'].to(device)

            outputs = model(ids, mask)
            preds = torch.sigmoid(outputs).cpu().numpy() >= 0.5

            predictions.extend(preds.flatten())
            true_labels.extend(targets.cpu().numpy())

    return metrics.f1_score(true_labels, predictions)

# Train the model
history = train_model(model, train_loader, valid_loader, EPOCHS)

Results & Evaluation

Performance Comparison

Our DistilBERT model significantly outperforms traditional machine learning approaches:

Model	F1-Score	Balanced Accuracy	Training Time
Random Baseline	0.384	0.480	-
Logistic Regression	0.637	0.812	2 minutes
Random Forest	0.498	0.927	5 minutes
Gradient Boosting	0.670	0.947	8 minutes
DistilBERT	0.940	0.935	45 minutes

Model Performance Analysis

Code

from sklearn.metrics import classification_report, confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns

# Generate predictions on validation set
model.eval()
val_predictions = []
val_true = []

with torch.no_grad():
    for batch in valid_loader:
        ids = batch['ids'].to(device)
        mask = batch['mask'].to(device)
        targets = batch['targets']

        outputs = model(ids, mask)
        preds = torch.sigmoid(outputs).cpu().numpy() >= 0.5

        val_predictions.extend(preds.flatten())
        val_true.extend(targets.numpy())

# Print detailed classification report
print("## Classification Report")
print(classification_report(val_true, val_predictions,
                          target_names=['Non-Extremist', 'Extremist']))

# Confusion Matrix Visualization
cm = confusion_matrix(val_true, val_predictions)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=['Non-Extremist', 'Extremist'],
            yticklabels=['Non-Extremist', 'Extremist'])
plt.title('Confusion Matrix - DistilBERT Political Text Classifier')
plt.ylabel('True Label')
plt.xlabel('Predicted Label')
plt.show()

Key Findings

Superior Performance: DistilBERT achieves 94% F1-score, representing a 27% improvement over the best traditional ML approach
Robust Classification: High precision and recall for both classes indicate reliable extremist content detection
Efficient Architecture: DistilBERT provides near-BERT performance with significantly reduced computational requirements
Data Augmentation Benefits: Contextual augmentation improved performance by ~8% over the baseline

Real-World Application

Inference Pipeline

Code

def predict_extremist_content(text, model, tokenizer):
    """Predict whether text contains extremist content"""

    # Preprocess text
    processed_text = pre_process_text(text)

    # Tokenize
    inputs = tokenizer.encode_plus(
        processed_text,
        None,
        add_special_tokens=True,
        padding='max_length',
        max_length=MAX_LEN,
        return_tensors='pt',
        truncation=True
    )

    # Model inference
    model.eval()
    with torch.no_grad():
        ids = inputs['input_ids'].to(device)
        mask = inputs['attention_mask'].to(device)

        outputs = model(ids, mask)
        probability = torch.sigmoid(outputs).cpu().numpy()[0][0]
        prediction = probability > 0.5

    return {
        'prediction': bool(prediction),
        'confidence': float(probability),
        'label': 'Extremist' if prediction else 'Non-Extremist'
    }

# Example usage
sample_texts = [
    "I disagree with the current immigration policy and think we need reform.",
    "The political system is corrupt and needs to be overthrown by any means necessary."
]

for text in sample_texts:
    result = predict_extremist_content(text, model, tokenizer)
    print(f"Text: {text[:50]}...")
    print(f"Prediction: {result['label']} (confidence: {result['confidence']:.3f})")
    print()

Case Study: Forum Analysis

Applied to a large corpus of political forum posts (N=50,000), our model identified:

8,481 posts (17%) flagged as potentially extremist
High-confidence predictions (>0.9) for 3,247 posts
Temporal patterns showing increased extremist content around election periods
Topic clustering revealing common themes in flagged content

Limitations & Future Work

Current Limitations

Domain Specificity: Model trained on specific political forums may not generalize to all platforms
Contextual Challenges: Sarcasm and irony remain difficult to detect accurately
Temporal Drift: Political language evolves rapidly, requiring model updates
Bias Concerns: Training data may reflect annotator biases

Future Directions

Multi-domain Training: Expand to diverse political platforms and languages
Temporal Robustness: Implement continuous learning strategies
Explainability: Add attention visualization for model interpretability
Ethical Framework: Develop guidelines for responsible deployment

Technical Implementation

Model Deployment

Code

# Save complete model for deployment
torch.save({
    'model_state_dict': model.state_dict(),
    'tokenizer': tokenizer,
    'model_config': {
        'max_length': MAX_LEN,
        'model_name': 'distilbert-base-uncased'
    }
}, 'political_classifier_complete.pth')

# Load model for inference
def load_trained_model(model_path):
    """Load pre-trained model for inference"""
    checkpoint = torch.load(model_path, map_location=device)

    model = DistilBERTClassifier()
    model.load_state_dict(checkpoint['model_state_dict'])
    model.to(device)
    model.eval()

    return model, checkpoint['tokenizer'], checkpoint['model_config']

Conclusions

This project demonstrates the effectiveness of transformer-based models for automated detection of extremist political content. Key contributions include:

Methodological Innovation: Successfully adapted DistilBERT for political text classification with 94% F1-score
Practical Application: Developed scalable pipeline for real-world content moderation
Comparative Analysis: Demonstrated significant advantages over traditional ML approaches
Research Impact: Provided tools for studying online political radicalization

The results have implications for: - Platform Governance: Automated content moderation systems - Academic Research: Large-scale analysis of political discourse - Policy Development: Evidence-based approaches to online extremism

Reproducibility

All code and documentation are available for replication. The model architecture and training procedures follow established best practices for transformer-based text classification.

This research contributes to computational approaches for understanding and mitigating online political extremism, supporting both academic inquiry and practical applications in digital platform governance.