MobileNetV2 Fake News CNN
Model Performance Metrics
Project Overview
The MobileNetV2 Fake News CNN is a custom-trained image manipulation and authentication deep learning model. Engineered using a high-efficiency MobileNetV2 convolutional neural network backbone (transfer learning from ImageNet), it classifies whether images embedded in online news stories have been photoshopped, spliced, or digitally modified to propagate fake news.
The model was trained on a perfectly balanced dataset of 5,000 images — 2,500 Real and 2,500 Fake — preventing class-bias during optimization. The final weight checkpoint (FINAL_ROBUST_MODEL.h5) achieved an outstanding AUC of 0.96 and an Average Precision of 0.98, integrated with a Python Flask local dashboard for interactive analysis.
Training Results & Evaluation Charts
Accuracy Graph — Train vs Validation
Training accuracy climbs steadily to ~94% over 15 epochs. Validation accuracy peaks at ~82% around epoch 9, then shows mild overfitting — indicating the optimal early-stopping point for the best generalizing checkpoint.
Learning Curves — Loss (Train vs Validation)
Training loss converges sharply from 0.80 → 0.21. Validation loss reaches a minimum of ~0.41 before diverging, confirming the model generalizes well up to epoch 9. Binary cross-entropy loss function with Adam optimizer.
Confusion Matrix — Real vs Fake
Out of 100 test samples: 46 Real correctly classified, 4 Real misclassified as Fake. 47 Fake correctly classified, 3 Fake misclassified as Real. Overall test accuracy: 93% with very low false negative rate.
ROC Curve — AUC = 0.96
The Receiver Operating Characteristic curve shows AUC = 0.96 — significantly above the random baseline (0.5 diagonal). This indicates the model has excellent discrimination between Real and Fake images across all classification thresholds.
Precision-Recall Curve — AP = 0.98
Average Precision of 0.98 — the model maintains near-perfect precision (~100%) even at low recall thresholds. The curve remains very high and linear, demonstrating robust classification performance with minimal precision-recall tradeoff.
Hyperparameter Tuning — LR vs Batch
Grid search across 4 configurations (val acc: 0.82, 0.85, 0.89, 0.86). Shows the relationship between Learning Rate and Batch Size. Best configuration achieved val acc 0.89 — selected for the final ROBUST model checkpoint.
Dataset Distribution — Perfectly Balanced
Dataset contains exactly 2,500 Real and 2,500 Fake images — a perfectly balanced 50/50 split. This eliminates class-imbalance bias during training and ensures the model does not prefer one class over another.
Preprocessing Pipeline
3-stage pipeline: Raw Images & Text → Corruption / Noise Injection (augmentation for robustness) → Resize / Tokenize (224×224 normalization, float scaling) → Model Input tensor ready for MobileNetV2.
Training Loop Flowchart
Per-epoch training cycle: Batch Data → Forward Pass (MobileNetV2 + Dense layers) → Calculate Loss (Binary Cross-Entropy) → Backpropagation (gradient descent) → Update Weights (Adam optimizer) → Validation.
Core Features
- MobileNetV2 Transfer Learning: Pretrained ImageNet backbone with custom top layers — GlobalAveragePooling → Dense(256, ReLU) → Dropout(0.5) → Softmax output.
- Perfectly Balanced Dataset (5,000 images): 2,500 Real + 2,500 Fake — prevents class-bias, ensuring unbiased gradient updates throughout training.
- AUC 0.96 / AP 0.98: Exceptional discriminative power — model cleanly separates manipulated from authentic images across all threshold values.
- Hyperparameter Grid Search: Systematic LR × Batch Size sweeps tested 4 configurations, selecting the optimal setup (val acc 0.89) for the final model.
- Flask Local Dashboard: Web UI where users can drag-and-drop image files or paste URLs to receive instant Fake/Real classification with softmax confidence scores.
- Robust Data Augmentation: Horizontal flips, brightness jitter, and noise corruption applied during preprocessing to improve generalization against unseen manipulations.
Neural Network Architecture
The deep model relies on a highly optimized convolutional layer stack compiled on Keras:
- Pretrained Base: MobileNetV2 feature extractor (excluding top classification layers, loaded with ImageNet base weights). Frozen during initial training, then fine-tuned.
- Global Average Pooling: Reduces spatial dimensions to compress features while preventing overfitting by eliminating fully-connected flattening.
- Dense Layer (256 nodes): ReLU activation to capture complex non-linear patterns in the extracted feature maps.
- Dropout Layer (50%): Randomly drops half the nodes per batch — forces redundant learning and ensures high generalization on unseen images.
- Output Layer: Softmax activation producing clean binary probability distributions — probability of Real vs. Fake per image.
Customization & Extensibility
- Transfer Learning Fine-Tuning: Configure custom dense layers, dropout ratios, optimizer rates (Adam/SGD), and loss indicators (Binary/Categorical Cross-Entropy).
- Alternative Backbone Swap: Easily swap MobileNetV2 for ResNet50, EfficientNetB0, or VGG16 within the TensorFlow loading pipeline — same training loop applies.
- Dataset Expansion: Plug in custom image datasets (e.g. medical images, satellite imagery, document fraud detection) — same preprocessing pipeline adapts.
- Flask REST API: Refactor Flask endpoints to output RESTful JSON structures, enabling integration with React, Vue, mobile apps, or browser extensions.
- Class Expansion: Extend from binary (Real/Fake) to multi-class classification (e.g. Spliced / Copy-Move / Retouched / Authentic) with minimal architecture changes.