Forensic Analysis

Digital Forensics for Deepfakes

Metadata Examination

Standard: EXIF (Exchangeable Image File Format)

# Extract comprehensive metadata
exiftool -a -G1 suspicious_video.mp4

# Key indicators:
# - Software: Check for deepfake tools
# - CreateDate vs ModifyDate: Large gaps suspicious
# - GPS: Location consistency
# - Camera Model: Matches claimed source?

Research: Verdoliva, L. (2020) - “Media Forensics and DeepFakes: An Overview” IEEE Journal of Selected Topics in Signal Processing, 14(5), 910-932 DOI: 10.1109/JSTSP.2020.3002101

File System Analysis

import os
import hashlib
from datetime import datetime

class ForensicAnalyzer:
    def analyze_file(self, filepath):
        """
        Comprehensive file analysis
        """
        stat = os.stat(filepath)
        
        return {
            'size': stat.st_size,
            'created': datetime.fromtimestamp(stat.st_ctime),
            'modified': datetime.fromtimestamp(stat.st_mtime),
            'accessed': datetime.fromtimestamp(stat.st_atime),
            'md5': self.calculate_hash(filepath, 'md5'),
            'sha256': self.calculate_hash(filepath, 'sha256')
        }
    
    def calculate_hash(self, filepath, algorithm='sha256'):
        h = hashlib.new(algorithm)
        with open(filepath, 'rb') as f:
            for chunk in iter(lambda: f.read(4096), b""):
                h.update(chunk)
        return h.hexdigest()

Chain of Custody

Evidence Preservation

Standard: ISO/IEC 27037:2012 - Digital Evidence Guidelines

class ChainOfCustody:
    def __init__(self):
        self.log = []
    
    def acquire_evidence(self, source, investigator):
        """
        Document evidence acquisition
        """
        entry = {
            'timestamp': datetime.utcnow().isoformat(),
            'action': 'ACQUIRED',
            'source': source,
            'investigator': investigator,
            'hash': self.calculate_hash(source),
            'location': os.path.abspath(source)
        }
        self.log.append(entry)
        return entry
    
    def transfer_custody(self, from_person, to_person, reason):
        """
        Document custody transfer
        """
        entry = {
            'timestamp': datetime.utcnow().isoformat(),
            'action': 'TRANSFERRED',
            'from': from_person,
            'to': to_person,
            'reason': reason
        }
        self.log.append(entry)

Frame-Level Analysis

Compression Artifacts

Research: Matern et al. (2019) - “Exploiting Visual Artifacts to Expose Deepfakes”

import cv2
import numpy as np

def analyze_compression_artifacts(video_path):
    """
    Deepfakes often show inconsistent compression
    """
    cap = cv2.VideoCapture(video_path)
    artifact_scores = []
    
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
        
        # Convert to frequency domain
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        dct = cv2.dct(np.float32(gray))
        
        # Analyze high-frequency components
        high_freq = dct[32:, 32:]
        artifact_score = np.mean(np.abs(high_freq))
        artifact_scores.append(artifact_score)
    
    # Inconsistent scores indicate manipulation
    return np.std(artifact_scores)

Biological Signal Detection

Method: Blood flow analysis (used by Intel FakeCatcher)

def detect_blood_flow_inconsistencies(video_path):
    """
    Real faces show subtle blood flow changes
    Deepfakes often lack this biological signal
    """
    cap = cv2.VideoCapture(video_path)
    frames = []
    
    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
        frames.append(frame)
    
    # Analyze subtle color changes in face region
    # Real faces show periodic changes from blood flow
    # Deepfakes typically show static patterns
    
    return analyze_temporal_color_patterns(frames)

Legal Admissibility

Daubert Standard (US Courts)

Criteria for Expert Testimony:

Testability: Can the method be tested?
Peer Review: Published in journals?
Error Rate: Known accuracy?
Standards: Accepted in scientific community?
General Acceptance: Widely used?

Case Law: Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579 (1993)

Documentation Requirements

## Forensic Report Template

### Case Information
- Case Number: [ID]
- Date: [YYYY-MM-DD]
- Investigator: [Name, Credentials]
- Qualifications: [Certifications, Experience]

### Evidence Description
- File: [filename]
- Hash (SHA-256): [hash]
- Size: [bytes]
- Source: [origin]
- Acquisition Method: [how obtained]

### Analysis Methods
1. Method: [Name]
   - Tool: [Software version]
   - Standard: [ISO/IEEE reference]
   - Result: [Finding]
   - Confidence: [percentage]

### Findings
- Conclusion: [AUTHENTIC / MANIPULATED / INCONCLUSIVE]
- Confidence Level: [percentage]
- Supporting Evidence: [details]
- Alternative Explanations: [considered]

### Chain of Custody
[Complete log with timestamps and signatures]

### Limitations
- Known limitations of methods
- Assumptions made
- Scope of analysis

### Signature
[Digital signature with timestamp]

Statistical Analysis

Benford’s Law Application

Research: Applying Benford’s Law to detect manipulation

import numpy as np
from collections import Counter

def benfords_law_test(pixel_values):
    """
    Natural images follow Benford's Law
    Manipulated images often deviate
    """
    # Extract first digits
    first_digits = [int(str(abs(x))[0]) for x in pixel_values if x != 0]
    
    # Count frequencies
    counts = Counter(first_digits)
    observed = [counts[d] / len(first_digits) for d in range(1, 10)]
    
    # Benford's expected distribution
    expected = [np.log10(1 + 1/d) for d in range(1, 10)]
    
    # Chi-square test
    chi_square = sum((o - e)**2 / e for o, e in zip(observed, expected))
    
    # Critical value at 95% confidence: 15.507
    return chi_square > 15.507

Timeline Reconstruction

Event Sequencing

class TimelineAnalyzer:
    def reconstruct_timeline(self, evidence_files):
        """
        Build chronological timeline of events
        """
        events = []
        
        for file in evidence_files:
            metadata = self.extract_metadata(file)
            
            events.append({
                'timestamp': metadata['created'],
                'event': 'FILE_CREATED',
                'file': file,
                'source': metadata.get('camera_model')
            })
            
            if metadata['modified'] != metadata['created']:
                events.append({
                    'timestamp': metadata['modified'],
                    'event': 'FILE_MODIFIED',
                    'file': file
                })
        
        # Sort chronologically
        events.sort(key=lambda x: x['timestamp'])
        return events

Multimodal Deepfake Detection

Approach: Combining multiple detection methods

class MultimodalDetector:
    def analyze(self, video_path):
        """
        Combine spatial, temporal, and frequency analysis
        """
        results = {
            'spatial': self.spatial_analysis(video_path),
            'temporal': self.temporal_analysis(video_path),
            'frequency': self.frequency_analysis(video_path),
            'biological': self.biological_signal_analysis(video_path)
        }
        
        # Aggregate results
        confidence = self.aggregate_results(results)
        return {
            'verdict': 'MANIPULATED' if confidence > 0.7 else 'AUTHENTIC',
            'confidence': confidence,
            'details': results
        }

Research Citations

Verdoliva, L. (2020) - Media Forensics Overview
- DOI: 10.1109/JSTSP.2020.3002101
Tolosana, R., et al. (2020) - DeepFakes and Beyond: A Survey
- DOI: 10.1016/j.inffus.2020.06.014
ISO/IEC 27037:2012 - Digital Evidence Guidelines
Matern et al. (2019) - Visual Artifacts
Daubert v. Merrell Dow - 509 U.S. 579 (1993)

Next: Legal Framework →

Keyboard shortcuts

Security Awareness: Deepfakes & Prompt Injections