Forensic Analysis
Digital Forensics for Deepfakes
Metadata Examination
Standard: EXIF (Exchangeable Image File Format)
# Extract comprehensive metadata
exiftool -a -G1 suspicious_video.mp4
# Key indicators:
# - Software: Check for deepfake tools
# - CreateDate vs ModifyDate: Large gaps suspicious
# - GPS: Location consistency
# - Camera Model: Matches claimed source?
Research: Verdoliva, L. (2020) - “Media Forensics and DeepFakes: An Overview” IEEE Journal of Selected Topics in Signal Processing, 14(5), 910-932 DOI: 10.1109/JSTSP.2020.3002101
File System Analysis
import os
import hashlib
from datetime import datetime
class ForensicAnalyzer:
def analyze_file(self, filepath):
"""
Comprehensive file analysis
"""
stat = os.stat(filepath)
return {
'size': stat.st_size,
'created': datetime.fromtimestamp(stat.st_ctime),
'modified': datetime.fromtimestamp(stat.st_mtime),
'accessed': datetime.fromtimestamp(stat.st_atime),
'md5': self.calculate_hash(filepath, 'md5'),
'sha256': self.calculate_hash(filepath, 'sha256')
}
def calculate_hash(self, filepath, algorithm='sha256'):
h = hashlib.new(algorithm)
with open(filepath, 'rb') as f:
for chunk in iter(lambda: f.read(4096), b""):
h.update(chunk)
return h.hexdigest()
Chain of Custody
Evidence Preservation
Standard: ISO/IEC 27037:2012 - Digital Evidence Guidelines
class ChainOfCustody:
def __init__(self):
self.log = []
def acquire_evidence(self, source, investigator):
"""
Document evidence acquisition
"""
entry = {
'timestamp': datetime.utcnow().isoformat(),
'action': 'ACQUIRED',
'source': source,
'investigator': investigator,
'hash': self.calculate_hash(source),
'location': os.path.abspath(source)
}
self.log.append(entry)
return entry
def transfer_custody(self, from_person, to_person, reason):
"""
Document custody transfer
"""
entry = {
'timestamp': datetime.utcnow().isoformat(),
'action': 'TRANSFERRED',
'from': from_person,
'to': to_person,
'reason': reason
}
self.log.append(entry)
Frame-Level Analysis
Compression Artifacts
Research: Matern et al. (2019) - “Exploiting Visual Artifacts to Expose Deepfakes”
import cv2
import numpy as np
def analyze_compression_artifacts(video_path):
"""
Deepfakes often show inconsistent compression
"""
cap = cv2.VideoCapture(video_path)
artifact_scores = []
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# Convert to frequency domain
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
dct = cv2.dct(np.float32(gray))
# Analyze high-frequency components
high_freq = dct[32:, 32:]
artifact_score = np.mean(np.abs(high_freq))
artifact_scores.append(artifact_score)
# Inconsistent scores indicate manipulation
return np.std(artifact_scores)
Biological Signal Detection
Method: Blood flow analysis (used by Intel FakeCatcher)
def detect_blood_flow_inconsistencies(video_path):
"""
Real faces show subtle blood flow changes
Deepfakes often lack this biological signal
"""
cap = cv2.VideoCapture(video_path)
frames = []
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
frames.append(frame)
# Analyze subtle color changes in face region
# Real faces show periodic changes from blood flow
# Deepfakes typically show static patterns
return analyze_temporal_color_patterns(frames)
Legal Admissibility
Daubert Standard (US Courts)
Criteria for Expert Testimony:
- Testability: Can the method be tested?
- Peer Review: Published in journals?
- Error Rate: Known accuracy?
- Standards: Accepted in scientific community?
- General Acceptance: Widely used?
Case Law: Daubert v. Merrell Dow Pharmaceuticals, 509 U.S. 579 (1993)
Documentation Requirements
## Forensic Report Template
### Case Information
- Case Number: [ID]
- Date: [YYYY-MM-DD]
- Investigator: [Name, Credentials]
- Qualifications: [Certifications, Experience]
### Evidence Description
- File: [filename]
- Hash (SHA-256): [hash]
- Size: [bytes]
- Source: [origin]
- Acquisition Method: [how obtained]
### Analysis Methods
1. Method: [Name]
- Tool: [Software version]
- Standard: [ISO/IEEE reference]
- Result: [Finding]
- Confidence: [percentage]
### Findings
- Conclusion: [AUTHENTIC / MANIPULATED / INCONCLUSIVE]
- Confidence Level: [percentage]
- Supporting Evidence: [details]
- Alternative Explanations: [considered]
### Chain of Custody
[Complete log with timestamps and signatures]
### Limitations
- Known limitations of methods
- Assumptions made
- Scope of analysis
### Signature
[Digital signature with timestamp]
Statistical Analysis
Benford’s Law Application
Research: Applying Benford’s Law to detect manipulation
import numpy as np
from collections import Counter
def benfords_law_test(pixel_values):
"""
Natural images follow Benford's Law
Manipulated images often deviate
"""
# Extract first digits
first_digits = [int(str(abs(x))[0]) for x in pixel_values if x != 0]
# Count frequencies
counts = Counter(first_digits)
observed = [counts[d] / len(first_digits) for d in range(1, 10)]
# Benford's expected distribution
expected = [np.log10(1 + 1/d) for d in range(1, 10)]
# Chi-square test
chi_square = sum((o - e)**2 / e for o, e in zip(observed, expected))
# Critical value at 95% confidence: 15.507
return chi_square > 15.507
Timeline Reconstruction
Event Sequencing
class TimelineAnalyzer:
def reconstruct_timeline(self, evidence_files):
"""
Build chronological timeline of events
"""
events = []
for file in evidence_files:
metadata = self.extract_metadata(file)
events.append({
'timestamp': metadata['created'],
'event': 'FILE_CREATED',
'file': file,
'source': metadata.get('camera_model')
})
if metadata['modified'] != metadata['created']:
events.append({
'timestamp': metadata['modified'],
'event': 'FILE_MODIFIED',
'file': file
})
# Sort chronologically
events.sort(key=lambda x: x['timestamp'])
return events
Multimodal Deepfake Detection
Approach: Combining multiple detection methods
class MultimodalDetector:
def analyze(self, video_path):
"""
Combine spatial, temporal, and frequency analysis
"""
results = {
'spatial': self.spatial_analysis(video_path),
'temporal': self.temporal_analysis(video_path),
'frequency': self.frequency_analysis(video_path),
'biological': self.biological_signal_analysis(video_path)
}
# Aggregate results
confidence = self.aggregate_results(results)
return {
'verdict': 'MANIPULATED' if confidence > 0.7 else 'AUTHENTIC',
'confidence': confidence,
'details': results
}
Research Citations
-
Verdoliva, L. (2020) - Media Forensics Overview
- DOI: 10.1109/JSTSP.2020.3002101
-
Tolosana, R., et al. (2020) - DeepFakes and Beyond: A Survey
- DOI: 10.1016/j.inffus.2020.06.014
-
ISO/IEC 27037:2012 - Digital Evidence Guidelines
-
Matern et al. (2019) - Visual Artifacts
-
Daubert v. Merrell Dow - 509 U.S. 579 (1993)
Next: Legal Framework →