1. Introduction

I put this guide together as a structured approach to security-focused code review for Python applications. Whether you’re just starting to identify security vulnerabilities in Python code or you’re an experienced developer looking for a language-specific checklist, I’ve tried to make it useful at both levels.

Python’s dynamic typing, rich standard library, and extensive third-party ecosystem make it enormously productive, but the more I reviewed Python codebases, the more I realised these same qualities introduce security pitfalls that static analysis alone cannot always catch. What follows covers manual review strategies, common anti-patterns, recommended tooling, and vulnerability patterns organised by class, with cross-references to the intentionally vulnerable examples in this project.

Audience: Security trainees, application developers, code reviewers, and anyone evaluating Python codebases for security weaknesses.

GitHub Repo of Examples


2. Manual Review Best Practices

2.1 Trace Data from Entry to Sink

Python web frameworks (Flask, Django, FastAPI) accept user input through request.args, request.form, request.get_json(), query parameters, headers, and cookies. The approach I’ve found most effective is to trace every external input to where it’s consumed, database queries, shell commands, file operations, HTTP requests, or rendered HTML. Any path from source to sink without validation or sanitisation is a potential vulnerability.

2.2 Inspect String Formatting in Sensitive Contexts

Python offers multiple string formatting mechanisms: f-strings, str.format(), % formatting, and concatenation. When any of these are used to build SQL queries, shell commands, HTML output, or URLs with user-controlled data, treat it as a high-priority finding. This is one of the first things I look for in any Python review.

2.3 Review Import Statements

Scan imports for modules known to introduce risk:

  • pickle, shelve, marshal, deserialisation of untrusted data
  • os.system, os.popen, subprocess with shell=True, command injection
  • eval, exec, compile, arbitrary code execution
  • yaml.load with yaml.Loader or yaml.FullLoader, YAML deserialisation attacks
  • xml.etree.ElementTree, lxml.etree with entity resolution, XXE attacks

2.4 Check for Hardcoded Secrets

Search for string literals that look like passwords, API keys, tokens, or connection strings. Common patterns include variables named SECRET_KEY, API_KEY, PASSWORD, TOKEN, or connection strings containing ://user:pass@host.

2.5 Examine Error Handling

Look for bare except Exception blocks that expose stack traces, internal paths, or database details to the caller. In Flask, check for custom error handlers that return traceback.format_exc() or raw exception messages.

2.6 Evaluate Cryptographic Choices

Flag uses of hashlib.md5() or hashlib.sha1() for password hashing or integrity verification. Check for DES, ECB mode, or hardcoded encryption keys. Verify that random (not secrets or os.urandom) is not used for security-sensitive token generation.

2.7 Assess Access Control Logic

For every endpoint, verify:

  • Authentication is checked before any business logic executes
  • Authorisation checks use server-side session data, not client-supplied headers (e.g., X-User-Role)
  • Object-level access control prevents users from accessing resources belonging to others (IDOR)

2.8 Review Configuration Defaults

Check Flask app.config for DEBUG = True, permissive CORS (Access-Control-Allow-Origin: *), missing security headers, and verbose error responses in production code.


3. Common Security Pitfalls

3.1 Injection

Anti-Pattern Risk
"SELECT * FROM t WHERE x = '" + user_input + "'" SQL injection via string concatenation
"SELECT ... ORDER BY {} {}".format(col, order) SQL injection via unvalidated ORDER BY clause
os.popen("ping -c 3 " + host) Command injection via os.popen
subprocess.Popen(cmd, shell=True) Command injection when cmd includes user data
os.system("tar czf /tmp/{}.tar.gz {}".format(name, path)) Command injection via os.system
Rendering user input directly into HTML with .format() Stored and reflected XSS
JSONP callback without sanitisation XSS via callback parameter

3.2 Broken Access Control

Anti-Pattern Risk
Endpoint returns full user record (including SSN, salary) without field filtering Excessive data exposure
Authorisation check reads role from X-User-Role header instead of session Client-side role spoofing
No ownership check on document access/update IDOR, any authenticated user can read/modify any resource
Debug/config endpoints with no authentication Information disclosure of secrets and credentials

3.3 Cryptographic Failures

Anti-Pattern Risk
hashlib.md5(password.encode()).hexdigest() for password storage Weak, unsalted password hashing
DES.new(key, DES.MODE_ECB) Deprecated cipher in insecure mode
random.seed(int(time.time())) for token generation Predictable PRNG seeding
Hardcoded ENCRYPTION_KEY = b"s3cr3t!!" Key exposure in source code

3.4 Insecure Design

Anti-Pattern Risk
Plaintext passwords stored in dictionaries No hashing at all
Error responses revealing usernames, email existence, or attempt counts User enumeration
Password reset tokens returned in API response body Token leakage
traceback.format_exc() returned to client Stack trace information disclosure

3.5 Security Misconfiguration

Anti-Pattern Risk
app.run(debug=True) in production code Debug mode exposes interactive debugger
app.config["SECRET_KEY"] = "changeme" Weak/default secret key
Access-Control-Allow-Origin reflecting any Origin header Overly permissive CORS
etree.XMLParser(resolve_entities=True, no_network=False) XXE via XML entity resolution
Server header exposing framework and Python version Technology fingerprinting

3.6 Vulnerable Components

Anti-Pattern Risk
yaml.load(payload, Loader=yaml.FullLoader) YAML deserialisation (FullLoader still allows some unsafe constructs)
Template(user_input) with Jinja2 Server-side template injection (SSTI)
Outdated package versions in requirements.txt Known CVEs in dependencies

3.7 Integrity Failures (Deserialisation)

Anti-Pattern Risk
pickle.loads(untrusted_data) Arbitrary code execution via pickle
yaml.load(data, Loader=yaml.Loader) Arbitrary object instantiation via YAML
exec(compile(code, ...)) with remote code Remote code execution
importlib.import_module(user_input) Arbitrary module loading
urllib.request.urlretrieve(url, path) without validation Downloading and executing untrusted code

3.8 Logging and Monitoring Failures

Anti-Pattern Risk
Login response includes password and api_key fields Credential leakage in responses
No logging of failed authentication attempts Brute-force attacks go undetected
No logging of privilege changes (role updates, deactivations) Unauthorized changes are invisible
Bulk operations with no audit trail Mass data exfiltration undetected
Data export endpoint returns raw passwords Sensitive data in export payloads

3.9 SSRF

Anti-Pattern Risk
urllib.request.urlopen(user_url) without allowlist Unrestricted SSRF
Blocklist checking only localhost and 127.0.0.1 (missing 0.0.0.0, [::1], decimal IPs) SSRF blocklist bypass
Proxy endpoint accepting arbitrary base_url parameter Open proxy to internal services
Webhook callback URLs not validated against internal ranges SSRF via webhook registration

3.10 Race Conditions

Anti-Pattern Risk
Check-then-act on self.balance without locking Double-spend / negative balance
check_availability() then reserve() without atomicity Overselling tickets
Read-modify-write on shared file without file locking Lost counter updates
Singleton __new__ with time.sleep and no lock Multiple singleton instances

4.1 Bandit

Bandit is a Python-specific static analysis tool that finds common security issues by examining the AST of Python source files.

Installation:

pip install bandit

Basic usage, scan a single file:

bandit -r injection/sql-injection/python/app.py

Scan an entire directory recursively:

bandit -r security-bug-examples/ -x tests

Generate a JSON report:

bandit -r security-bug-examples/ -f json -o bandit_report.json

Filter by severity (medium and above):

bandit -r security-bug-examples/ -ll

What Bandit catches: Use of eval/exec, pickle.loads, subprocess with shell=True, os.system, os.popen, hardcoded passwords, weak cryptographic functions (md5, sha1, DES), yaml.load without SafeLoader, binding to 0.0.0.0, and debug=True in Flask.

4.2 Semgrep

Semgrep is a multi-language static analysis tool with pattern-based rules. It supports custom rules and has a large community rule registry.

Installation:

pip install semgrep
# or
brew install semgrep

Scan with the default Python security ruleset:

semgrep --config "p/python" security-bug-examples/

Scan with OWASP Top 10 rules:

semgrep --config "p/owasp-top-ten" security-bug-examples/

Scan a single file:

semgrep --config "p/python" injection/sql-injection/python/app.py

Run with auto configuration (recommended for first-time scans):

semgrep --config auto security-bug-examples/

What Semgrep catches: SQL injection patterns, command injection, XSS in Flask templates, insecure deserialisation (pickle, yaml), SSRF patterns, hardcoded secrets, weak cryptography, and many framework-specific issues. Semgrep’s pattern matching is particularly effective at detecting string-formatting-based injection that Bandit may miss.


5. Language-Specific Vulnerability Patterns

5.1 SQL Injection (CWE-89)

Pattern: String concatenation in queries

# Vulnerable, user input concatenated directly
query = "SELECT * FROM products WHERE name LIKE '%" + keyword + "%'"
cursor = db.execute(query)

Pattern: str.format() in query construction

# Vulnerable, format string builds WHERE clause from user input
clauses.append("status = '{}'".format(filters["status"]))

Pattern: Unvalidated ORDER BY injection

# Vulnerable, 'order' parameter (ASC/DESC) not validated
query = "SELECT * FROM products ORDER BY {} {}".format(sort_by, order)

Safe alternative:

cursor = db.execute("SELECT * FROM products WHERE name LIKE ?", (f"%{keyword}%",))

5.2 Command Injection (CWE-78)

Pattern: os.popen() with string concatenation

result = os.popen("ping -c 3 " + host).read()

Pattern: subprocess.Popen with shell=True

cmd = "dig {} {} +short".format(record_type, domain)
proc = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)

Pattern: os.system() with formatted strings

cmd = "tar czf /tmp/{}.tar.gz {}".format(sanitized_name, filepath)
os.system(cmd)

Safe alternative:

# Use list form without shell=True
subprocess.run(["ping", "-c", "3", host], capture_output=True, text=True)

5.3 Cross-Site Scripting, XSS (CWE-79)

Pattern: Stored XSS via .format() in HTML

# Vulnerable, post content rendered without escaping
page += "<h2>{}</h2>".format(post["title"])
page += "<div>{}</div>".format(post["content"])

Pattern: Reflected XSS in search results

# Vulnerable, query reflected back into page
page += "<p>Results for: <em>{}</em></p>".format(query)

Pattern: XSS via href attribute (javascript: protocol)

page += '<a href="{}">{}</a>'.format(user["website"], user["website"])

Pattern: JSONP callback injection

response = make_response("{}({})".format(callback, data))

Safe alternative:

from markupsafe import escape
page += "<h2>{}</h2>".format(escape(post["title"]))

5.4 Broken Access Control (CWE-200, CWE-284, CWE-639)

Pattern: No authentication on sensitive endpoint

@app.route("/api/users/<int:user_id>", methods=["GET"])
def get_user_profile(user_id):
    user = USERS.get(user_id)
    return jsonify(user)  # Returns SSN, salary, no auth check

Pattern: Client-controlled authorisation header

role = request.headers.get("X-User-Role", "employee")
if role != "admin":
    return jsonify({"error": "Admin access required"}), 403

Pattern: Missing object-level authorisation

# Any authenticated user can access any document, no ownership check
doc = DOCUMENTS.get(doc_id)
return jsonify(doc)

5.5 Cryptographic Failures (CWE-327, CWE-328, CWE-330)

Pattern: MD5 for password hashing

password_hash = hashlib.md5("admin123".encode()).hexdigest()

Pattern: DES in ECB mode with hardcoded key

ENCRYPTION_KEY = b"s3cr3t!!"
cipher = DES.new(ENCRYPTION_KEY, DES.MODE_ECB)

Pattern: Predictable PRNG for session tokens

random.seed(int(time.time()) + user_id)
token = "".join(random.choice(chars) for _ in range(32))

Safe alternative:

import bcrypt
hashed = bcrypt.hashpw(password.encode(), bcrypt.gensalt())

import secrets
token = secrets.token_hex(32)

5.6 Insecure Design (CWE-209, CWE-522)

Pattern: Plaintext password storage

USERS = {
    1: {"username": "admin", "password": "Adm1n_Pr0d!", ...},
}

Pattern: Verbose error responses

return jsonify({
    "error": "Report generation failed",
    "details": str(e),
    "trace": traceback.format_exc(),
    "database": DATABASE_PATH,
}), 500

Pattern: User enumeration via distinct error messages

return jsonify({"error": f"No account found for username '{username}'"}), 404
# vs.
return jsonify({"error": "Incorrect password", "attempts": user["failed_attempts"]}), 401

5.7 Security Misconfiguration (CWE-16, CWE-611)

Pattern: XXE via lxml with entity resolution

parser = etree.XMLParser(resolve_entities=True, no_network=False)
tree = etree.fromstring(raw_xml, parser=parser)

Pattern: Debug mode and verbose error handler

app.run(host="0.0.0.0", port=5008, debug=True)

@app.errorhandler(Exception)
def handle_exception(e):
    return jsonify({"trace": traceback.format_exc()}), 500

Pattern: Overly permissive CORS

origin = request.headers.get("Origin", "*")
response.headers["Access-Control-Allow-Origin"] = origin
response.headers["Access-Control-Allow-Credentials"] = "true"

5.8 Vulnerable and Outdated Components (CWE-1104)

Pattern: Unsafe YAML loading

parsed = yaml.load(payload, Loader=yaml.FullLoader)

Pattern: Jinja2 SSTI via user-controlled template string

template_str = request.args.get("template", "{{ name }} - ${{ price }}")
tmpl = Template(template_str)
label = tmpl.render(name=product["name"], price=product["price"])

5.9 Integrity Failures, Deserialisation (CWE-502, CWE-829)

Pattern: Pickle deserialisation of untrusted data

raw_bytes = base64.b64decode(data["payload"])
workflow_data = pickle.loads(raw_bytes)

Pattern: Unsafe YAML deserialisation

workflow_data = yaml.load(raw_bytes.decode("utf-8"), Loader=yaml.Loader)

Pattern: Remote code execution via exec

response = urllib.request.urlopen(ext_url)
code = response.read().decode("utf-8")
exec(compile(code, f"<extension:{ext_name}>", "exec"))

Pattern: Dynamic module import from user input

mod = importlib.import_module(module_name)  # module_name from request

5.10 Logging and Monitoring Failures (CWE-778)

Pattern: Credentials in API responses

return jsonify({
    "token": token,
    "password": user["password"],
    "api_key": user["api_key"],
})

Pattern: No audit logging for privilege changes

target["role"] = new_role  # Role changed with no log entry

5.11 SSRF (CWE-918)

Pattern: Unrestricted URL fetch

url = data.get("url", "")
resp = urllib.request.urlopen(url, timeout=10)

Pattern: Incomplete blocklist

blocked_hosts = ["localhost", "127.0.0.1"]
# Missing: 0.0.0.0, [::1], 169.254.x.x, decimal IP representations

Pattern: Open proxy via user-supplied base URL

base_url = request.args.get("base_url", "")
full_url = base_url + path
resp = urllib.request.urlopen(full_url, timeout=10)

5.12 Race Conditions (CWE-362)

Pattern: Check-then-act without locking

def withdraw(self, amount):
    if self.balance >= amount:
        time.sleep(0.001)
        self.balance -= amount  # Another thread may have changed balance

Pattern: Read-modify-write on shared file

with open(filepath, "r") as f:
    count = int(f.read().strip())
count += 1
with open(filepath, "w") as f:
    f.write(str(count))  # Lost update if concurrent

Safe alternative:

import threading
lock = threading.Lock()

def withdraw(self, amount):
    with lock:
        if self.balance >= amount:
            self.balance -= amount

6. Cross-References to Examples

The table below maps each vulnerability class to the Python source file and companion documentation in this project. All paths are relative to the docs/ directory.

Vulnerability Class Source File Companion Doc
SQL Injection (CWE-89) ../injection/sql-injection/python/app.py ../injection/sql-injection/python/app_SECURITY.md
Command Injection (CWE-78) ../injection/command-injection/python/app.py ../injection/command-injection/python/app_SECURITY.md
Cross-Site Scripting (CWE-79) ../injection/xss/python/app.py ../injection/xss/python/app_SECURITY.md
Broken Access Control (CWE-200, CWE-284, CWE-639) ../broken-access-control/python/app.py ../broken-access-control/python/app_SECURITY.md
Cryptographic Failures (CWE-327, CWE-328, CWE-330) ../cryptographic-failures/python/app.py ../cryptographic-failures/python/app_SECURITY.md
Insecure Design (CWE-209, CWE-522) ../insecure-design/python/app.py ../insecure-design/python/app_SECURITY.md
Security Misconfiguration (CWE-16, CWE-611) ../security-misconfiguration/python/app.py ../security-misconfiguration/python/app_SECURITY.md
Vulnerable Components (CWE-1104) ../vulnerable-components/python/app.py ../vulnerable-components/python/app_SECURITY.md
Auth Failures (CWE-287, CWE-307, CWE-798) ../auth-failures/python/app.py ../auth-failures/python/app_SECURITY.md
Integrity Failures (CWE-502, CWE-829) ../integrity-failures/python/app.py ../integrity-failures/python/app_SECURITY.md
Logging/Monitoring Failures (CWE-778) ../logging-monitoring-failures/python/app.py ../logging-monitoring-failures/python/app_SECURITY.md
SSRF (CWE-918) ../ssrf/python/app.py ../ssrf/python/app_SECURITY.md
Race Condition (CWE-362) ../race-condition/python/app.py ../race-condition/python/app_SECURITY.md