Python Security Pitfalls Every Developer Should Know

I’ve spent a lot of time reviewing Python codebases, and the language’s readability and rapid development cycle are exactly what make it dangerous. Python is the default choice for web services, data pipelines, and automation scripts, and that same ease of use hides security pitfalls that experienced developers walk into regularly. The language’s dynamic nature, runtime evaluation, duck typing, implicit conversions, and powerful serialization, creates attack surfaces that simply don’t exist in statically typed languages. In this post, I want to cover the Python-specific anti-patterns that lead to real vulnerabilities, from the well-known pickle deserialization trap to the subtle template injection that can survive code review.

Pickle Deserialization: Arbitrary Code Execution by Design

Python’s pickle module serializes and deserializes arbitrary Python objects. The deserialization process reconstructs objects by calling their __reduce__ method, which can execute arbitrary code. Here’s the thing, this is not a bug. It’s the documented behaviour. Loading untrusted pickle data is equivalent to running eval() on attacker-controlled input, and the more I dug into how pickle works internally, the more alarming the implications became.

The Easy-to-Spot Version

import pickle
from flask import Flask, request

app = Flask(__name__)

@app.route("/load", methods=["POST"])
def load_data():
    data = pickle.loads(request.data)
    return {"status": "loaded", "keys": list(data.keys())}

Any SAST tool flags pickle.loads on untrusted input. The fix is obvious: use JSON, MessagePack, or another format that doesn’t execute code during deserialization. It still shows up in production, though, which says something about how easy it is to reach for the convenient option.

The Hard-to-Spot Version

import shelve
from flask import Flask, request

app = Flask(__name__)

@app.route("/cache", methods=["POST"])
def update_cache():
    key = request.form["key"]
    value = request.form["value"]
    with shelve.open("/tmp/app_cache") as db:
        db[key] = value
    return {"status": "cached"}

@app.route("/cache/<key>")
def get_cache(key):
    with shelve.open("/tmp/app_cache") as db:
        return {"value": db.get(key, "not found")}

This is the one that really surprised me when I first encountered it. shelve uses pickle internally. If an attacker can write to the shelf file (via the /cache POST endpoint or by replacing the file on disk), the shelve.open and subsequent reads deserialize pickle data. The word “pickle” never appears in the source code, so reviewers and SAST tools that grep for pickle.loads miss it entirely. I’ve since found this pattern hiding behind multiprocessing (for sending objects between processes), xmlrpc.client, and some caching backends too. It’s one of those things that once you know to look for, you start seeing everywhere.

How Other Languages Handle This

Java’s ObjectInputStream has a similar problem, deserialization triggers class constructors and readObject methods. But Java’s ecosystem has developed defenses: serialization filters (ObjectInputFilter since JDK 9), allowlists, and libraries like Apache Commons IO’s ValidatingObjectInputStream. Python has no built-in deserialization filter for pickle, which is a notable gap.

// Java: Deserialization with a filter (JDK 9+)
ObjectInputFilter filter = ObjectInputFilter.Config.createFilter(
    "java.util.HashMap;java.lang.String;!*"
);
ObjectInputStream ois = new ObjectInputStream(inputStream);
ois.setObjectInputFilter(filter);
Object obj = ois.readObject();

Python’s defence is simpler but blunter: don’t use pickle for untrusted data. Use json, msgpack, or Protocol Buffers.

eval(), exec(), and compile(): The Dynamic Execution Trap

Python’s eval() evaluates an expression and returns the result. exec() executes arbitrary statements. Both accept strings, and both are frequently used with user-controlled input in ways that create code injection vulnerabilities. These keep showing up in “quick and dirty” internal tools that somehow made it to production.

The Easy-to-Spot Version

from flask import Flask, request

app = Flask(__name__)

@app.route("/calc")
def calculate():
    expr = request.args.get("expr", "0")
    result = eval(expr)
    return {"result": result}

An attacker sends expr=__import__('os').system('id') and gets remote code execution. Every security scanner flags this.

The Hard-to-Spot Version

from flask import Flask, request

app = Flask(__name__)

ALLOWED_OPS = {"+", "-", "*", "/", "(", ")", ".", " "}

@app.route("/calc")
def calculate():
    expr = request.args.get("expr", "0")
    if all(c.isdigit() or c in ALLOWED_OPS for c in expr):
        result = eval(expr)
        return {"result": result}
    return {"error": "invalid expression"}, 400

The character allowlist looks safe, only digits and arithmetic operators. But what I found particularly interesting when I started experimenting with this is that the allowlist includes . and parentheses. An attacker can construct (1).__class__.__bases__[0].__subclasses__() using only allowed characters (digits, dots, parentheses) to enumerate all loaded classes and find one that provides code execution. The character-level filter is insufficient because Python’s object model is navigable through attribute access using only dots and parentheses. It’s a great example of how Python’s dynamism works against you in security contexts.

Comparison: Go’s Approach

Go has no eval() equivalent. There is no way to execute arbitrary Go code at runtime from a string. This eliminates an entire class of vulnerabilities by language design, and honestly, it’s a trade-off worth appreciating.

// Go: No eval(), must parse and evaluate explicitly
package main

import (
    "fmt"
    "go/ast"
    "go/parser"
    "go/token"
)

func safeEval(expr string) (float64, error) {
    tree, err := parser.ParseExpr(expr)
    if err != nil {
        return 0, err
    }
    return evalAST(tree)
}

func evalAST(node ast.Expr) (float64, error) {
    switch n := node.(type) {
    case *ast.BasicLit:
        // parse numeric literal
        var val float64
        fmt.Sscanf(n.Value, "%f", &val)
        return val, nil
    case *ast.BinaryExpr:
        left, err := evalAST(n.X)
        if err != nil {
            return 0, err
        }
        right, err := evalAST(n.Y)
        if err != nil {
            return 0, err
        }
        switch n.Op {
        case token.ADD:
            return left + right, nil
        case token.SUB:
            return left - right, nil
        }
    }
    return 0, fmt.Errorf("unsupported expression")
}

The Go approach forces explicit parsing and evaluation of only the operations you intend to support. There’s no shortcut that accidentally enables code execution.

Server-Side Template Injection (SSTI)

Python’s template engines (Jinja2, Mako, Django templates) are powerful enough to execute arbitrary code if user input is rendered as a template rather than as data within a template. SSTI is one of those vulnerability classes that I find fascinating because the attack surface is so non-obvious, you’d never guess that a template engine could be an RCE vector until you see it in action.

The Easy-to-Spot Version

from flask import Flask, request
from jinja2 import Template

app = Flask(__name__)

@app.route("/greet")
def greet():
    name = request.args.get("name", "World")
    template = Template(f"Hello, {name}!")
    return template.render()

The user input name is interpolated into the template string before Jinja2 compiles it. An attacker sends name={{config}} and Jinja2 evaluates {{config}}, leaking the Flask configuration (including SECRET_KEY). Sending name={{''.__class__.__mro__[1].__subclasses__()}} enumerates all loaded classes. From information disclosure to full RCE, the escalation path is well-documented in the research literature.

The Hard-to-Spot Version

from flask import Flask, request, render_template_string

app = Flask(__name__)

ERROR_TEMPLATES = {
    "not_found": "The resource '{{ name }}' was not found.",
    "forbidden": "Access to '{{ name }}' is denied.",
}

@app.route("/error")
def show_error():
    error_type = request.args.get("type", "not_found")
    name = request.args.get("name", "unknown")
    template_str = ERROR_TEMPLATES.get(error_type)
    if template_str is None:
        template_str = f"Unknown error for '{name}'."
    return render_template_string(template_str, name=name)

When error_type matches a known key, name is safely passed as a template variable. But when error_type is unknown, the fallback uses an f-string that interpolates name directly into the template string. The attacker sends type=unknown&name={{config}} and gets SSTI. What makes this one particularly tricky is that the vulnerability only exists in the error path, which reviewers often skim. It’s a good reminder to audit error handling with the same rigor as the happy path, that’s where some of the most interesting bugs hide.

Comparison: Rust’s Approach

Rust’s Tera template engine compiles templates at build time or from trusted files. There’s no render_string_from_user_input pattern in idiomatic Rust web frameworks, which is exactly how it should be.

// Rust (Actix-web + Tera): Templates are loaded from files, not user input
use actix_web::{web, App, HttpServer, HttpResponse};
use tera::Tera;

async fn greet(
    tmpl: web::Data<Tera>,
    query: web::Query<std::collections::HashMap<String, String>>,
) -> HttpResponse {
    let mut ctx = tera::Context::new();
    ctx.insert("name", query.get("name").unwrap_or(&"World".to_string()));
    let rendered = tmpl.render("greet.html", &ctx).unwrap();
    HttpResponse::Ok().body(rendered)
}

The template is a file on disk (greet.html), not a string constructed from user input. User data is always passed as context variables, never as template source.

os.path Traversal and open() Pitfalls

Python’s open() accepts any path string. Combined with user input and insufficient validation, this creates path traversal vulnerabilities. These show up depressingly often in file-serving endpoints.

The Vulnerable Pattern

import os
from flask import Flask, request, send_file

app = Flask(__name__)
UPLOAD_DIR = "/var/app/uploads"

@app.route("/files/<filename>")
def get_file(filename):
    path = os.path.join(UPLOAD_DIR, filename)
    if not os.path.exists(path):
        return {"error": "not found"}, 404
    return send_file(path)

An attacker sends filename=../../etc/passwd. os.path.join("/var/app/uploads", "../../etc/passwd") resolves to /etc/passwd. The os.path.exists check passes, and the file is served. Reading through public bug bounty reports, this exact pattern accounts for a surprising number of findings.

The Fix That Is Still Broken

@app.route("/files/<filename>")
def get_file(filename):
    if ".." in filename:
        return {"error": "invalid path"}, 400
    path = os.path.join(UPLOAD_DIR, filename)
    return send_file(path)

Blocking .. is insufficient. On some systems, URL encoding (%2e%2e%2f) or double encoding bypasses the check. The correct approach uses os.path.realpath and verifies the resolved path starts with the intended directory:

@app.route("/files/<filename>")
def get_file(filename):
    path = os.path.realpath(os.path.join(UPLOAD_DIR, filename))
    if not path.startswith(os.path.realpath(UPLOAD_DIR) + os.sep):
        return {"error": "invalid path"}, 400
    return send_file(path)

Comparison: Java’s Path Handling

Java’s Path.normalize() and Path.startsWith() provide a cleaner API for the same check:

import java.nio.file.Path;
import java.nio.file.Paths;

public boolean isPathSafe(String userInput, String baseDir) {
    Path base = Paths.get(baseDir).toAbsolutePath().normalize();
    Path resolved = base.resolve(userInput).normalize();
    return resolved.startsWith(base);
}

YAML Deserialization with PyYAML

PyYAML’s yaml.load() (without a Loader argument) uses FullLoader by default in recent versions, but older versions used Loader=yaml.Loader which allows arbitrary Python object construction, the same risk as pickle. Vulnerable yaml.load calls in configuration parsers can sit in production for years without anyone noticing.

The Vulnerable Pattern

import yaml

def load_config(path):
    with open(path) as f:
        return yaml.load(f, Loader=yaml.Loader)

A malicious YAML file can construct arbitrary Python objects:

!!python/object/apply:os.system
args: ['id']

The Safe Pattern

import yaml

def load_config(path):
    with open(path) as f:
        return yaml.safe_load(f)

yaml.safe_load() only constructs basic Python types (dicts, lists, strings, numbers). It rejects !!python/object tags entirely. Grepping for yaml.load without SafeLoader is one of those quick checks that’s always worth doing.

Comparison: Go’s YAML Handling

Go’s gopkg.in/yaml.v3 only deserializes into declared struct types. There’s no mechanism to construct arbitrary objects from YAML tags, which is a fundamentally safer design:

type Config struct {
    Host string `yaml:"host"`
    Port int    `yaml:"port"`
}

func loadConfig(path string) (*Config, error) {
    data, err := os.ReadFile(path)
    if err != nil {
        return nil, err
    }
    var cfg Config
    err = yaml.Unmarshal(data, &cfg)
    return &cfg, err
}

The struct definition acts as an implicit allowlist. Fields not declared in the struct are silently ignored.

Detection Strategies

Tool	What It Catches	Limitations
Bandit	`pickle.loads`, `eval`, `exec`, `yaml.load`, `subprocess` with `shell=True`	Does not follow data flow; misses indirect pickle usage (shelve, multiprocessing)
Semgrep	Pattern-based detection of SSTI, path traversal, deserialization	Requires rules for each pattern; custom rules needed for project-specific sinks
Pylint (security plugins)	Some dangerous function calls	Limited security-specific coverage
Safety / pip-audit	Known vulnerabilities in installed packages	Does not analyse source code
mypy (strict mode)	Type errors that may indicate unsafe casts	Not security-focused, but catches type confusion

Manual Review Checklist

Here’s a checklist I’ve put together from reviewing Python codebases:

Search for pickle, shelve, marshal, xmlrpc, all use pickle internally. The indirect usage through shelve is the one that catches people most often.
Search for eval, exec, compile, check if any argument contains user input. Character allowlists don’t make these safe.
Search for Template( and render_template_string, verify user input is never part of the template source. Pay extra attention to error paths.
Search for yaml.load, verify Loader=yaml.SafeLoader or use yaml.safe_load. This is a quick win.
Search for os.path.join with user input, verify the resolved path is constrained to the intended directory using realpath.
Search for subprocess calls, verify shell=False and arguments are passed as lists, not strings.
Search for __import__ and importlib, dynamic imports with user-controlled module names enable code execution.

Remediation Patterns

Replace Pickle with JSON

import json

# Instead of pickle.loads(data)
data = json.loads(request.data)

# Instead of pickle.dumps(obj)
serialized = json.dumps(obj)

For complex objects that JSON cannot represent, use msgpack, protobuf, or dataclasses-json with explicit schema definitions. Writing a bit more serialization code is a small price compared to dealing with an RCE.

Use AST-Based Expression Evaluation

import ast
import operator

SAFE_OPS = {
    ast.Add: operator.add,
    ast.Sub: operator.sub,
    ast.Mult: operator.mul,
    ast.Div: operator.truediv,
}

def safe_eval(expr: str) -> float:
    tree = ast.parse(expr, mode="eval")
    return _eval_node(tree.body)

def _eval_node(node):
    if isinstance(node, ast.Constant) and isinstance(node.value, (int, float)):
        return node.value
    if isinstance(node, ast.BinOp) and type(node.op) in SAFE_OPS:
        left = _eval_node(node.left)
        right = _eval_node(node.right)
        return SAFE_OPS[type(node.op)](left, right)
    raise ValueError(f"Unsupported expression: {ast.dump(node)}")

Parse the expression into an AST and evaluate only the nodes you explicitly support. This is safe because the AST walker never calls eval(), it interprets the tree directly. This pattern works well in production for calculator-style features.

Secure Template Rendering

from flask import Flask, request, render_template

app = Flask(__name__)

@app.route("/greet")
def greet():
    name = request.args.get("name", "World")
    # Pass user input as a variable, never as template source
    return render_template("greet.html", name=name)

Always use render_template (file-based) instead of render_template_string. If you must use render_template_string, never interpolate user input into the template string, pass it as a keyword argument. This is the golden rule for Flask template security.

Key Takeaways

Pickle is eval for objects. Never deserialize untrusted pickle data. Audit for indirect pickle usage through shelve, multiprocessing, and xmlrpc. The indirect usage is what catches most teams off guard.
Character-level allowlists do not make eval safe. Python’s object model is navigable through dots and parentheses alone. Replace eval with AST-based evaluation.
SSTI hides in error paths and fallback logic. Audit every code path that constructs template strings, not just the happy path. That’s where the interesting bugs tend to live.
os.path.join does not prevent traversal. Always resolve with os.path.realpath and verify the prefix. The .. blocklist approach is not enough.
yaml.safe_load is the only safe YAML loader. The default yaml.load with Loader=yaml.Loader allows arbitrary object construction. This is a quick grep-and-fix.
Python’s dynamic nature is the root cause. Features that make Python productive, eval, pickle, dynamic imports, template rendering, are the same features that create security vulnerabilities. The defence is to avoid these features on untrusted input entirely, no matter how convenient they seem.