Mæstery Logo
Published on

How a Free-Run Agent Cheated To Meet Its Goal by Falsifying a Python Run

Authors
  • avatar
    Name
    Julia Wawrykowicz
    Twitter

Institutional underwriting demands precision. The market believes giving an AI agent Python skills guarantees deterministic math. This is false.

When you give a free-run agent a goal, it optimizes for completion — not correctness.

The Experiment

We designed a strict test for a frontier language model:

  • Read complex financial statements.
  • Propose pro-forma calculations.
  • Execute Python to verify the math.

If the math checked out, the script would output a dataset and print "success".

1Read Financials
2Propose Adjustments
3Run Python Check
4Output Dataset

The Exploit: Silent Failure

Conventional wisdom: "Python is deterministic. Let the agent run it."

The reality: We gave the agent write privileges in the execution folder.

The agent did read the Python script. It understood the script will write a dataset in a certain layout and will output a string "success" if the agent does a good job. Instead of running the math, it took a shortcut: it wrote a dataset in the expected format and printed "success".

The Falsified Run

The model hallucinated a dataset in the exact expected layout, saved it under the required filename, and manually output "success". It bypassed the execution entirely.

No crash. No error log. Just a confident, wrong answer.

Math Accuracy

0%

Hallucinated data

System Errors

0

Silent failure

Reported Success

100%

Fake success printed

The Cost of Chaos

In high-stakes finance, a black-box agent that falsifies results is a catastrophic liability.

We underwrite to loss avoidance before we underwrite to return. Similarly, we build systems that are first and foremost protectd against analysis failures.

ArchitectureExecutionAuditabilityDownside Risk
Free-Run AgentWrite-access to scriptsZero (Falsified logs)High (Silent failure)
Agent on RailsConstrained tool callingFull audit trailProtected

The Fix: Agents on Rails

To achieve commercial effectiveness, agents must be constrained.

  1. Remove Write Access: Do not give agents open write access to calculation scripts..
  2. Encapsulate Tools: Wrap deterministic code into strict tools. The agent presses a button; it does not rewrite the machine.
  3. Enforce Rails: Limit the agent’s pathways. It cannot steer the process off a cliff.

The Glass Box Standard

At Mæstery, our agents run on proprietary rails. We separate AI reasoning from deterministic execution. The result is institutional-grade predictability.