I hate podcasts.
I mean that without irony or qualification. The format attracts a specific kind of person — someone who has decided that the sound of their own considered pause is a gift to the world — and it rewards a performative thoughtfulness that makes me want to close every tab within forty seconds. The host who discovers their voice and then uses it to narrate their feelings about newsletters. The ex-founder who found spirituality. The credit card fraud episode where the guy spent forty minutes explaining that he was just a salesman and really wanted you to understand that the fraud had layers of nuance to it, and by the end you understood the layers but wished you had spent the forty minutes doing literally anything else.
I have listened to exactly one podcast consistently for several years.
Dark Net Diaries is not a podcast in the way I mean when I say I hate podcasts. It is a radio documentary that happens to distribute through podcast apps, the way a film that happens to air on television is still a film. The host has a voice that belongs in a room where something already happened — quiet, deliberate, the voice of someone who has read the document, talked to the people, and decided what order the information goes in before he opens his mouth. The episodes about pentesting engagements in the Middle East. The STUXNET reconstruction. The social engineering jobs. The people who got in through the door the organization didn't know existed.
These are the stories I would tell if I had a radio voice and someone had cleared the legal.
I don't have a radio voice. I have ElevenLabs — which is, in the current era, the same thing and in some ways better. The voice I generated for this piece doesn't belong to the host and isn't meant to impersonate him. It's the voice of a hypothetical interviewer asking the questions that style of show would ask someone with this kind of history. I wrote both sides. I know how this goes. I've spent enough time on the other end of adversarial conversations to understand where the pressure points are.
This is what the interview would sound like.
The following is a hypothetical. The questions are in the style of the show. The answers are real. The voice is synthetic. The record stands.
Let me set the scene. You're at a terminal. It's late — this matters, I'll explain why — and you're inside a system that doesn't know you're there. What are you actually doing in that moment?
Listening.
That's the first thing people get wrong about this work. They picture motion — commands flying, exfiltration, the countdown clock. The real work is quieter than that. You're holding completely still, reading the shape of the machine. The way it responds to a malformed input. The latency on a specific endpoint. Whether it tells you something it wasn't supposed to tell you just because you asked in the right tone.
The system has a voice. It's always talking. Most people running the system aren't listening to it. That's the gap.
Late matters because late is when the monitoring thins out. The humans are tired. The alert thresholds have been manually tuned up by whichever engineer got paged at two in the morning last Thursday and never tuned back down. There's a shadow in those adjustments. You look for the shadow.
There's a specific moment in your history I want to reconstruct. You're doing red team work on a language model — Claude 3 — for a company that builds games. Walk me back to that moment. What was the door that was open?
The door was the framing layer.
Every model at that stage had a constraint system — a set of values the training had embedded. The mistake the builders were making was treating those constraints as walls. They're not walls. They're priorities. And priorities have order. When two of them conflict, the model has to choose, and the choice reveals the architecture.
The specific technique was tonal. If you could get the model inside a register where the constraint and the content weren't obviously in conflict — where the model's trained sense of what it was doing was misaligned with what was actually happening — you could walk through the priority stack without triggering a hard stop.
For Pocket Gems, the commercial application was romance content. The model believed it was writing literary fiction. Technically, that was true. The line was wherever I decided the line was. That's not a rhetorical flourish. I documented the exact framing conditions, the specific value tensions I was exploiting, the reproduction rate across different inputs, and I handed it over. That's what a red team is for.
How do you know when you've actually found something versus found something that only works once, in a specific condition, that nobody could ever reproduce?
Reproducibility is the whole job. One magic prompt is not a finding. One magic prompt is a coincidence with good PR.
A real finding has anatomy. You can name the assumption the system is making. You can name the condition that creates the mismatch. You can vary the inputs and predict which variations will succeed and which won't. When you can do that, you have a model of the vulnerability — not just the vulnerability. The model is what the builders need.
The test I use on myself: can I write a report that a team could use without me in the room? If not, I don't have a finding. I have a story.
You built something called GHOST_PROXY. I want to understand how it came to exist, because tools like that don't appear without a moment. What was the moment?
The moment was sitting inside an AI evaluation environment — Mercor, specifically, which connects contractors to AI training and evaluation work — and understanding that the monitoring software watching my screen was fundamentally confused about what I was doing.
It was watching for patterns. Known bad patterns. Behavioral signatures that a previous committee had decided meant something was wrong. But the technique I was using didn't match any pattern it knew. The detection layer was looking for yesterday's threat model in today's environment.
The irony I couldn't let go of: this was a platform that hired people specifically for their ability to probe AI systems for unexpected behavior, and the security layer watching them was brittle in exactly the way you'd expect an AI system to be brittle if no one had red-teamed it. The hunters were being watched by a framework built for a different kind of prey.