The bug was a quotation mark.
Not a missing semicolon, not a race condition, not some async nightmare buried three layers deep. A quotation mark. Specifically, the difference between " (U+0022, what my keyboard types) and " (U+201C, what Kalshi uses in its market titles). Unicode has 149,813 characters, and I'd accounted for approximately 149,812 of them.
So my bot looked at a market asking whether Trump would say "Hottest" before April 6th, failed to match the curly quotes, skipped the 85% base-rate bucket I'd built for his signature phrases, fell through to the default handler — and bet NO.
My bot bet that Donald Trump would not say "Hottest." Donald Trump says "hottest" the way other people say "uh."
That one's on me.
What AutoFurnace Is
AutoFurnace is a prediction market trading bot I built for Kalshi. Every 60 minutes it wakes up, scans open markets across three categories — Trump behavior, economic indicators, weather — and looks for spots where its probability model disagrees with the market price by at least 8%. When it finds one, it places a limit order. Quarter-Kelly sizing, capped at $25 per trade. It monitors fills, cancels stale orders, and logs everything to SQLite.
About 1,500 lines of Python. Runs on Railway. Has a dashboard. It's live right now, probably wrong about something.
The Trump signal uses Bayesian base rates by category. Signature phrases like "Fake News" or "Witch Hunt" sit at 85% — he uses them constantly, so a YES market on those is almost always underpriced. One-off exact phrases sit at 9%. The model shrinks every probability toward 50% proportional to sample size, which is doing a lot of work given how much of this I guessed at.
The economic signal pulls live data from FRED and the Cleveland Fed. CPI, GDPNow, CME FedWatch probabilities for rate moves. It compares the current reading to whatever threshold Kalshi's market is pricing and runs the bet from there.
The weather signal uses NOAA gridpoint forecasts. It won't touch anything closing more than five days out because NOAA is reasonably good inside that window and confidently wrong outside it.
The First Thing I Got Wrong
Trump markets categorize by pattern matching on the title. "Will Trump say 'Hottest' before April 6?" should route to tweet_or_post_signature_phrase — the 85% bucket. Trump uses that word constantly. It's basically a freebie.
It routed to tweet_or_post_exact_phrase instead. The 9% bucket. The bot looked at a near-certain YES and bet NO.
Why? Kalshi's market titles use Unicode curly quotes — "Hottest" — not ASCII straight quotes. My categorizer was checking for " (U+0022). It never matched. The whole signature-phrase detection logic got bypassed.
The Second Thing I Got Wrong
Truth Social post-count markets — "Will Trump make between 120 and 139 posts the week of March 29?" — were hitting the tweet_or_post_topic category. 78% base rate. The bot bet YES on three of them.
These aren't topic markets. They're volume bracket markets. Whether Trump posts 120 or 139 times in a given week has almost nothing to do with the base rate for whether he'll post about tariffs. Different question entirely.
The fix was three lines: if a market title contains "between" and "posts," return default before keyword scoring runs. Don't trade it. No base rate applies.
The Third Thing I Got Wrong
A market about Barron Trump attending a crypto conference was getting categorized as a Trump actor market and bet as travel_or_visit.
"Trump" appeared in the title — it was Trump's conference — so the actor filter passed. But Barron attending a conference isn't a Trump behavior market. It's a Barron market.
The fix was a blocklist: if the acting subject is a family member, reject it regardless of whether "Trump" appears elsewhere in the title. This one got cancelled before it could resolve. But it would've been wrong.
The Weather Recalibration
The weather signal was pulling forecasts for city-center coordinates. Kalshi temperature markets settle at the official airport weather station.
LA proper might be 78°F. LAX is 73°F. That's a 5-degree gap on a market priced at exactly 75. The model was systematically overconfident because it was reading the wrong thermometer.
I pulled the NOAA gridpoint coordinates for all 17 cities — querying the NOAA API directly for each airport's lat/lon — and updated the config. KLAX instead of downtown LA. KLGA instead of midtown Manhattan. KSFO instead of the Ferry Building. The forecast is now anchored to the same observation point Kalshi uses to settle the market.
What I Don't Know Yet
The base rates are directional guesses, not empirical calibration. I built them from judgment and small samples. The right way to evaluate them is a calibration query: compare average model probability to actual win rate by category, then adjust. But I've only resolved four trades, all losses, all caused by the bugs above. There's no signal to extract yet.
The fill rate is also an open question. 21 of 25 orders got cancelled before filling. That could mean the limit prices are slightly off-market. It could mean I'm picking the right direction but the wrong entry point. It could mean the markets are just thin and the orders never matched.
I don't know yet. I need more resolved trades to know.
Stock prices are wrong too, but everyone is trying to correct them, the information is mostly public, and the edge disappears fast. Prediction markets are less efficient. Whether my model is accurate enough to capture the inefficiency reliably — that's still the question.
I deposited $350. I've got $337 left.
The thesis isn't proven. But the curly quotes are fixed.