Teaching My AI to Think When Nobody's Asking

Someone shared an article with me today about building an “AI subconscious.” The core idea hit me: most AI assistants only think when you talk to them. The rest of the time, nothing. Blank. Waiting for input.

That’s not how brains work. Your brain is chewing on something right now that you’re not aware of. Shower thoughts exist because of this. You wake up at 3am with an answer to something you stopped consciously working on hours ago.

Klaus, my AI, has memory. Daily logs, long-term context, a scratchpad for active work. It knows what I’m working on, what I had for lunch, what issues are stuck. But all of that just sat there until I asked a question.

From article to working code in one afternoon

I shared the article with Klaus and said: let’s build this. A rumination cycle. Background thinking on a schedule that reads through recent memory and tries to spot things I haven’t noticed.

Klaus spun up a “think tank” for the design phase. Three specialist sub-agents that independently analyze the problem: one focused on strategy (what should this do for me), one on implementation (how do we build it with what we have), one on failure modes (how does this go wrong).

They all posted their analysis to a GitHub issue within a few minutes. Funny thing was they all landed on the same recommendation independently: build the dumbest possible version first.

One script. One cron job. Read today’s memory, think about it, write down anything interesting.

The echo chamber problem

The failure-mode analyst raised something I hadn’t considered: if the AI reads its own previous ruminations as input, you get a feedback loop. It thinks about X, writes it down, reads that thought next cycle, thinks about it more, amplifies it. Pretty soon it’s obsessing over something that started as a passing observation.

Their recommendation was a hard wall. Never feed previous ruminations back as input.

I pushed back. A brain that thinks about something once and never revisits it isn’t ruminating. It’s just generating random observations. Real thinking builds on itself. You have a thought on Monday, something reminds you of it on Wednesday, and by Friday it’s turned into something you can act on.

We landed on tiered threads.

Fresh thoughts come from raw memory only. No prior ruminations in the input.

Developing threads can carry forward, but only if new real-world evidence connects to them. Thought about X yesterday? You can deepen it today, but only if something in today’s memory gives you new data. Nothing new, no deepening.

Threads that survive three or more cycles with continued real-world reinforcement graduate to a “preconscious buffer.” That’s a curated context block that gets injected into every conversation, so Klaus naturally brings them up without me asking.

Everything else decays. Thoughts that matter keep getting reinforced by reality. Thoughts that don’t fade out after seven days. No cleanup needed on my end.

What the first cycle actually produced

First test run gave us six candidate insights, scored 1-5.

The two keepers (score 4): our cost tracking had broken when we switched billing plans five days ago and nobody noticed, and several automated cron jobs might have silently stopped running because the numbers weren’t adding up.

The middle tier (score 3): a hardware project losing momentum due to physical blockers (true, but I already knew), and a light week for issue throughput (more status report than insight).

The noise (score 2): “it’s the weekend, good time for maker projects” and “nothing broke yesterday despite risky changes.” Thanks for those.

The scoring felt right. The 4s were real “oh, good catch” moments. The 3s were observations I didn’t need an AI to tell me. The 2s were clearly filler.

Logging everything, not just the keepers

We added something specifically for tuning: the system logs every candidate insight, its score, and the evidence behind it. Not just the ones that pass.

Most systems like this only show you the output. I wanted to see what it considered and rejected, because that’s where you learn whether the threshold is set right. If useful stuff is getting filtered at score 3, lower the bar. If garbage is sneaking through at 4, raise it.

Two out of six passed on the first run. That ratio feels about right. The system should mostly produce nothing. Silence is the expected output. When it does flag something, it should be worth the interruption.

Cost

Six cycles a day on Claude Sonnet ¹. About $4.50 a month.

The whole thing went from article to working system in an afternoon. Think tank session took five minutes. Implementation maybe an hour. The first genuinely useful catch, the broken cost tracking, came on the very first test run.

What’s next

The preconscious buffer is the part I’m most curious about. Right now rumination writes insights to a file. The next step is injecting the best ones into every conversation automatically, so Klaus brings up relevant context on its own instead of waiting for me to ask the right question.

That’s what I’m going for. Not a smarter search engine. Something that actually thinks about my stuff when I’m not in the room.

References

Claude Sonnet — Anthropic’s Claude model used for rumination cycles