AI Reflections: Fatigue

Why I'm taking a break from AI-assisted coding.

This year I’ve taken a deeper interest in how I use AI.

There are nearly endless ideas and takes on AI, and particularly in my bubble, LLM-assisted coding. For every measured, grounded take, it feels like there are 10 “hype” takes. It’s been challenging to wade through all of the information and settle on coherent thoughts. This reflection is me trying to put into (relatively unedited/stream-of-consciousness) words on what my current thoughts are on the matter.

First off, in the beginning of the year (January, 2026) I had just finished reading Adam Grant’s novel, Think Again. One of the main themes in Think Again, is to be both more open-minded in our interpretations of the world and more scientific about how we come to conclusions. Generally, Grant encourages us to exercise the “rethinking” process more liberally, and especially when we’re most dogmatic or for topics where there is a lot of ambiguous/conflicting information.

At the same time, more powerful models released from major labs, and tools like Claude Code, Open Code, and other powerful agentic harnesses gained traction. Outside of the Linkedin fueled hype of people claiming that LLMs were “doing the work of dozens of engineers”, I started seeing other engineers I respected across the industry experimenting more with these tools (or at the very least, rethinking their positions on them).

I remember in late 2025 and early January, I felt like AI was making me more productive, for some definition of productive that I couldn’t quite pin down.

However, it didn’t feel strictly better. I had a number of hypotheses around how AI was impacting my work (both in my personal/side projects and at my day-job, a Principal Engineer at CapitalRx, where I’m focused mostly on consumer product development and architecture).

Here were my hypotheses:

"hypotheses": [
    "I procrastinate less with AI assistance",
    "I ship more code with AI assistance",
    "Code quality is slightly lower with AI but still acceptable",
    "I write more tests and documentation with AI",
    "I retain less understanding of the codebase with heavy AI use",
    "Moderate AI use is more fulfilling than no AI or full AI"
  ]

There were a number of other “minor” ideas I had, but these were the main ones I wanted to try and test in a self-structured experiment.

In order to carry out that experiment, I built (with Claude Code and LLM assistance) a tool called devex.

You can check it out to get an idea of what it does, but the basic idea was just a simple self-capture system in the terminal to capture things like mood, commits/lines of code, sentiment, and how well I could recall information about what I was building. I built the tool around the idea of “blocks”, which are just timeboxes (initially 2 weeks) wherein I would set a condition like No AI, Some AI, Heavy AI. You can read more about exactly what I meant by those in this post, but the gist is:

  • No AI = zero AI at all (no AI-powered search, no chatbots, no codegen, no agents, no inline-assist, zip)
  • Some AI = primarily hand coding, AI allowed for rubber ducking, planning, searching, tedious tasks
  • Heavy AI = primarily agentic coding through Claude Code, Open Code, or Zed (and planning / leaning on Claude web for research, overnight tasks, remote development, sandboxes, etc)

I followed this fairly dogmatically.

What I expected was to come out of this with some data that would let me have a more nuanced opinion of how AI was impacting my own workflow. I expected to see something inbetween the extremes of what I saw online. Not an “AI transformed my whole life and I’m never writing code by hand again” and also not “I will never touch an LLM and you can never convince me otherwise”.

Conclusions

In short, I am extremely concerned about how AI impacted my workflow, and not for the reasons I expected. I’m not going to share the data here (for privacy / IP reasons primarily — I work in healthtech), but I will share broad strokes percentages where I can.

I ran blocks of this back to back for nearly 3 months (Q1). The first 6 weeks were simply split into 3 blocks (Some AI, Heavy AI, No AI). After that I decided to try a simpler 3 weeks of Heavy AI and then 3 weeks of No AI, with a few days off in the middle.

Let’s take each hypothesis in turn.

I procrastinate less with AI assistance: False

This is a very weird one. When I wrote this hypothesis, I wanted to look at something I was noticing, which was that I was starting a lot more stuff. More projects, more repos, more ideas, more everything. My insight at the time was “Wow, it’s really easy for me, someone who has struggled with procrastination, to simply sit down and say Hey Claude, here's what I need to do...” (paraphrased) and get started on whatever I needed to do. I saw this as similar to the 2 minute rule or whatever it’s called where you’re supposed to just do something for 2 minutes, and that often is enough to get you going. It’s a brain hack kind of thing.

Generally speaking, I found it true during the periods where I was allowed to use AI that I would start things more readily. HOWEVER, in reflection, I actually think I procrastinated more.

By this I mean, I did start plenty of things (arguably way too many things). But often the things I started doing weren’t actually moving the needle in any meaningful sense. Ironically, much of the work I did (both at my day job and in personal projects) was shallow, unimportant, or entirely a distraction in itself. It was so easy to start doing something, that I rarely actually built anything that felt meaningful. Or if it did feel meaningful, I would end up working on something else at the same time, due to how easy it was to hop onto a new idea or project.

Let’s just look at everything I started building (just side stuff) over the past few months:

  1. Breakline - a private project for an online game where users could collect popular news stories as trading cards and autobattle them
  2. Beacon - a privacy-first analytics platform built on Gleam and Postgres intended for analytics for deep user sessions in SPAs primarily
  3. my wife’s blog - a blog for my wife, built on SvelteKit and Cloudflare
  4. An unnamed, private frontend stack evaluation tool (compares performance and “complexity”) across different metaframeworks in a relatively complex app
  5. A forked version of opencode where I ripped out the telemetry for testing and then promptly abandoned it
  6. traverse - another performance capture tool for frontend stacks
  7. npmx.dev - a few early contributions to npmx.dev, primarily the “docs” feature similar to jsr where you can see autogenerated docs based on type information
  8. A private design system test to compare different design system methodologies
  9. syl - a highly-personalized notetaking and organization system built to improve on issues I had with my Obsidian markdown vault
  10. syl-tui - a ratatui interface for interacting with syl
  11. syl-bootstrap - a nix-inspired way to package up all of the tools I use to develop in setup scripts I can use across all of the machines I work on
  12. syl-read - a content tracking system built into syl for managing and cataloging content I’m consuming (books, talks, games, whatever)
  13. nahel - a custom Rust-based agentic harness inspired by OpenCode, Pi, Claude Code, and Codex built to interact directly with syl and act as more of a “thought medium” and personal assistant

…and about a dozen half-baked game prototypes, only one of which was truly unique, and none of which were fun as a prototype.

This doesn’t even include all of the projects that I “started” as conversations or mini-prototypes that never even made it to a repository for tracking. It also entirely ignores all of the work I did during my day job and the stealth-startup that I’m a technical advisor on.

If you’re like me and you look at that list and go “how the fuck did you do so many things?”, the answer is I didn’t.

90% of this I started and abandoned, realized were distractions from important work, or felt ashamed/hollow that I couldn’t add features to because I didn’t know how they worked after a certain point.

I procrastinated more with AI. I started a lot of stuff, and some of that stuff was legitimately interesting or taught me a lot. So I don’t think it’s fair to say it was “wasted” effort necessarily, but it certainly wasn’t the highest value-spend for my time.

I ship more code with AI assistance: True

Similar to above, this one is messy, and at this point you can probably guess why.

Yes, I shipped “more code”, but as above the code wasn’t exactly high-value. Half of the projects above I’m sort of nervous to even post about, just because of how silly the codebases look in retrospect. Don’t think I have too much more to add here, except that I fully expect this is a dimension that will get fully gamed and misunderstood by companies. It’s a common theme I’m seeing as I look back and consider how AI has impacted my work, which is that there is a clear throughline of more or what Cal Newport calls “pseudowork”, work that looks like real work, but is really either busywork or low-value work. Note that I’m not saying this is zero value work. It feels more like what I’ve heard referred to as “snacking” or “preening” in software development circles.

These are ideas that come from Staff Engineer: Work on what matters. The gist is that, especially for someone at my level, my focus should be on the hard problems that require my skillset and deep focus. These are typically high effort, high impact. I’m not saying that all of my work is that, but at the very least I strive to spend as much time as possible on high impact work. At my day job that might mean whatever is going to result in more revenue, which is obviously a very broad category, and I’ll leave it up to the reader to break down.

What I want to avoid is the low impact stuff, particularly low impact, high effort. What I saw in my data was that, because of how “easy” AI made certain work, I ended up doing a lot of low impact, easy work (in both my day job and out) when I was heavily focused on using AI. When I was on strict “No AI” blocks, I saw the reverse. I rarely sat down and knocked out whatever the unimportant thing was, as I didn’t feel I had time to. I often grappled with starting a hard thing for a while, but ultimately the things I ended up doing mattered a lot more.

Code quality is slightly worse with AI but still acceptable: Unclear

This one was a mixed bag, and dependend a lot on what I was working on.

Generally, I’d say the “code” was acceptable in the sense that it worked (often with a lot of steering when I was doing “heavy” AI periods). But because I was churning through so much code, inevitably I would get to a point where I didn’t fully understand what I was changing. This led to what felt like what I’ll call “complacent engineering”, where the things I were building would have never held up to my bar if I was hand-writing them, but because of the general speed of development I didn’t really care to fix it. Worse, the more this complacent code got in, the less I cared. This showed up in sentiment in my data like “okay”, “works”, “fine”, etc.

Now I can in retrospect go look at the code quality written during the heavier AI blocks and say definitively it’s worse architecturally. Much of it is fine syntactically, I’d argue, but the system is incoherent and messy. This held up even when I tried explicitly to course correct against this, adding in hand-written tests, documentation, or otherwise adding friction to the agent adding to the codebase. I think the rate at which complacent code accumulated was slower, but still faster than if I was handwriting.

I think a separate reason this happened is I would often build things that, again, weren’t actually moving the needle and would bloat codebases.

I retain less understanding of the codebase with AI: True

This one is not only true, it is the biggest driver of my current hesitance to continue using these tools, even with guardrails and sophisticated harnesses.

In short, I simply do not remember the code I’m writing when I use AI to write it. Worse, I noticed I’m not developing the mental representations that I need to succeed at my level. I can’t go into a meeting and vibe explain architecture decisions to stakeholders, project managers, or my boss. I have to know what the fuck I’m talking about.

And unfortunately, I often find that I simply… don’t. Or at least, I don’t with the conviction I do when I’m forced to grok and write it myself top to bottom.

I’m surprised at the extent of this as well. It’s not simply that I don’t remember a particular algorithm I used or the implementation details of some feature. No, I literally don’t remember how the system works. Looking back through the projects above as I listed them here, I could not name a single file in Breakline by memory, a project I built not 3 months ago. I couldn’t even tell you (without looking) what the stack is in full. I know there’s Svelte rendinerg UI, Zero sync engine assisting with autobattling, and Gleam handling some backend state machines in there somewhere, but I don’t really remember how they fit together to accomplish the goals of that particular project. If I had to add a feature to it or fix a bug, I’d have no idea where to start.

You don’t have to believe me, and I’m not saying this is/should be anyone else’s experience, but it confirms a long-held suspiscion I’ve had about these tools on my own workflow. It reminds me of similar conclusions I made around notetaking back when I was studying in college. I absolutely hated taking notes by hand. Also, since I was studying computer science, it felt reaffirming to take notes in markdown on an editor on my laptop. It was an identity thing.

But I wasn’t really retaining hardly anything I wrote down. A professor encouraged me to try taking notes by hand instead (and yes I know there’s some shaky evidence here). For me, the difference was immediately noticable, and after that even when I did take notes on a laptop (which I still did when I really wanted to capture everything since I could type way faster than write) I would still go back and transfer those notes by hand onto paper.

I suspect something similar happens when I write code. I felt this early on and tried to adjust my workflow a touch by having AI write a “first draft” and then hand editing it after. But honestly this seemed to just take more time for most things than if I just did it myself from the beginning. Not everything, but lots of things. The areas where it didn’t feel that way were extremely tedious tasks where there was a clear input and output. For example, at one point I needed to quickly convert a bunch of RGBA colors in figma to an Oklch representation in code and then demo them in an app to check color variance. This was the kind of thing that done by hand would’ve taken me a long time, and I didn’t really care about the code at all. I just needed a quick yes or no on whether it made sense to convert between the two for a design system I was working on. I still feel like there’s some cognitive trade off here, but I suspect there are many tasks like this that being able to defer it to an LLM makes sense.

The problem is, in practice, I’ve found it very hard to reliably draw that line without overindulging. It’s so, so easy to say “Okay Claude, I need you to do this one thing for me” and then doom-prompt into vibing an entire day of sloppy work that I don’t retain the next day. This spirals in that now I either have to go back and spend more time grokking it to catch back up or (what I noticed myself more often doing) is just letting that turn into complacent code and letting the LLM figure it, further separating my mental representation of the codebase from reality.

To be clear, for me this is the most existential risk I saw over the past few months for my personal workflow. My job, I find, is often to take all of the small things cumulatively and figure out what the next big thing needs to look like. I have found it significantly more challenging to do this and make “correct decisions” (for some definition of correct) in my work when I don’t do the small things and take my time building the pieces by hand. When I don’t form the mental representations, I degrade my value to my team.

Perhaps more damning, is the emotional impact of this, which I cover in the next and final hypothesis.

Moderate AI use is more fulfilling than No AI or Full AI: False

This is perhaps the point I discovered for myself that feels the most controversial.

I already wrote at length over a year ago about the intangible discontentment I had when using AI for prolonged periods, which at that time was primarily through autocomplete.

And yet, it felt hard to dismiss the utility here, particularly with how… creatively energized I was by agentic AI (re: starting many projects, actioning many ideas). I expected that leaning too heavily into LLM-assisted coding would give me the same hollow, prideless feeling in my work that I experienced prior. But I also thought that not using AI would feel unsatisfying, or at the very least introduce FOMO.

Full AI predictably felt hollow. Actually, I’d liken it more to an addiction. I experimented plenty with substances in college, and was 100% addicted to gaming in college as well (to my detriment). I’m not going to sit here and try to make hand-wavy parallels around how abusing substances or playing games 100 hours a week instead of studying/doing your responsibilities is identical to using AI. Sure, there’s already plenty of research happening on the psychological impacts of using AI, particularly the findings on AI-induced psychosis. I’ll leave that up to people more psychiatrically qualified, and I also don’t think it’s critical to my takeaway.

When I say “i’d liken it more to an addiction”, what I mean precisely is that I felt similar physiologically to how I felt during my life when I was addicted to things. A “high” that is difficult to turn away from day-to-day, but ultimately left me feeling fatigued, hollow, shameful, and/or hopeless to some degree. I look back and 90% of what I built during the past quarter and take little pride in it. I know it sounds dramatic, but I would literally sneak off to my desk between hanging out with my kids to check a prompt and send the next one after hours. I would have 2 or 3 threads going at once in the background, researching something when I leaned all the way in.

And in retrospect it made me feel kind of gross?

There was little different between “Full AI” or “moderate AI” here, and I’m trying to stay grounded here an open to the possibility that perhaps this is a discipline thing and I just need to “find the right limits” or whatever for moderate AI usage, given how much utility there is. But the formerly addicted person in me also knows that’s a slippery slope.

Some conclusions

I have recently stumbled harder into the “Better Software” community (e.g. BSC 2025, Casey Muratori, Ginger Bill, and so on) and am at a point in my career where I’d simply prefer to lean into the “satisfying but hard” side of my work. I don’t know what that means for my current AI-usage, as I feel like long term the impacts of LLM-assisted coding aren’t going anywhere. As I mentioned above with the color-space example, there are so many “good examples” of AI-assisted coding that I’ve encountered over the past few months that I feel it’d be irresponsible to fully ignore the developments/not continue to iterate on my perspectives here.

One major takeaway for me has been leaning further into de-anthropomorphizing all LLMs as much as possible. It feels somehow more reasonable to me to use a tool that uses an LLM as part of its system to achieve a desired output than it does to simply lean fully on a giant language model where simply predicting the next token “really well” is the entire feature.

For example, I built syl, my “second brain” of sorts. One of the features I built into syl (by hand, for what it’s worth) is a feature that takes all of my notes from a given time period plus some search terms, runs them through fzf, uses an LLM to find relationships between the results and throw that into some YAML, then another feature takes that and lets me visualize the data and drill into specific notes. Kind of like a more interactive, fuzzy-search enabled, dynamic mind-map.

It feels like a very well-scoped problem, where the LLM part is simply one small piece of the toolchain and not the tool itself.

On the other hand, I feel very weird about the agentic harness I built nahel. While this was a really educational project in itself, it doesn’t really do anything different than Claude Code or any of these other agentic CLIs. It’s more personalized to my workflow, sure, but the outcomes of it suffer from the same problems as everything I’ve discussed above.

And on the “psychosis” front, it feels particularly weird in retrospect that I named it nahel. If you’re not familiar, the names syl and nahel come from Brandon Sanderson’s fictional universe Cosmere. In the Stormlight Archives (currently 5 massive high-fantasy books), one of the main characters, Kaladin, has a “bond” with a spirit-like character named Sylphrena (who he affectionately calls Syl for short). The two form a deep relationship and Syl enhances Kaladin’s physical abilities and gives him pseudo-magical powers. That’s beside the point, though. The name of the “bond” between them is called the Nahel bond, a lifelong bond between “Spren” (the type of creature Syl is) and Kaladin (a human).

…and I went straight to this name for my agentic harness, the thing that connects me to my “second brain/knowledge base”. See the problem?

I wouldn’t have considered it a problem (and am trying not to convey an overreaction here either), but it feels awkward looking back and noticing that I gave such an effectively intimate name to this technology wrapping an LLM.

I continue to think there’s utility in using LLMs for some generalized problems, of course, but am wary of the subtle effects it’s had on my psyche and will be taking a good long break from using them. Will post further reflections after that!

Other References

Here are some other level-headed takes I’ve come across over the past few months that are worth checking out, if what I wrote resonates with you at all: