Somewhere in App Store Connect, a build of mine is sitting in a review queue, and the reviewer assigned to it has no idea that the person who submitted it runs an engineering organization for a living. That is exactly how it should be. The queue does not read org charts. It does not care about your title, your tenure, or the size of the team you lead. It cares whether your app lets people delete their accounts the way the guidelines require, and whether it crashes on a device you have never owned. After thirty years of building software, I still find that clarifying.
GSD, short for Get Stuff Done, is now live as a native app on iPhone, iPad, and Mac, alongside the web app it started as. It is the same calm Eisenhower matrix on every screen, with sync as a choice rather than a default, and your data living on your device until you decide otherwise. I have already written about how I built the native version, in more architectural detail than most people will ever want. That essay was about the discipline inside the build. This one is about what the build did to my judgment.
The same calm matrix on a Mac and an iPhone. One model, laid out for whatever screen you are on.
The easy reading is that this is a hobby, and that I am dressing it up as professional virtue after the fact. I want to take that objection seriously, because it is a fair one, and I will come back to it. But the honest answer is narrower and more useful. Staying in the code is not nostalgia, and it is not a way of proving I can still do the work the teams I lead do every day. It is calibration. It is how I keep my understanding of how software actually gets made from drifting quietly out of date.
In the AI era, a leader's technical judgment expires faster than it used to. Staying close to the code is not about proving you can still type. It is how you keep your model of the work current enough to make decisions that matter.
The map is being redrawn
For most of my career, model drift was survivable. Knowledge decayed, but slowly enough that a mental model from three or four years ago was still mostly right. You could keep learning, read widely, listen to your teams, and coast on verified intuition without much risk. The map changed, but it did not redraw itself while you were still in the meeting.
That grace period is over. The way software gets built is being redrawn in real time, and the redraw is not cosmetic. Coding agents have changed the shape of the work itself. The bottleneck is moving off of typing and onto the things that were always harder: stating intent precisely, encoding it into boundaries a machine will respect, and verifying the result fast enough to stay honest. I have argued at length that the bottleneck was never the stack, and that the constraint now lives in the human framing of a problem rather than the execution of it. I believe that. But there is a difference between believing it from a deck and knowing it from a keyboard.
This is what makes "leaders should stay technical" more than a tired slogan in 2026. It is no longer about remembering how a loop works or keeping your syntax sharp. It is about whether your model of the work reflects the work as it is being done right now, this quarter, with these tools. If the last time you personally shipped something was before agents could carry a real share of the load, then your intuition about velocity, risk, and what a small team can now accomplish is calibrated to a world that no longer exists. And you are setting strategy with it.
I did not want to be that leader. So I went and shipped something the new way, end to end, and paid close attention to where it actually broke.
Leading with both hands
None of this is new for me. I started building my first iOS app in 2014, when Swift was brand new and most people were still on Objective-C, and it shipped to the App Store. I have been building on the side ever since, one release at a time. A year ago I finally gave the practice a name: leading with both hands. One hand on the work most people associate with my title, the teams, the strategy, the direction. The other hand still on a keyboard, shipping real things, because there is an empathy you only get from debugging your own code at 11 PM, and it keeps you connected to what your teams do every day.
I still believe every word of that. But reading it back now, one line stands out as the part I underplayed. I noted, almost in passing, that the half-life of what I knew kept shrinking, and I treated it as a reason to stay curious. I did not yet see that it was about to become a reason to stay essential. The argument I made last summer was about empathy and humility. The one I am making here is sharper, and a little less comfortable. The honest part is that I had the signal a year ago and underweighted what it meant.
Altitude has a cost
Here is the uncomfortable truth about climbing the engineering ladder. Every rung up trades contact for context. You gain a wider view and you lose resolution. The work that used to run through your hands now reaches you as a status update, a roll-up, a slide with a green box on it. You stop reading diffs and start reading dashboards. None of that is a failure. It is the job. A leader who is still personally merging pull requests for a team of fifty is not leading; they are hovering, and they have become the bottleneck.
But the trade is real, and it is easy to pretend it is free. The further you get from the keyboard, the more your sense of how hard something is, how long it should take, and what could go wrong comes from secondhand accounts. You begin to reason about the work from a model of the work. For a while the model is accurate, because you built it yourself, recently, with your own hands. The danger is that the model keeps feeling accurate long after the ground it describes has changed.
I have watched capable leaders make confident calls on a mental map that quietly went stale years earlier. They were not careless. They were running on the most recent version of reality they had personally verified, and they had no signal telling them the version was old. That is the quiet failure mode of seniority. Not that you stop knowing things, but that you stop noticing when what you know expires.
Firsthand contact is what keeps that from happening, and now it is less a nice leadership habit than preventive maintenance. Think of it as changing the oil in your judgment before the engine light comes on, which is much cheaper than explaining later why smoke is coming from the roadmap.
A personal project with real teeth
GSD is a good test rig precisely because it is not a toy, even though it began as one. It started as a confession. I am a procrastinator, I had tried every task app on the market, and like any reasonable engineer I responded to that frustration by building my own. For two years it was a web app that did exactly one job: put every task on the urgent-versus-important grid and make me decide what actually mattered.
What makes it useful as a calibration instrument is that I refused to let it stay trivial. It is open source, so the claims are checkable. It now ships on the web, on iPhone, on iPad, and on the Mac, sharing one model across all four. It has optional sync you can self-host, and an MCP server so an AI assistant can read and write tasks under review. None of that was necessary for a personal to-do list. I built it anyway, because the goal was never only the to-do list. The goal was to walk every mile of shipping a real product, including the miles that have nothing to do with writing features.
Everything beyond the grid lives one layer down: streaks, completion trends, and a by-quadrant view.
A project you control but refuse to keep trivial is the ideal place to feel the current state of the craft. There is no team to negotiate with and no roadmap to defend, so nothing dilutes the signal. But because I held it to a real bar, App Store and all, it still charged me the full price of shipping. That combination, total control and a refusal to cut corners, is what turned a procrastinator's side project into the most honest feedback I get about how building software actually feels right now.
Reality does not grade on seniority
Here is what shipping gives you that no meeting can: the truth.
A status report is an interpretation. A demo is a curated slice. Shipping is the unedited thing, and it does not care who you are. When I took GSD native, the platform handed me a list of realities that no amount of seniority could shortcut. There is a guideline, 5.1.1(v), that says if users can create an account they must be able to delete it from inside the app, so I built an entire ordered deletion flow to satisfy a rule that, to be fair, is the right rule. Every upload interrogates you about export compliance and encryption. Mac Catalyst, which markets itself as nearly free, grew its own species of bug, including a web-auth callback that arrived on the wrong thread through plumbing I had to go learn.
The in-app account deletion flow that App Store guideline 5.1.1(v) requires, built like any other feature.
My favorite was a sync bug I have come to be oddly fond of. Sync used last-write-wins, and the winner was decided by a timestamp. My first version trusted the device's own clock to decide what "since" meant when pulling changes. Devices' clocks are routinely wrong by seconds. When a device thought it was slightly in the future, it asked the server for changes since a moment that had not happened yet. The server truthfully answered "nothing," and completed tasks quietly vanished between devices. No crash, no error, just data that evaporated. The fix was conceptually one line: never trust the local clock, advance the cursor by the server's stamp. But finding it required staring at the actual failure on actual hardware.
You do not get bugs like that in a roll-up. You get them by holding the thing in your hands while it misbehaves. That is ground truth, and it is worth more to my judgment than a quarter of dashboards. The clock did not care about my title. The review queue did not care. The compiler has never once been impressed by my tenure, and that is exactly why I trust what it tells me.
Thirty years in, a beginner again
There is a second thing shipping GSD gave me, and it is less comfortable than ground truth. It made me a beginner again, on purpose.
My professional center of gravity is cloud and platform engineering. That is the domain I have the deepest reflexes in. Native iOS and the Mac are not that domain. Swift and SwiftUI, App Intents, widget timelines, the precise way an iOS extension is memory-constrained, the exception the system throws when you hold a database locked while suspended in the background: most of that sat outside my daily reflexes when I started. So I did the thing I ask the people I lead to do constantly, which is to be bad at something new for long enough to get good at it.
I will be honest about how that feels at thirty years in. Being a novice is uncomfortable in a specific way when you are senior. You are used to competence being your baseline, and beginnerhood asks you to sit in not-knowing, to be slow, to be wrong in front of a problem that does not care about your résumé. The instinct is to retreat to the domains where you are already expert. Resisting that instinct is the entire point. The discomfort is not a side effect of learning; it is the sensation of learning, and a leader who has not felt it recently has quietly stopped.
I also learned the new way of building, not just the new platform. I did not write most of the lines of GSD. I built it with an AI agent, using a rhythm I have written about elsewhere: spec, then plan, then execute, one verifiable step at a time, with a test suite fast enough to keep both of us honest. Learning to lead an agent well, to be precise about intent and ruthless about boundaries, is its own skill, and it is exactly the skill I am asking entire teams to develop. I would rather learn its texture on my own project than theorize about it on theirs.
What the ship brings back
All of this would be self-indulgent if it stopped at the App Store. It does not. The reason a busy leader should spend scarce hours shipping real software is that the calibration comes back to work with you, and it changes the quality of your decisions.
Start with credibility, the kind you cannot fake. When I sit with a team navigating the agentic SDLC, I am not reasoning from a conference talk. I have felt where agents are extraordinary leverage and where they are confidently useless, and I have the scars to say which is which. When a vendor tells me their tool makes the boundaries unnecessary, I know from direct experience that an agent respects the boundaries your tooling enforces and ignores the ones that live only in someone's head. That is not an opinion I borrowed. I earned it, and people can tell the difference.
Then there is the calibration of judgment itself. Having just walked the full distance of shipping, my sense of what is actually hard is current. It changes which vendor claims I believe. It changes how I hear a team's estimate. It changes how much confidence I place in a demo that has not yet met a gate. I am less likely to wave off a "small" change that is quietly enormous, and less likely to be impressed by the shiny part of a workflow that hides the unglamorous ninety percent. My estimates have fresh ground underneath them. When someone tells me a thing will take two weeks, I have a recently verified model to check that against, not a memory from a version of the work that no longer exists.
And there is the part that matters most to me as a servant leader, which is modeling. I cannot credibly ask my organization to embrace continuous learning, to sit in the discomfort of new tools, to ship with discipline rather than enthusiasm, if I am unwilling to do those things myself. Culture is not what you say in an all-hands. It is what people watch you actually do. Shipping GSD is me doing the thing I am asking for, in public, with my name on it and the same constraints everyone else faces. That is worth more than any number of slides about a growth mindset.
The objection, taken seriously
I promised I would come back to the skeptic, because I would raise the same objection. Here is the fair version of it.
This is a privilege. I rebuilt a personal app I fully control, with no deadline, no stakeholders, and no one to answer to but myself. Most engineering does not get to take the scenic route, and a clean result from a project of one proves less than it appears to. Worse, a VP of Engineering has genuinely higher-leverage uses of time. Should you not be developing your people, clearing roadblocks, and setting direction, rather than fighting with Mac Catalyst at night? It is a real question, and waving it away would be exactly the kind of hand-waving I just said I distrust.
Two honest responses. First, this is not instead of the leadership work; it is the small, deliberate investment that makes the leadership work sharper. The hours are modest, and the return is a calibrated model I use in nearly every technical decision I make. That is one of the higher-leverage things I do, not a distraction from it. Second, the privilege is real, and it does not invalidate the lesson; it isolates it. The point was never "everyone should build a task manager." The point is that contact with the work keeps a leader honest, and a controlled personal project is simply the cleanest place to get that contact. If you have a better way to stay in touch with how software is actually made this quarter, use it. The method is negotiable. The contact is not.
Find your own ship
So here is where I land. I lead engineering at a Fortune 100 company, with thirty years behind me, and I just shipped a task manager to the App Store on three Apple platforms myself. Not because the world needed another to-do app, and not to prove I still can. I did it because the org chart does not ship the app, and the moment I forget that is the moment my judgment starts running on a map of a city that has already been rebuilt around me.
The takeaway is not the app, and it is not even iOS. It is the practice. Pick something real, something you control, and hold it to a bar high enough that it charges you the full price of shipping. Then pay attention to where it breaks, because that is the current state of the craft telling you a truth no dashboard will. Learn a tool that makes you a beginner again, and notice how it feels, because that feeling is the one you are asking your teams to live in. Carry the calibration back to work, and let it sharpen the decisions only you can make.
A year ago I closed a post much like this one with an invitation: find your way to stay close to the craft, whether through open source, tinkering with models, or just reading your teams' pull requests. I meant it as encouragement. I am reissuing it now with more urgency, because the menu itself has shifted. Reading PRs is still worth doing, but it is no longer enough on its own. More and more of those PRs are written by agents, and trying to understand that change from the outside is like reviewing a language you have never spoken. The surest way to know what your teams are actually living is to go live a small version of it yourself.
I went looking for a way to get stuff done. I came back with something more durable: proof, renewed on my own hands, of how the work actually works right now. That is the part I am keeping. Every leader needs some version of the App Store queue, a place where the title disappears and the work answers back.
