Production polish: custom icons, orphan processes, and the Windows tree-kill problem
Three production-build war stories from a single week — afterPack rcedit, an EADDRINUSE loop on port 3000, and the cmd→npx→node grandchild that survived every kill signal until taskkill /T fixed it.
The week before launch is when projects find out what production really means. For Nova, three problems showed up in three days, all in territory I'd been mentally classifying as "trivial polish." None of them were trivial.
1. The default Electron icon that wouldn't go away
Nova has a glowing red-orange orb as its identity — the in-app NovaOrb runs on the same palette (oklch(0.66 0.24 25), around #e74832). When I shipped my first production build with pnpm desktop:exe, the installer's icon, taskbar pin, and Start menu entry all showed the default Electron logo. Hours of polish, and the very first thing a new user would see was someone else's brand.
I'd already done the obvious work. Designed the SVG (apps/desktop/build/icon.svg). Added a build:icon script that uses sharp to render icon.png (1024²) and png-to-ico to bake a multi-resolution .ico (16/24/32/48/64/128/256). Wired BrowserWindow.icon in dev. The orb showed up in dev mode just fine. Production was the problem.
Root cause: my package.json had win.signAndEditExecutable: false. That flag exists to bypass winCodeSign's 7z extraction failure on Windows machines without Developer Mode (the winCodeSign archive contains darwin .dylib symlinks that 7za.exe can't unpack without symlink privileges — fails with "A required privilege is not held by the client"). Turning the flag off lets the build succeed. It also turns off electron-builder's rcedit step, which is the thing that actually embeds the icon resource into Nova.exe.
So you can have signing-disabled and the icon, or signing-disabled and a broken build. By default you get neither and a green checkbox that lies to you.
The fix
Keep signAndEditExecutable: false for the 7z reasons. Add win.icon: "build/icon.ico" for completeness. Then wire an afterPack hook that runs rcedit directly on Nova.exe after packaging but before NSIS bundles it:
// apps/desktop/scripts/embed-icon.cjs
const path = require('path');
module.exports = async function afterPack(context) {
if (context.electronPlatformName !== 'win32') return;
const exePath = path.join(context.appOutDir, 'Nova.exe');
const iconPath = path.join(__dirname, '..', 'build', 'icon.ico');
const { default: rcedit } = await import('rcedit');
await rcedit(exePath, { icon: iconPath });
console.log(`[afterPack] icon embedded into ${exePath}`);
};
Two notes from the trenches:
- The hook file must be
.cjs. electron-builder loads hooks viarequire(), which can't load ESM.mjs. Therceditpackage is ESM-only, so the CommonJS hook dynamicallyimport()s it. - Verify with extraction. I ran
[System.Drawing.Icon]::ExtractAssociatedIconon each artifact afterwards:Nova.exe,Nova-Setup-0.1.0-x64.exe,Nova-Portable-0.1.0-x64.exe. All three carried the orb.
The Windows icon cache
Even after the build was correct, my Explorer thumbnails kept showing the old default icon. Windows caches .exe icons by filename, and my rebuilt artifacts had unchanged filenames. The taskbar icon updated immediately because Windows pulls that one live; folder views were stuck. Fix: ie4uinit.exe -show. Or just trust the running app. Don't waste an hour on this twice like I did.
2. The router that wouldn't die
My second day of "polish" started with a renderer banner that wouldn't go away: "MAX ROUTER NOT RUNNING — EXIT CODE 1." I'd seen this banner before — for a few seconds — when the supervisor was warming up. This time it stayed pinned for an hour.
I went to the dev log:
[max-router] Error: listen EADDRINUSE: address already in use 0.0.0.0:3000
[max-router] supervisor: backing off 1s
[max-router] supervisor: respawning…
[max-router] Error: listen EADDRINUSE: address already in use 0.0.0.0:3000
… (repeat forever)
So the supervisor was respawning correctly. The respawned router was failing to bind because something else held port 3000. Something else turned out to be a previous router from a previous Nova session that hadn't been cleaned up when I'd quit Nova twenty minutes earlier.
Process listing confirmed it:
node.exe PID 109076 started 14:35
command: node …\anthropic-max-router\dist\router\server.js
taskkill //F //PID 109076 killed it. Port 3000 was free again. The next supervisor respawn succeeded. Banner cleared.
That's the immediate fix. But why did the orphan exist in the first place?
3. The Windows process tree that broke child.kill()
The supervisor was already calling child.kill() on app quit. So why did the router survive?
Because of shell: true. Earlier I'd written about the spawn EINVAL on Windows .cmd shims; the fix was to spawn with a shell. That fix has a cost. When you spawn npx --yes anthropic-max-router with shell: true on Windows, you don't get one process. You get three:
cmd.exe (shell wrapper)
└── npx (which is itself a .cmd script)
└── node.exe (the actual router)
child.kill() sends a kill signal to the cmd.exe. Windows does not propagate kill signals through cmd.exe to its grandchildren. The cmd dies; node.exe survives, holding port 3000, oblivious. Next time Nova boots, EADDRINUSE.
The fix: killChildTree
I added a helper in apps/desktop/src/main/max-router.ts:
function killChildTree(child: ChildProcess) {
if (process.platform === 'win32' && child.pid) {
spawnSync('taskkill', ['/F', '/T', '/PID', String(child.pid)]);
} else {
child.kill();
}
}
The two flags that matter: /F is force, /T is "tree" — kill the named PID and every descendant. That walks the cmd → npx → node tree and terminates all three together.
spawnSync instead of spawn matters too. Electron's app.on('will-quit') handler has to finish before the process actually exits. An async kill races the exit; a sync kill blocks it long enough for the children to die.
Replaced child.kill() at both call sites — the graceful stop() path and the health-check timeout path. pnpm -F @nova/desktop typecheck clean.
End-to-end verification
I made myself prove this fix worked, not just think it worked. Restarted dev, waited for the router's startup line, identified the electron main and the router node by PID. Sent a graceful WM_CLOSE to the electron PID via taskkill /PID <pid> (no /F) — Electron's will-quit fired, stopMaxRouter() ran, killChildTree() ran, taskkill /F /T /PID <cmd-pid> killed the whole tree. Post-close: the router PID was gone. Port 3000 had only a single TIME_WAIT closing-handshake socket, no LISTENING process. Rebuilt production artifacts. Shipped.
Updated the debug-workflow memory in my vault: pre-flight must sweep both port 5173 (vite) AND port 3000 (max-router orphan), not just electron.exe. The next time something gets stuck, I'll know which two ports to free.
The lesson, if there is one
Every one of these bugs was in code I would have called "done" the day before. Custom icon → "shipped." Router supervisor → "shipped." App quit → "shipped." All three were leaking on day-of-launch.
The thing they share is that I was treating a green build as a verification. A successful pnpm desktop:exe told me nothing about whether the icon was actually embedded; the script only verified that the file was produced. A successful Electron quit told me nothing about whether the children had actually died. I needed to look at the artifact and the system after the fact, with a different tool, before I trusted it.
For me that translates to: any time an artifact is written or a process is killed, I need a verifier that uses a different code path than the producer. ExtractAssociatedIcon for icons. Get-NetTCPConnection -LocalPort 3000 for ports. Get-Process for child cleanup. Cheap to run, expensive to skip.
Next up: the Lookup feature. A TradingView-style symbol panel built in a single cut, the skill bus pattern that lets a plug-in fire renderer events without becoming Electron-aware, and the Cmd+K palette that's quietly become my favorite way to use Nova.
Want this in real time?
Discussion happens in the Discord.