Files
Sébastien Stormacq 4815273dc3 Fix race condition in Lambda+LocalServer causing NIOAsyncWriter fatal error (Bug #635) (#636)
On fast machines, the local Lambda server crashes with:
```
Fatal error: Deinited NIOAsyncWriter without calling finish()
```

This occurs in `NIOAsyncChannelHandler.channelActive()` when child
connection channels are created.

## Root Cause

This is a known issue with NIO's async server channel API (see
[swift-nio#2637](https://github.com/apple/swift-nio/issues/2637)).

**The fundamental problem:**

1. The async `bind()` API creates `NIOAsyncChannel` instances for
incoming connections
2. These channels are yielded through an async stream to the server loop
3. When the serving task is cancelled (or completes), the async stream
iteration stops
4. Any channels that were accepted but not yet read from the stream are
dropped
5. These unread channels never have `executeThenClose()` called on them
6. Their `NIOAsyncWriter` is deallocated without `finish()` being called
→ fatal error

**Why graceful shutdown doesn't help:**

Even closing the server channel gracefully doesn't eliminate the race -
there's a timing window where:
- A connection is accepted and queued in the async stream
- The server task is cancelled or completes
- The queued channel is never read and gets dropped

IMHO, this is an inherent limitation of the `async bind()` API when
combined with task cancellation.

## Solution

I stopped using the `async bind()` API entirely. Instead, I use the
traditional callback-based `childChannelInitializer`:

1. Create `NIOAsyncChannel` directly in `childChannelInitializer`
(synchronous context)
2. Immediately spawn a `Task.detached` to handle the connection
3. Each connection is handled independently, not through a cancellable
async stream
4. Detached tasks are not affected by task group cancellation
5. Every channel has `executeThenClose()` called immediately, preventing
the writer from being dropped

This approach avoids the async stream entirely, eliminating the race
condition.

## Changes

- Replaced `async bind()` with traditional `childChannelInitializer`
- Each connection spawns a `Task.detached` that immediately calls
`executeThenClose()`
- Removed the connection iteration loop (no longer needed)
- Server task now simply waits for the channel to close
- Simplified shutdown logic since there's no async stream to drain

## Trade-offs

- Uses `Task.detached` (unstructured concurrency) to bridge NIO's
event-loop world with Swift concurrency
- This is necessary until NIO provides a new bootstrap API that properly
handles cancellation
- Each connection is handled independently rather than through
structured concurrency

## Testing

Tested on fast machines where the race condition was reliably
reproducible. The crash no longer occurs.

## References

- [swift-nio#2637](https://github.com/apple/swift-nio/issues/2637) -
Known issue with async server channels and cancellation
- [Comment from NIO
maintainer](https://github.com/apple/swift-nio/issues/2637#issuecomment-1921317577)
- Recommends avoiding cancellation or using callback-based API

Fixes #635

---------

Co-authored-by: Sebastien Stormacq <stormacq@amazon.lu>
2026-01-27 09:10:58 +00:00
..