3 Commits

Author SHA1 Message Date
Sébastien Stormacq fef8c0d130 Fix for swift test : _Concurrency/Executor.swift:357: Fatal error (#639)
Fix the issue described at
https://github.com/awslabs/swift-aws-lambda-runtime/issues/640

Here is the proposed fix:

I added a new function `assumeIsolatedOnEventLoop` — a nonisolated
method that:
1. Calls `self.eventLoop.preconditionInEventLoop()` to verify we're on
the correct event loop (NIO's own thread-identity check, which always
works)
2. Uses `unsafeBitCast` to strip the isolated annotation, the same
pattern NIO uses internally and that I found on the Swift Forums.

See:
https://github.com/swiftlang/swift/blob/main/stdlib/public/Concurrency/ExecutorAssertions.swift#L348

See:
https://forums.swift.org/t/actor-assumeisolated-erroneously-crashes-when-using-a-dispatch-queue-as-the-underlying-executor/72434/3

---------

Co-authored-by: Sebastien Stormacq <stormacq@amazon.lu>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-02-14 13:05:33 +01:00
Sébastien Stormacq 102f92aafb Fix race condition crash in LambdaRuntimeClient channel lifecycle (Bug #624) (#632)
This PR fixes a race condition in `LambdaRuntimeClient` that causes a
fatal crash when an old channel's `closeFuture` callback fires after a
new connection has been established. The fix adds proper channel
lifecycle tracking and replaces the fatal error with graceful handling.

## Problem

**Crash Location**: `LambdaRuntimeClient.swift:270` in `channelClosed()`

**Error Message**:
```
Fatal error: Invalid state: connected(SocketChannel { ... }), closed
```

**Root Cause**: Race condition where:
1. An old channel's `closeFuture` callback fires
2. AFTER a new connection has been established (`connectionState =
.connected`)
3. BUT `closingState` is `.closed` from a previous close operation
4. The code asserted this was impossible and crashed with `fatalError`

This can occur when:
- Network conditions cause delayed channel cleanup
- Connection is recycled quickly (old channel still closing while new
one connects)
- Timing issues between channel close callbacks and new connection
establishment

## Solution

### Key Changes

1. **Added channel identity tracking**:
   ```swift
   private var channelsBeingClosed: Set<ObjectIdentifier> = []
   ```
Tracks which channels are in the process of closing to distinguish old
channels from the current one.

2. **Enhanced `connectionWillClose()`**:
   - Marks channels as "being closed" using `ObjectIdentifier`
   - Adds logging when old channels close while new connection is active

3. **Rewrote `channelClosed()` with defensive logic**:
- **Early return for tracked old channels**: Handles them gracefully
without affecting current connection
- **Replaced `fatalError` with warning log**: The `(_, .closed)` case
now logs a warning instead of crashing
- **Channel identity checks**: Only transitions state if the closing
channel is the CURRENT channel
- **Removed unconditional state change**: Previously set
`connectionState = .disconnected` for ANY channel close, now only for
the current channel

### Why This Fixes the Bug

The fix addresses the race condition by:
- Distinguishing between "current channel closing" vs "old channel
closing"
- Handling old channel closes gracefully without crashing or corrupting
state
- Not overwriting connection state when old channels close
- Providing visibility through logging when the race condition occurs

## Changes

### Modified Files

- **Sources/AWSLambdaRuntime/HTTPClient/LambdaRuntimeClient.swift**
  - Added `channelsBeingClosed: Set<ObjectIdentifier>` property
  - Enhanced `connectionWillClose()` with channel tracking
  - Rewrote `channelClosed()` with defensive logic and identity checks
  - Replaced `fatalError` with warning log for unexpected states
  - Removed unconditional state change in `closeFuture` callback

**Lines Changed**: ~150 lines modified/added

**Backward Compatibility**:  Fully compatible, no API changes

## Testing

###  All Existing Tests Pass

```bash
swift test
# Result: 91 tests passed in 14 suites
```

All original functionality is preserved with no regressions.

### ⚠️ Note on Test Coverage

While we cannot reproduce the exact race condition from bug #624 in a
deterministic test (it requires specific network timing), the fix:
- Is logically sound for the described race condition
- Improves defensive programming around channel lifecycle
- Replaces a fatal crash with graceful handling + logging
- Should prevent the crash by properly tracking channel identity

## Related Issues

Fixes #624

---------

Co-authored-by: Sebastien Stormacq <stormacq@amazon.lu>
2026-01-27 08:53:31 +00:00
Sébastien Stormacq e0f064a93e Refactor project directories (#621)
This PR refactors the project's directories.
As the number of source files grows, I created subdirectories to
separate the runtime itself, from its HTTP Client (`RuntimeClient`) and
local HTTP Server (`Lambda+LocalServer`).

The new layout looks like this:

```text
Sources
├── AWSLambdaRuntime
│   ├── FoundationSupport
│   │   ├── Context+Foundation.swift
│   │   ├── Lambda+JSON.swift
│   │   └── Vendored
│   │       ├── ByteBuffer-foundation.swift
│   │       └── JSON+ByteBuffer.swift
│   ├── HTTPClient
│   │   ├── ControlPlaneRequest.swift
│   │   ├── ControlPlaneRequestEncoder.swift
│   │   ├── LambdaRuntimeClient+ChannelHandler.swift
│   │   ├── LambdaRuntimeClient.swift
│   │   └── LambdaRuntimeClientProtocol.swift
│   ├── HTTPServer
│   │   ├── Lambda+LocalServer+Pool.swift
│   │   └── Lambda+LocalServer.swift
│   ├── Lambda.swift
│   ├── LambdaClock.swift
│   ├── LambdaContext.swift
│   ├── LambdaRequestID.swift
│   ├── LambdaResponseStreamWriter+Headers.swift
│   ├── LambdaRuntimeError.swift
│   ├── Runtime
│   │   ├── LambdaHandlers.swift
│   │   ├── LambdaRuntime+Codable.swift
│   │   ├── LambdaRuntime+Handler.swift
│   │   ├── LambdaRuntime+ServiceLifecycle.swift
│   │   └── LambdaRuntime.swift
│   ├── SendableMetatype.swift
│   ├── Utils.swift
│   └── Version.swift
└── MockServer
    └── MockHTTPServer.swift
```
2026-01-01 11:19:13 -05:00