Files
async-http-client/Sources/AsyncHTTPClient/ConnectionPool/HTTP1/HTTP1ConnectionStateMachine.swift
T
Fabian Fett e2ab0d176f Full support for bidirectional streaming (#879)
> ## Note: 
> This is a long LLM generated PR description. However it captures very
well, what has been changed and has already been reduced for brevity.
The PR is sadly quite complex but I think the description captures the
changes quite well.

This is foundational work needed to properly support HTTP trailers and
scenarios where the server sends a complete response before the client
finishes uploading (e.g., early rejection, 100-continue flows, or
bidirectional streaming protocols).

## Changes

### State Machine Improvements

- **Added `endForwarded` state** to
`Transaction.StateMachine.RequestStreamState`
- This new state distinguishes between "request data forwarded to the
channel" and "request data written to the network"
- Properly handles the race condition where response completes before
the request write completes

- **Renamed `succeedRequest` → `forwardResponseEnd`** in both
`HTTPRequestStateMachine.Action` and
`HTTP1ConnectionStateMachine.Action`
- Better reflects the semantic meaning: we're forwarding the end of the
response stream, not necessarily succeeding the entire request yet
  - More accurate naming for bidirectional streaming scenarios

### Protocol Changes

- **Added `requestBodyStreamSent()` to `HTTPExecutableRequest`
protocol**
- Called by the channel handler when the request body stream has been
fully written to the network
- Allows proper coordination between request and response stream
completion
  - Implemented in both `Transaction` and `RequestBag`

### Request State Machine Updates

- **Updated `FinalSuccessfulRequestAction`**
- Changed `.sendRequestEnd(EventLoopPromise<Void>?)` to simpler
`.requestDone`
- Added `.none` case for when response completes but request is still
in-flight
- Removed the need to pass promises around, simplifying the state
machine

- **`sendRequestEnd` action now includes
`FinalSuccessfulRequestAction`**
- Allows the state machine to signal what should happen after the
request completes
- Enables proper cleanup coordination (idle connection, close, or
continue)

### Channel Handler Updates

- **HTTP1ClientChannelHandler**
- `sendRequestEnd` now properly handles scenarios where response has
already completed
- Added future callback to coordinate request completion with final
actions
- Properly manages connection state (idle vs close) based on both
streams completing

- **HTTP2ClientRequestHandler**
  - Updated to handle new `sendRequestEnd` signature
  - Properly ignores HTTP/1-specific final actions (like `.requestDone`)

### RequestBag State Machine

- **Added `endReceived` state to `ResponseStreamState`**
- Tracks when the response has completed while request is still ongoing
- Enables proper sequencing: response end → request end → task
completion

- **Updated `FinishAction`**
- Added `.forwardStreamFinishedAndSucceedTask` for the case where both
streams complete simultaneously
  - Ensures delegate methods are called in the correct order

### Error Handling

- **Improved failure handling in `Transaction.StateMachine`**
- Now properly handles errors that occur after response completes but
before request finishes
  - Added `cancelExecutor` action to the fail path
- Executor is now passed to `failRequestStreamContinuation` for proper
cleanup

## Technical Details

### The Problem

Previously, when a server sent a complete response before the client
finished uploading the request body, AHC would:
1. Receive the full response (head, body, end)
2. But NOT inform the user that the response was complete if the request
was still streaming
3. Only succeed the request after both streams completed

This made it impossible to implement proper bidirectional streaming or
handle scenarios like:
- Server rejecting a large upload early (e.g., 413 Payload Too Large)
- 100-continue flows where the server responds before request completes
- HTTP trailers sent by the server

### The Solution

The new state machine properly tracks four completion states:
1. **Neither complete**: Normal request/response in flight
2. **Response complete, request ongoing**: New
`endForwarded`/`endReceived` states
3. **Request complete, response ongoing**: Existing logic
4. **Both complete**: Request succeeds

The key insight is the `endForwarded` state, which represents "we've
given all request data to the channel, but it hasn't been written to the
network yet". This allows us to:
- Immediately forward response completion to the user
- Wait for the write to complete before cleaning up resources
- Properly sequence connection state transitions

## Future Work

This PR lays the groundwork for:
- Proper internal HTTP trailer support (both sending and receiving)

---------

Co-authored-by: George Barnett <gbarnett@apple.com>
2026-02-03 12:21:31 +01:00

546 lines
20 KiB
Swift

//===----------------------------------------------------------------------===//
//
// This source file is part of the AsyncHTTPClient open source project
//
// Copyright (c) 2021 Apple Inc. and the AsyncHTTPClient project authors
// Licensed under Apache License v2.0
//
// See LICENSE.txt for license information
// See CONTRIBUTORS.txt for the list of AsyncHTTPClient project authors
//
// SPDX-License-Identifier: Apache-2.0
//
//===----------------------------------------------------------------------===//
import NIOCore
import NIOHTTP1
struct HTTP1ConnectionStateMachine {
fileprivate enum State {
case initialized
case idle
case inRequest(HTTPRequestStateMachine, close: Bool)
case closing
case closed
case modifying
}
enum Action {
/// An additional action to execute, when either the response or request stream has finished.
enum FinalSuccessfulStreamAction {
/// Nothing todo
case none
/// Close the connection
case close
/// Inform an observer that the connection has become idle
case informConnectionIsIdle
}
/// A action to execute, when we consider a request "done".
enum FinalFailedStreamAction {
/// Close the connection
///
/// The promise is an optional write promise.
case close(EventLoopPromise<Void>?)
/// Inform an observer that the connection has become idle
case informConnectionIsIdle
/// Fail the write promise
case failWritePromise(EventLoopPromise<Void>?)
/// Do nothing.
case none
}
case sendRequestHead(HTTPRequestHead, sendEnd: Bool)
case notifyRequestHeadSendSuccessfully(
resumeRequestBodyStream: Bool,
startIdleTimer: Bool
)
case sendBodyPart(IOData, EventLoopPromise<Void>?)
case sendRequestEnd(EventLoopPromise<Void>?, FinalSuccessfulStreamAction)
case failSendBodyPart(Error, EventLoopPromise<Void>?)
case failSendStreamFinished(Error, EventLoopPromise<Void>?)
case pauseRequestBodyStream
case resumeRequestBodyStream
case forwardResponseHead(HTTPResponseHead, pauseRequestBodyStream: Bool)
case forwardResponseBodyParts(CircularBuffer<ByteBuffer>)
case forwardResponseEnd(FinalSuccessfulStreamAction, CircularBuffer<ByteBuffer>)
case failRequest(Error, FinalFailedStreamAction)
case read
case close
case wait
case fireChannelActive
case fireChannelInactive
case fireChannelError(Error, closeConnection: Bool)
}
private var state: State
private var isChannelWritable: Bool = true
init() {
self.state = .initialized
}
mutating func channelActive(isWritable: Bool) -> Action {
switch self.state {
case .initialized:
self.isChannelWritable = isWritable
self.state = .idle
return .fireChannelActive
case .idle, .inRequest, .closing, .closed:
// Since NIO triggers promise before pipeline, the handler might have been added to the
// pipeline, before the channelActive callback was triggered. For this reason, we might
// get the channelActive call twice
return .wait
case .modifying:
fatalError("Invalid state: \(self.state)")
}
}
mutating func channelInactive() -> Action {
switch self.state {
case .initialized:
fatalError("A channel that isn't active, must not become inactive")
case .inRequest(var requestStateMachine, close: _):
return self.avoidingStateMachineCoW { state -> Action in
let action = requestStateMachine.channelInactive()
state = .closed
return state.modify(with: action)
}
case .idle, .closing:
self.state = .closed
return .fireChannelInactive
case .closed:
return .wait
case .modifying:
fatalError("Invalid state: \(self.state)")
}
}
mutating func errorHappened(_ error: Error) -> Action {
switch self.state {
case .initialized:
self.state = .closed
return .fireChannelError(error, closeConnection: false)
case .inRequest(var requestStateMachine, let close):
return self.avoidingStateMachineCoW { state -> Action in
let action = requestStateMachine.errorHappened(error)
state = .inRequest(requestStateMachine, close: close)
return state.modify(with: action)
}
case .idle:
self.state = .closing
return .fireChannelError(error, closeConnection: true)
case .closing, .closed:
return .fireChannelError(error, closeConnection: false)
case .modifying:
fatalError("Invalid state: \(self.state)")
}
}
mutating func writabilityChanged(writable: Bool) -> Action {
self.isChannelWritable = writable
switch self.state {
case .initialized, .idle, .closing, .closed:
return .wait
case .inRequest(var requestStateMachine, let close):
return self.avoidingStateMachineCoW { state -> Action in
let action = requestStateMachine.writabilityChanged(writable: writable)
state = .inRequest(requestStateMachine, close: close)
return state.modify(with: action)
}
case .modifying:
fatalError("Invalid state: \(self.state)")
}
}
mutating func runNewRequest(
head: HTTPRequestHead,
metadata: RequestFramingMetadata
) -> Action {
switch self.state {
case .initialized, .inRequest:
// These states are unreachable as the connection pool state machine has put the
// connection into these states. In other words the connection pool state machine must
// be aware about these states before the connection itself. For this reason the
// connection pool state machine must not send a new request to the connection, if the
// connection is `.initialized`, `.closing` or `.inRequest`
fatalError("Invalid state: \(self.state)")
case .closing, .closed:
// The remote may have closed the connection and the connection pool state machine
// was not updated yet because of a race condition. New request vs. marking connection
// as closed.
//
// TODO: AHC should support a fast rescheduling mechanism here.
return .failRequest(HTTPClientError.remoteConnectionClosed, .none)
case .idle:
var requestStateMachine = HTTPRequestStateMachine(isChannelWritable: self.isChannelWritable)
let action = requestStateMachine.startRequest(head: head, metadata: metadata)
// by default we assume a persistent connection. however in `requestVerified`, we read the
// "connection" header.
self.state = .inRequest(requestStateMachine, close: metadata.connectionClose)
return self.state.modify(with: action)
case .modifying:
fatalError("Invalid state: \(self.state)")
}
}
mutating func requestStreamPartReceived(_ part: IOData, promise: EventLoopPromise<Void>?) -> Action {
guard case .inRequest(var requestStateMachine, let close) = self.state else {
fatalError("Invalid state: \(self.state)")
}
return self.avoidingStateMachineCoW { state -> Action in
let action = requestStateMachine.requestStreamPartReceived(part, promise: promise)
state = .inRequest(requestStateMachine, close: close)
return state.modify(with: action)
}
}
mutating func requestStreamFinished(promise: EventLoopPromise<Void>?) -> Action {
guard case .inRequest(var requestStateMachine, let close) = self.state else {
fatalError("Invalid state: \(self.state)")
}
return self.avoidingStateMachineCoW { state -> Action in
let action = requestStateMachine.requestStreamFinished(promise: promise)
state = .inRequest(requestStateMachine, close: close)
return state.modify(with: action)
}
}
mutating func requestCancelled(closeConnection: Bool) -> Action {
switch self.state {
case .initialized:
fatalError(
"This event must only happen, if the connection is leased. During startup this is impossible. Invalid state: \(self.state)"
)
case .idle:
if closeConnection {
self.state = .closing
return .close
} else {
return .wait
}
case .inRequest(var requestStateMachine, let close):
return self.avoidingStateMachineCoW { state -> Action in
let action = requestStateMachine.requestCancelled()
state = .inRequest(requestStateMachine, close: close || closeConnection)
return state.modify(with: action)
}
case .closing, .closed:
return .wait
case .modifying:
fatalError("Invalid state: \(self.state)")
}
}
// MARK: - Response
mutating func read() -> Action {
switch self.state {
case .initialized:
fatalError("Why should we read something, if we are not connected yet")
case .idle:
return .read
case .inRequest(var requestStateMachine, let close):
return self.avoidingStateMachineCoW { state -> Action in
let action = requestStateMachine.read()
state = .inRequest(requestStateMachine, close: close)
return state.modify(with: action)
}
case .closing, .closed:
// there might be a race in us closing the connection and receiving another read event
return .read
case .modifying:
fatalError("Invalid state: \(self.state)")
}
}
mutating func channelRead(_ part: HTTPClientResponsePart) -> Action {
switch self.state {
case .initialized, .idle:
fatalError("Invalid state: \(self.state)")
case .inRequest(var requestStateMachine, var close):
return self.avoidingStateMachineCoW { state -> Action in
let action = requestStateMachine.channelRead(part)
if case .head(let head) = part, close == false {
// since the HTTPClient does not support protocol switching, we must close any
// connection that has received a status `.switchingProtocols`
close = !head.isKeepAlive || head.status == .switchingProtocols
}
state = .inRequest(requestStateMachine, close: close)
return state.modify(with: action)
}
case .closing, .closed:
return .wait
case .modifying:
fatalError("Invalid state: \(self.state)")
}
}
mutating func channelReadComplete() -> Action {
switch self.state {
case .initialized, .idle, .closing, .closed:
return .wait
case .inRequest(var requestStateMachine, let close):
return self.avoidingStateMachineCoW { state -> Action in
let action = requestStateMachine.channelReadComplete()
state = .inRequest(requestStateMachine, close: close)
return state.modify(with: action)
}
case .modifying:
fatalError("Invalid state: \(self.state)")
}
}
mutating func demandMoreResponseBodyParts() -> Action {
guard case .inRequest(var requestStateMachine, let close) = self.state else {
fatalError("Invalid state: \(self.state)")
}
return self.avoidingStateMachineCoW { state -> Action in
let action = requestStateMachine.demandMoreResponseBodyParts()
state = .inRequest(requestStateMachine, close: close)
return state.modify(with: action)
}
}
mutating func idleReadTimeoutTriggered() -> Action {
guard case .inRequest(var requestStateMachine, let close) = self.state else {
fatalError("Invalid state: \(self.state)")
}
return self.avoidingStateMachineCoW { state -> Action in
let action = requestStateMachine.idleReadTimeoutTriggered()
state = .inRequest(requestStateMachine, close: close)
return state.modify(with: action)
}
}
mutating func idleWriteTimeoutTriggered() -> Action {
guard case .inRequest(var requestStateMachine, let close) = self.state else {
return .wait
}
return self.avoidingStateMachineCoW { state -> Action in
let action = requestStateMachine.idleWriteTimeoutTriggered()
state = .inRequest(requestStateMachine, close: close)
return state.modify(with: action)
}
}
mutating func headSent() -> Action {
guard case .inRequest(var requestStateMachine, let close) = self.state else {
return .wait
}
return self.avoidingStateMachineCoW { state in
let action = requestStateMachine.headSent()
state = .inRequest(requestStateMachine, close: close)
return state.modify(with: action)
}
}
}
extension HTTP1ConnectionStateMachine {
/// So, uh...this function needs some explaining.
///
/// While the state machine logic above is great, there is a downside to having all of the state machine data in
/// associated data on enumerations: any modification of that data will trigger copy on write for heap-allocated
/// data. That means that for _every operation on the state machine_ we will CoW our underlying state, which is
/// not good.
///
/// The way we can avoid this is by using this helper function. It will temporarily set state to a value with no
/// associated data, before attempting the body of the function. It will also verify that the state machine never
/// remains in this bad state.
///
/// A key note here is that all callers must ensure that they return to a good state before they exit.
///
/// Sadly, because it's generic and has a closure, we need to force it to be inlined at all call sites, which is
/// not ideal.
@inline(__always)
private mutating func avoidingStateMachineCoW<ReturnType>(_ body: (inout State) -> ReturnType) -> ReturnType {
self.state = .modifying
defer {
assert(!self.isModifying)
}
return body(&self.state)
}
private var isModifying: Bool {
if case .modifying = self.state {
return true
} else {
return false
}
}
}
extension HTTP1ConnectionStateMachine.State {
fileprivate mutating func modify(with action: HTTPRequestStateMachine.Action) -> HTTP1ConnectionStateMachine.Action
{
switch action {
case .sendRequestHead(let head, let sendEnd):
return .sendRequestHead(head, sendEnd: sendEnd)
case .notifyRequestHeadSendSuccessfully(let resumeRequestBodyStream, let startIdleTimer):
return .notifyRequestHeadSendSuccessfully(
resumeRequestBodyStream: resumeRequestBodyStream,
startIdleTimer: startIdleTimer
)
case .pauseRequestBodyStream:
return .pauseRequestBodyStream
case .resumeRequestBodyStream:
return .resumeRequestBodyStream
case .sendBodyPart(let part, let writePromise):
return .sendBodyPart(part, writePromise)
case .sendRequestEnd(let writePromise, let finalAction):
guard case .inRequest(_, close: let close) = self else {
assertionFailure("Invalid state: \(self)")
self = .closing
return .failRequest(HTTPClientError.internalStateFailure(), .close(writePromise))
}
let newFinalAction: HTTP1ConnectionStateMachine.Action.FinalSuccessfulStreamAction
switch finalAction {
case .close:
self = .closing
newFinalAction = .close
case .requestDone:
if close {
self = .closing
newFinalAction = .close
} else {
self = .idle
newFinalAction = .informConnectionIsIdle
}
case .none:
newFinalAction = .none
}
return .sendRequestEnd(writePromise, newFinalAction)
case .forwardResponseHead(let head, let pauseRequestBodyStream):
return .forwardResponseHead(head, pauseRequestBodyStream: pauseRequestBodyStream)
case .forwardResponseBodyParts(let parts):
return .forwardResponseBodyParts(parts)
case .forwardResponseEnd(let finalAction, let finalParts):
guard case .inRequest(_, close: let close) = self else {
assertionFailure("Invalid state: \(self)")
self = .closing
return .failRequest(HTTPClientError.internalStateFailure(), .close(nil))
}
let newFinalAction: HTTP1ConnectionStateMachine.Action.FinalSuccessfulStreamAction
switch finalAction {
case .close:
self = .closing
newFinalAction = .close
case .requestDone:
if close {
self = .closing
newFinalAction = .close
} else {
self = .idle
newFinalAction = .informConnectionIsIdle
}
case .none:
// request is ongoing. request stream is still alive
newFinalAction = .none
}
return .forwardResponseEnd(newFinalAction, finalParts)
case .failRequest(let error, let finalAction):
switch self {
case .initialized:
fatalError("Invalid state: \(self)")
case .idle:
fatalError("How can we fail a task, if we are idle")
case .inRequest(_, let close):
if case .close(let promise) = finalAction {
self = .closing
return .failRequest(error, .close(promise))
} else if close {
self = .closing
return .failRequest(error, .close(nil))
} else {
self = .idle
return .failRequest(error, .informConnectionIsIdle)
}
case .closing:
return .failRequest(error, .none)
case .closed:
// this state can be reached, if the connection was unexpectedly closed by remote
return .failRequest(error, .none)
case .modifying:
fatalError("Invalid state: \(self)")
}
case .read:
return .read
case .wait:
return .wait
case .failSendBodyPart(let error, let writePromise):
return .failSendBodyPart(error, writePromise)
case .failSendStreamFinished(let error, let writePromise):
return .failSendStreamFinished(error, writePromise)
}
}
}
extension HTTP1ConnectionStateMachine: CustomStringConvertible {
var description: String {
switch self.state {
case .initialized:
return ".initialized"
case .idle:
return ".idle"
case .inRequest(let request, let close):
return ".inRequest(\(request), closeAfterRequest: \(close))"
case .closing:
return ".closing"
case .closed:
return ".closed"
case .modifying:
fatalError("Invalid state: \(self.state)")
}
}
}