Stacked on #29551
Flight pings much more often than Fizz because async function components
will always take at least a microtask to resolve . Rather than
scheduling this work as a new macrotask Flight now schedules pings in a
microtask. This allows more microtasks to ping before actually doing a
work flush but doesn't force the vm to spin up a new task which is quite
common give n the nature of Server Components
A while back we implemented a heuristic that if a chunk was large it was
assumed to be produced by the render and thus was safe to stream which
results in transferring the underlying object memory. Later we ran into
an issue where a precomputed chunk grew large enough to trigger this
hueristic and it started causing renders to fail because once a second
render had occurred the precomputed chunk would not have an underlying
buffer of bytes to send and these bytes would be omitted from the
stream. We implemented a technique to detect large precomputed chunks
and we enforced that these always be cloned before writing.
Unfortunately our test coverage was not perfect and there has been for a
very long time now a usage pattern where if you complete a boundary in
one flush and then complete a boundary that has stylehsheet dependencies
in another flush you can get a large precomputed chunk that was not
being cloned to be sent twice causing streaming errors.
I've thought about why we even went with this solution in the first
place and I think it was a mistake. It relies on a dev only check to
catch paired with potentially version specific order of operations on
the streaming side. This is too unreliable. Additionally the low limit
of view size for Edge is not used in Node.js but there is not real
justification for this.
In this change I updated the view size for edge streaming to match Node
at 2048 bytes which is still relatively small and we have no data one
way or another to preference 512 over this. Then I updated the assertion
logic to error anytime a precomputed chunk exceeds the size. This
eliminates the need to clone these chunks by just making sure our view
size is always larger than the largest precomputed chunk we can possibly
write. I'm generally in favor of this for a few reasons.
First, we'll always know during testing whether we've violated the limit
as long as we exercise each stream config because the precomputed chunks
are created in module scope. Second, we can always split up large chunks
so making sure the precomptued chunk is smaller than whatever view size
we actually desire is relatively trivial.
To support MPA-style form submissions, useFormState sends down a key
that represents the identity of the hook on the page. It's based on the
key path of the component within the React tree; for deeply nested
hooks, this keypath can become very long. We can hash the key to make it
shorter.
Adds a method called createFastHash to the Stream Config interface.
We're not using this for security or obfuscation, only to generate a
more compact key without sacrificing too much collision resistance.
- In Node.js builds, createFastHash uses the built-in crypto module.
- In Bun builds, createFastHash uses Bun.hash. See:
https://bun.sh/docs/api/hashing#bun-hash
I have not yet implemented createFastHash in the Edge, Browser, or FB
(Hermes) stream configs because those environments do not have a
built-in hashing function that meets our requirements. (We can't use the
web standard `crypto` API because those methods are async, and yielding
to the main thread is too costly to be worth it for this particular use
case.) We'll likely use a pure JS implementation in those environments;
for now, they just return the original string without hashing it. I'll
address this in separate PRs.
This uses the same mechanism as [large
strings](https://github.com/facebook/react/pull/26932) to encode chunks
of length based binary data in the RSC payload behind a flag.
I introduce a new BinaryChunk type that's specific to each stream and
ways to convert into it. That's because we sometimes need all chunks to
be Uint8Array for the output, even if the source is another array buffer
view, and sometimes we need to clone it before transferring.
Each type of typed array is its own row tag. This lets us ensure that
the instance is directly in the right format in the cached entry instead
of creating a wrapper at each reference. Ideally this is also how
Map/Set should work but those are lazy which complicates that approach a
bit.
We assume both server and client use little-endian for now. If we want
to support other modes, we'd convert it to/from little-endian so that
the transfer protocol is always little-endian. That way the common
clients can be the fastest possible.
So far this only implements Server to Client. Still need to implement
Client to Server for parity.
NOTE: This is the first time we make RSC effectively a binary format.
This is not compatible with existing SSR techniques which serialize the
stream as unicode in the HTML. To be compatible, those implementations
would have to use base64 or something like that. Which is what we'll do
when we move this technique to be built-in to Fizz.
This introduces a Text row (T) which is essentially a string blob and
refactors the parsing to now happen at the binary level.
```
RowID + ":" + "T" + ByteLengthInHex + "," + Text
```
Today, we encode all row data in JSON, which conveniently never has
newline characters and so we use newline as the line terminator. We
can't do that if we pass arbitrary unicode without escaping it. Instead,
we pass the byte length (in hexadecimal) in the leading header for this
row tag followed by a comma.
We could be clever and use fixed or variable-length binary integers for
the row id and length but it's not worth the more difficult
debuggability so we keep these human readable in text.
Before this PR, we used to decode the binary stream into UTF-8 strings
before parsing them. This is inefficient because sometimes the slices
end up having to be copied so it's better to decode it directly into the
format. The follow up to this is also to add support for binary data and
then we can't assume the entire payload is UTF-8 anyway. So this
refactors the parser to parse the rows in binary and then decode the
result into UTF-8. It does add some overhead to decoding on a per row
basis though.
Since we do this, we need to encode the byte length that we want decode
- not the string length. Therefore, this requires clients to receive
binary data and why I had to delete the string option.
It also means that I had to add a way to get the byteLength from a chunk
since they're not always binary. For Web streams it's easy since they're
always typed arrays. For Node streams it's trickier so we use the
byteLength helper which may not be very efficient. Might be worth
eagerly encoding them to UTF8 - perhaps only for this case.
Stacked on #26557
Supporting Float methods such as ReactDOM.preload() are challenging for
flight because it does not have an easy means to convey direct
executions in other environments. Because the flight wire format is a
JSON-like serialization that is expected to be rendered it currently
only describes renderable elements. We need a way to convey a function
invocation that gets run in the context of the client environment
whether that is Fizz or Fiber.
Fiber is somewhat straightforward because the HostDispatcher is always
active and we can just have the FlightClient dispatch the serialized
directive.
Fizz is much more challenging becaue the dispatcher is always scoped but
the specific request the dispatch belongs to is not readily available.
Environments that support AsyncLocalStorage (or in the future
AsyncContext) we will use this to be able to resolve directives in Fizz
to the appropriate Request. For other environments directives will be
elided. Right now this is pragmatic and non-breaking because all
directives are opportunistic and non-critical. If this changes in the
future we will need to reconsider how widespread support for async
context tracking is.
For Flight, if AsyncLocalStorage is available Float methods can be
called before and after await points and be expected to work. If
AsyncLocalStorage is not available float methods called in the sync
phase of a component render will be captured but anything after an await
point will be a noop. If a float call is dropped in this manner a DEV
warning should help you realize your code may need to be modified.
This PR also introduces a way for resources (Fizz) and hints (Flight) to
flush even if there is not active task being worked on. This will help
when Float methods are called in between async points within a function
execution but the task is blocked on the entire function finishing.
This PR also introduces deduping of Hints in Flight using the same
resource keys used in Fizz. This will help shrink payload sizes when the
same hint is attempted to emit over and over again
Added an explicit type to all $FlowFixMe suppressions to reduce
over-suppressions of new errors that might be caused on the same lines.
Also removes suppressions that aren't used (e.g. in a `@noflow` file as
they're purely misleading)
Test Plan:
yarn flow-ci
We rely heavily on being able to batch rendering after multiple fetches
etc. have completed on the server. However, we only do this in the
Node.js build. Node.js `setImmediate` has the exact semantics we need.
To be after the current cycle of I/O so that we can collect after all
those I/O events already in the queue has been processed.
This doesn't exist in standard browsers, so we ended up not using it
there. We could've used `setTimeout` but that risks being throttled
which would severely negatively affect the performance so we just did it
synchronously there. We probably could just use the `scheduler` there.
Now we have a separate build for Edge where `setTimeout(..., 0)`
actually behaves like `setImmediate` which is what we want. So we can
just use that in that build.
@Jarred-Sumner not sure what you want for Bun.
We currently abuse the browser builds for Web streams derived
environments. We already have a special build for Bun but we should also
have one for [other "edge"
runtimes](https://runtime-keys.proposal.wintercg.org/) so that we can
maximally take advantage of the APIs that exist on each platform.
In practice, we currently check for a global property called
`AsyncLocalStorage` in the server browser builds which we shouldn't
really do since browsers likely won't ever have it. Additionally, this
should probably move to an import which we can't add to actual browser
builds where that will be an invalid import. So it has to be a separate
build. That's not done yet in this PR but Vercel will follow
Cloudflare's lead here.
The `deno` key still points to the browser build since there's no
AsyncLocalStorage there but it could use this same or a custom build if
support is added.