This is an incremental step to removing `Place.memberPath`. This PR changes how
we handle MemberExpressions in rvalue position, converting to a new
`PropertyLoad` InstructionValue variant. Example:
```
let x = a.b;
x.y = b.c;
=>
Const tmp1 = PropertyLoad a, 'b';
Const x = Place tmp1;
Const tmp2 = PropertyLoad b, 'c';
Reassign x.y = tmp2
```
That we already recently made a chance to ensure that _if_ the lvalue is a
member expression, that we convert the RHS to a Place. So although `x.y = b.c`
could technically be lowered to a single instruction (with the`b.c` as a
PropertyLoad), we force this to a temporary to ensure that we can independently
memoize the RHS value.
The net result is that the following combinations are possible:
* `x = y`, lvalue identifier, rvalue identifier
* `x.y = y` lvalue member path, rvalue identifier
* `x = y.z` lvalue identifier, rvalue property load
As noted above, `x.y = a.b` no longer occurs (and there's an invariant for this
in one of the passes).
A follow-up PR will add a PropertyStore instruction so that we can remove member
paths in lvalue position too.
@josephsavona had the intuition that we were picking the wrong root in the
previous infinite loop test case, so the fix for this is to force a root to
always be picked. This works because `find` implements path compression (if a <-
b and b <- c, then we can just point a <- c to "flatten" the tree which makes
subsequent `find` operations more efficient since we don't have to follow the
ancestor chain each time), so we're forcing all those unions to pick one parent.
From some googling it looks like the traditional way to implement union is to
call `find` so this should be the "right" way to fix it (?).
I'm also adding a basic unit test for DisjointSet, I think we could revisit
later and see if property testing is worth it but for now I mainly wanted to
capture the regression test as a unit test.
Discovered this by accident modifying Joe's playground example. This seemed to
be triggered in inferReactiveScopeVariables when iterating over the DisjointSet
of scopeIdentifiers. In particular this test case contains a cycle and
DisjointSet.find would never terminate.
The github action was exceeding maximum allowed memory size because we were no
longer grouping messages correctly prior to formatting them in the script. I
think these were introduced when we integrated the Babel plugin into the
preprocessor. This PR strips out filenames from the message so they can be
grouped together again. Also added some light comments
Test plan: manually ran `scripts/test262.sh` and verified that the JSON was
grouped together correctly
Previously we weren't ever clearing the `dist` directory so there were likely
some vestigial files left over from previous builds that might have confused the
import path. This commit fixes the import path and also deletes the `dist`
directory on every build to ensure a clean slate.
Given that our type inference needs to be very conservative, there's not a lot
of benefit to having such fine grained type inference.
In the future, we can use type information from flow/ts for inference.
For now, we've decided to punt on super fine grained aliasing of fields (ie,
mutating x.y shouldn't mutate x.z or it's aliases).
This PR removes the code that tracks the aliases of each field. We can re-add
this when we revisit this functionality.
For the rest of the field aliasing, most of it has been replaced by
InferAliasForStores.
Splits handling of simple assignment updates (`=`) from update-assignments (`+=`
etc). This unblocks starting to support destructuring for the simple case in the
subsequent PR.
The test262 preprocessor was correctly running forget against all the functions
in each test file, but it was incorrectly transforming the file contents
overall. Previously we swapped the contents of the file for the transformed
output of the last function, now we use the babel plugin to rewrite functions in
place and keep the rest of the file intact.
Of course, the tests that failed before still fail. But when we fix them the
tests will work now.
Addressed a TODO from the previous PR. When we enter SSA form, when we rewrite
variable reassignments we currently change the identifier but leave the kind of
the lvalue alone; technically we should convert from Reassign to Const. After
doing that, it's easier to correctly update when we leave SSA form, we can
convert just a subset back into let/reassign (but leave most things alone as
const).
Reorders LeaveSSA so that it runs before we begin evaluating reactive scopes.
Note that reactive scopes must span the full construction of each variable — for
variables with a phi, this must span the declaration and all assignments of the
phi operands. And that's exactly what the new LeaveSSA does! LeaveSSA removes
phi nodes and ensure that all versions of a variable which flow into a phi have
been assigned a single canonical identifier (with an appropriate mutable range).
This PR includes this and some related changes:
* Reorders the pass
* Changes hir-test to print the final HIR, eg just prior to codegen
* Teaches LeaveSSA to update the mutable range of the canonical identifiers it
assigns, based on the min/max of the variables assigned.
Rather than special casing for field stores, look for Effect.Store to alias
fields.
In the future, this will be extended to other constructs like Array#push.
Effect.Store is exactly like Effect.Mutate, the only difference is that Store
aliases one into a another value.
There is no practical difference between Effect.Mutate and Effect.Store
currently.
Assignment expressions to a member path are a special case because they're the
only place where a value isn't assigned to a (possibly temporary) variable,
which is our unit of memoization. #901 demonstrated how this can lead to values
that can't be independently memoized:
```javascript
const x = {a: a}
x.y = [b, c]; // array recomputed w `x`, even if only `a` changed
```
This PR ensures that assignment expressions where the LHS is a member path lower
the RHS to a Place. That means the above example is handled as if you wrote:
```javascript
const x = {a: a};
const tmp1 = [b, c];
x.y = tmp1;
```
And we independently memoize the temporary.
Completes a todo for the `if` condition of scopes without any inputs. In this
case since there are no inputs we can check for changes, we check if the first
_output_ cache slot is set to the sentinel.
For this to work we need to ensure that all scopes have at least one output,
which isn't currently the case. Dead code can produce output-less sentinels. So
this PR also adds a pass to find scopes w/o any outputs and convert them to
regular blocks. We could in theory also just delete them, but for now let's be
more conservative. This is something we'd want to highlight as a diagnostic in
an IDE, though existing dead code linters would almost certainly find this case
too.
Remove our existing compiler flags since they were only being used for
enabling/disabling passes to aid debugging and to simplify in preparation for
the upcoming work on diagnostics and bailouts. Additionally with the new
playground tabs disabling passes has become less necessary. In the future when
we have actual compiler flags (eg tweaking optimization levels) we can add this
back.
I opted to keep the existing `CompilerResult` return value instead of just
returning the optimized AST as we're still using `scopes` in our test fixtures.
Small reorganization to move Pipeline out of HIR since it's a compiler module.
It used to make sense before to be in HIR since the old architecture was the
still the primary, but no longer!
It's not safe to infer types of arguments and return values because Javascript
is so polymorphic. Instead just infer the type of the callee for non methods.
Interestingly, even this is not conservative enough for JavaScript because Proxy
can also be callable. But I think for our use cases we will treat Proxy and
Functions similarly (they're all just objects) so it's ok.
Thanks old architecture! We learned a lot about what does and doesn't work via
this POC, but it's time to move forward w the ~~new~~ current architecture.
When collecting the output of each scope I was checking to see whether each
operand was being used after the end of the _current_ scope. What we really want
to be checking is whether the operand is used after _the scope in which it's
defined_. Those are the same thing when there is no nesting, involved, so the
previous logic worked for most examples.
It isn't super easy to tell when if the operand's scope has ended, because we
don't always know what the "current" InstructionId is inside a ReactiveFunction.
And that in turn isn't quite so easy to change, because of some edge cases like
break statements that we synthesize. The solution here is to track the set of
active scopes, and if an operand is used and its scope is not active then voila,
it's scope must have completed and its an output.
Moves _some_ files from HIR into new top-level directories. To not make
@gsathya's life a pain I left the files he's touching alone, but I moved some
others. My intent is to have something like this:
* Babel/ - code for the Babel plugin, though ideally this actually gets split
into a separate package and the compiler itself is AST in, AST out w no Babel
dep.
* HIR/ - the core HIRFunction, HIR and related data types, plus the HIR
construction and printing, HIR visitors.
* Inference/ - the core inference passes that operate on the HIR, including type
inference, reference effects, alias analysis, mutable range analysis, etc. I'll
let @gsathya move these files when at a good stopping point.
* ReactiveScopes/ - inference relating to reactive scopes,
constructing/printing/codegenning ReactiveFunction
* SSA/ - enter/leave SSA and eliminate redundant phis
* Utils/ - every project needs a place to put stuff that doesn't fit into the
other categories, this is ours.
This leaves just index.ts at the top level, and overall feels pretty tidy. Not
too tedious to figure out where anything goes, hopefully.
It's pretty tedious to keep the playground in sync w `Pipeline` — we need some
abstraction so we can write the sequence of passes once and reuse it (while
inspecting intermediate states).
This is mostly unused and there's just not enough benefit for now. We can re-add
this if folks start using this site on mobile. Removing now to simplify the
codebase.
Updates the ReactiveFunction-based codegen from the previous PR to emit
memoization code for each scope. This is currently naive and has some bugs, but
it gets the idea across. The core logic is straightforward at this point, all
the hard work is in earlier passes:
* Compute one change variable per scope dependency, eg `const c_0 = $[0] ===
maxItems`
* Generate one `let` binding for each scope output
* Generate an if block where the test is if any of the change variables are true
(`||` them together)
* Generate the consequent block with the original code block, plus statements to
save dependencies and outputs to their cache slots
* Generate the alternate block to populate the scope outputs from their cached
values
## Todos
A few things don't quite work yet:
* Codegen is designed to avoid emitting variables for temporary values, but
that's causing a few values to sort of disappear n the examples, or get emitted
twice. There are a variety of ways to achieve this but we'll need to ensure that
this category of values gets assigned to a variable and then reference the
variable. This is more involved.
* Scopes can end up with zero dependencies, in which case we should check that
the first output cache is initialized. This one is more straightforward.
* If there are early returns, we don't record that they occurred and replay them
in the `else` branch for each scope. We know the algorithm though so i'm okay
delaying that for now.