lifter: allow resolved indirect jumps to participate in structured loop generalization (#98)

* docs: sync rewrite workflow guidance

* docs: drop machine-local pointers and fix stale README branch link

* lifter: allow resolved indirect jumps to participate in structured loop generalization

When a register-indirect jmp has already been resolved to a concrete target via solvePath (ConstantInt or solver), it's no longer speculative. If the target also points backward at a visited block, treat it as a loop back-edge for generalization purposes, the same way a direct or conditional jump would be treated.

Introduces currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget() alongside the existing narrow predicate. canGeneralizeStructuredLoopHeader gains an opt-in targetResolvedConcretely parameter that routes through the widened check. getLiftedBackedgeBB uses the widened variant so back-edge reuse fires for resolved indirect jumps. resolveTargetBlock passes targetResolvedConcretely=true (its entry condition requires a concrete destination) and extends stackBypassGeneralizedLoopAddresses to include IndirectJump-context inserts.

Ret-path contexts remain excluded. Tests updated: the old runLoopGeneralizationIndirectJumpBlocked splits into runLoopGeneralizationIndirectJumpBlockedWhenUnresolved (unchanged semantics) and runLoopGeneralizationIndirectJumpAllowedWhenResolved (new). runPendingGeneralizedLoopBlockedByContext becomes runPendingGeneralizedLoopByContext with an expectReuse parameter; Ret still expects no reuse, IndirectJump with a resolved target now expects reuse.

---------

Co-authored-by: yusufcanislek <yusuf.canislek@meetdandy.com>
This commit is contained in:
naci
2026-04-19 05:36:45 +03:00
committed by GitHub
parent 0fbc2e9a52
commit 5708deef54
7 changed files with 131 additions and 36 deletions
+20
View File
@@ -47,6 +47,8 @@ Important invariants:
- `.editorconfig` and `.clang-format` — formatting contract (2 spaces, LF, UTF-8, 100-column LLVM-based style).
## Development Commands
Before running any command in this section, confirm the exact repo root and cwd. Prefer these repo-provided scripts over ad hoc shell commands.
Preferred Windows build flow:
```bat
cmd /c scripts\dev\configure_iced.cmd
@@ -109,6 +111,24 @@ scripts\rewrite\run_microtests.cmd --check-flags xor
- Coverage/vector plumbing: `python test.py coverage --full` and `python test.py report --json`
- Build script/CMake changes: rerun the affected `scripts\dev\configure_*.cmd` + `build_*.cmd` lane
## Operator workflow defaults
> Use these with the repo-specific architecture/test rules above.
- Confirm the real repo root, source-of-truth file, and owning subsystem before searching or editing.
- Narrow search scope before using broad repo scans.
- Prefer `read`, `find`, `grep`, `ast_grep`, `edit`, `ast_edit`, and `lsp` before bash for discovery or structural edits.
- Before build/test/git/bash commands, confirm the exact cwd and lane you intend to run.
- If you edit the same file twice, re-read it first.
- Default to one main line of work; split into subtasks only when file boundaries are real and outputs are independent.
- Do not finish non-trivial work without focused verification that matches the changed subsystem.
## What not to do
- Do not start with repo-root scans when a narrower directory or entry document can answer the question.
- Do not run configure/build/test commands from an assumed cwd.
- Do not use bash-first discovery when a specialized tool can answer it.
- Do not spawn reviewer/subtask branches just to spread a single code path across multiple agents.
## Process Notes For AI Assistants
- Prefer `docs/REWRITE_BASELINE.md` and CI workflows over older generic build docs when commands disagree.
- Do not edit generated files or artifact outputs unless the task is explicitly about generation.
+1 -1
View File
@@ -119,7 +119,7 @@ jump next_handler;
```
We try to always analyze values and keep track of them. This allows us to understand control flow.
[For jumptable-like branches](https://github.com/NaC-L/Mergen/blob/experimental-pattern-matching/testcases/test_branches.asm)
[For jumptable-like branches](https://github.com/NaC-L/Mergen/blob/main/testcases/test_branches.asm)
Optimized output would be a simple
```llvm
define i64 @main(i64 %rax, i64 %rcx, i64 %rdx, i64 %rbx, i64 %rsp, i64 %rbp, i64 %rsi, i64 %rdi, i64 %r8, i64 %r9, i64 %r10, i64 %r11, i64 %r12, i64 %r13, i64 %r14, i64 %r15, ptr nocapture readnone %TEB, ptr nocapture readnone %memory) local_unnamed_addr #0 {
+5
View File
@@ -33,6 +33,8 @@ cmd /c scripts\dev\build_zydis.cmd
## Verify After Building
Primary checks:
The rewrite gate's sample-build lane is stricter than the core CMake build. CI requires a pinned `clang-cl` via `CLANG_CL_EXE`, `CMAKE_C_COMPILER`, or `LLVM_DIR`; for local `python test.py quick` / `all` runs, set `CLANG_CL_EXE=C:\Program Files\LLVM\bin\clang-cl.exe` when you want the same sample-build compiler resolution as CI instead of relying on local fallback discovery.
```bat
python test.py quick
python test.py all
@@ -53,12 +55,15 @@ Use `python test.py vmp` for larger control-flow/semantics/inlining changes when
- `LLVM_DIR` — points CMake at `LLVMConfig.cmake`
- `MERGEN_BUILD_JOBS` — overrides build parallelism (default `4`)
- `CMAKE_C_COMPILER` / `CMAKE_CXX_COMPILER` — optional compiler override for the configure scripts
- `CLANG_CL_EXE` — optional local override for the rewrite gate's sample-build path; set it to the pinned `clang-cl` when you want local `python test.py quick` / `all` runs to match CI compiler resolution
Example:
```bat
set CLANG_CL_EXE=C:\Program Files\LLVM\bin\clang-cl.exe
set MERGEN_BUILD_JOBS=8
cmd /c scripts\dev\build_iced.cmd
python test.py quick
```
## Secondary Flows
+13 -12
View File
@@ -22,17 +22,17 @@ Sample sources live in:
- `scripts/rewrite/manifest_validation.ps1` — shared strict manifest validator used by both `run.ps1` and `verify.ps1`
- `scripts/rewrite/run.cmd` — one-command Windows entrypoint
- `scripts/rewrite/run_microtests.cmd` — runs `rewrite_microtests.exe` (in-process instruction-byte tests from `lifter/test/TestInstructions.cpp`); builds lazily only when the executable is missing, supports `--build` to force rebuild and `--no-build` to require prebuilt binaries
- `scripts/rewrite/collect_instruction_tests.cmd` — reports handler coverage against `lifter/x86_64_opcodes.x` using oracle vector metadata (`handler` field) to track missing instruction tests
- `scripts/rewrite/generate_oracle_vectors.cmd` — regenerates `lifter/test_vectors/oracle_vectors.json` from seed vectors using oracle providers (currently Unicorn)
- `scripts/rewrite/collect_instruction_tests.cmd` — reports handler coverage against `lifter/semantics/x86_64_opcodes.x` using oracle vector metadata (`handler` field) to track missing instruction tests
- `scripts/rewrite/generate_oracle_vectors.cmd` — regenerates `lifter/test/test_vectors/oracle_vectors.json` from seed vectors using oracle providers (currently Unicorn)
- `scripts/rewrite/oracle_seed_vectors.json` — seed cases with instruction bytes, initial state, and tracked outputs for oracle generation
- `scripts/rewrite/build_full_handler_seed.cmd` — builds `oracle_seed_full_handlers.json` (base semantic vectors + auto-discovered smoke vectors for missing handlers)
- `scripts/rewrite/build_full_handler_seed.py` — Capstone-based opcode discovery that fills missing handlers and marks known-crashing handlers as `skip`
- `scripts/rewrite/run_all_handlers.cmd` — generates full-handler seed/vectors and executes `rewrite_microtests.exe` across the full suite
- `scripts/rewrite/generate_flag_stress_vectors.cmd` — builds `lifter/test_vectors/oracle_vectors_flagstress.json` with multiple strict flag-oracle cases per flag-writing handler
- `scripts/rewrite/generate_flag_stress_vectors.py` — derives flag-writing handlers from `lifter/Semantics.ipp`, generates deterministic initial states, and computes expected flags via Unicorn
- `scripts/rewrite/generate_flag_stress_vectors.cmd` — builds `lifter/test/test_vectors/oracle_vectors_flagstress.json` with multiple strict flag-oracle cases per flag-writing handler
- `scripts/rewrite/generate_flag_stress_vectors.py` — derives flag-writing handlers from `lifter/semantics/Semantics.ipp`, generates deterministic initial states, and computes expected flags via Unicorn
- `scripts/rewrite/run_flagstress.cmd` — one-command strict flag suite runner (auto-generates flag-stress vectors and executes microtests with strict flag assertions)
- `run.ps1` validates that `instruction_microtests.json` covers every `testcases/rewrite_smoke/*` source file
- `scripts/rewrite/check_semantic.py` — runtime semantic regression for all lifted samples; reads `semantic` cases from the manifest, generates lli-executable wrappers, and verifies return values across all declared inputs (23 samples, 107 test cases)
- `scripts/rewrite/check_semantic.py` — runtime semantic regression for all lifted samples; reads `semantic` cases from the manifest, generates lli-executable wrappers, and verifies return values across all declared inputs (33 samples, 177 test cases)
Helper build scripts for local development are in:
@@ -56,7 +56,7 @@ set MERGEN_BUILD_JOBS=8 &rem fast builds on large machines
Use `run_microtests.cmd --check-flags <filter>` to enforce oracle flag comparisons (strict mode, expected to fail until flag semantics are fixed).
Use `run_microtests.cmd --build <filter>` to force rebuilding `rewrite_microtests.exe`, or `run_microtests.cmd --no-build <filter>` to skip any build step.
Set `SKIP_ORACLE_GENERATION=1` to reuse a pre-generated oracle file. Set `MERGEN_TEST_VECTORS=<path>` to point tests at a custom oracle JSON file.
Use `run_all_handlers.cmd` to exercise full handler coverage smoke tests. It writes `lifter/test_vectors/oracle_vectors_full_handlers.json` and then runs microtests against it through `run_microtests.cmd` (which now builds lazily).
Use `run_all_handlers.cmd` to exercise full handler coverage smoke tests. It writes `lifter/test/test_vectors/oracle_vectors_full_handlers.json` and then runs microtests against it through `run_microtests.cmd` (which now builds lazily).
Oracle vector JSON fixtures are deterministic by design; regenerating them should only change tracked files when the underlying cases change, not because of wall-clock metadata.
Full-handler vectors are expected to execute end-to-end (no default `skip: true` crash exclusions).
Use `run_flagstress.cmd` (or `python test.py flags`) for broad strict-flag validation across all handlers that explicitly write flags.
@@ -69,13 +69,13 @@ By default, regression artifacts are written to a sibling folder outside the rep
- `../rewrite-regression-work/`
Artifacts include:
- `lifter/test_vectors/oracle_vectors_flagstress.json` (generated strict-flag stress suite)
- `lifter/test/test_vectors/oracle_vectors_flagstress.json` (generated strict-flag stress suite)
- compiled sample binaries/maps/objects for every manifest entry
- `ir_outputs/*.ll` and `ir_outputs/*_no_opts.ll` (replaced on each run after stale `.ll` cleanup)
- `ir_outputs/*_semantic.ll` (generated by `check_semantic.py` for lli execution)
- `lifter/test_vectors/oracle_vectors_full_handlers.json` (generated by `run_all_handlers.cmd`)
- `lifter/test/test_vectors/oracle_vectors_full_handlers.json` (generated by `run_all_handlers.cmd`)
## Running the baseline gate
From repository root:
@@ -84,6 +84,8 @@ From repository root:
scripts\rewrite\run.cmd
```
CI requires a pinned sample-build compiler via `CLANG_CL_EXE`, `CMAKE_C_COMPILER`, or `LLVM_DIR`. For local runs, set `CLANG_CL_EXE=C:\Program Files\LLVM\bin\clang-cl.exe` when you want `scripts\rewrite\run.cmd` or `python test.py quick` to use the same sample-build compiler resolution as CI instead of relying on fallback discovery.
Optional custom output directory:
```bat
@@ -131,13 +133,12 @@ Samples without a `semantic` field are not tested. The `semantic` field is optio
### Coverage summary
Current active quick-gate semantic coverage is **30 samples / 171 cases** on CI.
Current active quick-gate semantic coverage is **33 samples / 177 cases** on CI and local pinned-toolchain runs.
Notable current state:
- `dummy_vm_loop`, `bytecode_vm_loop`, and `stack_vm_loop` are active VM-shaped control-flow samples.
- `calc_sum_to_n` is active again under the safe structured-loop recovery path.
- `calc_fib` and `calc_sum_array` are `ci_skip` on `windows-latest` because the current hosted toolchain still emits loop/array codegen shapes that fail lifting there even though local developer runs pass.
- `calc_cout` remains `ci_skip` because its C++ codegen is toolchain-dependent on CI.
- `calc_sum_to_n`, `calc_fib`, and `calc_sum_array` are active again under the current safe path.
- `calc_cout` is active again after SSE2 `PUNPCKLQDQ` support landed; the manifest currently has zero `ci_skip` entries.
## Call-boundary ABI framework
+13 -3
View File
@@ -61,15 +61,25 @@ MERGEN_LIFTER_DEFINITION_TEMPLATES(PATH_info)::solvePath(
auto it = addrToBB.find(target);
const bool hasPendingGeneralization =
pendingLoopGeneralizationAddresses.contains(target);
// `resolveTargetBlock` is only reached with a concrete destination, so an
// indirect jump whose target has just resolved participates in the same
// structured-loop generalization path that direct and conditional jumps
// already take.
const bool canUseStructuredLoopGeneralization =
currentPathSolveAllowsStructuredLoopGeneralization();
currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget();
const bool canReusePendingGeneralization =
hasPendingGeneralization && canUseStructuredLoopGeneralization;
const bool wantsGeneralization =
canReusePendingGeneralization ||
(backwardVisitedTarget && canGeneralizeStructuredLoopHeader(target));
(backwardVisitedTarget &&
canGeneralizeStructuredLoopHeader(target,
/*targetResolvedConcretely=*/true));
if (wantsGeneralization) {
if (currentPathSolveContext == PathSolveContext::DirectJump) {
// A resolved backward target participates in the same stack-concolic
// bypass regime regardless of whether the source jump is direct or
// indirect — both represent a confirmed loop back-edge.
if (currentPathSolveContext == PathSolveContext::DirectJump ||
currentPathSolveContext == PathSolveContext::IndirectJump) {
stackBypassGeneralizedLoopAddresses.insert(target);
}
const bool generalizedBackup =
+23 -4
View File
@@ -507,6 +507,16 @@ public:
return currentPathSolveContext == PathSolveContext::ConditionalBranch ||
currentPathSolveContext == PathSolveContext::DirectJump;
}
// Widened variant: when the path solver has already resolved the branch
// target to a concrete address, an indirect jump is no longer speculative.
// If its target also points backward at a visited block it is legitimately
// a loop back-edge and should enter structured loop generalization alongside
// direct and conditional jumps. Ret-path contexts have their own lifecycle
// and stay excluded here.
bool currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget() const {
return currentPathSolveAllowsStructuredLoopGeneralization() ||
currentPathSolveContext == PathSolveContext::IndirectJump;
}
bool isStructuredLoopHeaderShape(BasicBlock* block) const {
std::set<BasicBlock*> seenBlocks;
auto* current = block;
@@ -558,9 +568,13 @@ public:
return false;
}
bool canGeneralizeStructuredLoopHeader(uint64_t addr) {
if (getControlFlow() != ControlFlow::Unflatten ||
!currentPathSolveAllowsStructuredLoopGeneralization() ||
bool canGeneralizeStructuredLoopHeader(uint64_t addr,
bool targetResolvedConcretely = false) {
const bool contextAllows =
targetResolvedConcretely
? currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget()
: currentPathSolveAllowsStructuredLoopGeneralization();
if (getControlFlow() != ControlFlow::Unflatten || !contextAllows ||
addr > blockInfo.block_address || !visitedAddresses.contains(addr) ||
pendingLoopGeneralizationAddresses.contains(addr) ||
generalizedLoopAddresses.contains(addr)) {
@@ -821,8 +835,13 @@ public:
BasicBlock* getLiftedBackedgeBB(uint64_t addr) {
// A resolved backward target is eligible for reuse regardless of whether
// the branching source was direct, conditional, or indirect. Once we have
// a non-empty generalized block for the address, re-entering it on a
// subsequent iteration should branch into that block rather than cutting a
// fresh empty one through `getOrCreateBB` (which would orphan the body).
if (getControlFlow() != ControlFlow::Unflatten ||
!currentPathSolveAllowsStructuredLoopGeneralization()) {
!currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget()) {
return nullptr;
}
if (addr > blockInfo.block_address ||
+56 -16
View File
@@ -435,13 +435,37 @@ private:
return true;
}
bool runLoopGeneralizationIndirectJumpBlocked(std::string& details) {
bool runLoopGeneralizationIndirectJumpBlockedWhenUnresolved(std::string& details) {
// The unresolved-indirect-jump predicate must still exclude indirect
// dispatchers from speculative loop generalization. Without a concrete
// target, we have no proof the jump forms a backward loop edge.
LifterUnderTest lifter;
lifter.currentPathSolveContext =
LifterUnderTest::PathSolveContext::IndirectJump;
if (lifter.currentPathSolveAllowsStructuredLoopGeneralization()) {
details =
" indirect-jump dispatcher context must not generalize loop state\n";
" unresolved indirect-jump context must not generalize loop state\n";
return false;
}
return true;
}
bool runLoopGeneralizationIndirectJumpAllowedWhenResolved(std::string& details) {
// Once `solvePath` has pinned an indirect jump to a concrete destination,
// the resolved-target predicate widens to admit it. Ret-path contexts
// still have their own lifecycle and stay excluded.
LifterUnderTest lifter;
lifter.currentPathSolveContext =
LifterUnderTest::PathSolveContext::IndirectJump;
if (!lifter.currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget()) {
details =
" resolved indirect-jump context must allow structured loop generalization\n";
return false;
}
lifter.currentPathSolveContext = LifterUnderTest::PathSolveContext::Ret;
if (lifter.currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget()) {
details =
" ret context must never participate in structured loop generalization\n";
return false;
}
return true;
@@ -457,9 +481,9 @@ private:
return true;
}
bool runPendingGeneralizedLoopBlockedByContext(
bool runPendingGeneralizedLoopByContext(
LifterUnderTest::PathSolveContext context, const char* contextName,
std::string& details) {
bool expectReuse, std::string& details) {
LifterUnderTest lifter;
lifter.currentPathSolveContext = context;
@@ -486,13 +510,19 @@ private:
" context did not emit the expected direct branch\n";
return false;
}
if (branch->getSuccessor(0) == pending) {
const bool reused = branch->getSuccessor(0) == pending;
if (expectReuse && !reused) {
details = std::string(" ") + contextName +
" context must reuse the pending generalized loop header when the target resolved concretely\n";
return false;
}
if (!expectReuse && reused) {
details = std::string(" ") + contextName +
" context must not reuse a pending generalized loop header\n";
return false;
}
if (lifter.unvisitedBlocks.empty() ||
lifter.unvisitedBlocks.back().block == pending) {
if (!expectReuse && (lifter.unvisitedBlocks.empty() ||
lifter.unvisitedBlocks.back().block == pending)) {
details = std::string(" ") + contextName +
" context queued the pending generalized loop header instead of a fresh block\n";
return false;
@@ -505,14 +535,22 @@ private:
return true;
}
bool runPendingGeneralizedLoopIndirectJumpBlocked(std::string& details) {
return runPendingGeneralizedLoopBlockedByContext(
LifterUnderTest::PathSolveContext::IndirectJump, "indirect-jump", details);
bool runPendingGeneralizedLoopIndirectJumpAllowedWhenResolved(std::string& details) {
// After the resolved-target relaxation, a constant-folded indirect-jump
// target that matches a pending generalized loop header is reused just
// like a direct-jump target would be.
return runPendingGeneralizedLoopByContext(
LifterUnderTest::PathSolveContext::IndirectJump, "indirect-jump",
/*expectReuse=*/true, details);
}
bool runPendingGeneralizedLoopRetBlocked(std::string& details) {
return runPendingGeneralizedLoopBlockedByContext(
LifterUnderTest::PathSolveContext::Ret, "return-path", details);
// Return-path contexts keep their own lifecycle — they must not reuse
// pending generalized loop headers, even now that the resolved-target
// relaxation admits indirect jumps.
return runPendingGeneralizedLoopByContext(
LifterUnderTest::PathSolveContext::Ret, "return-path",
/*expectReuse=*/false, details);
}
@@ -936,12 +974,14 @@ private:
&InstructionTester::runLoopGeneralizationConditionalBranchAllowed);
runCustom("loop_generalization_direct_jump_allowed",
&InstructionTester::runLoopGeneralizationDirectJumpAllowed);
runCustom("loop_generalization_indirect_jump_blocked",
&InstructionTester::runLoopGeneralizationIndirectJumpBlocked);
runCustom("loop_generalization_indirect_jump_blocked_when_unresolved",
&InstructionTester::runLoopGeneralizationIndirectJumpBlockedWhenUnresolved);
runCustom("loop_generalization_indirect_jump_allowed_when_resolved",
&InstructionTester::runLoopGeneralizationIndirectJumpAllowedWhenResolved);
runCustom("loop_generalization_ret_blocked",
&InstructionTester::runLoopGeneralizationRetBlocked);
runCustom("pending_generalized_loop_indirect_jump_blocked",
&InstructionTester::runPendingGeneralizedLoopIndirectJumpBlocked);
runCustom("pending_generalized_loop_indirect_jump_allowed_when_resolved",
&InstructionTester::runPendingGeneralizedLoopIndirectJumpAllowedWhenResolved);
runCustom("pending_generalized_loop_ret_blocked",
&InstructionTester::runPendingGeneralizedLoopRetBlocked);
runCustom("structured_loop_header_allows_conditional_backedge",