diff --git a/AGENTS.md b/AGENTS.md index 5861e07..c76dc90 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -47,6 +47,8 @@ Important invariants: - `.editorconfig` and `.clang-format` — formatting contract (2 spaces, LF, UTF-8, 100-column LLVM-based style). ## Development Commands +Before running any command in this section, confirm the exact repo root and cwd. Prefer these repo-provided scripts over ad hoc shell commands. + Preferred Windows build flow: ```bat cmd /c scripts\dev\configure_iced.cmd @@ -109,6 +111,24 @@ scripts\rewrite\run_microtests.cmd --check-flags xor - Coverage/vector plumbing: `python test.py coverage --full` and `python test.py report --json` - Build script/CMake changes: rerun the affected `scripts\dev\configure_*.cmd` + `build_*.cmd` lane +## Operator workflow defaults + +> Use these with the repo-specific architecture/test rules above. + +- Confirm the real repo root, source-of-truth file, and owning subsystem before searching or editing. +- Narrow search scope before using broad repo scans. +- Prefer `read`, `find`, `grep`, `ast_grep`, `edit`, `ast_edit`, and `lsp` before bash for discovery or structural edits. +- Before build/test/git/bash commands, confirm the exact cwd and lane you intend to run. +- If you edit the same file twice, re-read it first. +- Default to one main line of work; split into subtasks only when file boundaries are real and outputs are independent. +- Do not finish non-trivial work without focused verification that matches the changed subsystem. + +## What not to do +- Do not start with repo-root scans when a narrower directory or entry document can answer the question. +- Do not run configure/build/test commands from an assumed cwd. +- Do not use bash-first discovery when a specialized tool can answer it. +- Do not spawn reviewer/subtask branches just to spread a single code path across multiple agents. + ## Process Notes For AI Assistants - Prefer `docs/REWRITE_BASELINE.md` and CI workflows over older generic build docs when commands disagree. - Do not edit generated files or artifact outputs unless the task is explicitly about generation. diff --git a/README.md b/README.md index 3e6b21e..3441ea4 100644 --- a/README.md +++ b/README.md @@ -119,7 +119,7 @@ jump next_handler; ``` We try to always analyze values and keep track of them. This allows us to understand control flow. -[For jumptable-like branches](https://github.com/NaC-L/Mergen/blob/experimental-pattern-matching/testcases/test_branches.asm) +[For jumptable-like branches](https://github.com/NaC-L/Mergen/blob/main/testcases/test_branches.asm) Optimized output would be a simple ```llvm define i64 @main(i64 %rax, i64 %rcx, i64 %rdx, i64 %rbx, i64 %rsp, i64 %rbp, i64 %rsi, i64 %rdi, i64 %r8, i64 %r9, i64 %r10, i64 %r11, i64 %r12, i64 %r13, i64 %r14, i64 %r15, ptr nocapture readnone %TEB, ptr nocapture readnone %memory) local_unnamed_addr #0 { diff --git a/docs/BUILDING.md b/docs/BUILDING.md index 0f9c73c..0338afd 100644 --- a/docs/BUILDING.md +++ b/docs/BUILDING.md @@ -33,6 +33,8 @@ cmd /c scripts\dev\build_zydis.cmd ## Verify After Building Primary checks: +The rewrite gate's sample-build lane is stricter than the core CMake build. CI requires a pinned `clang-cl` via `CLANG_CL_EXE`, `CMAKE_C_COMPILER`, or `LLVM_DIR`; for local `python test.py quick` / `all` runs, set `CLANG_CL_EXE=C:\Program Files\LLVM\bin\clang-cl.exe` when you want the same sample-build compiler resolution as CI instead of relying on local fallback discovery. + ```bat python test.py quick python test.py all @@ -53,12 +55,15 @@ Use `python test.py vmp` for larger control-flow/semantics/inlining changes when - `LLVM_DIR` — points CMake at `LLVMConfig.cmake` - `MERGEN_BUILD_JOBS` — overrides build parallelism (default `4`) - `CMAKE_C_COMPILER` / `CMAKE_CXX_COMPILER` — optional compiler override for the configure scripts +- `CLANG_CL_EXE` — optional local override for the rewrite gate's sample-build path; set it to the pinned `clang-cl` when you want local `python test.py quick` / `all` runs to match CI compiler resolution Example: ```bat +set CLANG_CL_EXE=C:\Program Files\LLVM\bin\clang-cl.exe set MERGEN_BUILD_JOBS=8 cmd /c scripts\dev\build_iced.cmd +python test.py quick ``` ## Secondary Flows diff --git a/docs/REWRITE_BASELINE.md b/docs/REWRITE_BASELINE.md index 3e0a88f..234d3b8 100644 --- a/docs/REWRITE_BASELINE.md +++ b/docs/REWRITE_BASELINE.md @@ -22,17 +22,17 @@ Sample sources live in: - `scripts/rewrite/manifest_validation.ps1` — shared strict manifest validator used by both `run.ps1` and `verify.ps1` - `scripts/rewrite/run.cmd` — one-command Windows entrypoint - `scripts/rewrite/run_microtests.cmd` — runs `rewrite_microtests.exe` (in-process instruction-byte tests from `lifter/test/TestInstructions.cpp`); builds lazily only when the executable is missing, supports `--build` to force rebuild and `--no-build` to require prebuilt binaries -- `scripts/rewrite/collect_instruction_tests.cmd` — reports handler coverage against `lifter/x86_64_opcodes.x` using oracle vector metadata (`handler` field) to track missing instruction tests -- `scripts/rewrite/generate_oracle_vectors.cmd` — regenerates `lifter/test_vectors/oracle_vectors.json` from seed vectors using oracle providers (currently Unicorn) +- `scripts/rewrite/collect_instruction_tests.cmd` — reports handler coverage against `lifter/semantics/x86_64_opcodes.x` using oracle vector metadata (`handler` field) to track missing instruction tests +- `scripts/rewrite/generate_oracle_vectors.cmd` — regenerates `lifter/test/test_vectors/oracle_vectors.json` from seed vectors using oracle providers (currently Unicorn) - `scripts/rewrite/oracle_seed_vectors.json` — seed cases with instruction bytes, initial state, and tracked outputs for oracle generation - `scripts/rewrite/build_full_handler_seed.cmd` — builds `oracle_seed_full_handlers.json` (base semantic vectors + auto-discovered smoke vectors for missing handlers) - `scripts/rewrite/build_full_handler_seed.py` — Capstone-based opcode discovery that fills missing handlers and marks known-crashing handlers as `skip` - `scripts/rewrite/run_all_handlers.cmd` — generates full-handler seed/vectors and executes `rewrite_microtests.exe` across the full suite -- `scripts/rewrite/generate_flag_stress_vectors.cmd` — builds `lifter/test_vectors/oracle_vectors_flagstress.json` with multiple strict flag-oracle cases per flag-writing handler -- `scripts/rewrite/generate_flag_stress_vectors.py` — derives flag-writing handlers from `lifter/Semantics.ipp`, generates deterministic initial states, and computes expected flags via Unicorn +- `scripts/rewrite/generate_flag_stress_vectors.cmd` — builds `lifter/test/test_vectors/oracle_vectors_flagstress.json` with multiple strict flag-oracle cases per flag-writing handler +- `scripts/rewrite/generate_flag_stress_vectors.py` — derives flag-writing handlers from `lifter/semantics/Semantics.ipp`, generates deterministic initial states, and computes expected flags via Unicorn - `scripts/rewrite/run_flagstress.cmd` — one-command strict flag suite runner (auto-generates flag-stress vectors and executes microtests with strict flag assertions) - `run.ps1` validates that `instruction_microtests.json` covers every `testcases/rewrite_smoke/*` source file -- `scripts/rewrite/check_semantic.py` — runtime semantic regression for all lifted samples; reads `semantic` cases from the manifest, generates lli-executable wrappers, and verifies return values across all declared inputs (23 samples, 107 test cases) +- `scripts/rewrite/check_semantic.py` — runtime semantic regression for all lifted samples; reads `semantic` cases from the manifest, generates lli-executable wrappers, and verifies return values across all declared inputs (33 samples, 177 test cases) Helper build scripts for local development are in: @@ -56,7 +56,7 @@ set MERGEN_BUILD_JOBS=8 &rem fast builds on large machines Use `run_microtests.cmd --check-flags ` to enforce oracle flag comparisons (strict mode, expected to fail until flag semantics are fixed). Use `run_microtests.cmd --build ` to force rebuilding `rewrite_microtests.exe`, or `run_microtests.cmd --no-build ` to skip any build step. Set `SKIP_ORACLE_GENERATION=1` to reuse a pre-generated oracle file. Set `MERGEN_TEST_VECTORS=` to point tests at a custom oracle JSON file. -Use `run_all_handlers.cmd` to exercise full handler coverage smoke tests. It writes `lifter/test_vectors/oracle_vectors_full_handlers.json` and then runs microtests against it through `run_microtests.cmd` (which now builds lazily). +Use `run_all_handlers.cmd` to exercise full handler coverage smoke tests. It writes `lifter/test/test_vectors/oracle_vectors_full_handlers.json` and then runs microtests against it through `run_microtests.cmd` (which now builds lazily). Oracle vector JSON fixtures are deterministic by design; regenerating them should only change tracked files when the underlying cases change, not because of wall-clock metadata. Full-handler vectors are expected to execute end-to-end (no default `skip: true` crash exclusions). Use `run_flagstress.cmd` (or `python test.py flags`) for broad strict-flag validation across all handlers that explicitly write flags. @@ -69,13 +69,13 @@ By default, regression artifacts are written to a sibling folder outside the rep - `../rewrite-regression-work/` Artifacts include: -- `lifter/test_vectors/oracle_vectors_flagstress.json` (generated strict-flag stress suite) +- `lifter/test/test_vectors/oracle_vectors_flagstress.json` (generated strict-flag stress suite) - compiled sample binaries/maps/objects for every manifest entry - `ir_outputs/*.ll` and `ir_outputs/*_no_opts.ll` (replaced on each run after stale `.ll` cleanup) - `ir_outputs/*_semantic.ll` (generated by `check_semantic.py` for lli execution) -- `lifter/test_vectors/oracle_vectors_full_handlers.json` (generated by `run_all_handlers.cmd`) +- `lifter/test/test_vectors/oracle_vectors_full_handlers.json` (generated by `run_all_handlers.cmd`) ## Running the baseline gate From repository root: @@ -84,6 +84,8 @@ From repository root: scripts\rewrite\run.cmd ``` +CI requires a pinned sample-build compiler via `CLANG_CL_EXE`, `CMAKE_C_COMPILER`, or `LLVM_DIR`. For local runs, set `CLANG_CL_EXE=C:\Program Files\LLVM\bin\clang-cl.exe` when you want `scripts\rewrite\run.cmd` or `python test.py quick` to use the same sample-build compiler resolution as CI instead of relying on fallback discovery. + Optional custom output directory: ```bat @@ -131,13 +133,12 @@ Samples without a `semantic` field are not tested. The `semantic` field is optio ### Coverage summary -Current active quick-gate semantic coverage is **30 samples / 171 cases** on CI. +Current active quick-gate semantic coverage is **33 samples / 177 cases** on CI and local pinned-toolchain runs. Notable current state: - `dummy_vm_loop`, `bytecode_vm_loop`, and `stack_vm_loop` are active VM-shaped control-flow samples. -- `calc_sum_to_n` is active again under the safe structured-loop recovery path. -- `calc_fib` and `calc_sum_array` are `ci_skip` on `windows-latest` because the current hosted toolchain still emits loop/array codegen shapes that fail lifting there even though local developer runs pass. -- `calc_cout` remains `ci_skip` because its C++ codegen is toolchain-dependent on CI. +- `calc_sum_to_n`, `calc_fib`, and `calc_sum_array` are active again under the current safe path. +- `calc_cout` is active again after SSE2 `PUNPCKLQDQ` support landed; the manifest currently has zero `ci_skip` entries. ## Call-boundary ABI framework diff --git a/lifter/analysis/PathSolver.ipp b/lifter/analysis/PathSolver.ipp index c227692..eb874b0 100644 --- a/lifter/analysis/PathSolver.ipp +++ b/lifter/analysis/PathSolver.ipp @@ -61,15 +61,25 @@ MERGEN_LIFTER_DEFINITION_TEMPLATES(PATH_info)::solvePath( auto it = addrToBB.find(target); const bool hasPendingGeneralization = pendingLoopGeneralizationAddresses.contains(target); + // `resolveTargetBlock` is only reached with a concrete destination, so an + // indirect jump whose target has just resolved participates in the same + // structured-loop generalization path that direct and conditional jumps + // already take. const bool canUseStructuredLoopGeneralization = - currentPathSolveAllowsStructuredLoopGeneralization(); + currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget(); const bool canReusePendingGeneralization = hasPendingGeneralization && canUseStructuredLoopGeneralization; const bool wantsGeneralization = canReusePendingGeneralization || - (backwardVisitedTarget && canGeneralizeStructuredLoopHeader(target)); + (backwardVisitedTarget && + canGeneralizeStructuredLoopHeader(target, + /*targetResolvedConcretely=*/true)); if (wantsGeneralization) { - if (currentPathSolveContext == PathSolveContext::DirectJump) { + // A resolved backward target participates in the same stack-concolic + // bypass regime regardless of whether the source jump is direct or + // indirect — both represent a confirmed loop back-edge. + if (currentPathSolveContext == PathSolveContext::DirectJump || + currentPathSolveContext == PathSolveContext::IndirectJump) { stackBypassGeneralizedLoopAddresses.insert(target); } const bool generalizedBackup = diff --git a/lifter/core/LifterClass.hpp b/lifter/core/LifterClass.hpp index fb06818..9a50429 100644 --- a/lifter/core/LifterClass.hpp +++ b/lifter/core/LifterClass.hpp @@ -507,6 +507,16 @@ public: return currentPathSolveContext == PathSolveContext::ConditionalBranch || currentPathSolveContext == PathSolveContext::DirectJump; } + // Widened variant: when the path solver has already resolved the branch + // target to a concrete address, an indirect jump is no longer speculative. + // If its target also points backward at a visited block it is legitimately + // a loop back-edge and should enter structured loop generalization alongside + // direct and conditional jumps. Ret-path contexts have their own lifecycle + // and stay excluded here. + bool currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget() const { + return currentPathSolveAllowsStructuredLoopGeneralization() || + currentPathSolveContext == PathSolveContext::IndirectJump; + } bool isStructuredLoopHeaderShape(BasicBlock* block) const { std::set seenBlocks; auto* current = block; @@ -558,9 +568,13 @@ public: return false; } - bool canGeneralizeStructuredLoopHeader(uint64_t addr) { - if (getControlFlow() != ControlFlow::Unflatten || - !currentPathSolveAllowsStructuredLoopGeneralization() || + bool canGeneralizeStructuredLoopHeader(uint64_t addr, + bool targetResolvedConcretely = false) { + const bool contextAllows = + targetResolvedConcretely + ? currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget() + : currentPathSolveAllowsStructuredLoopGeneralization(); + if (getControlFlow() != ControlFlow::Unflatten || !contextAllows || addr > blockInfo.block_address || !visitedAddresses.contains(addr) || pendingLoopGeneralizationAddresses.contains(addr) || generalizedLoopAddresses.contains(addr)) { @@ -821,8 +835,13 @@ public: BasicBlock* getLiftedBackedgeBB(uint64_t addr) { + // A resolved backward target is eligible for reuse regardless of whether + // the branching source was direct, conditional, or indirect. Once we have + // a non-empty generalized block for the address, re-entering it on a + // subsequent iteration should branch into that block rather than cutting a + // fresh empty one through `getOrCreateBB` (which would orphan the body). if (getControlFlow() != ControlFlow::Unflatten || - !currentPathSolveAllowsStructuredLoopGeneralization()) { + !currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget()) { return nullptr; } if (addr > blockInfo.block_address || diff --git a/lifter/test/Tester.hpp b/lifter/test/Tester.hpp index 1fda4ae..67ee378 100644 --- a/lifter/test/Tester.hpp +++ b/lifter/test/Tester.hpp @@ -435,13 +435,37 @@ private: return true; } - bool runLoopGeneralizationIndirectJumpBlocked(std::string& details) { + bool runLoopGeneralizationIndirectJumpBlockedWhenUnresolved(std::string& details) { + // The unresolved-indirect-jump predicate must still exclude indirect + // dispatchers from speculative loop generalization. Without a concrete + // target, we have no proof the jump forms a backward loop edge. LifterUnderTest lifter; lifter.currentPathSolveContext = LifterUnderTest::PathSolveContext::IndirectJump; if (lifter.currentPathSolveAllowsStructuredLoopGeneralization()) { details = - " indirect-jump dispatcher context must not generalize loop state\n"; + " unresolved indirect-jump context must not generalize loop state\n"; + return false; + } + return true; + } + + bool runLoopGeneralizationIndirectJumpAllowedWhenResolved(std::string& details) { + // Once `solvePath` has pinned an indirect jump to a concrete destination, + // the resolved-target predicate widens to admit it. Ret-path contexts + // still have their own lifecycle and stay excluded. + LifterUnderTest lifter; + lifter.currentPathSolveContext = + LifterUnderTest::PathSolveContext::IndirectJump; + if (!lifter.currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget()) { + details = + " resolved indirect-jump context must allow structured loop generalization\n"; + return false; + } + lifter.currentPathSolveContext = LifterUnderTest::PathSolveContext::Ret; + if (lifter.currentPathSolveAllowsStructuredLoopGeneralizationForResolvedTarget()) { + details = + " ret context must never participate in structured loop generalization\n"; return false; } return true; @@ -457,9 +481,9 @@ private: return true; } - bool runPendingGeneralizedLoopBlockedByContext( + bool runPendingGeneralizedLoopByContext( LifterUnderTest::PathSolveContext context, const char* contextName, - std::string& details) { + bool expectReuse, std::string& details) { LifterUnderTest lifter; lifter.currentPathSolveContext = context; @@ -486,13 +510,19 @@ private: " context did not emit the expected direct branch\n"; return false; } - if (branch->getSuccessor(0) == pending) { + const bool reused = branch->getSuccessor(0) == pending; + if (expectReuse && !reused) { + details = std::string(" ") + contextName + + " context must reuse the pending generalized loop header when the target resolved concretely\n"; + return false; + } + if (!expectReuse && reused) { details = std::string(" ") + contextName + " context must not reuse a pending generalized loop header\n"; return false; } - if (lifter.unvisitedBlocks.empty() || - lifter.unvisitedBlocks.back().block == pending) { + if (!expectReuse && (lifter.unvisitedBlocks.empty() || + lifter.unvisitedBlocks.back().block == pending)) { details = std::string(" ") + contextName + " context queued the pending generalized loop header instead of a fresh block\n"; return false; @@ -505,14 +535,22 @@ private: return true; } - bool runPendingGeneralizedLoopIndirectJumpBlocked(std::string& details) { - return runPendingGeneralizedLoopBlockedByContext( - LifterUnderTest::PathSolveContext::IndirectJump, "indirect-jump", details); + bool runPendingGeneralizedLoopIndirectJumpAllowedWhenResolved(std::string& details) { + // After the resolved-target relaxation, a constant-folded indirect-jump + // target that matches a pending generalized loop header is reused just + // like a direct-jump target would be. + return runPendingGeneralizedLoopByContext( + LifterUnderTest::PathSolveContext::IndirectJump, "indirect-jump", + /*expectReuse=*/true, details); } bool runPendingGeneralizedLoopRetBlocked(std::string& details) { - return runPendingGeneralizedLoopBlockedByContext( - LifterUnderTest::PathSolveContext::Ret, "return-path", details); + // Return-path contexts keep their own lifecycle — they must not reuse + // pending generalized loop headers, even now that the resolved-target + // relaxation admits indirect jumps. + return runPendingGeneralizedLoopByContext( + LifterUnderTest::PathSolveContext::Ret, "return-path", + /*expectReuse=*/false, details); } @@ -936,12 +974,14 @@ private: &InstructionTester::runLoopGeneralizationConditionalBranchAllowed); runCustom("loop_generalization_direct_jump_allowed", &InstructionTester::runLoopGeneralizationDirectJumpAllowed); - runCustom("loop_generalization_indirect_jump_blocked", - &InstructionTester::runLoopGeneralizationIndirectJumpBlocked); + runCustom("loop_generalization_indirect_jump_blocked_when_unresolved", + &InstructionTester::runLoopGeneralizationIndirectJumpBlockedWhenUnresolved); + runCustom("loop_generalization_indirect_jump_allowed_when_resolved", + &InstructionTester::runLoopGeneralizationIndirectJumpAllowedWhenResolved); runCustom("loop_generalization_ret_blocked", &InstructionTester::runLoopGeneralizationRetBlocked); - runCustom("pending_generalized_loop_indirect_jump_blocked", - &InstructionTester::runPendingGeneralizedLoopIndirectJumpBlocked); + runCustom("pending_generalized_loop_indirect_jump_allowed_when_resolved", + &InstructionTester::runPendingGeneralizedLoopIndirectJumpAllowedWhenResolved); runCustom("pending_generalized_loop_ret_blocked", &InstructionTester::runPendingGeneralizedLoopRetBlocked); runCustom("structured_loop_header_allows_conditional_backedge",