15 Commits

Author SHA1 Message Date
naci babe982b65 docs: correct SCOPE loop-header generalization status (#117)
Line 28 read 'Temporarily disabled while the team keeps required VMP 3.8.x targets on the safe high-budget path'.  That is stale relative to the current code: canGeneralizeStructuredLoopHeader (lifter/core/LifterClass.hpp) gates generalization on path-solve context plus nine operational guards, and the corresponding loop_generalization_* microtests pass on main.  Describe the actual gating and point readers at docs/LOOP_HANDLING.md.

Co-authored-by: yusufcanislek <yusuf.canislek@meetdandy.com>
2026-04-22 23:06:47 +03:00
naci 0fbc2e9a52 Upgrade rewrite gate clang-cl to 21.1.8; re-enable calc_fib/calc_sum_array (#96)
The windows-latest preinstalled clang-cl (currently 20.1.8 at
`C:\Program Files\LLVM\bin\clang-cl.exe`) produces a lifter binary
that segfaults on calc_fib before emitting any IR, causing the rewrite
gate to fail. Clang 21.1.8 has been verified locally to compile the
lifter into a binary that lifts both calc_fib and calc_sum_array to
their expected constant returns (`ret i64 13` and `ret i64 150`).

Rolling back to clang 18.x is not an option: the runner image's MSVC STL
(14.44+) hard-requires clang 19.0.0 or newer via a static_assert in
yvals_core.h. Clang 21 satisfies that bound and dodges the clang 20.1.8
miscompile.

Upgrading via `choco upgrade llvm --version=21.1.8` keeps the existing
`C:\Program Files\LLVM\bin\clang-cl.exe` path valid, so the rest of
the pipeline (Resolve LLVM_DIR, Resolve clang-cl, Configure, Build) is
unchanged.

## Changes
- `.github/workflows/rewrite-strict-gate.yml`: add an "Upgrade clang-cl
  to 21.1.8" step before `Resolve LLVM_DIR` that runs `choco upgrade
  llvm` and pins `CMAKE_{C,CXX}_COMPILER` to the upgraded binary.
- `scripts/rewrite/instruction_microtests.json`: drop the `ci_skip`
  entries on `calc_fib` and `calc_sum_array`.
- `docs/SCOPE.md`: bump the corpus counts to 33 samples / 177 runtime
  semantic cases.

## Follow-up
Investigating the underlying clang 20.1.8 miscompile in the lifter is
still worth doing \u2014 it's almost certainly UB somewhere in the
structured-loop recovery path that clang 21 happens to tolerate. Tracked
separately.

Co-authored-by: NaC-L <nac-l@users.noreply.github.com>
2026-04-07 18:33:05 +03:00
naci acab499d3f Re-skip calc_fib and calc_sum_array in CI (#95)
PR #93 un-skipped both samples after a clean local Release build proved
they lift correctly, but the windows-latest CI lane still fails on them
`Lifter failed for calc_fib` (run 24077021868). The HANDOFF note that
windows-latest clang-cl produces a different codegen shape than the
locally pinned clang-cl turned out to be the actual root cause; the
"stale build cache" theory only explained the local symptom.

Restoring the `ci_skip` entries unbreaks the rewrite-strict-gate and
rewrite-quick-gate workflows. Real fix tracked as a follow-up: either
teach the lifter the CI codegen shape, or pin the rewrite CI lane to a
toolchain that matches the local one byte-for-byte.

Also reverts the `docs/SCOPE.md` corpus counts to 31 samples / 175 cases.

Co-authored-by: NaC-L <nac-l@users.noreply.github.com>
2026-04-07 17:04:16 +03:00
naci 089e10ac08 Re-enable calc_fib and calc_sum_array in rewrite gate (#93)
Both samples were originally CI-skipped because windows-latest clang-cl
produced loop/array codegen shapes that tripped the lifter on CI even
though local runs passed. Since then the rewrite CI lane has been pinned
to the same LLVM 18.1.8 clang-cl used locally (eb49a35, 949acaa, a28a368)
and several structured loop recovery fixes have landed (2989e5a, 2eaa22e),
so the codegen mismatch that motivated the skips is gone.

Verified locally with a clean Release build (`cmd /c scripts\dev\configure_iced.cmd`
followed by `build_iced.cmd`):
- `calc_fib` lifts to `ret i64 13` and passes its semantic case
- `calc_sum_array` lifts to `ret i64 150` and passes its semantic case
- `python test.py all` is fully green: semantic 33/33 (was 31/31),
  baseline, micro --check-flags, full handler suite 115/119, determinism

Drops the two `ci_skip` entries from `instruction_microtests.json` and
updates `docs/SCOPE.md` corpus counts to 33 samples / 177 cases.

Co-authored-by: NaC-L <nac-l@users.noreply.github.com>
2026-04-07 13:38:38 +03:00
naci 5ccd498998 Implement PUNPCKLQDQ and re-enable calc_cout (#92)
- Add lift_punpcklqdq handler in Semantics_Misc.ipp (XMM dest, low-quadword
  interleave from dest+src into a 128-bit result; rejects MMX/non-XMM forms
  via the standard not_implemented bailout)
- Wire OPCODE(punpcklqdq, PUNPCKLQDQ) in x86_64_opcodes.x and add a missing
  trailing newline
- Add manual punpcklqdq case to TestInstructions.cpp (rdrand-style XMM seed)
  and matching seeds in build_full_handler_seed.py
- Regenerate oracle_seed_full_handlers{,_enriched}.json, oracle_seed_vectors.json,
  and oracle_vectors_full_handlers.json with two punpcklqdq vectors
  (basic interleave, low-source-zero edge case)
- Drop ci_skip on calc_cout in instruction_microtests.json now that the STL
  PUNPCKLQDQ path lifts cleanly (4/4 semantic cases pass locally)
- Keep calc_fib and calc_sum_array ci_skipped: they still trip a separate
  lifter dyn_cast assertion that is not related to PUNPCKLQDQ; tracked as
  follow-up
- Update docs/SCOPE.md handler counts (115/119 covered, 4 intentional skips)
  and corpus counts (31 active samples / 175 cases)

Co-authored-by: NaC-L <nac-l@users.noreply.github.com>
2026-04-07 12:58:44 +03:00
yusufcanislek 825b29946d Fix CI coverage counts in docs 2026-04-04 16:57:17 +03:00
yusufcanislek fa95a27dae CI-skip calc_sum_array on windows-latest 2026-04-04 16:46:34 +03:00
yusufcanislek 81bc3a89da CI-skip calc_fib on windows-latest 2026-04-04 16:34:38 +03:00
yusufcanislek 2989e5ab58 Recover structured loop lifting safely 2026-04-03 19:54:51 +03:00
yusufcanislek 8fba033cc6 Fix VMP gate and loop safety 2026-04-03 15:00:42 +03:00
yusufcanislek 1020775ec0 feat: prototype minimization + canonical IR naming
Two new post-optimization passes that run after the final O2 pipeline:

PrototypeMinimizationPass:
- Removes unused function arguments based on Argument::use_empty()
- Typical reduction: 34 params -> 0-2 (e.g. @main(i64 %RCX) instead of all 16 GPRs + 16 XMMs + 2 ptrs)
- Splices basic blocks into new function, remaps argument uses, erases old function
- Updated check_semantic.py to parse actual IR signatures instead of hardcoded 34-param list

CanonicalNamingPass:
- Strips address-derived suffixes from block/value names for deterministic output
- Blocks: entry, bb1, bb2, ... (sequential)
- Values: semantic prefix preserved, address suffix removed (realadd-5368713230- -> realadd)
- Same input now produces byte-identical IR across rebuilds

Also fixed writeFunctionToFile to use stored module pointer M instead of
fnc->getParent() (dangling after prototype minimization erases the old function).

Review fixes:
- CanonicalNamingPass: use StringMap<unsigned> instead of DenseMap<StringRef> (dangling key)
- PrototypeMinimizationPass: restrict call rewriting to CallInst (not InvokeInst/CallBrInst)
- PrototypeMinimizationPass: guard F->eraseFromParent() with use_empty() check
- check_semantic.py: widen define regex to handle dso_local and other prefixes

All 28 samples pass, 146 semantic cases, 56 golden hashes updated.
2026-03-29 11:00:07 +03:00
naci 6ee50d315e test: add jump table regression suite (5 samples, 39 semantic cases) (#80)
* test: add jump table regression suite (5 samples, 39 semantic cases)

Add 5 new jump table test cases covering the major dispatch patterns:

- jumptable_rel32.asm: RIP-relative dword offset table (lea+movsxd+add+jmp)
- jumptable_shifted.asm: base-shifted range check (sub before index)
- jumptable_shared_targets.asm: multiple cases sharing handlers
- jumptable_computation.asm: case bodies with symbolic arithmetic
- calc_jumptable_large.c: 16-case dense C switch compiled at /O2

All 5 pass lifting and semantic validation (39 new cases, 146 total).
Update golden hashes (46 -> 56 files), manifest, and docs.

* fix(ci): exclude C-compiled samples from golden IR hashes

C-compiled samples (calc_*) produce address-dependent IR because the
linker places symbols at different addresses depending on toolchain
version, link order, and build environment. The determinism check
comment (test.py L123-125) already documented this exclusion policy
but the golden hash file included them anyway, causing rewrite-quick-gate
to fail on CI.

Remove all 14 calc_* entries from golden_ir_hashes.json (56 -> 42).
C-compiled sample correctness is still validated by semantic tests.

---------

Co-authored-by: yusufcanislek <yusuf.canislek@meetdandy.com>
2026-03-29 09:46:52 +03:00
yusufcanislek 6d0157f26b feat: call-boundary ABI framework with strict clobber + speculative inlining scaffolding
Cross-ABI call contract (AbiCallContract.hpp):
- AbiKind enum (x64_msvc, x86_cdecl/stdcall/fastcall, unknown)
- CallModelMode: strict (default) clobbers volatile regs, compat preserves all
- CallEffects: arg regs, return regs, volatile set, stack cleanup, memory effect
- Pre-built descriptors for x64 MSVC and x86 calling conventions
- Structured diagnostics at every call site ([call-abi] prefix)

Call-site semantics (lift_call):
- applyPostCallEffects: assigns RAX=result, clobbers volatile in strict mode
- emittedExternalCall flag: skips Unflatten inlining when CreateCall emitted
- Import thunk detection (FF 25 jmp [IAT]): auto-outlines DLL imports
- shouldOutlineCall hook: extensible policy for inline/outline decisions

Bug fixes:
- parseArgs(nullptr) duplicated RDI (18 values for 16-type slots) — now 16 GPRs + memory ptr
- Unknown calls in lift_call never assigned RAX = call result — now they do
- callFunctionIR routed through applyPostCallEffects for consistency

Speculative inlining (disabled by default, opt-in via maxCallInlineBudget):
- Budget-limited call inlining with bail-out to CreateCall + ABI effects
- Worklist trimming on bail-out restores pre-call continuation
- Works mechanically but needs smarter trigger policy (see open issue)

Tests:
- call_abi_compat_preserves_volatile: R10 survives, RAX = result
- call_abi_strict_clobbers_volatile: R10 = undef, RBX preserved, RAX = result
- call_abi_default_is_strict: verifies strict is the default
- All existing baseline (90+), semantic (23/23), micro (15) tests pass
- VMP 3.8.1 target produces identical a+b+c deobfuscation
2026-03-26 09:53:16 +03:00
yusufcanislek 8e2ada491f Add SSE2 integer XMM lifting and oracle coverage 2026-03-07 16:14:34 +03:00
yusufcanislek a67bcf3ee2 Add C test binaries, NASM test cases, deterministic IR hashing, SCOPE doc
Test infra:
- test.py: flag checks always-on for quick/all; deterministic IR hash
  verification via SHA-256; update-golden subcommand
- run.ps1: accept both .asm and .c source files in manifest validation
- build_samples.cmd: compile C files with cl.exe /Od /GS- alongside NASM
- CI: rewrite-strict-gate.yml uses test.py defaults (flags always on)

New test cases (10 total):
- 6 NASM: nested_branch, loop_simple, bitchain, multi_arg, diamond, cmov_chain
- 4 C (MSVC /Od): calc_grade (5-way branch), calc_mixed (symbolic+concrete),
  calc_fib (loop->const fold to 13), calc_sum_array (array->const fold to 150)

Manifest: 17 samples, 40 pattern checks
Golden hashes: 34 .ll files (17 optimized + 17 unoptimized)
Handler microtests: 108/111 (97.3%), flags enforced

Docs:
- docs/SCOPE.md: supported/unsupported pattern matrix
2026-03-05 20:31:53 +03:00