37 Commits

Author SHA1 Message Date
naci 0fbc2e9a52 Upgrade rewrite gate clang-cl to 21.1.8; re-enable calc_fib/calc_sum_array (#96)
The windows-latest preinstalled clang-cl (currently 20.1.8 at
`C:\Program Files\LLVM\bin\clang-cl.exe`) produces a lifter binary
that segfaults on calc_fib before emitting any IR, causing the rewrite
gate to fail. Clang 21.1.8 has been verified locally to compile the
lifter into a binary that lifts both calc_fib and calc_sum_array to
their expected constant returns (`ret i64 13` and `ret i64 150`).

Rolling back to clang 18.x is not an option: the runner image's MSVC STL
(14.44+) hard-requires clang 19.0.0 or newer via a static_assert in
yvals_core.h. Clang 21 satisfies that bound and dodges the clang 20.1.8
miscompile.

Upgrading via `choco upgrade llvm --version=21.1.8` keeps the existing
`C:\Program Files\LLVM\bin\clang-cl.exe` path valid, so the rest of
the pipeline (Resolve LLVM_DIR, Resolve clang-cl, Configure, Build) is
unchanged.

## Changes
- `.github/workflows/rewrite-strict-gate.yml`: add an "Upgrade clang-cl
  to 21.1.8" step before `Resolve LLVM_DIR` that runs `choco upgrade
  llvm` and pins `CMAKE_{C,CXX}_COMPILER` to the upgraded binary.
- `scripts/rewrite/instruction_microtests.json`: drop the `ci_skip`
  entries on `calc_fib` and `calc_sum_array`.
- `docs/SCOPE.md`: bump the corpus counts to 33 samples / 177 runtime
  semantic cases.

## Follow-up
Investigating the underlying clang 20.1.8 miscompile in the lifter is
still worth doing \u2014 it's almost certainly UB somewhere in the
structured-loop recovery path that clang 21 happens to tolerate. Tracked
separately.

Co-authored-by: NaC-L <nac-l@users.noreply.github.com>
2026-04-07 18:33:05 +03:00
yusufcanislek 0406553c21 Fix rewrite coverage summary JSON collection 2026-04-05 23:56:49 +03:00
yusufcanislek b4861ef3f2 Fix rewrite coverage summary command 2026-04-05 23:50:53 +03:00
yusufcanislek d3dda532dd Export clang-cl for rewrite gates 2026-04-05 23:44:16 +03:00
yusufcanislek 2eaa22ee63 Fix structured loop recovery regressions 2026-04-05 23:33:30 +03:00
yusufcanislek a28a368a7d Use runner clang for rewrite CI 2026-04-04 16:42:14 +03:00
yusufcanislek a24188e9e9 Install sample compiler after lifter build 2026-04-04 16:26:20 +03:00
yusufcanislek 8d0a6f34d5 Use LLVM 18 only for rewrite samples 2026-04-04 16:19:23 +03:00
yusufcanislek 3c2b7ec609 Split LLVM config and clang sources in CI 2026-04-04 16:11:13 +03:00
yusufcanislek 9a590484bc Find LLVMConfig under installed LLVM 2026-04-04 16:06:36 +03:00
yusufcanislek 949acaa4ff Install LLVM 18.1.8 in Windows CI 2026-04-04 16:02:20 +03:00
yusufcanislek eb49a35cc7 Pin rewrite CI clang-cl toolchain 2026-04-04 15:53:28 +03:00
yusufcanislek 555a23def5 Force pinned LLVM clang in rewrite CI 2026-04-04 09:56:03 +03:00
yusufcanislek aa101dc6d0 ci: add Windows build workflow with artifact upload
Build both iced and zydis variants on windows-latest.
Upload lifter.exe and rewrite_microtests.exe as downloadable artifacts.
Triggers on push to main, tags (v*), PRs to main, and manual dispatch.
Uses pre-built LLVM 18.1.8 from vovkos/llvm-package-windows.
2026-03-29 10:05:17 +03:00
yusufcanislek cdf03f1721 issue: speculative call inlining policy design (inline vs outline)
Documents the unsolved problem of distinguishing real function calls
(library, CRT) from obfuscation call gadgets (VM handlers, push+ret).

Current state: import thunks auto-detected, speculative budget mechanism
built but disabled. Needs call-depth scoped policy or hybrid approach.

See .github/SPECULATIVE_INLINE_ISSUE.md for full analysis and proposals.
2026-03-26 09:54:08 +03:00
yusufcanislek 1a60446f57 Refactor: split Semantics.ipp, remove dead code, add editor configs
- Split Semantics.ipp (4509 lines) into 5 category sub-files:
  Semantics_Helpers.ipp, Semantics_ControlFlow.ipp,
  Semantics_Arithmetic.ipp, Semantics_Bitwise.ipp, Semantics_Misc.ipp
- Remove dead simplifyValue() function and all call sites
- Remove 330 lines of commented-out Z3 integration code from OperandUtils.ipp
- Remove commented-out dead code from ZydisDisassembler.hpp and lifterClass_symbolic.hpp
- Remove using namespace llvm from GEPTracker.h header
- Add .clang-format (LLVM base) and .editorconfig
- Update .gitignore for temp/debug file patterns
- Fix Docker workflow placeholder tag (my-image-name -> mergen)
- Align cmake.toml rewrite_microtests target with lifter target
2026-03-06 17:56:51 +03:00
yusufcanislek a67bcf3ee2 Add C test binaries, NASM test cases, deterministic IR hashing, SCOPE doc
Test infra:
- test.py: flag checks always-on for quick/all; deterministic IR hash
  verification via SHA-256; update-golden subcommand
- run.ps1: accept both .asm and .c source files in manifest validation
- build_samples.cmd: compile C files with cl.exe /Od /GS- alongside NASM
- CI: rewrite-strict-gate.yml uses test.py defaults (flags always on)

New test cases (10 total):
- 6 NASM: nested_branch, loop_simple, bitchain, multi_arg, diamond, cmov_chain
- 4 C (MSVC /Od): calc_grade (5-way branch), calc_mixed (symbolic+concrete),
  calc_fib (loop->const fold to 13), calc_sum_array (array->const fold to 150)

Manifest: 17 samples, 40 pattern checks
Golden hashes: 34 .ll files (17 optimized + 17 unoptimized)
Handler microtests: 108/111 (97.3%), flags enforced

Docs:
- docs/SCOPE.md: supported/unsupported pattern matrix
2026-03-05 20:31:53 +03:00
yusufcanislek 7362486e82 Address PR review: fail-fast oracle pipeline, stable shift vectors, stricter validation
- Workflow: enforce per-package choco install exit-code checks in rewrite gate
- Seed builder:
  - make sar/shl/shr overrides use immediate count=1 (stable OF semantics)
  - merge DEFAULT_INITIAL into all smoke cases
  - define explicit default flag state (CF/PF/AF/ZF/SF/OF/DF=0, IF=1)
- Enrichment:
  - validate seed schema and cases array
  - validate expected payload type
  - strict computed-helper input checks (required RSI/RDI/RBP where needed)
  - reject malformed initial/register/flag objects
- Oracle generation:
  - fail fast on emulation errors (no silent skip downgrade)
  - validate expected.registers/expected.flags are objects for none/computed
- Lifter/tests:
  - reset hadConditionalBranch/lastBranchTaken per testcase
  - disambiguate branchHelper when true/false destinations are equal
  - reject invalid expected.branch_taken types and non {0,1} ints
- Coverage/reporting:
  - count covered handlers only within opcode universe
  - guard text coverage percentage against divide-by-zero
  - normalize test.py report --vectors path relative to repo root
- Regenerated oracle seeds/vectors and updated full-handler vectors
  - oracle_vectors_full_handlers.json now: 130 cases, 3 skipped (cpuid, rdtsc, ret)
2026-03-04 12:08:00 +03:00
yusufcanislek 505bb7aadf Deduplicate rewrite CI jobs and remove dead startup block 2026-03-04 01:22:30 +03:00
yusufcanislek 7ab07baee6 Build lifter before running rewrite CI gates 2026-03-04 00:17:50 +03:00
yusufcanislek 44208263ca Install full LLVM package artifact in Windows CI gates 2026-03-04 00:14:43 +03:00
yusufcanislek 7c47ed9410 Harden Windows CI LLVM discovery across runner layouts 2026-03-03 23:41:40 +03:00
yusufcanislek 0c4812bcc8 Fix vswhere path expansion in Windows CI workflow 2026-03-03 23:39:03 +03:00
yusufcanislek 669b7569b1 Fix Windows CI LLVM resolution for rewrite gate workflows 2026-03-03 23:37:10 +03:00
yusufcanislek 42106ff274 Add quick rewrite gate job to GitHub Actions workflow 2026-03-03 23:32:58 +03:00
yusufcanislek 082a2d94c1 Add GitHub Actions workflow for strict rewrite gate 2026-03-03 23:30:44 +03:00
naci b5c8880f8c Update docker-image.yml 2025-04-24 05:39:47 +03:00
naci 46e5567394 Update docker-image.yml 2025-04-24 05:37:16 +03:00
naci 97a1065d73 Update docker-image.yml 2025-04-24 05:36:10 +03:00
naci 1ec49af423 Update docker-image.yml 2025-04-24 05:35:28 +03:00
naci fada7b74e4 try build without rust 2025-04-24 05:34:33 +03:00
wcscpy 23e39dca53 Update docker-image.yml 2024-11-27 18:30:47 +01:00
wcscpy 6f81e22e38 Update docker-image.yml 2024-11-27 18:28:04 +01:00
wcscpy 8b152a9e8c Update docker-image.yml 2024-11-27 18:24:56 +01:00
wcscpy 5a2a68fbd6 Update docker-image.yml 2024-11-27 18:22:38 +01:00
naci 4d965607fa Create docker-image.yml 2024-08-14 13:02:03 +03:00
naci 1984be8b76 Create FUNDING.yml 2024-04-16 10:51:04 +03:00