Commit Graph

40 Commits

Author SHA1 Message Date
Jordan Tunstill e9734c1ff2 Fix pre-receive hook hangs and missing logs by flushing logs on signal and using CommandContext for git commands (#4714)
When TruffleHog times out in pre-receive hooks, it fails to output
diagnostic logs.

Changes:
- Ensure log output is flushed before process termination in all exit paths:
  * Defer log sync in run() function for normal exits
  * Sync logs in signal handler before os.Exit(0)
  * Sync logs before os.Exit(183) when results are found
  * Sync logs in logFatalFunc before os.Exit() calls
- Use exec.CommandContext instead of exec.Command for git log/diff
  to ensure processes are killed when context is cancelled
- Add WaitDelay to git commands to provide grace period for cleanup

This ensures diagnostic output is captured when git operations block in
pre-receive hook environments and logs are visible when process is killed.
2026-02-06 11:48:02 -08:00
Gleb Haranin 6a0bc788d2 fix(git): use --iso-strict git arg to prevent locale issue (#4653)
Co-authored-by: Kashif Khan <70996046+kashifkhan0771@users.noreply.github.com>

Fixes #3338

TruffleHog fails to parse git commit dates when the system locale is non-English (e.g., German outputs Sa. instead of Sat) preventing users from scanning git repos(all git-based sources like local git repo, github, gitlab, etc). This is because the format that is being used --date=format:%a %b %d %H:%M:%S %Y %z is dependent on strftime(3) C library function which is locale dependent(uses POSIX LC_ALL, LC_TIME, LANG). Setting LC_ALL=C or any other locale on the user's side like LC_ALL=de_DE.UTF-8 trufflehog ... doesn't help because the code overwrites the subprocess environment.

This means git only receives GIT_DIR and no locale settings. Rather than fixing env inheritance or adding LC_TIME=C, this PR switches to --date=iso-strict which outputs locale-independent ISO 8601 timestamps (2024-09-28T07:59:21+00:00). I think this is a pretty good solution: iso format has been stable since Git 2.2 (2014), requires no environment manipulation, and uses Go's native time.RFC3339 parser.

Error message when parsing:

2026-01-10T00:28:06-05:00	error	trufflehog	failed to parse commit date	{"source_manager_worker_id": "rc4SK", "unit_kind": "dir", "unit": "/var/folders/hw/r5j4bcyd3472ccwjz0klh5840000gn/T/trufflehog-25309-2688385567", "repo": "file:///Users/xxx/xxx/sample-repo", "commit": "5f506baa305831998a2e15aa07cd381a69fde48f", "latestState": "AuthorDateLine", "error": "parsing time \"Sa. Jan. 10 00:11:44 2026 -0500\" as \"Mon Jan 2 15:04:05 2006 -0700\": cannot parse \"Sa. Jan. 10 00:11:44 2026 -0500\" as \"Mon\""}

Tested on homebrew git on mac(apple git seems to completely ignoring locales) and linux (debian)
2026-01-15 10:25:04 -05:00
Kashif Khan 9d7c0afbc8 Expanded test coverage for binary content (#4332)
* Expanded test coverage for binary git diffs

* reverted old changes and added binary test in chunker

* revert more

* added filesystem binary file scan test
2025-08-05 12:40:30 +05:00
Nabeel Alam e7489bff11 decrease max diff size in gitparse maxdiffsize test (#4240) 2025-06-24 13:51:13 +05:00
Martin Locklear 867fa0165d Skip Intermittently Failing Git Tests (#4187)
Not sure why these are failing right now, but this gives us a little breathing room
2025-05-29 09:11:49 -04:00
Martin Locklear 3550d2019b Better Test Output For Git Diff Parsing (#4018)
* Better Test Output For Git Diff Parsing

This test file previously compared parsed diffs using a method `Equal` on the Diff
struct. This would fail (and stop) at the first mismatch, and left reporting the
failure to the caller.  The existing test output made it hard to understand _what_
the difference was when the difference was in the diff content itself because the
_content_ of the contentWriter field wasn't actually processed and printed.

This change adds an `assertDiffEqualToExpected` function that leverages much of the
existing test structure but centralizes the assertion _and_ reporting of the differences
into a single reusable method.
2025-04-09 09:41:31 -04:00
ahrav ad0bc11a5b [fix] - integer types (#3793)
* fix types

* fix
2024-12-18 09:02:12 -08:00
Dustin Decker f3630da1e0 Improve process cleanup (#3339)
* ensures that cmd.Wait() is always called, even if there's a panic in the FromReader function or if stdOut.Close() returns an error

* close stdout and ensure wait is called when handling binaries

* process cleanup improvements

* lint
2024-09-26 10:17:47 -07:00
Richard Gomez 11e5febeee feat(git): scan commit metadata (#2754)
This is a follow-up to #2713 that fixes the strange test error.

As suspected, the failure was caused by additional diffs not being included in the test's expected data.
2024-04-29 16:58:45 -04:00
Miccah fadf9c6286 [chore] Remove broken test (#2748)
This wasn't actually testing the fix, which is more difficult to
orchestrate than is worth.

See: https://github.com/trufflesecurity/trufflehog/pull/2742
2024-04-25 11:27:17 -07:00
ahrav b430dae83e [refactor] - lazy buffer retrieval (#2745)
* only create the contentWriter once

* update test

* Lazily fetch buffer from the pool

* fix tests

* fix test

* remove ctx
2024-04-25 08:27:15 -07:00
ahrav 8ceeb5d5a1 [bug] - Refactor newDiff constructor to avoid double initialization of contentWriter (#2742)
* only create the contentWriter once

* update test

* correclty use mock

* remove deprecated pkg
2024-04-25 08:01:38 -07:00
Cody Rose 11452e8a57 Revert "feat(git): scan commit metadata (#2713)" (#2747)
This reverts commit 81a9c813a1.
2024-04-25 10:56:48 -04:00
Richard Gomez 81a9c813a1 feat(git): scan commit metadata (#2713)
This fixes #2683. It scans the commit author, committer (which is typically GitHub <noreply@github.com> for GitHub, but can be different), and message.

It also scans Git notes.
2024-04-25 10:13:09 -04:00
ahrav 4a5fbf8417 [refactor] - Update Write method signature in contentWriter interface (#2721)
* Update write method in contentWriter interface

* fix lint
2024-04-23 08:47:53 -07:00
Richard Gomez baf7ea1458 feat(gitparse): avoid uneeded calls to strconv.Unquote (#2605) 2024-03-22 08:35:10 -07:00
Richard Gomez aa862e46bb fix(git): decode unicode paths (#2585) 2024-03-19 08:50:27 -07:00
Miccah 88c1bb3289 [chore] Increase TestMaxDiffSize timeout (#2472) 2024-02-16 11:09:25 -08:00
ahrav 40bbab8add [cleanup] - Extract buffer logic (#2409)
* extract the buffer logic into it's own package

* address comments
2024-02-15 11:40:34 -08:00
ahrav e8006f1bee 2396 since commit stopped working (#2402)
* Ensure we handle commits with no diffs correctly.

* cleanup

* add nil check

* address comments

* move comment

* revert

* add comment
2024-02-13 07:21:22 -08:00
Richard Gomez 3b40c4fa63 Update GitParse to handle quoted binary filenames (#2391)
* fix(gitparse): quoted binary files

* fix(gitparse): use bytes.Cut instead of regexp

* fix lint warning

---------

Co-authored-by: Zachary Rice <zachary.rice@trufflesec.com>
2024-02-08 09:25:04 -06:00
ahrav 7b492a690a [feat] - use diff chan (#2387)
* use diff chan

* address comments

* add comment

* address comments

* use old ordering

* add correct author line

* Add required *Commit arg to newDiff

* address comments
2024-02-06 10:06:10 -08:00
ahrav 9867ce8eb8 Allow for configuring the buffered file writer (#2319)
* Write large diffs to tmp files

* address comments

* Move bufferedfilewriter to own pkg

* update test

* swallow write err

* use buffer pool

* use size vs len

* use interface

* fix test

* update comments

* fix test

* Allow for configuring the buffered file writer

* remove unused

* add missing method

* remove

* remove unused

* move parser and commit struct closer to where they are used

* linter change

* fix snifftest

* address comments

* add more kvp pairs to error

* fix test

* update

* add back missing metadata fields

* address comments

* remove bufferedfile writer

* fix

* address comments

* use unint8

* update interface

* adjust interface

* fix tests

* make linter happy

* fix finalize

* address comments

* update test

* address comments

* lint

* remove guard

* fix test

* fix

* add TODO

* fix tests
2024-01-30 12:51:58 -08:00
ahrav 7c59ff95d5 [feat] - tmp file diffs (#2306)
* Write large diffs to tmp files

* address comments

* Move bufferedfilewriter to own pkg

* update test

* swallow write err

* use buffer pool

* use size vs len

* use interface

* fix test

* update comments

* fix test

* remove unused

* remove

* remove unused

* move parser and commit struct closer to where they are used

* linter change

* add more kvp pairs to error

* fix test

* update

* address comments

* remove bufferedfile writer

* address comments

* adjust interface

* fix finalize

* address comments

* lint

* remove guard

* fix

* add TODO
2024-01-30 12:30:51 -08:00
ahrav 383f8a1f67 [chore] - reduce test time (#2321)
* reduce test time

* remove commented out code
2024-01-22 09:40:32 -08:00
Richard Gomez 241e153dfb fix(gitparse): handle fromFileLine edge case (#2206) 2024-01-04 14:53:08 -08:00
Richard Gomez e72fdb62e4 fix(gitparse): don't trim filename (#2201) 2023-12-14 08:29:46 -08:00
Bill Rich 00a00ef651 Fix binary handling (#1999) 2023-10-26 10:07:02 -07:00
Zachary Rice 3897454dbb add merge support (#1561) 2023-07-27 09:24:49 -05:00
Richard Gomez f48a635c34 feat: update gitparse logic (#1486) 2023-07-25 17:52:34 -05:00
ahrav 1da7720912 Replace context.TODO. (#1349) 2023-05-19 11:09:51 -07:00
Zachary Rice 458c79165a fix extra log messages (#1253)
* fix extra log messages

* add small test, move flag to isindex
2023-04-13 09:53:21 -05:00
Dustin Decker 8f10938bf7 forager requires direct access to gitparse.FromReader (#1233) 2023-04-02 17:54:43 -07:00
Bill Rich b37080e6a5 Add max commit size (#1079)
* Add max commit size

* Use common.IsDone

* Use breaks instead of return
2023-02-07 15:25:00 -08:00
Bill Rich af6e3f8fdf Pull gitparse config options out of pkg consts (#1072)
* Pull gitparse config options out of pkg consts.

* Adjust naming
2023-02-04 13:19:23 -08:00
Bill Rich ac1dd23d37 Limit diff size to prevent out of control memory use. (#1035)
* Limit diff size to prevent out of control memory use.

* Group consts
2023-01-23 10:14:10 -08:00
Bill Rich 912d8e461d Add context so to avoid splitting creds. (#791)
* Add context so to avoid splitting creds.

* Add context newlines to expected results
2022-09-09 15:00:33 -07:00
Bill Rich 3fe916fe1e add tests (#785) 2022-09-08 21:46:12 -07:00
ahrav 20cdcbc970 [bug] - Fix the starting index value for plus line check. (#734)
* Fix the starting index value for plus line check.

* Set the correct source type for notifications.

* Reset old value.

* Fix the starting index value for plus line check.

* Fix len check.

* Reset old value.

* Add tests.

* Update tests.

* Update tests.
2022-08-25 10:45:35 -07:00
Bill Rich a0d44a39f1 Use trufflesec git parser (#729)
* Use trufflesec git parser.

* wip

* Fix line numbers and linter feedback
2022-08-23 13:29:20 -07:00