diff options
author | René Scharfe <l.s.r@web.de> | 2018-09-03 14:49:27 +0000 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2018-09-12 15:17:46 -0700 |
commit | 3b41fb0cb217f4b4491f2e67ce4183e5d2a5d873 (patch) | |
tree | b229f85f43239388a68e21c09c556f7d71974d6a /Documentation | |
parent | fsck: use strbuf_getline() to read skiplist file (diff) | |
download | tgif-3b41fb0cb217f4b4491f2e67ce4183e5d2a5d873.tar.xz |
fsck: use oidset instead of oid_array for skipList
Change the implementation of the skipList feature to use oidset
instead of oid_array to store SHA-1s for later lookup.
This list is parsed once on startup by fsck, fetch-pack or
receive-pack depending on the *.skipList config in use. I.e. only once
per invocation, but note that for "clone --recurse-submodules" each
submodule will re-parse the list, in addition to the main project, and
it will be re-parsed when checking .gitmodules blobs, see
fb16287719 ("fsck: check skiplist for object in fsck_blob()",
2018-06-27).
Memory usage is a bit higher, but we don't need to keep track of the
sort order anymore. Embed the oidset into struct fsck_options to make
its ownership clear (no hidden sharing) and avoid unnecessary pointer
indirection.
The cumulative impact on performance of this & the preceding change,
using the test setup described in the previous commit:
Test HEAD~2 HEAD~ HEAD
----------------------------------------------------------------------------------------------------------------
1450.3: fsck with 0 skipped bad commits 7.70(7.31+0.38) 7.72(7.33+0.38) +0.3% 7.70(7.30+0.40) +0.0%
1450.5: fsck with 1 skipped bad commits 7.84(7.47+0.37) 7.69(7.32+0.36) -1.9% 7.71(7.29+0.41) -1.7%
1450.7: fsck with 10 skipped bad commits 7.81(7.40+0.40) 7.94(7.57+0.36) +1.7% 7.92(7.55+0.37) +1.4%
1450.9: fsck with 100 skipped bad commits 7.81(7.42+0.38) 7.95(7.53+0.41) +1.8% 7.83(7.42+0.41) +0.3%
1450.11: fsck with 1000 skipped bad commits 7.99(7.62+0.36) 7.90(7.50+0.40) -1.1% 7.86(7.49+0.37) -1.6%
1450.13: fsck with 10000 skipped bad commits 7.98(7.57+0.40) 7.94(7.53+0.40) -0.5% 7.90(7.45+0.44) -1.0%
1450.15: fsck with 100000 skipped bad commits 7.97(7.57+0.39) 8.03(7.67+0.36) +0.8% 7.84(7.43+0.41) -1.6%
1450.17: fsck with 1000000 skipped bad commits 7.72(7.22+0.50) 7.28(7.07+0.20) -5.7% 7.13(6.87+0.25) -7.6%
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/config.txt | 11 |
1 files changed, 6 insertions, 5 deletions
diff --git a/Documentation/config.txt b/Documentation/config.txt index 3287c7ef8a..161ffe259e 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1731,11 +1731,12 @@ all three of them they must all set to the same values. + Older versions of Git (before 2.20) documented that the object names list should be sorted. This was never a requirement, the object names -can appear in any order, but when reading the list we track whether -the list is sorted for the purposes of an internal binary search -implementation, which can save itself some work with an already sorted -list. Unless you have a humongous list there's no reason to go out of -your way to pre-sort the list. +could appear in any order, but when reading the list we tracked whether +the list was sorted for the purposes of an internal binary search +implementation, which could save itself some work with an already sorted +list. Unless you had a humongous list there was no reason to go out of +your way to pre-sort the list. After Git version 2.20 a hash implementation +is used instead, so there's now no reason to pre-sort the list. gc.aggressiveDepth:: The depth parameter used in the delta compression |