diff options
author | Junio C Hamano <gitster@pobox.com> | 2016-10-10 14:03:46 -0700 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2016-10-10 14:03:47 -0700 |
commit | e6e24c94df9df6d39f2316113c14fe07d2ab03d7 (patch) | |
tree | 1f883be2ac3f3a84ce5af3cd1e9e32c45648e21e /t | |
parent | Merge branch 'rs/qsort' (diff) | |
parent | pack-objects: use mru list when iterating over packs (diff) | |
download | tgif-e6e24c94df9df6d39f2316113c14fe07d2ab03d7.tar.xz |
Merge branch 'jk/pack-objects-optim-mru'
"git pack-objects" in a repository with many packfiles used to
spend a lot of time looking for/at objects in them; the accesses to
the packfiles are now optimized by checking the most-recently-used
packfile first.
* jk/pack-objects-optim-mru:
pack-objects: use mru list when iterating over packs
pack-objects: break delta cycles before delta-search phase
sha1_file: make packed_object_info public
provide an initializer for "struct object_info"
Diffstat (limited to 't')
-rwxr-xr-x | t/t5314-pack-cycle-detection.sh | 113 |
1 files changed, 113 insertions, 0 deletions
diff --git a/t/t5314-pack-cycle-detection.sh b/t/t5314-pack-cycle-detection.sh new file mode 100755 index 0000000000..f7dbdfb412 --- /dev/null +++ b/t/t5314-pack-cycle-detection.sh @@ -0,0 +1,113 @@ +#!/bin/sh + +test_description='test handling of inter-pack delta cycles during repack + +The goal here is to create a situation where we have two blobs, A and B, with A +as a delta against B in one pack, and vice versa in the other. Then if we can +persuade a full repack to find A from one pack and B from the other, that will +give us a cycle when we attempt to reuse those deltas. + +The trick is in the "persuade" step, as it depends on the internals of how +pack-objects picks which pack to reuse the deltas from. But we can assume +that it does so in one of two general strategies: + + 1. Using a static ordering of packs. In this case, no inter-pack cycles can + happen. Any objects with a delta relationship must be present in the same + pack (i.e., no "--thin" packs on disk), so we will find all related objects + from that pack. So assuming there are no cycles within a single pack (and + we avoid generating them via pack-objects or importing them via + index-pack), then our result will have no cycles. + + So this case should pass the tests no matter how we arrange things. + + 2. Picking the next pack to examine based on locality (i.e., where we found + something else recently). + + In this case, we want to make sure that we find the delta versions of A and + B and not their base versions. We can do this by putting two blobs in each + pack. The first is a "dummy" blob that can only be found in the pack in + question. And then the second is the actual delta we want to find. + + The two blobs must be present in the same tree, not present in other trees, + and the dummy pathname must sort before the delta path. + +The setup below focuses on case 2. We have two commits HEAD and HEAD^, each +which has two files: "dummy" and "file". Then we can make two packs which +contain: + + [pack one] + HEAD:dummy + HEAD:file (as delta against HEAD^:file) + HEAD^:file (as base) + + [pack two] + HEAD^:dummy + HEAD^:file (as delta against HEAD:file) + HEAD:file (as base) + +Then no matter which order we start looking at the packs in, we know that we +will always find a delta for "file", because its lookup will always come +immediately after the lookup for "dummy". +' +. ./test-lib.sh + + + +# Create a pack containing the the tree $1 and blob $1:file, with +# the latter stored as a delta against $2:file. +# +# We convince pack-objects to make the delta in the direction of our choosing +# by marking $2 as a preferred-base edge. That results in $1:file as a thin +# delta, and index-pack completes it by adding $2:file as a base. +# +# Note that the two variants of "file" must be similar enough to convince git +# to create the delta. +make_pack () { + { + printf '%s\n' "-$(git rev-parse $2)" + printf '%s dummy\n' "$(git rev-parse $1:dummy)" + printf '%s file\n' "$(git rev-parse $1:file)" + } | + git pack-objects --stdout | + git index-pack --stdin --fix-thin +} + +test_expect_success 'setup' ' + test-genrandom base 4096 >base && + for i in one two + do + # we want shared content here to encourage deltas... + cp base file && + echo $i >>file && + + # ...whereas dummy should be short, because we do not want + # deltas that would create duplicates when we --fix-thin + echo $i >dummy && + + git add file dummy && + test_tick && + git commit -m $i || + return 1 + done && + + make_pack HEAD^ HEAD && + make_pack HEAD HEAD^ +' + +test_expect_success 'repack' ' + # We first want to check that we do not have any internal errors, + # and also that we do not hit the last-ditch cycle-breaking code + # in write_object(), which will issue a warning to stderr. + >expect && + git repack -ad 2>stderr && + test_cmp expect stderr && + + # And then double-check that the resulting pack is usable (i.e., + # we did not fail to notice any cycles). We know we are accessing + # the objects via the new pack here, because "repack -d" will have + # removed the others. + git cat-file blob HEAD:file >/dev/null && + git cat-file blob HEAD^:file >/dev/null +' + +test_done |