From 50b72ede0592b16cb62e1b92d52bdccc4cee9b20 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 14 Jun 2013 17:51:01 -0400 Subject: t5303: drop "count=1" from corruption dd This test corrupts pack objects by using "dd" with a seek command. It passes "count=1 bs=1" to munge just a single byte. However, the test added in commit b3118bdc wants to munge two bytes, and the second byte of corruption is silently ignored. This turned out not to impact the test, however. The idea was to reduce the "size of this entry" part of the header so that zlib runs out of input bytes while inflating the entry. That header is two bytes long, and the test reduced the value of both bytes; since we experience the problem if we are off by even 1 byte, it is sufficient to munge only the first one. Even though the test would have worked with only a single byte munged, and we could simply tweak the test to use a single byte, it makes sense to lift this 1-byte restriction from do_corrupt_object. It will allow future tests that do need to change multiple bytes to do so. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/t5303-pack-corruption-resilience.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh index 5b1250f0d2..9cb8172adf 100755 --- a/t/t5303-pack-corruption-resilience.sh +++ b/t/t5303-pack-corruption-resilience.sh @@ -51,7 +51,7 @@ do_corrupt_object() { ofs=`git show-index < ${pack}.idx | grep $1 | cut -f1 -d" "` && ofs=$(($ofs + $2)) && chmod +w ${pack}.pack && - dd of=${pack}.pack count=1 bs=1 conv=notrunc seek=$ofs && + dd of=${pack}.pack bs=1 conv=notrunc seek=$ofs && test_must_fail git verify-pack ${pack}.pack } -- cgit v1.2.3 From 1ee886c1f08f4dd672a342b7811191b02291a597 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 14 Jun 2013 17:53:34 -0400 Subject: unpack_entry: do not die when we fail to apply a delta When we try to load an object from disk and fail, our general strategy is to see if we can get it from somewhere else (e.g., a loose object). That lets users fix corruption problems by copying known-good versions of objects into the object database. We already handle the case where we were not able to read the delta from disk. However, when we find that the delta we read does not apply, we simply die. This case is harder to trigger, as corruption in the delta data itself would trigger a crc error from zlib. However, a corruption that pointed us at the wrong delta base might cause it. We can do the same "fail and try to find the object elsewhere" trick instead of dying. This not only gives us a chance to recover, but also puts us on code paths that will alert the user to the problem (with the current message, they do not even know which sha1 caused the problem). Note that unlike some other pack corruptions, we do not recover automatically from this case when doing a repack. There is nothing apparently wrong with the delta, as it points to a valid, accessible object, and we realize the error only when the resulting size does not match up. And in theory, one could even have a case where the corrupted size is the same, and the problem would only be noticed by recomputing the sha1. We can get around this by recomputing the deltas with --no-reuse-delta, which our test does (and this is probably good advice for anyone recovering from pack corruption). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- sha1_file.c | 11 ++++++++++- t/t5303-pack-corruption-resilience.sh | 27 +++++++++++++++++++++++++++ 2 files changed, 37 insertions(+), 1 deletion(-) diff --git a/sha1_file.c b/sha1_file.c index b114cc922d..742cf342c0 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -2135,8 +2135,17 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset, data = patch_delta(base, base_size, delta_data, delta_size, &size); + + /* + * We could not apply the delta; warn the user, but keep going. + * Our failure will be noticed either in the next iteration of + * the loop, or if this is the final delta, in the caller when + * we return NULL. Those code paths will take care of making + * a more explicit warning and retrying with another copy of + * the object. + */ if (!data) - die("failed to apply delta"); + error("failed to apply delta"); free(delta_data); } diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh index 9cb8172adf..35926debe3 100755 --- a/t/t5303-pack-corruption-resilience.sh +++ b/t/t5303-pack-corruption-resilience.sh @@ -275,6 +275,33 @@ test_expect_success \ git cat-file blob $blob_2 > /dev/null && git cat-file blob $blob_3 > /dev/null' +test_expect_success \ + 'corruption of delta base reference pointing to wrong object' \ + 'create_new_pack --delta-base-offset && + git prune-packed && + printf "\220\033" | do_corrupt_object $blob_3 2 && + git cat-file blob $blob_1 >/dev/null && + git cat-file blob $blob_2 >/dev/null && + test_must_fail git cat-file blob $blob_3 >/dev/null' + +test_expect_success \ + '... but having a loose copy allows for full recovery' \ + 'mv ${pack}.idx tmp && + git hash-object -t blob -w file_3 && + mv tmp ${pack}.idx && + git cat-file blob $blob_1 > /dev/null && + git cat-file blob $blob_2 > /dev/null && + git cat-file blob $blob_3 > /dev/null' + +test_expect_success \ + '... and then a repack "clears" the corruption' \ + 'do_repack --delta-base-offset --no-reuse-delta && + git prune-packed && + git verify-pack ${pack}.pack && + git cat-file blob $blob_1 > /dev/null && + git cat-file blob $blob_2 > /dev/null && + git cat-file blob $blob_3 > /dev/null' + test_expect_success \ 'corrupting header to have too small output buffer fails unpack' \ 'create_new_pack && -- cgit v1.2.3