summaryrefslogtreecommitdiff
path: root/t/t5319-multi-pack-index.sh
AgeCommit message (Collapse)AuthorFilesLines
2018-09-17multi-pack-index: verify bad headerLibravatar Derrick Stolee1-1/+45
When verifying if a multi-pack-index file is valid, we want the command to fail to signal an invalid file. Previously, we wrote an error to stderr and continued as if we had no multi-pack-index. Now, die() instead of error(). Add tests that check corrupted headers in a few ways: * Bad signature * Bad file version * Bad hash version * Truncated hash count * Extended hash count Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-17multi-pack-index: add 'verify' verbLibravatar Derrick Stolee1-0/+8
The multi-pack-index builtin writes multi-pack-index files, and uses a 'write' verb to do so. Add a 'verify' verb that checks this file matches the contents of the pack-indexes it replaces. The current implementation is a no-op, but will be extended in small increments in later commits. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20pack-objects: consider packs in multi-pack-indexLibravatar Derrick Stolee1-1/+7
When running 'git pack-objects --local', we want to avoid packing objects that are in an alternate. Currently, we check for these objects using the packed_git_mru list, which excludes the pack-files covered by a multi-pack-index. Add a new iteration over the multi-pack-indexes to find these copies and mark them as unwanted. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20midx: test a few commands that use get_all_packsLibravatar Derrick Stolee1-3/+18
The new get_all_packs() method exposed the packfiles coverede by a multi-pack-index. Before, the 'git cat-file --batch' and 'git count-objects' commands would skip objects in an environment with a multi-pack-index. Further, a reachability bitmap would be ignored if its pack-file was covered by a multi-pack-index. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20midx: fix bug that skips midx with alternatesLibravatar Derrick Stolee1-0/+17
The logic for constructing the linked list of multi-pack-indexes in the object store is incorrect. If the local object store has a multi-pack-index, but an alternate does not, then the list is dropped. Add tests that would have revealed this bug. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20midx: clear midx on repackLibravatar Derrick Stolee1-0/+9
If a 'git repack' command replaces existing packfiles, then we must clear the existing multi-pack-index before moving the packfiles it references. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20config: create core.multiPackIndex settingLibravatar Derrick Stolee1-13/+38
The core.multiPackIndex config setting controls the multi-pack- index (MIDX) feature. If false, the setting will disable all reads from the multi-pack-index file. Read this config setting in the new prepare_multi_pack_index_one() which is called during prepare_packed_git(). This check is run once per repository. Add comparison commands in t5319-multi-pack-index.sh to check typical Git behavior remains the same as the config setting is turned on and off. This currently includes 'git rev-list' and 'git log' commands to trigger several object database reads. Currently, these would only catch an error in the prepare_multi_pack_index_one(), but with later commits will catch errors in object lookups, abbreviations, and approximate object counts. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20midx: write object offsetsLibravatar Derrick Stolee1-10/+39
The final pair of chunks for the multi-pack-index file stores the object offsets. We default to using 32-bit offsets as in the pack-index version 1 format, but if there exists an offset larger than 32-bits, we use a trick similar to the pack-index version 2 format by storing all offsets at least 2^31 in a 64-bit table; we use the 32-bit table to point into that 64-bit table as necessary. We only store these 64-bit offsets if necessary, so create a test that manipulates a version 2 pack-index to fake a large offset. This allows us to test that the large offset table is created, but the data does not match the actual packfile offsets. The multi-pack-index offset does match the (corrupted) pack-index offset, so a future feature will compare these offsets during a 'verify' step. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20midx: write object id fanout chunkLibravatar Derrick Stolee1-7/+9
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20midx: write object ids in a chunkLibravatar Derrick Stolee1-2/+2
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20midx: read pack names into arrayLibravatar Derrick Stolee1-5/+12
Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20multi-pack-index: write pack names in chunkLibravatar Derrick Stolee1-1/+2
The multi-pack-index needs to track which packfiles it indexes. Store these in our first required chunk. Since filenames are not well structured, add padding to keep good alignment in later chunks. Modify the 'git multi-pack-index read' subcommand to output the existence of the pack-file name chunk. Modify t5319-multi-pack-index.sh to reflect this new output and the new expected number of chunks. Defense in depth: A pattern we are using in the multi-pack-index feature is to verify the data as we write it. We want to ensure we never write invalid data to the multi-pack-index. There are many checks that verify that the values we are writing fit the format definitions. This mainly helps developers while working on the feature, but it can also identify issues that only appear when dealing with very large data sets. These large sets are hard to encode into test cases. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20multi-pack-index: read packfile listLibravatar Derrick Stolee1-7/+8
When constructing a multi-pack-index file for a given object directory, read the files within the enclosed pack directory and find matches that end with ".idx" and find the correct paired packfile using add_packed_git(). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20t5319: expand test dataLibravatar Derrick Stolee1-0/+84
As we build the multi-pack-index file format, we want to test the format on real repositories. Add tests that create repository data including multiple packfiles with both version 1 and version 2 formats. The current 'git multi-pack-index write' command will always write the same file with no "real" data. This will be expanded in future commits, along with the test expectations. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20multi-pack-index: load into memoryLibravatar Derrick Stolee1-1/+10
Create a new multi_pack_index struct for loading multi-pack-indexes into memory. Create a test-tool builtin for reading basic information about that multi-pack-index to verify the correct data is written. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20midx: write header information to lockfileLibravatar Derrick Stolee1-1/+3
As we begin writing the multi-pack-index format to disk, start with the basics: the 12-byte header and the 20-byte checksum footer. Start with these basics so we can add the rest of the format in small increments. As we implement the format, we will use a technique to check that our computed offsets within the multi-pack-index file match what we are actually writing. Each method that writes to the hashfile will return the number of bytes written, and we will track that those values match our expectations. Currently, write_midx_header() returns 12, but is not checked. We will check the return value in a later commit. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-20multi-pack-index: add 'write' verbLibravatar Derrick Stolee1-0/+10
In anticipation of writing multi-pack-indexes, add a skeleton 'git multi-pack-index write' subcommand and send the options to a write_midx_file() method. Also create a skeleton test script that tests the 'write' subcommand. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>