From 81efa161546d8fc9a89f2565d912137df8cc6c89 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Mon, 10 Jun 2019 16:35:22 -0700 Subject: Docs: rearrange subcommands for multi-pack-index MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We will add new subcommands to the multi-pack-index, and that will make the documentation a bit messier. Clean up the 'verb' descriptions by renaming the concept to 'subcommand' and removing the reference to the object directory. Helped-by: Stefan Beller Helped-by: Szeder Gábor Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- Documentation/git-multi-pack-index.txt | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) (limited to 'Documentation/git-multi-pack-index.txt') diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index f7778a2c85..1af406aca2 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -9,7 +9,7 @@ git-multi-pack-index - Write and verify multi-pack-indexes SYNOPSIS -------- [verse] -'git multi-pack-index' [--object-dir=] +'git multi-pack-index' [--object-dir=] DESCRIPTION ----------- @@ -23,13 +23,13 @@ OPTIONS `/packs/multi-pack-index` for the current MIDX file, and `/packs` for the pack-files to index. +The following subcommands are available: + write:: - When given as the verb, write a new MIDX file to - `/packs/multi-pack-index`. + Write a new MIDX file. verify:: - When given as the verb, verify the contents of the MIDX file - at `/packs/multi-pack-index`. + Verify the contents of the MIDX file. EXAMPLES -- cgit v1.2.3 From cff97116160e57741c9a954a51bfcb0057b3e89d Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Mon, 10 Jun 2019 16:35:23 -0700 Subject: multi-pack-index: prepare for 'expire' subcommand The multi-pack-index tracks objects in a collection of pack-files. Only one copy of each object is indexed, using the modified time of the pack-files to determine tie-breakers. It is possible to have a pack-file with no referenced objects because all objects have a duplicate in a newer pack-file. Introduce a new 'expire' subcommand to the multi-pack-index builtin. This subcommand will delete these unused pack-files and rewrite the multi-pack-index to no longer refer to those files. More details about the specifics will follow as the method is implemented. Add a test that verifies the 'expire' subcommand is correctly wired, but will still be valid when the verb is implemented. Specifically, create a set of packs that should all have referenced objects and should not be removed during an 'expire' operation. The packs are created carefully to ensure they have a specific order when sorted by size. This will be important in a later test. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- Documentation/git-multi-pack-index.txt | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'Documentation/git-multi-pack-index.txt') diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index 1af406aca2..6186c4c936 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -31,6 +31,11 @@ write:: verify:: Verify the contents of the MIDX file. +expire:: + Delete the pack-files that are tracked by the MIDX file, but + have no objects referenced by the MIDX. Rewrite the MIDX file + afterward to remove all references to these pack-files. + EXAMPLES -------- -- cgit v1.2.3 From 2af890bb28013ae9edf8f5de7873fac849d39c32 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Mon, 10 Jun 2019 16:35:26 -0700 Subject: multi-pack-index: prepare 'repack' subcommand In an environment where the multi-pack-index is useful, it is due to many pack-files and an inability to repack the object store into a single pack-file. However, it is likely that many of these pack-files are rather small, and could be repacked into a slightly larger pack-file without too much effort. It may also be important to ensure the object store is highly available and the repack operation does not interrupt concurrent git commands. Introduce a 'repack' subcommand to 'git multi-pack-index' that takes a '--batch-size' option. The subcommand will inspect the multi-pack-index for referenced pack-files whose size is smaller than the batch size, until collecting a list of pack-files whose sizes sum to larger than the batch size. Then, a new pack-file will be created containing the objects from those pack-files that are referenced by the multi-pack-index. The resulting pack is likely to actually be smaller than the batch size due to compression and the fact that there may be objects in the pack- files that have duplicate copies in other pack-files. The current change introduces the command-line arguments, and we add a test that ensures we parse these options properly. Since we specify a small batch size, we will guarantee that future implementations do not change the list of pack-files. In addition, we hard-code the modified times of the packs in the pack directory to ensure the list of packs sorted by modified time matches the order if sorted by size (ascending). This will be important in a future test. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- Documentation/git-multi-pack-index.txt | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) (limited to 'Documentation/git-multi-pack-index.txt') diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index 6186c4c936..233b2b7862 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -36,6 +36,23 @@ expire:: have no objects referenced by the MIDX. Rewrite the MIDX file afterward to remove all references to these pack-files. +repack:: + Create a new pack-file containing objects in small pack-files + referenced by the multi-pack-index. If the size given by the + `--batch-size=` argument is zero, then create a pack + containing all objects referenced by the multi-pack-index. For + a non-zero batch size, Select the pack-files by examining packs + from oldest-to-newest, computing the "expected size" by counting + the number of objects in the pack referenced by the + multi-pack-index, then divide by the total number of objects in + the pack and multiply by the pack size. We select packs with + expected size below the batch size until the set of packs have + total expected size at least the batch size. If the total size + does not reach the batch size, then do nothing. If a new pack- + file is created, rewrite the multi-pack-index to reference the + new pack-file. A later run of 'git multi-pack-index expire' will + delete the pack-files that were part of this batch. + EXAMPLES -------- -- cgit v1.2.3