From 2057d75038541cd16debb1c55f3f897fd244965c Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Thu, 17 Sep 2020 18:11:42 +0000 Subject: maintenance: create basic maintenance runner The 'gc' builtin is our current entrypoint for automatically maintaining a repository. This one tool does many operations, such as repacking the repository, packing refs, and rewriting the commit-graph file. The name implies it performs "garbage collection" which means several different things, and some users may not want to use this operation that rewrites the entire object database. Create a new 'maintenance' builtin that will become a more general- purpose command. To start, it will only support the 'run' subcommand, but will later expand to add subcommands for scheduling maintenance in the background. For now, the 'maintenance' builtin is a thin shim over the 'gc' builtin. In fact, the only option is the '--auto' toggle, which is handed directly to the 'gc' builtin. The current change is isolated to this simple operation to prevent more interesting logic from being lost in all of the boilerplate of adding a new builtin. Use existing builtin/gc.c file because we want to share code between the two builtins. It is possible that we will have 'maintenance' replace the 'gc' builtin entirely at some point, leaving 'git gc' as an alias for some specific arguments to 'git maintenance run'. Create a new test_subcommand helper that allows us to test if a certain subcommand was run. It requires storing the GIT_TRACE2_EVENT logs in a file. A negation mode is available that will be used in later tests. Helped-by: Jonathan Nieder Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- t/t7900-maintenance.sh | 23 +++++++++++++++++++++++ t/test-lib-functions.sh | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 56 insertions(+) create mode 100755 t/t7900-maintenance.sh (limited to 't') diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh new file mode 100755 index 0000000000..c2f0b1d0c0 --- /dev/null +++ b/t/t7900-maintenance.sh @@ -0,0 +1,23 @@ +#!/bin/sh + +test_description='git maintenance builtin' + +. ./test-lib.sh + +test_expect_success 'help text' ' + test_expect_code 129 git maintenance -h 2>err && + test_i18ngrep "usage: git maintenance run" err && + test_expect_code 128 git maintenance barf 2>err && + test_i18ngrep "invalid subcommand: barf" err && + test_expect_code 129 git maintenance 2>err && + test_i18ngrep "usage: git maintenance" err +' + +test_expect_success 'run [--auto]' ' + GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" git maintenance run && + GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" git maintenance run --auto && + test_subcommand git gc ... < +# +# For example, to look for an invocation of "git upload-pack +# /path/to/repo" +# +# GIT_TRACE2_EVENT=event.log git fetch ... && +# test_subcommand git upload-pack "$PATH" Date: Thu, 17 Sep 2020 18:11:43 +0000 Subject: maintenance: add --quiet option Maintenance activities are commonly used as steps in larger scripts. Providing a '--quiet' option allows those scripts to be less noisy when run on a terminal window. Turn this mode on by default when stderr is not a terminal. Pipe the option to the 'git gc' child process. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- t/t7900-maintenance.sh | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) (limited to 't') diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index c2f0b1d0c0..5637053bf6 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -13,11 +13,16 @@ test_expect_success 'help text' ' test_i18ngrep "usage: git maintenance" err ' -test_expect_success 'run [--auto]' ' - GIT_TRACE2_EVENT="$(pwd)/run-no-auto.txt" git maintenance run && - GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" git maintenance run --auto && - test_subcommand git gc /dev/null && + GIT_TRACE2_EVENT="$(pwd)/run-auto.txt" \ + git maintenance run --auto 2>/dev/null && + GIT_TRACE2_EVENT="$(pwd)/run-no-quiet.txt" \ + git maintenance run --no-quiet 2>/dev/null && + test_subcommand git gc --quiet Date: Thu, 17 Sep 2020 18:11:44 +0000 Subject: maintenance: replace run_auto_gc() The run_auto_gc() method is used in several places to trigger a check for repo maintenance after some Git commands, such as 'git commit' or 'git fetch'. To allow for extra customization of this maintenance activity, replace the 'git gc --auto [--quiet]' call with one to 'git maintenance run --auto [--quiet]'. As we extend the maintenance builtin with other steps, users will be able to select different maintenance activities. Rename run_auto_gc() to run_auto_maintenance() to be clearer what is happening on this call, and to expose all callers in the current diff. Rewrite the method to use a struct child_process to simplify the calls slightly. Since 'git fetch' already allows disabling the 'git gc --auto' subprocess, add an equivalent option with a different name to be more descriptive of the new behavior: '--[no-]maintenance'. Update the documentation to include these options at the same time. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- t/t5510-fetch.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 't') diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh index 2a1abe91f0..a9682c5669 100755 --- a/t/t5510-fetch.sh +++ b/t/t5510-fetch.sh @@ -934,7 +934,7 @@ test_expect_success 'fetching with auto-gc does not lock up' ' git config fetch.unpackLimit 1 && git config gc.autoPackLimit 1 && git config gc.autoDetach false && - GIT_ASK_YESNO="$D/askyesno" git fetch >fetch.out 2>&1 && + GIT_ASK_YESNO="$D/askyesno" git fetch --verbose >fetch.out 2>&1 && test_i18ngrep "Auto packing the repository" fetch.out && ! grep "Should I try again" fetch.out ) -- cgit v1.2.3 From 663b2b1b90bf76275044824ddeca96aaec240f09 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Thu, 17 Sep 2020 18:11:46 +0000 Subject: maintenance: add commit-graph task The first new task in the 'git maintenance' builtin is the 'commit-graph' task. This updates the commit-graph file incrementally with the command git commit-graph write --reachable --split By writing an incremental commit-graph file using the "--split" option we minimize the disruption from this operation. The default behavior is to merge layers until the new "top" layer is less than half the size of the layer below. This provides quick writes most of the time, with the longer writes following a power law distribution. Most importantly, concurrent Git processes only look at the commit-graph-chain file for a very short amount of time, so they will verly likely not be holding a handle to the file when we try to replace it. (This only matters on Windows.) If a concurrent process reads the old commit-graph-chain file, but our job expires some of the .graph files before they can be read, then those processes will see a warning message (but not fail). This could be avoided by a future update to use the --expire-time argument when writing the commit-graph. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- t/t7900-maintenance.sh | 2 ++ 1 file changed, 2 insertions(+) (limited to 't') diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 5637053bf6..505a1b4d60 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -4,6 +4,8 @@ test_description='git maintenance builtin' . ./test-lib.sh +GIT_TEST_COMMIT_GRAPH=0 + test_expect_success 'help text' ' test_expect_code 129 git maintenance -h 2>err && test_i18ngrep "usage: git maintenance run" err && -- cgit v1.2.3 From 090511bc0b73bf3054f11df6f275ad3a3abbf34f Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Thu, 17 Sep 2020 18:11:47 +0000 Subject: maintenance: add --task option A user may want to only run certain maintenance tasks in a certain order. Add the --task= option, which allows a user to specify an ordered list of tasks to run. These cannot be run multiple times, however. Here is where our array of maintenance_task pointers becomes critical. We can sort the array of pointers based on the task order, but we do not want to move the struct data itself in order to preserve the hashmap references. We use the hashmap to match the --task= arguments into the task struct data. Keep in mind that the 'enabled' member of the maintenance_task struct is a placeholder for a future 'maintenance..enabled' config option. Thus, we use the 'enabled' member to specify which tasks are run when the user does not specify any --task= arguments. The 'enabled' member should be ignored if --task= appears. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- t/t7900-maintenance.sh | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) (limited to 't') diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 505a1b4d60..fb4cadd30c 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -27,4 +27,31 @@ test_expect_success 'run [--auto|--quiet]' ' test_subcommand git gc --no-quiet ' ' + GIT_TRACE2_EVENT="$(pwd)/run-commit-graph.txt" \ + git maintenance run --task=commit-graph 2>/dev/null && + GIT_TRACE2_EVENT="$(pwd)/run-gc.txt" \ + git maintenance run --task=gc 2>/dev/null && + GIT_TRACE2_EVENT="$(pwd)/run-commit-graph.txt" \ + git maintenance run --task=commit-graph 2>/dev/null && + GIT_TRACE2_EVENT="$(pwd)/run-both.txt" \ + git maintenance run --task=commit-graph --task=gc 2>/dev/null && + test_subcommand ! git gc --quiet err && + test_i18ngrep "is not a valid task" err +' + +test_expect_success 'run --task duplicate' ' + test_must_fail git maintenance run --task=gc --task=gc 2>err && + test_i18ngrep "cannot be selected multiple times" err +' + test_done -- cgit v1.2.3 From 65d655b52df3d4db0b4980f60106ed9785d025f8 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Thu, 17 Sep 2020 18:11:49 +0000 Subject: maintenance: create maintenance..enabled config Currently, a normal run of "git maintenance run" will only run the 'gc' task, as it is the only one enabled. This is mostly for backwards- compatible reasons since "git maintenance run --auto" commands replaced previous "git gc --auto" commands after some Git processes. Users could manually run specific maintenance tasks by calling "git maintenance run --task=" directly. Allow users to customize which steps are run automatically using config. The 'maintenance..enabled' option then can turn on these other tasks (or turn off the 'gc' task). Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- t/t7900-maintenance.sh | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 't') diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index fb4cadd30c..8a162a18ba 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -27,6 +27,14 @@ test_expect_success 'run [--auto|--quiet]' ' test_subcommand git gc --no-quiet .enabled' ' + git config maintenance.gc.enabled false && + git config maintenance.commit-graph.enabled true && + GIT_TRACE2_EVENT="$(pwd)/run-config.txt" git maintenance run 2>err && + test_subcommand ! git gc --quiet ' ' GIT_TRACE2_EVENT="$(pwd)/run-commit-graph.txt" \ git maintenance run --task=commit-graph 2>/dev/null && -- cgit v1.2.3 From 916d0626c202c18e6d615abb69ba1f15b335c3ea Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Thu, 17 Sep 2020 18:11:50 +0000 Subject: maintenance: use pointers to check --auto The 'git maintenance run' command has an '--auto' option. This is used by other Git commands such as 'git commit' or 'git fetch' to check if maintenance should be run after adding data to the repository. Previously, this --auto option was only used to add the argument to the 'git gc' command as part of the 'gc' task. We will be expanding the other tasks to perform a check to see if they should do work as part of the --auto flag, when they are enabled by config. First, update the 'gc' task to perform the auto check inside the maintenance process. This prevents running an extra 'git gc --auto' command when not needed. It also shows a model for other tasks. Second, use the 'auto_condition' function pointer as a signal for whether we enable the maintenance task under '--auto'. For instance, we do not want to enable the 'fetch' task in '--auto' mode, so that function pointer will remain NULL. Now that we are not automatically calling 'git gc', a test in t5514-fetch-multiple.sh must be changed to watch for 'git maintenance' instead. We continue to pass the '--auto' option to the 'git gc' command when necessary, because of the gc.autoDetach config option changes behavior. Likely, we will want to absorb the daemonizing behavior implied by gc.autoDetach as a maintenance.autoDetach config option. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- t/t5514-fetch-multiple.sh | 2 +- t/t7900-maintenance.sh | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) (limited to 't') diff --git a/t/t5514-fetch-multiple.sh b/t/t5514-fetch-multiple.sh index de8e2f1531..bd202ec6f3 100755 --- a/t/t5514-fetch-multiple.sh +++ b/t/t5514-fetch-multiple.sh @@ -108,7 +108,7 @@ test_expect_success 'git fetch --multiple (two remotes)' ' GIT_TRACE=1 git fetch --multiple one two 2>trace && git branch -r > output && test_cmp ../expect output && - grep "built-in: git gc" trace >gc && + grep "built-in: git maintenance" trace >gc && test_line_count = 1 gc ) ' diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh index 8a162a18ba..53c883531e 100755 --- a/t/t7900-maintenance.sh +++ b/t/t7900-maintenance.sh @@ -23,7 +23,7 @@ test_expect_success 'run [--auto|--quiet]' ' GIT_TRACE2_EVENT="$(pwd)/run-no-quiet.txt" \ git maintenance run --no-quiet 2>/dev/null && test_subcommand git gc --quiet