diff options
Diffstat (limited to 'Documentation')
75 files changed, 1573 insertions, 382 deletions
diff --git a/Documentation/MyFirstContribution.txt b/Documentation/MyFirstContribution.txt index 7c9a037cc2..af0a9da62e 100644 --- a/Documentation/MyFirstContribution.txt +++ b/Documentation/MyFirstContribution.txt @@ -664,7 +664,7 @@ mention the right animal somewhere: ---- test_expect_success 'runs correctly with no args and good output' ' git psuh >actual && - test_i18ngrep Pony actual + grep Pony actual ' ---- diff --git a/Documentation/RelNotes/2.17.6.txt b/Documentation/RelNotes/2.17.6.txt new file mode 100644 index 0000000000..2f181e8064 --- /dev/null +++ b/Documentation/RelNotes/2.17.6.txt @@ -0,0 +1,16 @@ +Git v2.17.6 Release Notes +========================= + +This release addresses the security issues CVE-2021-21300. + +Fixes since v2.17.5 +------------------- + + * CVE-2021-21300: + On case-insensitive file systems with support for symbolic links, + if Git is configured globally to apply delay-capable clean/smudge + filters (such as Git LFS), Git could be fooled into running + remote code during a clone. + +Credit for finding and fixing this vulnerability goes to Matheus +Tavares, helped by Johannes Schindelin. diff --git a/Documentation/RelNotes/2.18.5.txt b/Documentation/RelNotes/2.18.5.txt new file mode 100644 index 0000000000..dfb1de4ceb --- /dev/null +++ b/Documentation/RelNotes/2.18.5.txt @@ -0,0 +1,6 @@ +Git v2.18.5 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6 to address +the security issue CVE-2021-21300; see the release notes for that +version for details. diff --git a/Documentation/RelNotes/2.19.6.txt b/Documentation/RelNotes/2.19.6.txt new file mode 100644 index 0000000000..bcca6cd258 --- /dev/null +++ b/Documentation/RelNotes/2.19.6.txt @@ -0,0 +1,6 @@ +Git v2.19.6 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6 and +v2.18.5 to address the security issue CVE-2021-21300; see the +release notes for these versions for details. diff --git a/Documentation/RelNotes/2.20.5.txt b/Documentation/RelNotes/2.20.5.txt new file mode 100644 index 0000000000..1dfb784ded --- /dev/null +++ b/Documentation/RelNotes/2.20.5.txt @@ -0,0 +1,6 @@ +Git v2.20.5 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6, v2.18.5 +and v2.19.6 to address the security issue CVE-2021-21300; see +the release notes for these versions for details. diff --git a/Documentation/RelNotes/2.21.4.txt b/Documentation/RelNotes/2.21.4.txt new file mode 100644 index 0000000000..0089dd6702 --- /dev/null +++ b/Documentation/RelNotes/2.21.4.txt @@ -0,0 +1,6 @@ +Git v2.21.4 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6, v2.18.5, +v2.19.6 and v2.20.5 to address the security issue CVE-2021-21300; +see the release notes for these versions for details. diff --git a/Documentation/RelNotes/2.22.5.txt b/Documentation/RelNotes/2.22.5.txt new file mode 100644 index 0000000000..6b280d9321 --- /dev/null +++ b/Documentation/RelNotes/2.22.5.txt @@ -0,0 +1,7 @@ +Git v2.22.5 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6, +v2.18.5, v2.19.6, v2.20.5 and v2.21.4 to address the security +issue CVE-2021-21300; see the release notes for these versions +for details. diff --git a/Documentation/RelNotes/2.23.4.txt b/Documentation/RelNotes/2.23.4.txt new file mode 100644 index 0000000000..6e5424d0da --- /dev/null +++ b/Documentation/RelNotes/2.23.4.txt @@ -0,0 +1,7 @@ +Git v2.23.4 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6, v2.18.5, +v2.19.6, v2.20.5, v2.21.4 and v2.22.5 to address the security +issue CVE-2021-21300; see the release notes for these versions +for details. diff --git a/Documentation/RelNotes/2.24.4.txt b/Documentation/RelNotes/2.24.4.txt new file mode 100644 index 0000000000..4e216eec2a --- /dev/null +++ b/Documentation/RelNotes/2.24.4.txt @@ -0,0 +1,7 @@ +Git v2.24.4 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6, v2.18.5, +v2.19.6, v2.20.5, v2.21.4, v2.22.5 and v2.23.4 to address the +security issue CVE-2021-21300; see the release notes for these +versions for details. diff --git a/Documentation/RelNotes/2.25.5.txt b/Documentation/RelNotes/2.25.5.txt new file mode 100644 index 0000000000..fcb9566b15 --- /dev/null +++ b/Documentation/RelNotes/2.25.5.txt @@ -0,0 +1,7 @@ +Git v2.25.5 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6, v2.18.5, +v2.19.6, v2.20.5, v2.21.4, v2.22.5, v2.23.4 and v2.24.4 to address +the security issue CVE-2021-21300; see the release notes for +these versions for details. diff --git a/Documentation/RelNotes/2.26.3.txt b/Documentation/RelNotes/2.26.3.txt new file mode 100644 index 0000000000..4111c38f0a --- /dev/null +++ b/Documentation/RelNotes/2.26.3.txt @@ -0,0 +1,7 @@ +Git v2.26.3 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6, v2.18.5, +v2.19.6, v2.20.5, v2.21.4, v2.22.5, v2.23.4, v2.24.4 and v2.25.5 +to address the security issue CVE-2021-21300; see the release +notes for these versions for details. diff --git a/Documentation/RelNotes/2.27.1.txt b/Documentation/RelNotes/2.27.1.txt new file mode 100644 index 0000000000..a1e08a9f72 --- /dev/null +++ b/Documentation/RelNotes/2.27.1.txt @@ -0,0 +1,7 @@ +Git v2.27.1 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6, v2.18.5, +v2.19.6, v2.20.5, v2.21.4, v2.22.5, v2.23.4, v2.24.4, v2.25.5 +and v2.26.3 to address the security issue CVE-2021-21300; see +the release notes for these versions for details. diff --git a/Documentation/RelNotes/2.28.1.txt b/Documentation/RelNotes/2.28.1.txt new file mode 100644 index 0000000000..8484c8297c --- /dev/null +++ b/Documentation/RelNotes/2.28.1.txt @@ -0,0 +1,7 @@ +Git v2.28.1 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6, v2.18.5, +v2.19.6, v2.20.5, v2.21.4, v2.22.5, v2.23.4, v2.24.4, v2.25.5, +v2.26.3 and v2.27.1 to address the security issue CVE-2021-21300; +see the release notes for these versions for details. diff --git a/Documentation/RelNotes/2.29.3.txt b/Documentation/RelNotes/2.29.3.txt new file mode 100644 index 0000000000..e10eedb35a --- /dev/null +++ b/Documentation/RelNotes/2.29.3.txt @@ -0,0 +1,8 @@ +Git v2.29.3 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6, +v2.18.5, v2.19.6, v2.20.5, v2.21.4, v2.22.5, v2.23.4, v2.24.4, +v2.25.5, v2.26.3, v2.27.1 and v2.28.1 to address the security +issue CVE-2021-21300; see the release notes for these versions +for details. diff --git a/Documentation/RelNotes/2.30.1.txt b/Documentation/RelNotes/2.30.1.txt new file mode 100644 index 0000000000..249ef1492f --- /dev/null +++ b/Documentation/RelNotes/2.30.1.txt @@ -0,0 +1,55 @@ +Git v2.30.1 Release Notes +========================= + +This release is primarily to merge fixes accumulated on the 'master' +front to prepare for 2.31 release that are still relevant to 2.30.x +maintenance track. + +Fixes since v2.30 +----------------- + + * "git fetch --recurse-submodules" failed to update a submodule + when it has an uninitialized (hence of no interest to the user) + sub-submodule, which has been corrected. + + * Command line error of "git rebase" are diagnosed earlier. + + * "git stash" did not work well in a sparsely checked out working + tree. + + * Some tests expect that "ls -l" output has either '-' or 'x' for + group executable bit, but setgid bit can be inherited from parent + directory and make these fields 'S' or 's' instead, causing test + failures. + + * "git for-each-repo --config=<var> <cmd>" should not run <cmd> for + any repository when the configuration variable <var> is not defined + even once. + + * "git mergetool --tool-help" was broken in 2.29 and failed to list + all the available tools. + + * Fix for procedure to building CI test environment for mac. + + * Newline characters in the host and path part of git:// URL are + now forbidden. + + * When more than one commit with the same patch ID appears on one + side, "git log --cherry-pick A...B" did not exclude them all when a + commit with the same patch ID appears on the other side. Now it + does. + + * Documentation for "git fsck" lost stale bits that has become + incorrect. + + * Doc for packfile URI feature has been clarified. + + * The implementation of "git branch --sort" wrt the detached HEAD + display has always been hacky, which has been cleaned up. + + * Our setting of GitHub CI test jobs were a bit too eager to give up + once there is even one failure found. Tweak the knob to allow + other jobs keep running even when we see a failure, so that we can + find more failures in a single run. + +Also contains minor documentation updates and code clean-ups. diff --git a/Documentation/RelNotes/2.30.2.txt b/Documentation/RelNotes/2.30.2.txt new file mode 100644 index 0000000000..bada398501 --- /dev/null +++ b/Documentation/RelNotes/2.30.2.txt @@ -0,0 +1,8 @@ +Git v2.30.2 Release Notes +========================= + +This release merges up the fixes that appear in v2.17.6, v2.18.5, +v2.19.6, v2.20.5, v2.21.4, v2.22.5, v2.23.4, v2.24.4, v2.25.5, +v2.26.3, v2.27.1, v2.28.1 and v2.29.3 to address the security +issue CVE-2021-21300; see the release notes for these versions +for details. diff --git a/Documentation/RelNotes/2.31.0.txt b/Documentation/RelNotes/2.31.0.txt index 7f53db7de9..cf0c7d8d40 100644 --- a/Documentation/RelNotes/2.31.0.txt +++ b/Documentation/RelNotes/2.31.0.txt @@ -14,6 +14,10 @@ Backward incompatible and other important changes * The development community has adopted Contributor Covenant v2.0 to update from v1.4 that we have been using. + * The support for deprecated PCRE1 library has been dropped. + + * Fixes for CVE-2021-21300 in Git 2.30.2 (and earlier) is included. + UI, Workflows & Features @@ -48,6 +52,62 @@ UI, Workflows & Features standard input. Also, it now does not lose refs whey they point at the same object. + * "git log" learned a new "--diff-merges=<how>" option. + + * "git ls-files" can and does show multiple entries when the index is + unmerged, which is a source for confusion unless -s/-u option is in + use. A new option --deduplicate has been introduced. + + * `git worktree list` now annotates worktrees as prunable, shows + locked and prunable attributes in --porcelain mode, and gained + a --verbose option. + + * "git clone" tries to locally check out the branch pointed at by + HEAD of the remote repository after it is done, but the protocol + did not convey the information necessary to do so when copying an + empty repository. The protocol v2 learned how to do so. + + * There are other ways than ".." for a single token to denote a + "commit range", namely "<rev>^!" and "<rev>^-<n>", but "git + range-diff" did not understand them. + + * The "git range-diff" command learned "--(left|right)-only" option + to show only one side of the compared range. + + * "git mergetool" feeds three versions (base, local and remote) of + a conflicted path unmodified. The command learned to optionally + prepare these files with unconflicted parts already resolved. + + * The .mailmap is documented to be read only from the root level of a + working tree, but a stray file in a bare repository also was read + by accident, which has been corrected. + + * "git maintenance" tool learned a new "pack-refs" maintenance task. + + * The error message given when a configuration variable that is + expected to have a boolean value has been improved. + + * Signed commits and tags now allow verification of objects, whose + two object names (one in SHA-1, the other in SHA-256) are both + signed. + + * "git rev-list" command learned "--disk-usage" option. + + * "git {diff,log} --{skip,rotate}-to=<path>" allows the user to + discard diff output for early paths or move them to the end of the + output. + + * "git difftool" learned "--skip-to=<path>" option to restart an + interrupted session from an arbitrary path. + + * "git grep" has been tweaked to be limited to the sparse checkout + paths. + + * "git rebase --[no-]fork-point" gained a configuration variable + rebase.forkPoint so that users do not have to keep specifying a + non-default setting. + + Performance, Internal Implementation, Development Support etc. * A 3-year old test that was not testing anything useful has been @@ -75,45 +135,130 @@ Performance, Internal Implementation, Development Support etc. * "git fetch" learns to treat ref updates atomically in all-or-none fashion, just like "git push" does, with the new "--atomic" option. + * The peel_ref() API has been replaced with peel_iterated_oid(). + + * The .use_shell flag in struct child_process that is passed to + run_command() API has been clarified with a bit more documentation. + + * Document, clean-up and optimize the code around the cache-tree + extension in the index. + + * The ls-refs protocol operation has been optimized to narrow the + sub-hierarchy of refs/ it walks to produce response. + + * When removing many branches and tags, the code used to do so one + ref at a time. There is another API it can use to delete multiple + refs, and it makes quite a lot of performance difference when the + refs are packed. + + * The "pack-objects" command needs to iterate over all the tags when + automatic tag following is enabled, but it actually iterated over + all refs and then discarded everything outside "refs/tags/" + hierarchy, which was quite wasteful. + + * A perf script was made more portable. + + * Our setting of GitHub CI test jobs were a bit too eager to give up + once there is even one failure found. Tweak the knob to allow + other jobs keep running even when we see a failure, so that we can + find more failures in a single run. + + * We've carried compatibility codepaths for compilers without + variadic macros for quite some time, but the world may be ready for + them to be removed. Force compilation failure on exotic platforms + where variadic macros are not available to find out who screams in + such a way that we can easily revert if it turns out that the world + is not yet ready. + + * Code clean-up to ensure our use of hashtables using object names as + keys use the "struct object_id" objects, not the raw hash values. + + * Lose the debugging aid that may have been useful in the past, but + no longer is, in the "grep" codepaths. + + * Some pretty-format specifiers do not need the data in commit object + (e.g. "%H"), but we were over-eager to load and parse it, which has + been made even lazier. + + * Get rid of "GETTEXT_POISON" support altogether, which may or may + not be controversial. + + * Introduce an on-disk file to record revindex for packdata, which + traditionally was always created on the fly and only in-core. + + * The commit-graph learned to use corrected commit dates instead of + the generation number to help topological revision traversal. + + * Piecemeal of rewrite of "git bisect" in C continues. + + * When a pager spawned by us exited, the trace log did not record its + exit status correctly, which has been corrected. + + * Removal of GIT_TEST_GETTEXT_POISON continues. + + * The code to implement "git merge-base --independent" was poorly + done and was kept from the very beginning of the feature. + + * Preliminary changes to fsmonitor integration. + + * Performance improvements for rename detection. + + * The common code to deal with "chunked file format" that is shared + by the multi-pack-index and commit-graph files have been factored + out, to help codepaths for both filetypes to become more robust. + + * The approach to "fsck" the incoming objects in "index-pack" is + attractive for performance reasons (we have them already in core, + inflated and ready to be inspected), but fundamentally cannot be + applied fully when we receive more than one pack stream, as a tree + object in one pack may refer to a blob object in another pack as + ".gitmodules", when we want to inspect blobs that are used as + ".gitmodules" file, for example. Teach "index-pack" to emit + objects that must be inspected later and check them in the calling + "fetch-pack" process. + + * The logic to handle "trailer" related placeholders in the + "--format=" mechanisms in the "log" family and "for-each-ref" + family is getting unified. + + * Raise the buffer size used when writing the index file out from + (obviously too small) 8kB to (clearly sufficiently large) 128kB. + + * It is reported that open() on some platforms (e.g. macOS Big Sur) + can return EINTR even though our timers are set up with SA_RESTART. + A workaround has been implemented and enabled for macOS to rerun + open() transparently from the caller when this happens. + Fixes since v2.30 ----------------- * Diagnose command line error of "git rebase" early. - (merge ca5120c339 rs/rebase-commit-validation later to maint). * Clean up option descriptions in "git cmd --help". - (merge e73fe3dd02 zh/arg-help-format later to maint). * "git stash" did not work well in a sparsely checked out working tree. - (merge ba359fd507 en/stash-apply-sparse-checkout later to maint). * Some tests expect that "ls -l" output has either '-' or 'x' for group executable bit, but setgid bit can be inherited from parent directory and make these fields 'S' or 's' instead, causing test failures. - (merge ea8bbf2a4e mt/t4129-with-setgid-dir later to maint). * "git for-each-repo --config=<var> <cmd>" should not run <cmd> for any repository when the configuration variable <var> is not defined even once. - (merge 6c62f01552 ds/for-each-repo-noopfix later to maint). * Fix 2.29 regression where "git mergetool --tool-help" fails to list all the available tools. - (merge 80f5a16798 pb/mergetool-tool-help-fix later to maint). * Fix for procedure to building CI test environment for mac. - (merge 3831132ace jc/macos-install-dependencies-fix later to maint). * The implementation of "git branch --sort" wrt the detached HEAD display has always been hacky, which has been cleaned up. - (merge 4045f659bd ab/branch-sort later to maint). * Newline characters in the host and path part of git:// URL are now forbidden. - (merge 6aed56736b jk/forbid-lf-in-git-url later to maint). * "git diff" showed a submodule working tree with untracked cruft as "Submodule commit <objectname>-dirty", but a natural expectation is @@ -125,24 +270,96 @@ Fixes since v2.30 side, "git log --cherry-pick A...B" did not exclude them all when a commit with the same patch ID appears on the other side. Now it does. - (merge c9e3a4e76d jk/log-cherry-pick-duplicate-patches later to maint). + + * Documentation for "git fsck" lost stale bits that has become + incorrect. + + * Doc fix for packfile URI feature. + + * When "git rebase -i" processes "fixup" insn, there is no reason to + clean up the commit log message, but we did the usual stripspace + processing. This has been corrected. + (merge f7d42ceec5 js/rebase-i-commit-cleanup-fix later to maint). + + * Fix in passing custom args from "git clone" to "upload-pack" on the + other side. + (merge ad6b5fefbd jv/upload-pack-filter-spec-quotefix later to maint). + + * The command line completion (in contrib/) completed "git branch -d" + with branch names, but "git branch -D" offered tagnames in addition, + which has been corrected. "git branch -M" had the same problem. + (merge 27dc071b9a jk/complete-branch-force-delete later to maint). + + * When commands are started from a subdirectory, they may have to + compare the path to the subdirectory (called prefix and found out + from $(pwd)) with the tracked paths. On macOS, $(pwd) and + readdir() yield decomposed path, while the tracked paths are + usually normalized to the precomposed form, causing mismatch. This + has been fixed by taking the same approach used to normalize the + command line arguments. + (merge 5c327502db tb/precompose-prefix-too later to maint). + + * Even though invocations of "die()" were logged to the trace2 + system, "BUG()"s were not, which has been corrected. + (merge 0a9dde4a04 jt/trace2-BUG later to maint). + + * "git grep --untracked" is meant to be "let's ALSO find in these + files on the filesystem" when looking for matches in the working + tree files, and does not make any sense if the primary search is + done against the index, or the tree objects. The "--cached" and + "--untracked" options have been marked as mutually incompatible. + (merge 0c5d83b248 mt/grep-cached-untracked later to maint). + + * Fix "git fsck --name-objects" which apparently has not been used by + anybody who is motivated enough to report breakage. + (merge e89f89361c js/fsck-name-objects-fix later to maint). + + * Avoid individual tests in t5411 from getting affected by each other + by forcing them to use separate output files during the test. + (merge 822ee894f6 jx/t5411-unique-filenames later to maint). + + * Test to make sure "git rev-parse one-thing one-thing" gives + the same thing twice (when one-thing is --since=X). + (merge a5cdca4520 ew/rev-parse-since-test later to maint). + + * When certain features (e.g. grafts) used in the repository are + incompatible with the use of the commit-graph, we used to silently + turned commit-graph off; we now tell the user what we are doing. + (merge c85eec7fc3 js/commit-graph-warning later to maint). + + * Objects that lost references can be pruned away, even when they + have notes attached to it (and these notes will become dangling, + which in turn can be pruned with "git notes prune"). This has been + clarified in the documentation. + (merge fa9ab027ba mz/doc-notes-are-not-anchors later to maint). + + * The error codepath around the "--temp/--prefix" feature of "git + checkout-index" has been improved. + (merge 3f7ba60350 mt/checkout-index-corner-cases later to maint). + + * The "git maintenance register" command had trouble registering bare + repositories, which had been corrected. + + * A handful of multi-word configuration variable names in + documentation that are spelled in all lowercase have been corrected + to use the more canonical camelCase. + (merge 7dd0eaa39c dl/doc-config-camelcase later to maint). + + * "git push $there --delete ''" should have been diagnosed as an + error, but instead turned into a matching push, which has been + corrected. + (merge 20e416409f jc/push-delete-nothing later to maint). + + * Test script modernization. + (merge 488acf15df sv/t7001-modernize later to maint). + + * An under-allocation for the untracked cache data has been corrected. + (merge 6347d649bc jh/untracked-cache-fix later to maint). * Other code cleanup, docfix, build fix, etc. - (merge 505a276596 pk/subsub-fetch-fix-take-2 later to maint). - (merge 33fc56253b fc/t6030-bisect-reset-removes-auxiliary-files later to maint). - (merge 7efc378205 ta/doc-typofix later to maint). - (merge 1f4e9319c7 pb/doc-modules-git-work-tree-typofix later to maint). - (merge 04f6b0a192 ma/t1300-cleanup later to maint). - (merge 7b77f5a13e ma/doc-pack-format-varint-for-sizes later to maint). - (merge cc2d43be2b nk/perf-fsmonitor-cleanup later to maint). - (merge c8302c6c00 ar/t6016-modernise later to maint). - (merge 0454986e78 jc/sign-off later to maint). - (merge 155067ab4f vv/send-email-with-less-secure-apps-access later to maint). - (merge acaabcf391 jk/t5516-deflake later to maint). - (merge a1e03535db ad/t4129-setfacl-target-fix later to maint). - (merge b356d23638 ug/doc-lose-dircache later to maint). - (merge 9371c0e9dd ab/gettext-charset-comment-fix later to maint). - (merge 52fc4f195c dl/p4-encode-after-kw-expansion later to maint). - (merge 4eb56b56e7 bc/doc-status-short later to maint). - (merge a4a1ca22ef tb/local-clone-race-doc later to maint). - (merge 6a8c89d053 ma/more-opaque-lock-file later to maint). + (merge e3f5da7e60 sg/t7800-difftool-robustify later to maint). + (merge 9d336655ba js/doc-proto-v2-response-end later to maint). + (merge 1b5b8cf072 jc/maint-column-doc-typofix later to maint). + (merge 3a837b58e3 cw/pack-config-doc later to maint). + (merge 01168a9d89 ug/doc-commit-approxidate later to maint). + (merge b865734760 js/params-vs-args later to maint). diff --git a/Documentation/RelNotes/2.32.0.txt b/Documentation/RelNotes/2.32.0.txt new file mode 100644 index 0000000000..aef141557b --- /dev/null +++ b/Documentation/RelNotes/2.32.0.txt @@ -0,0 +1,115 @@ +Git 2.32 Release Notes +====================== + +Backward compatibility notes +---------------------------- + + * ".gitattributes", ".gitignore", and ".mailmap" files that are + symbolic links are ignored. + + +Updates since v2.32 +------------------- + +UI, Workflows & Features + + * It does not make sense to make ".gitattributes", ".gitignore" and + ".mailmap" symlinks, as they are supposed to be usable from the + object store (think: bare repositories where HEAD:.mailmap etc. are + used). When these files are symbolic links, we used to read the + contents of the files pointed by them by mistake, which has been + corrected. + + * "git stash show" learned to optionally show untracked part of the + stash. + + * "git log --format='...'" learned "%(describe)" placeholder. + + * "git repack" so far has been only capable of repacking everything + under the sun into a single pack (or split by size). A cleverer + strategy to reduce the cost of repacking a repository has been + introduced. + + +Performance, Internal Implementation, Development Support etc. + + * Rename detection rework continues. + + * GIT_TEST_FAIL_PREREQS is a mechanism to skip test pieces with + prerequisites to catch broken tests that depend on the side effects + of optional pieces, but did not work at all when negative + prerequisites were involved. + (merge 27d578d904 jk/fail-prereq-testfix later to maint). + + * "git diff-index" codepath has been taught to trust fsmonitor status + to reduce number of lstat() calls. + (merge 7e5aa13d2c nk/diff-index-fsmonitor later to maint). + + +Fixes since v2.31 +----------------- + + * The fsmonitor interface read from its input without making sure + there is something to read from. This bug is new in 2.31 + timeframe. + (merge 097ea2c848 jh/fsmonitor-prework later to maint). + + * The data structure used by fsmonitor interface was not properly + duplicated during an in-core merge, leading to use-after-free etc. + (merge 4abc57848d js/fsmonitor-unpack-fix later to maint). + + * "git bisect" reimplemented more in C during 2.30 timeframe did not + take an annotated tag as a good/bad endpoint well. This regression + has been corrected. + (merge 7730f85594 jk/bisect-peel-tag-fix later to maint). + + * Fix macros that can silently inject unintended null-statements. + (merge 116affac3f rs/avoid-null-statement-after-macro-call later to maint). + + * CALLOC_ARRAY() macro replaces many uses of xcalloc(). + (merge 1c57cc70ec rs/calloc-array later to maint). + + * Update insn in Makefile comments to run fuzz-all target. + (merge 68b5c3aa48 ah/make-fuzz-all-doc-update later to maint). + + * Fix a corner case bug in "git mv" on case insensitive systems, + which was introduced in 2.29 timeframe. + (merge 93c3d297b5 tb/git-mv-icase-fix later to maint). + + * We had a code to diagnose and die cleanly when a required + clean/smudge filter is missing, but an assert before that + unnecessarily fired, hiding the end-user facing die() message. + (merge 6fab35f748 mt/cleanly-die-upon-missing-required-filter later to maint). + + * Update C code that sets a few configuration variables when a remote + is configured so that it spells configuration variable names in the + canonical camelCase. + (merge 0f1da600e6 ab/remote-write-config-in-camel-case later to maint). + + * A new configuration variable has been introduced to allow choosing + which version of the generation number gets used in the + commit-graph file. + (merge 702110aac6 ds/commit-graph-generation-config later to maint). + + * Perf test update to work better in secondary worktrees. + (merge 36e834abc1 jk/perf-in-worktrees later to maint). + + * Updates to memory allocation code around the use of pcre2 library. + (merge c1760352e0 ab/grep-pcre2-allocfix later to maint). + + * "git -c core.bare=false clone --bare ..." would have segfaulted, + which has been corrected. + (merge 75555676ad bc/clone-bare-with-conflicting-config later to maint). + + * Other code cleanup, docfix, build fix, etc. + (merge 486f4bd183 jc/calloc-fix later to maint). + (merge 5f70859c15 jt/clone-unborn-head later to maint). + (merge cfd409ed09 km/config-doc-typofix later to maint). + (merge 8588aa8657 jk/slimmed-down later to maint). + (merge 241b5d3ebe rs/xcalloc-takes-nelem-first later to maint). + (merge f451960708 dl/cat-file-doc-cleanup later to maint). + (merge 12604a8d0c sv/t9801-test-path-is-file-cleanup later to maint). + (merge ea7e63921c jr/doc-ignore-typofix later to maint). + (merge 23c781f173 ps/update-ref-trans-hook-doc later to maint). + (merge 42efa1231a jk/filter-branch-sha256 later to maint). + (merge 4c8e3dca6e tb/push-simple-uses-branch-merge-config later to maint). diff --git a/Documentation/blame-options.txt b/Documentation/blame-options.txt index dc3bceb6d1..117f4cf806 100644 --- a/Documentation/blame-options.txt +++ b/Documentation/blame-options.txt @@ -1,6 +1,6 @@ -b:: Show blank SHA-1 for boundary commits. This can also - be controlled via the `blame.blankboundary` config option. + be controlled via the `blame.blankBoundary` config option. --root:: Do not treat root commits as boundaries. This can also be diff --git a/Documentation/config.txt b/Documentation/config.txt index 6ba50b1104..bf82766a6a 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -46,7 +46,7 @@ Subsection names are case sensitive and can contain any characters except newline and the null byte. Doublequote `"` and backslash can be included by escaping them as `\"` and `\\`, respectively. Backslashes preceding other characters are dropped when reading; for example, `\t` is read as -`t` and `\0` is read as `0` Section headers cannot span multiple lines. +`t` and `\0` is read as `0`. Section headers cannot span multiple lines. Variables may belong directly to a section or to a given subsection. You can have `[section]` if you have `[section "subsection"]`, but you don't need to. @@ -398,6 +398,8 @@ include::config/interactive.txt[] include::config/log.txt[] +include::config/lsrefs.txt[] + include::config/mailinfo.txt[] include::config/mailmap.txt[] diff --git a/Documentation/config/commitgraph.txt b/Documentation/config/commitgraph.txt index 4582c39fc4..30604e4a4c 100644 --- a/Documentation/config/commitgraph.txt +++ b/Documentation/config/commitgraph.txt @@ -1,3 +1,9 @@ +commitGraph.generationVersion:: + Specifies the type of generation number version to use when writing + or reading the commit-graph file. If version 1 is specified, then + the corrected commit dates will not be written or read. Defaults to + 2. + commitGraph.maxNewFilters:: Specifies the default value for the `--max-new-filters` option of `git commit-graph write` (c.f., linkgit:git-commit-graph[1]). diff --git a/Documentation/config/init.txt b/Documentation/config/init.txt index dc77f8c844..79c79d6617 100644 --- a/Documentation/config/init.txt +++ b/Documentation/config/init.txt @@ -4,4 +4,4 @@ init.templateDir:: init.defaultBranch:: Allows overriding the default branch name e.g. when initializing - a new repository or when cloning an empty repository. + a new repository. diff --git a/Documentation/config/lsrefs.txt b/Documentation/config/lsrefs.txt new file mode 100644 index 0000000000..adeda0f24d --- /dev/null +++ b/Documentation/config/lsrefs.txt @@ -0,0 +1,9 @@ +lsrefs.unborn:: + May be "advertise" (the default), "allow", or "ignore". If "advertise", + the server will respond to the client sending "unborn" (as described in + protocol-v2.txt) and will advertise support for this feature during the + protocol v2 capability advertisement. "allow" is the same as + "advertise" except that the server will not advertise support for this + feature; this is useful for load-balanced servers that cannot be + updated atomically (for example), since the administrator could + configure "allow", then after a delay, configure "advertise". diff --git a/Documentation/config/maintenance.txt b/Documentation/config/maintenance.txt index a5ead09e4b..18f0562131 100644 --- a/Documentation/config/maintenance.txt +++ b/Documentation/config/maintenance.txt @@ -15,8 +15,9 @@ maintenance.strategy:: * `none`: This default setting implies no task are run at any schedule. * `incremental`: This setting optimizes for performing small maintenance activities that do not delete any data. This does not schedule the `gc` - task, but runs the `prefetch` and `commit-graph` tasks hourly and the - `loose-objects` and `incremental-repack` tasks daily. + task, but runs the `prefetch` and `commit-graph` tasks hourly, the + `loose-objects` and `incremental-repack` tasks daily, and the `pack-refs` + task weekly. maintenance.<task>.enabled:: This boolean config option controls whether the maintenance task diff --git a/Documentation/config/mergetool.txt b/Documentation/config/mergetool.txt index 16a27443a3..cafbbef46a 100644 --- a/Documentation/config/mergetool.txt +++ b/Documentation/config/mergetool.txt @@ -13,6 +13,11 @@ mergetool.<tool>.cmd:: merged; 'MERGED' contains the name of the file to which the merge tool should write the results of a successful merge. +mergetool.<tool>.hideResolved:: + Allows the user to override the global `mergetool.hideResolved` value + for a specific tool. See `mergetool.hideResolved` for the full + description. + mergetool.<tool>.trustExitCode:: For a custom merge command, specify whether the exit code of the merge command can be used to determine whether the merge was @@ -40,6 +45,16 @@ mergetool.meld.useAutoMerge:: value of `false` avoids using `--auto-merge` altogether, and is the default value. +mergetool.hideResolved:: + During a merge Git will automatically resolve as many conflicts as + possible and write the 'MERGED' file containing conflict markers around + any conflicts that it cannot resolve; 'LOCAL' and 'REMOTE' normally + represent the versions of the file from before Git's conflict + resolution. This flag causes 'LOCAL' and 'REMOTE' to be overwriten so + that only the unresolved conflicts are presented to the merge tool. Can + be configured per-tool via the `mergetool.<tool>.hideResolved` + configuration variable. Defaults to `false`. + mergetool.keepBackup:: After performing a merge, the original file with conflict markers can be saved as a file with a `.orig` extension. If this variable diff --git a/Documentation/config/pack.txt b/Documentation/config/pack.txt index 837f1b1679..3da4ea98e2 100644 --- a/Documentation/config/pack.txt +++ b/Documentation/config/pack.txt @@ -133,3 +133,10 @@ pack.writeBitmapHashCache:: between an older, bitmapped pack and objects that have been pushed since the last gc). The downside is that it consumes 4 bytes per object of disk space. Defaults to true. + +pack.writeReverseIndex:: + When true, git will write a corresponding .rev file (see: + link:../technical/pack-format.html[Documentation/technical/pack-format.txt]) + for each new packfile that it writes in all places except for + linkgit:git-fast-import[1] and in the bulk checkin mechanism. + Defaults to false. diff --git a/Documentation/config/rebase.txt b/Documentation/config/rebase.txt index 7f7a07d22f..214f31b451 100644 --- a/Documentation/config/rebase.txt +++ b/Documentation/config/rebase.txt @@ -68,3 +68,6 @@ rebase.rescheduleFailedExec:: Automatically reschedule `exec` commands that failed. This only makes sense in interactive mode (or when an `--exec` option was provided). This is the same as specifying the `--reschedule-failed-exec` option. + +rebase.forkPoint:: + If set to false set `--no-fork-point` option by default. diff --git a/Documentation/config/stash.txt b/Documentation/config/stash.txt index 00eb35434e..413f907cba 100644 --- a/Documentation/config/stash.txt +++ b/Documentation/config/stash.txt @@ -5,6 +5,11 @@ stash.useBuiltin:: is always used. Setting this will emit a warning, to alert any remaining users that setting this now does nothing. +stash.showIncludeUntracked:: + If this is set to true, the `git stash show` command without an + option will show the untracked files of a stash entry. Defaults to + false. See description of 'show' command in linkgit:git-stash[1]. + stash.showPatch:: If this is set to true, the `git stash show` command without an option will show the stash entry in patch form. Defaults to false. diff --git a/Documentation/date-formats.txt b/Documentation/date-formats.txt index f1097fac69..99c455f51c 100644 --- a/Documentation/date-formats.txt +++ b/Documentation/date-formats.txt @@ -1,10 +1,7 @@ DATE FORMATS ------------ -The `GIT_AUTHOR_DATE`, `GIT_COMMITTER_DATE` environment variables -ifdef::git-commit[] -and the `--date` option -endif::git-commit[] +The `GIT_AUTHOR_DATE` and `GIT_COMMITTER_DATE` environment variables support the following date formats: Git internal format:: @@ -26,3 +23,9 @@ ISO 8601:: + NOTE: In addition, the date part is accepted in the following formats: `YYYY.MM.DD`, `MM/DD/YYYY` and `DD.MM.YYYY`. + +ifdef::git-commit[] +In addition to recognizing all date formats above, the `--date` option +will also try to make sense of other, more human-centric date formats, +such as relative dates like "yesterday" or "last Friday at noon". +endif::git-commit[] diff --git a/Documentation/diff-generate-patch.txt b/Documentation/diff-generate-patch.txt index b10ff4caa6..2db8eacc3e 100644 --- a/Documentation/diff-generate-patch.txt +++ b/Documentation/diff-generate-patch.txt @@ -81,9 +81,9 @@ Combined diff format Any diff-generating command can take the `-c` or `--cc` option to produce a 'combined diff' when showing a merge. This is the default format when showing merges with linkgit:git-diff[1] or -linkgit:git-show[1]. Note also that you can give the `-m` option to any -of these commands to force generation of diffs with individual parents -of a merge. +linkgit:git-show[1]. Note also that you can give suitable +`--diff-merges` option to any of these commands to force generation of +diffs in specific format. A "combined diff" format looks like this: diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt index 746b144c76..aa2b5c11f2 100644 --- a/Documentation/diff-options.txt +++ b/Documentation/diff-options.txt @@ -33,6 +33,57 @@ endif::git-diff[] show the patch by default, or to cancel the effect of `--patch`. endif::git-format-patch[] +ifdef::git-log[] +--diff-merges=(off|none|first-parent|1|separate|m|combined|c|dense-combined|cc):: +--no-diff-merges:: + Specify diff format to be used for merge commits. Default is + {diff-merges-default} unless `--first-parent` is in use, in which case + `first-parent` is the default. ++ +--diff-merges=(off|none)::: +--no-diff-merges::: + Disable output of diffs for merge commits. Useful to override + implied value. ++ +--diff-merges=first-parent::: +--diff-merges=1::: + This option makes merge commits show the full diff with + respect to the first parent only. ++ +--diff-merges=separate::: +--diff-merges=m::: +-m::: + This makes merge commits show the full diff with respect to + each of the parents. Separate log entry and diff is generated + for each parent. `-m` doesn't produce any output without `-p`. ++ +--diff-merges=combined::: +--diff-merges=c::: +-c::: + With this option, diff output for a merge commit shows the + differences from each of the parents to the merge result + simultaneously instead of showing pairwise diff between a + parent and the result one at a time. Furthermore, it lists + only files which were modified from all parents. `-c` implies + `-p`. ++ +--diff-merges=dense-combined::: +--diff-merges=cc::: +--cc::: + With this option the output produced by + `--diff-merges=combined` is further compressed by omitting + uninteresting hunks whose contents in the parents have only + two variants and the merge result picks one of them without + modification. `--cc` implies `-p`. + +--combined-all-paths:: + This flag causes combined diffs (used for merge commits) to + list the name of the file from all parents. It thus only has + effect when `--diff-merges=[dense-]combined` is in use, and + is likely only useful if filename changes are detected (i.e. + when either rename or copy detection have been requested). +endif::git-log[] + -U<n>:: --unified=<n>:: Generate diffs with <n> lines of context instead of @@ -649,6 +700,14 @@ matches a pattern if removing any number of the final pathname components matches the pattern. For example, the pattern "`foo*bar`" matches "`fooasdfbar`" and "`foo/bar/baz/asdf`" but not "`foobarx`". +--skip-to=<file>:: +--rotate-to=<file>:: + Discard the files before the named <file> from the output + (i.e. 'skip to'), or move them to the end of the output + (i.e. 'rotate to'). These were invented primarily for use + of the `git difftool` command, and may not be very useful + otherwise. + ifndef::git-format-patch[] -R:: Swap two inputs; that is, show differences from index or diff --git a/Documentation/git-am.txt b/Documentation/git-am.txt index 06bc063542..decd8ae122 100644 --- a/Documentation/git-am.txt +++ b/Documentation/git-am.txt @@ -79,7 +79,7 @@ OPTIONS Pass `-u` flag to 'git mailinfo' (see linkgit:git-mailinfo[1]). The proposed commit log message taken from the e-mail is re-coded into UTF-8 encoding (configuration variable - `i18n.commitencoding` can be used to specify project's + `i18n.commitEncoding` can be used to specify project's preferred encoding if it is not UTF-8). + This was optional in prior versions of git, but now it is the diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt index adaa1782a8..94dc9a54f2 100644 --- a/Documentation/git-branch.txt +++ b/Documentation/git-branch.txt @@ -78,8 +78,8 @@ renaming. If <newbranch> exists, -M must be used to force the rename to happen. The `-c` and `-C` options have the exact same semantics as `-m` and -`-M`, except instead of the branch being renamed it along with its -config and reflog will be copied to a new name. +`-M`, except instead of the branch being renamed, it will be copied to a +new name, along with its config and reflog. With a `-d` or `-D` option, `<branchname>` will be deleted. You may specify more than one branch for deletion. If the branch currently @@ -153,7 +153,7 @@ OPTIONS --column[=<options>]:: --no-column:: Display branch listing in columns. See configuration variable - column.branch for option syntax.`--column` and `--no-column` + `column.branch` for option syntax. `--column` and `--no-column` without options are equivalent to 'always' and 'never' respectively. + This option is only applicable in non-verbose mode. diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt index 8e192d87db..4eb0421b3f 100644 --- a/Documentation/git-cat-file.txt +++ b/Documentation/git-cat-file.txt @@ -35,42 +35,42 @@ OPTIONS -t:: Instead of the content, show the object type identified by - <object>. + `<object>`. -s:: Instead of the content, show the object size identified by - <object>. + `<object>`. -e:: - Exit with zero status if <object> exists and is a valid - object. If <object> is of an invalid format exit with non-zero and + Exit with zero status if `<object>` exists and is a valid + object. If `<object>` is of an invalid format exit with non-zero and emits an error on stderr. -p:: - Pretty-print the contents of <object> based on its type. + Pretty-print the contents of `<object>` based on its type. <type>:: - Typically this matches the real type of <object> but asking + Typically this matches the real type of `<object>` but asking for a type that can trivially be dereferenced from the given - <object> is also permitted. An example is to ask for a - "tree" with <object> being a commit object that contains it, - or to ask for a "blob" with <object> being a tag object that + `<object>` is also permitted. An example is to ask for a + "tree" with `<object>` being a commit object that contains it, + or to ask for a "blob" with `<object>` being a tag object that points at it. --textconv:: Show the content as transformed by a textconv filter. In this case, - <object> has to be of the form <tree-ish>:<path>, or :<path> in + `<object>` has to be of the form `<tree-ish>:<path>`, or `:<path>` in order to apply the filter to the content recorded in the index at - <path>. + `<path>`. --filters:: Show the content as converted by the filters configured in - the current working tree for the given <path> (i.e. smudge filters, - end-of-line conversion, etc). In this case, <object> has to be of - the form <tree-ish>:<path>, or :<path>. + the current working tree for the given `<path>` (i.e. smudge filters, + end-of-line conversion, etc). In this case, `<object>` has to be of + the form `<tree-ish>:<path>`, or `:<path>`. --path=<path>:: - For use with --textconv or --filters, to allow specifying an object + For use with `--textconv` or `--filters`, to allow specifying an object name and a path separately, e.g. when it is difficult to figure out the revision from which the blob came. @@ -115,15 +115,15 @@ OPTIONS repository. --allow-unknown-type:: - Allow -s or -t to query broken/corrupt objects of unknown type. + Allow `-s` or `-t` to query broken/corrupt objects of unknown type. --follow-symlinks:: - With --batch or --batch-check, follow symlinks inside the + With `--batch` or `--batch-check`, follow symlinks inside the repository when requesting objects with extended SHA-1 expressions of the form tree-ish:path-in-tree. Instead of providing output about the link itself, provide output about the linked-to object. If a symlink points outside the - tree-ish (e.g. a link to /foo or a root-level link to ../foo), + tree-ish (e.g. a link to `/foo` or a root-level link to `../foo`), the portion of the link which is outside the tree will be printed. + @@ -175,15 +175,15 @@ respectively print: OUTPUT ------ -If `-t` is specified, one of the <type>. +If `-t` is specified, one of the `<type>`. -If `-s` is specified, the size of the <object> in bytes. +If `-s` is specified, the size of the `<object>` in bytes. -If `-e` is specified, no output, unless the <object> is malformed. +If `-e` is specified, no output, unless the `<object>` is malformed. -If `-p` is specified, the contents of <object> are pretty-printed. +If `-p` is specified, the contents of `<object>` are pretty-printed. -If <type> is specified, the raw (though uncompressed) contents of the <object> +If `<type>` is specified, the raw (though uncompressed) contents of the `<object>` will be returned. BATCH OUTPUT @@ -200,7 +200,7 @@ object, with placeholders of the form `%(atom)` expanded, followed by a newline. The available atoms are: `objectname`:: - The 40-hex object name of the object. + The full hex representation of the object name. `objecttype`:: The type of the object (the same as `cat-file -t` reports). @@ -215,8 +215,9 @@ newline. The available atoms are: `deltabase`:: If the object is stored as a delta on-disk, this expands to the - 40-hex sha1 of the delta base object. Otherwise, expands to the - null sha1 (40 zeroes). See `CAVEATS` below. + full hex representation of the delta base object name. + Otherwise, expands to the null OID (all zeroes). See `CAVEATS` + below. `rest`:: If this atom is used in the output string, input lines are split @@ -235,14 +236,14 @@ newline. For example, `--batch` without a custom format would produce: ------------ -<sha1> SP <type> SP <size> LF +<oid> SP <type> SP <size> LF <contents> LF ------------ Whereas `--batch-check='%(objectname) %(objecttype)'` would produce: ------------ -<sha1> SP <type> LF +<oid> SP <type> LF ------------ If a name is specified on stdin that cannot be resolved to an object in @@ -258,7 +259,7 @@ If a name is specified that might refer to more than one object (an ambiguous sh <object> SP ambiguous LF ------------ -If --follow-symlinks is used, and a symlink in the repository points +If `--follow-symlinks` is used, and a symlink in the repository points outside the repository, then `cat-file` will ignore any custom format and print: @@ -267,11 +268,11 @@ symlink SP <size> LF <symlink> LF ------------ -The symlink will either be absolute (beginning with a /), or relative -to the tree root. For instance, if dir/link points to ../../foo, then -<symlink> will be ../foo. <size> is the size of the symlink in bytes. +The symlink will either be absolute (beginning with a `/`), or relative +to the tree root. For instance, if dir/link points to `../../foo`, then +`<symlink>` will be `../foo`. `<size>` is the size of the symlink in bytes. -If --follow-symlinks is used, the following error messages will be +If `--follow-symlinks` is used, the following error messages will be displayed: ------------ diff --git a/Documentation/git-difftool.txt b/Documentation/git-difftool.txt index 484c485fd0..143b0c49d7 100644 --- a/Documentation/git-difftool.txt +++ b/Documentation/git-difftool.txt @@ -34,6 +34,14 @@ OPTIONS This is the default behaviour; the option is provided to override any configuration settings. +--rotate-to=<file>:: + Start showing the diff for the given path, + the paths before it will move to end and output. + +--skip-to=<file>:: + Start showing the diff for the given path, skipping all + the paths before it. + -t <tool>:: --tool=<tool>:: Use the diff tool specified by <tool>. Valid values include diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt index 2962f85a50..2ae2478de7 100644 --- a/Documentation/git-for-each-ref.txt +++ b/Documentation/git-for-each-ref.txt @@ -260,11 +260,9 @@ contents:lines=N:: The first `N` lines of the message. Additionally, the trailers as interpreted by linkgit:git-interpret-trailers[1] -are obtained as `trailers` (or by using the historical alias -`contents:trailers`). Non-trailer lines from the trailer block can be omitted -with `trailers:only`. Whitespace-continuations can be removed from trailers so -that each trailer appears on a line by itself with its full content with -`trailers:unfold`. Both can be used together as `trailers:unfold,only`. +are obtained as `trailers[:options]` (or by using the historical alias +`contents:trailers[:options]`). For valid [:option] values see `trailers` +section of linkgit:git-log[1]. For sorting purposes, fields with numeric values sort in numeric order (`objectsize`, `authordate`, `committerdate`, `creatordate`, `taggerdate`). diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt index d72d15be5b..bd596619c0 100644 --- a/Documentation/git-fsck.txt +++ b/Documentation/git-fsck.txt @@ -129,14 +129,6 @@ using 'git commit-graph verify'. See linkgit:git-commit-graph[1]. Extracted Diagnostics --------------------- -expect dangling commits - potential heads - due to lack of head information:: - You haven't specified any nodes as heads so it won't be - possible to differentiate between un-parented commits and - root nodes. - -missing sha1 directory '<dir>':: - The directory holding the sha1 objects is missing. - unreachable <type> <object>:: The <type> object <object>, isn't actually referred to directly or indirectly in any of the trees or commits seen. This can diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt index 0c114ad1ca..853967dea0 100644 --- a/Documentation/git-gc.txt +++ b/Documentation/git-gc.txt @@ -117,12 +117,14 @@ NOTES 'git gc' tries very hard not to delete objects that are referenced anywhere in your repository. In particular, it will keep not only objects referenced by your current set of branches and tags, but also -objects referenced by the index, remote-tracking branches, notes saved -by 'git notes' under refs/notes/, reflogs (which may reference commits -in branches that were later amended or rewound), and anything else in -the refs/* namespace. If you are expecting some objects to be deleted -and they aren't, check all of those locations and decide whether it -makes sense in your case to remove those references. +objects referenced by the index, remote-tracking branches, reflogs +(which may reference commits in branches that were later amended or +rewound), and anything else in the refs/* namespace. Note that a note +(of the kind created by 'git notes') attached to an object does not +contribute in keeping the object alive. If you are expecting some +objects to be deleted and they aren't, check all of those locations +and decide whether it makes sense in your case to remove those +references. On the other hand, when 'git gc' runs concurrently with another process, there is a risk of it deleting an object that the other process is using diff --git a/Documentation/git-http-fetch.txt b/Documentation/git-http-fetch.txt index 4deb4893f5..9fa17b60e4 100644 --- a/Documentation/git-http-fetch.txt +++ b/Documentation/git-http-fetch.txt @@ -41,11 +41,17 @@ commit-id:: <commit-id>['\t'<filename-as-in--w>] --packfile=<hash>:: - Instead of a commit id on the command line (which is not expected in + For internal use only. Instead of a commit id on the command + line (which is not expected in this case), 'git http-fetch' fetches the packfile directly at the given URL and uses index-pack to generate corresponding .idx and .keep files. The hash is used to determine the name of the temporary file and is - arbitrary. The output of index-pack is printed to stdout. + arbitrary. The output of index-pack is printed to stdout. Requires + --index-pack-args. + +--index-pack-args=<args>:: + For internal use only. The command to run on the contents of the + downloaded pack. Arguments are URL-encoded separated by spaces. --recover:: Verify that everything reachable from target is fetched. Used after diff --git a/Documentation/git-index-pack.txt b/Documentation/git-index-pack.txt index af0c26232c..7fa74b9e79 100644 --- a/Documentation/git-index-pack.txt +++ b/Documentation/git-index-pack.txt @@ -9,17 +9,18 @@ git-index-pack - Build pack index file for an existing packed archive SYNOPSIS -------- [verse] -'git index-pack' [-v] [-o <index-file>] <pack-file> +'git index-pack' [-v] [-o <index-file>] [--[no-]rev-index] <pack-file> 'git index-pack' --stdin [--fix-thin] [--keep] [-v] [-o <index-file>] - [<pack-file>] + [--[no-]rev-index] [<pack-file>] DESCRIPTION ----------- Reads a packed archive (.pack) from the specified file, and -builds a pack index file (.idx) for it. The packed archive -together with the pack index can then be placed in the -objects/pack/ directory of a Git repository. +builds a pack index file (.idx) for it. Optionally writes a +reverse-index (.rev) for the specified pack. The packed +archive together with the pack index can then be placed in +the objects/pack/ directory of a Git repository. OPTIONS @@ -35,6 +36,13 @@ OPTIONS fails if the name of packed archive does not end with .pack). +--[no-]rev-index:: + When this flag is provided, generate a reverse index + (a `.rev` file) corresponding to the given pack. If + `--verify` is given, ensure that the existing + reverse index is correct. Takes precedence over + `pack.writeReverseIndex`. + --stdin:: When this flag is provided, the pack is read from stdin instead and a copy is then written to <pack-file>. If @@ -78,7 +86,12 @@ OPTIONS Die if the pack contains broken links. For internal use only. --fsck-objects:: - Die if the pack contains broken objects. For internal use only. + For internal use only. ++ +Die if the pack contains broken objects. If the pack contains a tree +pointing to a .gitmodules blob that does not exist, prints the hash of +that blob (for the caller to check) after the hash that goes into the +name of the pack/idx file (see "Notes"). --threads=<n>:: Specifies the number of threads to spawn when resolving diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt index dd189a353a..1bbf865a1b 100644 --- a/Documentation/git-log.txt +++ b/Documentation/git-log.txt @@ -107,47 +107,15 @@ DIFF FORMATTING By default, `git log` does not generate any diff output. The options below can be used to show the changes made by each commit. -Note that unless one of `-c`, `--cc`, or `-m` is given, merge commits -will never show a diff, even if a diff format like `--patch` is -selected, nor will they match search options like `-S`. The exception is -when `--first-parent` is in use, in which merges are treated like normal -single-parent commits (this can be overridden by providing a -combined-diff option or with `--no-diff-merges`). - --c:: - With this option, diff output for a merge commit - shows the differences from each of the parents to the merge result - simultaneously instead of showing pairwise diff between a parent - and the result one at a time. Furthermore, it lists only files - which were modified from all parents. - ---cc:: - This flag implies the `-c` option and further compresses the - patch output by omitting uninteresting hunks whose contents in - the parents have only two variants and the merge result picks - one of them without modification. - ---combined-all-paths:: - This flag causes combined diffs (used for merge commits) to - list the name of the file from all parents. It thus only has - effect when -c or --cc are specified, and is likely only - useful if filename changes are detected (i.e. when either - rename or copy detection have been requested). - --m:: - This flag makes the merge commits show the full diff like - regular commits; for each merge parent, a separate log entry - and diff is generated. An exception is that only diff against - the first parent is shown when `--first-parent` option is given; - in that case, the output represents the changes the merge - brought _into_ the then-current branch. - ---diff-merges=off:: ---no-diff-merges:: - Disable output of diffs for merge commits (default). Useful to - override `-m`, `-c`, or `--cc`. +Note that unless one of `--diff-merges` variants (including short +`-m`, `-c`, and `--cc` options) is explicitly given, merge commits +will not show a diff, even if a diff format like `--patch` is +selected, nor will they match search options like `-S`. The exception +is when `--first-parent` is in use, in which case `first-parent` is +the default format. :git-log: 1 +:diff-merges-default: `off` include::diff-options.txt[] include::diff-generate-patch.txt[] diff --git a/Documentation/git-ls-files.txt b/Documentation/git-ls-files.txt index 0a3b5265b3..6d11ab506b 100644 --- a/Documentation/git-ls-files.txt +++ b/Documentation/git-ls-files.txt @@ -13,6 +13,7 @@ SYNOPSIS (--[cached|deleted|others|ignored|stage|unmerged|killed|modified])* (-[c|d|o|i|s|u|k|m])* [--eol] + [--deduplicate] [-x <pattern>|--exclude=<pattern>] [-X <file>|--exclude-from=<file>] [--exclude-per-directory=<file>] @@ -80,6 +81,13 @@ OPTIONS \0 line termination on output and do not quote filenames. See OUTPUT below for more information. +--deduplicate:: + When only filenames are shown, suppress duplicates that may + come from having multiple stages during a merge, or giving + `--deleted` and `--modified` option at the same time. + When any of the `-t`, `--unmerged`, or `--stage` option is + in use, this option has no effect. + -x <pattern>:: --exclude=<pattern>:: Skip untracked files matching pattern. diff --git a/Documentation/git-mailinfo.txt b/Documentation/git-mailinfo.txt index 7a6aed0e30..d343f040f5 100644 --- a/Documentation/git-mailinfo.txt +++ b/Documentation/git-mailinfo.txt @@ -53,7 +53,7 @@ character. The commit log message, author name and author email are taken from the e-mail, and after minimally decoding MIME transfer encoding, re-coded in the charset specified by - i18n.commitencoding (defaulting to UTF-8) by transliterating + `i18n.commitEncoding` (defaulting to UTF-8) by transliterating them. This used to be optional but now it is the default. + Note that the patch is always used as-is without charset @@ -61,7 +61,7 @@ conversion, even with this flag. --encoding=<encoding>:: Similar to -u. But when re-coding, the charset specified here is - used instead of the one specified by i18n.commitencoding or UTF-8. + used instead of the one specified by `i18n.commitEncoding` or UTF-8. -n:: Disable all charset re-coding of the metadata. diff --git a/Documentation/git-maintenance.txt b/Documentation/git-maintenance.txt index 3b432171d6..80ddd33ceb 100644 --- a/Documentation/git-maintenance.txt +++ b/Documentation/git-maintenance.txt @@ -145,6 +145,12 @@ incremental-repack:: which is a special case that attempts to repack all pack-files into a single pack-file. +pack-refs:: + The `pack-refs` task collects the loose reference files and + collects them into a single file. This speeds up operations that + need to iterate across many references. See linkgit:git-pack-refs[1] + for more information. + OPTIONS ------- --auto:: diff --git a/Documentation/git-mergetool--lib.txt b/Documentation/git-mergetool--lib.txt index 4da9d24096..3e8f59ac0e 100644 --- a/Documentation/git-mergetool--lib.txt +++ b/Documentation/git-mergetool--lib.txt @@ -38,6 +38,10 @@ get_merge_tool_cmd:: get_merge_tool_path:: returns the custom path for a merge tool. +initialize_merge_tool:: + bring merge tool specific functions into scope so they can be used or + overridden. + run_merge_tool:: launches a merge tool given the tool name and a true/false flag to indicate whether a merge base is present. diff --git a/Documentation/git-mergetool.txt b/Documentation/git-mergetool.txt index 6b14702e78..e587c7763a 100644 --- a/Documentation/git-mergetool.txt +++ b/Documentation/git-mergetool.txt @@ -99,6 +99,10 @@ success of the resolution after the custom tool has exited. (see linkgit:git-config[1]). To cancel `diff.orderFile`, use `-O/dev/null`. +CONFIGURATION +------------- +include::config/mergetool.txt[] + TEMPORARY FILES --------------- `git mergetool` creates `*.orig` backup files while resolving merges. diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index 54d715ead1..25d9fbe37a 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -85,6 +85,16 @@ base-name:: reference was included in the resulting packfile. This can be useful to send new tags to native Git clients. +--stdin-packs:: + Read the basenames of packfiles (e.g., `pack-1234abcd.pack`) + from the standard input, instead of object names or revision + arguments. The resulting pack contains all objects listed in the + included packs (those not beginning with `^`), excluding any + objects listed in the excluded packs (beginning with `^`). ++ +Incompatible with `--revs`, or options that imply `--revs` (such as +`--all`), with the exception of `--unpacked`, which is compatible. + --window=<n>:: --depth=<n>:: These two options affect how the objects contained in @@ -400,6 +410,17 @@ Note that we pick a single island for each regex to go into, using "last one wins" ordering (which allows repo-specific config to take precedence over user-wide config, and so forth). + +CONFIGURATION +------------- + +Various configuration variables affect packing, see +linkgit:git-config[1] (search for "pack" and "delta"). + +Notably, delta compression is not used on objects larger than the +`core.bigFileThreshold` configuration variable and on files with the +attribute `delta` set to false. + SEE ALSO -------- linkgit:git-rev-list[1] diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt index ab103c82cf..a953c7c387 100644 --- a/Documentation/git-push.txt +++ b/Documentation/git-push.txt @@ -600,7 +600,7 @@ EXAMPLES `git push origin`:: Without additional configuration, pushes the current branch to - the configured upstream (`remote.origin.merge` configuration + the configured upstream (`branch.<name>.merge` configuration variable) if it has the same name as the current branch, and errors out without pushing otherwise. + diff --git a/Documentation/git-range-diff.txt b/Documentation/git-range-diff.txt index 9701c1e5fd..fe350d7f40 100644 --- a/Documentation/git-range-diff.txt +++ b/Documentation/git-range-diff.txt @@ -10,6 +10,7 @@ SYNOPSIS [verse] 'git range-diff' [--color=[<when>]] [--no-color] [<diff-options>] [--no-dual-color] [--creation-factor=<factor>] + [--left-only | --right-only] ( <range1> <range2> | <rev1>...<rev2> | <base> <rev1> <rev2> ) DESCRIPTION @@ -28,6 +29,17 @@ Finally, the list of matching commits is shown in the order of the second commit range, with unmatched commits being inserted just after all of their ancestors have been shown. +There are three ways to specify the commit ranges: + +- `<range1> <range2>`: Either commit range can be of the form + `<base>..<rev>`, `<rev>^!` or `<rev>^-<n>`. See `SPECIFYING RANGES` + in linkgit:gitrevisions[7] for more details. + +- `<rev1>...<rev2>`. This is equivalent to + `<rev2>..<rev1> <rev1>..<rev2>`. + +- `<base> <rev1> <rev2>`: This is equivalent to `<base>..<rev1> + <base>..<rev2>`. OPTIONS ------- @@ -57,6 +69,14 @@ to revert to color all lines according to the outer diff markers See the ``Algorithm`` section below for an explanation why this is needed. +--left-only:: + Suppress commits that are missing from the first specified range + (or the "left range" when using the `<rev1>...<rev2>` format). + +--right-only:: + Suppress commits that are missing from the second specified range + (or the "right range" when using the `<rev1>...<rev2>` format). + --[no-]notes[=<ref>]:: This flag is passed to the `git log` program (see linkgit:git-log[1]) that generates the patches. diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt index 92f146d27d..317d63cf0d 100644 --- a/Documentation/git-repack.txt +++ b/Documentation/git-repack.txt @@ -165,9 +165,35 @@ depth is 4095. Pass the `--delta-islands` option to `git-pack-objects`, see linkgit:git-pack-objects[1]. -Configuration +-g=<factor>:: +--geometric=<factor>:: + Arrange resulting pack structure so that each successive pack + contains at least `<factor>` times the number of objects as the + next-largest pack. ++ +`git repack` ensures this by determining a "cut" of packfiles that need +to be repacked into one in order to ensure a geometric progression. It +picks the smallest set of packfiles such that as many of the larger +packfiles (by count of objects contained in that pack) may be left +intact. ++ +Unlike other repack modes, the set of objects to pack is determined +uniquely by the set of packs being "rolled-up"; in other words, the +packs determined to need to be combined in order to restore a geometric +progression. ++ +When `--unpacked` is specified, loose objects are implicitly included in +this "roll-up", without respect to their reachability. This is subject +to change in the future. This option (implying a drastically different +repack mode) is not guaranteed to work with all other combinations of +option to `git repack`). + +CONFIGURATION ------------- +Various configuration variables affect packing, see +linkgit:git-config[1] (search for "pack" and "delta"). + By default, the command passes `--delta-base-offset` option to 'git pack-objects'; this typically results in slightly smaller packs, but the generated packs are incompatible with versions of Git older than @@ -178,6 +204,10 @@ need to set the configuration variable `repack.UseDeltaBaseOffset` to is unaffected by this option as the conversion is performed on the fly as needed in that case. +Delta compression is not used on objects larger than the +`core.bigFileThreshold` configuration variable and on files with the +attribute `delta` set to false. + SEE ALSO -------- linkgit:git-pack-objects[1] diff --git a/Documentation/git-rev-list.txt b/Documentation/git-rev-list.txt index 5da66232dc..20bb8e8217 100644 --- a/Documentation/git-rev-list.txt +++ b/Documentation/git-rev-list.txt @@ -31,6 +31,99 @@ include::rev-list-options.txt[] include::pretty-formats.txt[] +EXAMPLES +-------- + +* Print the list of commits reachable from the current branch. ++ +---------- +git rev-list HEAD +---------- + +* Print the list of commits on this branch, but not present in the + upstream branch. ++ +---------- +git rev-list @{upstream}..HEAD +---------- + +* Format commits with their author and commit message (see also the + porcelain linkgit:git-log[1]). ++ +---------- +git rev-list --format=medium HEAD +---------- + +* Format commits along with their diffs (see also the porcelain + linkgit:git-log[1], which can do this in a single process). ++ +---------- +git rev-list HEAD | +git diff-tree --stdin --format=medium -p +---------- + +* Print the list of commits on the current branch that touched any + file in the `Documentation` directory. ++ +---------- +git rev-list HEAD -- Documentation/ +---------- + +* Print the list of commits authored by you in the past year, on + any branch, tag, or other ref. ++ +---------- +git rev-list --author=you@example.com --since=1.year.ago --all +---------- + +* Print the list of objects reachable from the current branch (i.e., all + commits and the blobs and trees they contain). ++ +---------- +git rev-list --objects HEAD +---------- + +* Compare the disk size of all reachable objects, versus those + reachable from reflogs, versus the total packed size. This can tell + you whether running `git repack -ad` might reduce the repository size + (by dropping unreachable objects), and whether expiring reflogs might + help. ++ +---------- +# reachable objects +git rev-list --disk-usage --objects --all +# plus reflogs +git rev-list --disk-usage --objects --all --reflog +# total disk size used +du -c .git/objects/pack/*.pack .git/objects/??/* +# alternative to du: add up "size" and "size-pack" fields +git count-objects -v +---------- + +* Report the disk size of each branch, not including objects used by the + current branch. This can find outliers that are contributing to a + bloated repository size (e.g., because somebody accidentally committed + large build artifacts). ++ +---------- +git for-each-ref --format='%(refname)' | +while read branch +do + size=$(git rev-list --disk-usage --objects HEAD..$branch) + echo "$size $branch" +done | +sort -n +---------- + +* Compare the on-disk size of branches in one group of refs, excluding + another. If you co-mingle objects from multiple remotes in a single + repository, this can show which remotes are contributing to the + repository size (taking the size of `origin` as a baseline). ++ +---------- +git rev-list --disk-usage --objects --remotes=$suspect --not --remotes=origin +---------- + GIT --- Part of the linkgit:git[1] suite diff --git a/Documentation/git-shortlog.txt b/Documentation/git-shortlog.txt index c16cc3b608..c9c7f3065c 100644 --- a/Documentation/git-shortlog.txt +++ b/Documentation/git-shortlog.txt @@ -113,6 +113,10 @@ MAPPING AUTHORS See linkgit:gitmailmap[5]. +Note that if `git shortlog` is run outside of a repository (to process +log contents on standard input), it will look for a `.mailmap` file in +the current directory. + GIT --- Part of the linkgit:git[1] suite diff --git a/Documentation/git-show.txt b/Documentation/git-show.txt index fcf528c1b3..2b1bc7288d 100644 --- a/Documentation/git-show.txt +++ b/Documentation/git-show.txt @@ -45,10 +45,13 @@ include::pretty-options.txt[] include::pretty-formats.txt[] -COMMON DIFF OPTIONS -------------------- +DIFF FORMATTING +--------------- +The options below can be used to change the way `git show` generates +diff output. :git-log: 1 +:diff-merges-default: `dense-combined` include::diff-options.txt[] include::diff-generate-patch.txt[] diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt index 31f1beb65b..a8c8c32f1e 100644 --- a/Documentation/git-stash.txt +++ b/Documentation/git-stash.txt @@ -8,8 +8,8 @@ git-stash - Stash the changes in a dirty working directory away SYNOPSIS -------- [verse] -'git stash' list [<options>] -'git stash' show [<options>] [<stash>] +'git stash' list [<log-options>] +'git stash' show [-u|--include-untracked|--only-untracked] [<diff-options>] [<stash>] 'git stash' drop [-q|--quiet] [<stash>] 'git stash' ( pop | apply ) [--index] [-q|--quiet] [<stash>] 'git stash' branch <branchname> [<stash>] @@ -67,7 +67,7 @@ save [-p|--patch] [-k|--[no-]keep-index] [-u|--include-untracked] [-a|--all] [-q Instead, all non-option arguments are concatenated to form the stash message. -list [<options>]:: +list [<log-options>]:: List the stash entries that you currently have. Each 'stash entry' is listed with its name (e.g. `stash@{0}` is the latest entry, `stash@{1}` is @@ -83,7 +83,7 @@ stash@{1}: On master: 9cc0589... Add git-stash The command takes options applicable to the 'git log' command to control what is shown and how. See linkgit:git-log[1]. -show [<options>] [<stash>]:: +show [-u|--include-untracked|--only-untracked] [<diff-options>] [<stash>]:: Show the changes recorded in the stash entry as a diff between the stashed contents and the commit back when the stash entry was first @@ -91,8 +91,8 @@ show [<options>] [<stash>]:: By default, the command shows the diffstat, but it will accept any format known to 'git diff' (e.g., `git stash show -p stash@{1}` to view the second most recent entry in patch form). - You can use stash.showStat and/or stash.showPatch config variables - to change the default behavior. + You can use stash.showIncludeUntracked, stash.showStat, and + stash.showPatch config variables to change the default behavior. pop [--index] [-q|--quiet] [<stash>]:: @@ -160,10 +160,18 @@ up with `git clean`. -u:: --include-untracked:: - This option is only valid for `push` and `save` commands. +--no-include-untracked:: + When used with the `push` and `save` commands, + all untracked files are also stashed and then cleaned up with + `git clean`. ++ +When used with the `show` command, show the untracked files in the stash +entry as part of the diff. + +--only-untracked:: + This option is only valid for the `show` command. + -All untracked files are also stashed and then cleaned up with -`git clean`. +Show only the untracked files in the stash entry as part of the diff. --index:: This option is only valid for `pop` and `apply` commands. diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt index c0764e850a..83f38e3198 100644 --- a/Documentation/git-status.txt +++ b/Documentation/git-status.txt @@ -130,7 +130,7 @@ ignored, then the directory is not shown, but all contents are shown. --column[=<options>]:: --no-column:: Display untracked files in columns. See configuration variable - column.status for option syntax.`--column` and `--no-column` + `column.status` for option syntax. `--column` and `--no-column` without options are equivalent to 'always' and 'never' respectively. diff --git a/Documentation/git-tag.txt b/Documentation/git-tag.txt index 56656d1be6..31a97a1b6c 100644 --- a/Documentation/git-tag.txt +++ b/Documentation/git-tag.txt @@ -134,7 +134,7 @@ options for details. --column[=<options>]:: --no-column:: Display tag listing in columns. See configuration variable - column.tag for option syntax.`--column` and `--no-column` + `column.tag` for option syntax. `--column` and `--no-column` without options are equivalent to 'always' and 'never' respectively. + This option is only applicable when listing tags without annotation lines. diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt index 02a706c4c0..f1bb1fa5f5 100644 --- a/Documentation/git-worktree.txt +++ b/Documentation/git-worktree.txt @@ -97,8 +97,9 @@ list:: List details of each working tree. The main working tree is listed first, followed by each of the linked working trees. The output details include whether the working tree is bare, the revision currently checked out, the -branch currently checked out (or "detached HEAD" if none), and "locked" if -the worktree is locked. +branch currently checked out (or "detached HEAD" if none), "locked" if +the worktree is locked, "prunable" if the worktree can be pruned by `prune` +command. lock:: @@ -231,9 +232,14 @@ This can also be set up as the default behaviour by using the -v:: --verbose:: With `prune`, report all removals. ++ +With `list`, output additional information about worktrees (see below). --expire <time>:: With `prune`, only expire unused working trees older than `<time>`. ++ +With `list`, annotate missing working trees as prunable if they are +older than `<time>`. --reason <string>:: With `lock`, an explanation why the working tree is locked. @@ -372,13 +378,46 @@ $ git worktree list /path/to/other-linked-worktree 1234abc (detached HEAD) ------------ +The command also shows annotations for each working tree, according to its state. +These annotations are: + + * `locked`, if the working tree is locked. + * `prunable`, if the working tree can be pruned via `git worktree prune`. + +------------ +$ git worktree list +/path/to/linked-worktree abcd1234 [master] +/path/to/locked-worktreee acbd5678 (brancha) locked +/path/to/prunable-worktree 5678abc (detached HEAD) prunable +------------ + +For these annotations, a reason might also be available and this can be +seen using the verbose mode. The annotation is then moved to the next line +indented followed by the additional information. + +------------ +$ git worktree list --verbose +/path/to/linked-worktree abcd1234 [master] +/path/to/locked-worktree-no-reason abcd5678 (detached HEAD) locked +/path/to/locked-worktree-with-reason 1234abcd (brancha) + locked: working tree path is mounted on a portable device +/path/to/prunable-worktree 5678abc1 (detached HEAD) + prunable: gitdir file points to non-existent location +------------ + +Note that the annotation is moved to the next line if the additional +information is available, otherwise it stays on the same line as the +working tree itself. + Porcelain Format ~~~~~~~~~~~~~~~~ The porcelain format has a line per attribute. Attributes are listed with a label and value separated by a single space. Boolean attributes (like `bare` and `detached`) are listed as a label only, and are present only -if the value is true. The first attribute of a working tree is always -`worktree`, an empty line indicates the end of the record. For example: +if the value is true. Some attributes (like `locked`) can be listed as a label +only or with a value depending upon whether a reason is available. The first +attribute of a working tree is always `worktree`, an empty line indicates the +end of the record. For example: ------------ $ git worktree list --porcelain @@ -393,6 +432,33 @@ worktree /path/to/other-linked-worktree HEAD 1234abc1234abc1234abc1234abc1234abc1234a detached +worktree /path/to/linked-worktree-locked-no-reason +HEAD 5678abc5678abc5678abc5678abc5678abc5678c +branch refs/heads/locked-no-reason +locked + +worktree /path/to/linked-worktree-locked-with-reason +HEAD 3456def3456def3456def3456def3456def3456b +branch refs/heads/locked-with-reason +locked reason why is locked + +worktree /path/to/linked-worktree-prunable +HEAD 1233def1234def1234def1234def1234def1234b +detached +prunable gitdir file points to non-existent location + +------------ + +If the lock reason contains "unusual" characters such as newline, they +are escaped and the entire reason is quoted as explained for the +configuration variable `core.quotePath` (see linkgit:git-config[1]). +For Example: + +------------ +$ git worktree list --porcelain +... +locked "reason\nwhy is locked" +... ------------ EXAMPLES diff --git a/Documentation/git.txt b/Documentation/git.txt index d36e6fd482..3a9c44987f 100644 --- a/Documentation/git.txt +++ b/Documentation/git.txt @@ -88,7 +88,7 @@ foo.bar= ...`) sets `foo.bar` to the empty string which `git config empty string, instead the environment variable itself must be set to the empty string. It is an error if the `<envvar>` does not exist in the environment. `<envvar>` may not contain an equals sign - to avoid ambiguity with `<name>`s which contain one. + to avoid ambiguity with `<name>` containing one. + This is useful for cases where you want to pass transitory configuration options to git, but are doing so on OS's where diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt index e84e104f93..0a60472bb5 100644 --- a/Documentation/gitattributes.txt +++ b/Documentation/gitattributes.txt @@ -1174,7 +1174,8 @@ tag then no replacement will be done. The placeholders are the same as those for the option `--pretty=format:` of linkgit:git-log[1], except that they need to be wrapped like this: `$Format:PLACEHOLDERS$` in the file. E.g. the string `$Format:%H$` will be replaced by the -commit hash. +commit hash. However, only one `%(describe)` placeholder is expanded +per archive to avoid denial-of-service attacks. Packing objects diff --git a/Documentation/gitdiffcore.txt b/Documentation/gitdiffcore.txt index c970d9fe43..0d57f86abc 100644 --- a/Documentation/gitdiffcore.txt +++ b/Documentation/gitdiffcore.txt @@ -74,6 +74,7 @@ into another list. There are currently 5 such transformations: - diffcore-merge-broken - diffcore-pickaxe - diffcore-order +- diffcore-rotate These are applied in sequence. The set of filepairs 'git diff-{asterisk}' commands find are used as the input to diffcore-break, and @@ -168,6 +169,26 @@ a similarity score different from the default of 50% by giving a number after the "-M" or "-C" option (e.g. "-M8" to tell it to use 8/10 = 80%). +Note that when rename detection is on but both copy and break +detection are off, rename detection adds a preliminary step that first +checks if files are moved across directories while keeping their +filename the same. If there is a file added to a directory whose +contents is sufficiently similar to a file with the same name that got +deleted from a different directory, it will mark them as renames and +exclude them from the later quadratic step (the one that pairwise +compares all unmatched files to find the "best" matches, determined by +the highest content similarity). So, for example, if a deleted +docs/ext.txt and an added docs/config/ext.txt are similar enough, they +will be marked as a rename and prevent an added docs/ext.md that may +be even more similar to the deleted docs/ext.txt from being considered +as the rename destination in the later step. For this reason, the +preliminary "match same filename" step uses a bit higher threshold to +mark a file pair as a rename and stop considering other candidates for +better matches. At most, one comparison is done per file in this +preliminary pass; so if there are several remaining ext.txt files +throughout the directory hierarchy after exact rename detection, this +preliminary step may be skipped for those files. + Note. When the "-C" option is used with `--find-copies-harder` option, 'git diff-{asterisk}' commands feed unmodified filepairs to diffcore mechanism as well as modified ones. This lets the copy @@ -276,6 +297,26 @@ Documentation t ------------------------------------------------ +diffcore-rotate: For Changing At Which Path Output Starts +--------------------------------------------------------- + +This transformation takes one pathname, and rotates the set of +filepairs so that the filepair for the given pathname comes first, +optionally discarding the paths that come before it. This is used +to implement the `--skip-to` and the `--rotate-to` options. It is +an error when the specified pathname is not in the set of filepairs, +but it is not useful to error out when used with "git log" family of +commands, because it is unreasonable to expect that a given path +would be modified by each and every commit shown by the "git log" +command. For this reason, when used with "git log", the filepair +that sorts the same as, or the first one that sorts after, the given +pathname is where the output starts. + +Use of this transformation combined with diffcore-order will produce +unexpected results, as the input to this transformation is likely +not sorted when diffcore-order is in effect. + + SEE ALSO -------- linkgit:git-diff[1], diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt index 1f3b57d04d..b51959ff94 100644 --- a/Documentation/githooks.txt +++ b/Documentation/githooks.txt @@ -138,7 +138,7 @@ given); `template` (if a `-t` option was given or the configuration option `commit.template` is set); `merge` (if the commit is a merge or a `.git/MERGE_MSG` file exists); `squash` (if a `.git/SQUASH_MSG` file exists); or `commit`, followed by -a commit SHA-1 (if a `-c`, `-C` or `--amend` option was given). +a commit object name (if a `-c`, `-C` or `--amend` option was given). If the exit status is non-zero, `git commit` will abort. @@ -231,19 +231,19 @@ named remote is not being used both values will be the same. Information about what is to be pushed is provided on the hook's standard input with lines of the form: - <local ref> SP <local sha1> SP <remote ref> SP <remote sha1> LF + <local ref> SP <local object name> SP <remote ref> SP <remote object name> LF For instance, if the command +git push origin master:foreign+ were run the hook would receive a line like the following: refs/heads/master 67890 refs/heads/foreign 12345 -although the full, 40-character SHA-1s would be supplied. If the foreign ref -does not yet exist the `<remote SHA-1>` will be 40 `0`. If a ref is to be -deleted, the `<local ref>` will be supplied as `(delete)` and the `<local -SHA-1>` will be 40 `0`. If the local commit was specified by something other -than a name which could be expanded (such as `HEAD~`, or a SHA-1) it will be -supplied as it was originally given. +although the full object name would be supplied. If the foreign ref does not +yet exist the `<remote object name>` will be the all-zeroes object name. If a +ref is to be deleted, the `<local ref>` will be supplied as `(delete)` and the +`<local object name>` will be the all-zeroes object name. If the local commit +was specified by something other than a name which could be expanded (such as +`HEAD~`, or an object name) it will be supplied as it was originally given. If this hook exits with a non-zero status, `git push` will abort without pushing anything. Information about why the push is rejected may be sent @@ -268,7 +268,7 @@ input a line of the format: where `<old-value>` is the old object name stored in the ref, `<new-value>` is the new object name to be stored in the ref and `<ref-name>` is the full name of the ref. -When creating a new ref, `<old-value>` is 40 `0`. +When creating a new ref, `<old-value>` is the all-zeroes object name. If the hook exits with non-zero status, none of the refs will be updated. If the hook exits with zero, updating of individual refs can @@ -473,7 +473,8 @@ reference-transaction This hook is invoked by any Git command that performs reference updates. It executes whenever a reference transaction is prepared, -committed or aborted and may thus get called multiple times. +committed or aborted and may thus get called multiple times. The hook +does not cover symbolic references (but that may change in the future). The hook takes exactly one argument, which is the current state the given reference transaction is in: @@ -492,6 +493,14 @@ receives on standard input a line of the format: <old-value> SP <new-value> SP <ref-name> LF +where `<old-value>` is the old object name passed into the reference +transaction, `<new-value>` is the new object name to be stored in the +ref and `<ref-name>` is the full name of the ref. When force updating +the reference regardless of its current value or when the reference is +to be created anew, `<old-value>` is the all-zeroes object name. To +distinguish these cases, you can inspect the current value of +`<ref-name>` via `git rev-parse`. + The exit status of the hook is ignored for any state except for the "prepared" state. In the "prepared" state, a non-zero exit status will cause the transaction to be aborted. The hook will not be called with @@ -550,7 +559,7 @@ command-dependent arguments may be passed in the future. The hook receives a list of the rewritten commits on stdin, in the format - <old-sha1> SP <new-sha1> [ SP <extra-info> ] LF + <old-object-name> SP <new-object-name> [ SP <extra-info> ] LF The 'extra-info' is again command-dependent. If it is empty, the preceding SP is also omitted. Currently, no commands pass any @@ -566,7 +575,7 @@ rebase:: For the 'squash' and 'fixup' operation, all commits that were squashed are listed as being rewritten to the squashed commit. This means that there will be several lines sharing the same - 'new-sha1'. + 'new-object-name'. + The commits are guaranteed to be listed in the order that they were processed by rebase. diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt index d47b1ae296..5751603b13 100644 --- a/Documentation/gitignore.txt +++ b/Documentation/gitignore.txt @@ -153,7 +153,7 @@ EXAMPLES -------- - The pattern `hello.*` matches any file or folder - whose name begins with `hello`. If one wants to restrict + whose name begins with `hello.`. If one wants to restrict this only to the directory and not in its subdirectories, one can prepend the pattern with a slash, i.e. `/hello.*`; the pattern now matches `hello.txt`, `hello.c` but not diff --git a/Documentation/gitmailmap.txt b/Documentation/gitmailmap.txt index 052209b33b..3fb39f801f 100644 --- a/Documentation/gitmailmap.txt +++ b/Documentation/gitmailmap.txt @@ -50,9 +50,9 @@ which allows mailmap to replace both the name and the email of a commit matching both the specified commit name and email address. Both E-Mails and names are matched case-insensitively. For example -this would also match the 'Commit Name <commit@email.xx>' above: +this would also match the 'Commit Name <commit@email.xx>' above: -- -Proper Name <proper@email.xx> CoMmIt NaMe <CoMmIt@EmAiL.xX> + Proper Name <proper@email.xx> CoMmIt NaMe <CoMmIt@EmAiL.xX> -- EXAMPLES @@ -79,9 +79,9 @@ Jane Doe <jane@example.com> Jane Doe <jane@desktop.(none)> ------------ -Note that there's no need to map the name for 'jane@laptop.(none)' to +Note that there's no need to map the name for '<jane@laptop.(none)>' to only correct the names. However, leaving the obviously broken -`<jane@laptop.(none)>' and '<jane@desktop.(none)>' E-Mails as-is is +'<jane@laptop.(none)>' and '<jane@desktop.(none)>' E-Mails as-is is usually not what you want. A `.mailmap` file which also corrects those is: diff --git a/Documentation/i18n.txt b/Documentation/i18n.txt index 7e36e5b55b..6c6baeeeb7 100644 --- a/Documentation/i18n.txt +++ b/Documentation/i18n.txt @@ -38,7 +38,7 @@ mind. a warning if the commit log message given to it does not look like a valid UTF-8 string, unless you explicitly say your project uses a legacy encoding. The way to say this is to - have i18n.commitencoding in `.git/config` file, like this: + have `i18n.commitEncoding` in `.git/config` file, like this: + ------------ [i18n] diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt index 6b59e28d44..45133066e4 100644 --- a/Documentation/pretty-formats.txt +++ b/Documentation/pretty-formats.txt @@ -208,6 +208,19 @@ The placeholders are: '%cs':: committer date, short format (`YYYY-MM-DD`) '%d':: ref names, like the --decorate option of linkgit:git-log[1] '%D':: ref names without the " (", ")" wrapping. +'%(describe[:options])':: human-readable name, like + linkgit:git-describe[1]; empty string for + undescribable commits. The `describe` string + may be followed by a colon and zero or more + comma-separated options. Descriptions can be + inconsistent when tags are added or removed at + the same time. ++ +** 'match=<pattern>': Only consider tags matching the given + `glob(7)` pattern, excluding the "refs/tags/" prefix. +** 'exclude=<pattern>': Do not consider tags matching the given + `glob(7)` pattern, excluding the "refs/tags/" prefix. + '%S':: ref name given on the command line by which the commit was reached (like `git log --source`), only works with `git log` '%e':: encoding diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 002379056a..b1c8f86c6e 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -129,6 +129,11 @@ parents) and `--max-parents=-1` (negative numbers denote no upper limit). adjusting to updated upstream from time to time, and this option allows you to ignore the individual commits brought in to your history by such a merge. +ifdef::git-log[] ++ +This option also changes default diff format for merge commits +to `first-parent`, see `--diff-merges=first-parent` for details. +endif::git-log[] --not:: Reverses the meaning of the '{caret}' prefix (or lack thereof) @@ -222,6 +227,15 @@ ifdef::git-rev-list[] test the exit status to see if a range of objects is fully connected (or not). It is faster than redirecting stdout to `/dev/null` as the output does not have to be formatted. + +--disk-usage:: + Suppress normal output; instead, print the sum of the bytes used + for on-disk storage by the selected commits or objects. This is + equivalent to piping the output into `git cat-file + --batch-check='%(objectsize:disk)'`, except that it runs much + faster (especially with `--use-bitmap-index`). See the `CAVEATS` + section in linkgit:git-cat-file[1] for the limitations of what + "on-disk storage" means. endif::git-rev-list[] --cherry-mark:: diff --git a/Documentation/technical/chunk-format.txt b/Documentation/technical/chunk-format.txt new file mode 100644 index 0000000000..593614fced --- /dev/null +++ b/Documentation/technical/chunk-format.txt @@ -0,0 +1,116 @@ +Chunk-based file formats +======================== + +Some file formats in Git use a common concept of "chunks" to describe +sections of the file. This allows structured access to a large file by +scanning a small "table of contents" for the remaining data. This common +format is used by the `commit-graph` and `multi-pack-index` files. See +link:technical/pack-format.html[the `multi-pack-index` format] and +link:technical/commit-graph-format.html[the `commit-graph` format] for +how they use the chunks to describe structured data. + +A chunk-based file format begins with some header information custom to +that format. That header should include enough information to identify +the file type, format version, and number of chunks in the file. From this +information, that file can determine the start of the chunk-based region. + +The chunk-based region starts with a table of contents describing where +each chunk starts and ends. This consists of (C+1) rows of 12 bytes each, +where C is the number of chunks. Consider the following table: + + | Chunk ID (4 bytes) | Chunk Offset (8 bytes) | + |--------------------|------------------------| + | ID[0] | OFFSET[0] | + | ... | ... | + | ID[C] | OFFSET[C] | + | 0x0000 | OFFSET[C+1] | + +Each row consists of a 4-byte chunk identifier (ID) and an 8-byte offset. +Each integer is stored in network-byte order. + +The chunk identifier `ID[i]` is a label for the data stored within this +fill from `OFFSET[i]` (inclusive) to `OFFSET[i+1]` (exclusive). Thus, the +size of the `i`th chunk is equal to the difference between `OFFSET[i+1]` +and `OFFSET[i]`. This requires that the chunk data appears contiguously +in the same order as the table of contents. + +The final entry in the table of contents must be four zero bytes. This +confirms that the table of contents is ending and provides the offset for +the end of the chunk-based data. + +Note: The chunk-based format expects that the file contains _at least_ a +trailing hash after `OFFSET[C+1]`. + +Functions for working with chunk-based file formats are declared in +`chunk-format.h`. Using these methods provide extra checks that assist +developers when creating new file formats. + +Writing chunk-based file formats +-------------------------------- + +To write a chunk-based file format, create a `struct chunkfile` by +calling `init_chunkfile()` and pass a `struct hashfile` pointer. The +caller is responsible for opening the `hashfile` and writing header +information so the file format is identifiable before the chunk-based +format begins. + +Then, call `add_chunk()` for each chunk that is intended for write. This +populates the `chunkfile` with information about the order and size of +each chunk to write. Provide a `chunk_write_fn` function pointer to +perform the write of the chunk data upon request. + +Call `write_chunkfile()` to write the table of contents to the `hashfile` +followed by each of the chunks. This will verify that each chunk wrote +the expected amount of data so the table of contents is correct. + +Finally, call `free_chunkfile()` to clear the `struct chunkfile` data. The +caller is responsible for finalizing the `hashfile` by writing the trailing +hash and closing the file. + +Reading chunk-based file formats +-------------------------------- + +To read a chunk-based file format, the file must be opened as a +memory-mapped region. The chunk-format API expects that the entire file +is mapped as a contiguous memory region. + +Initialize a `struct chunkfile` pointer with `init_chunkfile(NULL)`. + +After reading the header information from the beginning of the file, +including the chunk count, call `read_table_of_contents()` to populate +the `struct chunkfile` with the list of chunks, their offsets, and their +sizes. + +Extract the data information for each chunk using `pair_chunk()` or +`read_chunk()`: + +* `pair_chunk()` assigns a given pointer with the location inside the + memory-mapped file corresponding to that chunk's offset. If the chunk + does not exist, then the pointer is not modified. + +* `read_chunk()` takes a `chunk_read_fn` function pointer and calls it + with the appropriate initial pointer and size information. The function + is not called if the chunk does not exist. Use this method to read chunks + if you need to perform immediate parsing or if you need to execute logic + based on the size of the chunk. + +After calling these methods, call `free_chunkfile()` to clear the +`struct chunkfile` data. This will not close the memory-mapped region. +Callers are expected to own that data for the timeframe the pointers into +the region are needed. + +Examples +-------- + +These file formats use the chunk-format API, and can be used as examples +for future formats: + +* *commit-graph:* see `write_commit_graph_file()` and `parse_commit_graph()` + in `commit-graph.c` for how the chunk-format API is used to write and + parse the commit-graph file format documented in + link:technical/commit-graph-format.html[the commit-graph file format]. + +* *multi-pack-index:* see `write_midx_internal()` and `load_multi_pack_index()` + in `midx.c` for how the chunk-format API is used to write and + parse the multi-pack-index file format documented in + link:technical/pack-format.html[the multi-pack-index file format]. diff --git a/Documentation/technical/commit-graph-format.txt b/Documentation/technical/commit-graph-format.txt index b3b58880b9..87971c27dd 100644 --- a/Documentation/technical/commit-graph-format.txt +++ b/Documentation/technical/commit-graph-format.txt @@ -4,11 +4,7 @@ Git commit graph format The Git commit graph stores a list of commit OIDs and some associated metadata, including: -- The generation number of the commit. Commits with no parents have - generation number 1; commits with parents have generation number - one more than the maximum generation number of its parents. We - reserve zero as special, and can be used to mark a generation - number invalid or as "not computed". +- The generation number of the commit. - The root tree OID. @@ -65,6 +61,9 @@ CHUNK LOOKUP: the length using the next chunk position if necessary.) Each chunk ID appears at most once. + The CHUNK LOOKUP matches the table of contents from + link:technical/chunk-format.html[the chunk-based file format]. + The remaining data in the body is described one chunk at a time, and these chunks may be given in any order. Chunks are required unless otherwise specified. @@ -86,13 +85,33 @@ CHUNK DATA: position. If there are more than two parents, the second value has its most-significant bit on and the other bits store an array position into the Extra Edge List chunk. - * The next 8 bytes store the generation number of the commit and + * The next 8 bytes store the topological level (generation number v1) + of the commit and the commit time in seconds since EPOCH. The generation number uses the higher 30 bits of the first 4 bytes, while the commit time uses the 32 bits of the second 4 bytes, along with the lowest 2 bits of the lowest byte, storing the 33rd and 34th bit of the commit time. + Generation Data (ID: {'G', 'D', 'A', 'T' }) (N * 4 bytes) [Optional] + * This list of 4-byte values store corrected commit date offsets for the + commits, arranged in the same order as commit data chunk. + * If the corrected commit date offset cannot be stored within 31 bits, + the value has its most-significant bit on and the other bits store + the position of corrected commit date into the Generation Data Overflow + chunk. + * Generation Data chunk is present only when commit-graph file is written + by compatible versions of Git and in case of split commit-graph chains, + the topmost layer also has Generation Data chunk. + + Generation Data Overflow (ID: {'G', 'D', 'O', 'V' }) [Optional] + * This list of 8-byte values stores the corrected commit date offsets + for commits with corrected commit date offsets that cannot be + stored within 31 bits. + * Generation Data Overflow chunk is present only when Generation Data + chunk is present and atleast one corrected commit date offset cannot + be stored within 31 bits. + Extra Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional] This list of 4-byte values store the second through nth parents for all octopus merges. The second parent value in the commit data stores diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt index f14a7659aa..f05e7bda1a 100644 --- a/Documentation/technical/commit-graph.txt +++ b/Documentation/technical/commit-graph.txt @@ -38,14 +38,31 @@ A consumer may load the following info for a commit from the graph: Values 1-4 satisfy the requirements of parse_commit_gently(). -Define the "generation number" of a commit recursively as follows: +There are two definitions of generation number: +1. Corrected committer dates (generation number v2) +2. Topological levels (generation nummber v1) - * A commit with no parents (a root commit) has generation number one. +Define "corrected committer date" of a commit recursively as follows: - * A commit with at least one parent has generation number one more than - the largest generation number among its parents. + * A commit with no parents (a root commit) has corrected committer date + equal to its committer date. -Equivalently, the generation number of a commit A is one more than the + * A commit with at least one parent has corrected committer date equal to + the maximum of its commiter date and one more than the largest corrected + committer date among its parents. + + * As a special case, a root commit with timestamp zero has corrected commit + date of 1, to be able to distinguish it from GENERATION_NUMBER_ZERO + (that is, an uncomputed corrected commit date). + +Define the "topological level" of a commit recursively as follows: + + * A commit with no parents (a root commit) has topological level of one. + + * A commit with at least one parent has topological level one more than + the largest topological level among its parents. + +Equivalently, the topological level of a commit A is one more than the length of a longest path from A to a root commit. The recursive definition is easier to use for computation and observing the following property: @@ -60,6 +77,9 @@ is easier to use for computation and observing the following property: generation numbers, then we always expand the boundary commit with highest generation number and can easily detect the stopping condition. +The property applies to both versions of generation number, that is both +corrected committer dates and topological levels. + This property can be used to significantly reduce the time it takes to walk commits and determine topological relationships. Without generation numbers, the general heuristic is the following: @@ -67,7 +87,9 @@ numbers, the general heuristic is the following: If A and B are commits with commit time X and Y, respectively, and X < Y, then A _probably_ cannot reach B. -This heuristic is currently used whenever the computation is allowed to +In absence of corrected commit dates (for example, old versions of Git or +mixed generation graph chains), +this heuristic is currently used whenever the computation is allowed to violate topological relationships due to clock skew (such as "git log" with default order), but is not used when the topological order is required (such as merge base calculations, "git log --graph"). @@ -77,7 +99,7 @@ in the commit graph. We can treat these commits as having "infinite" generation number and walk until reaching commits with known generation number. -We use the macro GENERATION_NUMBER_INFINITY = 0xFFFFFFFF to mark commits not +We use the macro GENERATION_NUMBER_INFINITY to mark commits not in the commit-graph file. If a commit-graph file was written by a version of Git that did not compute generation numbers, then those commits will have generation number represented by the macro GENERATION_NUMBER_ZERO = 0. @@ -93,12 +115,12 @@ fully-computed generation numbers. Using strict inequality may result in walking a few extra commits, but the simplicity in dealing with commits with generation number *_INFINITY or *_ZERO is valuable. -We use the macro GENERATION_NUMBER_MAX = 0x3FFFFFFF to for commits whose -generation numbers are computed to be at least this value. We limit at -this value since it is the largest value that can be stored in the -commit-graph file using the 30 bits available to generation numbers. This -presents another case where a commit can have generation number equal to -that of a parent. +We use the macro GENERATION_NUMBER_V1_MAX = 0x3FFFFFFF for commits whose +topological levels (generation number v1) are computed to be at least +this value. We limit at this value since it is the largest value that +can be stored in the commit-graph file using the 30 bits available +to topological levels. This presents another case where a commit can +have generation number equal to that of a parent. Design Details -------------- @@ -267,6 +289,35 @@ The merge strategy values (2 for the size multiple, 64,000 for the maximum number of commits) could be extracted into config settings for full flexibility. +## Handling Mixed Generation Number Chains + +With the introduction of generation number v2 and generation data chunk, the +following scenario is possible: + +1. "New" Git writes a commit-graph with the corrected commit dates. +2. "Old" Git writes a split commit-graph on top without corrected commit dates. + +A naive approach of using the newest available generation number from +each layer would lead to violated expectations: the lower layer would +use corrected commit dates which are much larger than the topological +levels of the higher layer. For this reason, Git inspects the topmost +layer to see if the layer is missing corrected commit dates. In such a case +Git only uses topological level for generation numbers. + +When writing a new layer in split commit-graph, we write corrected commit +dates if the topmost layer has corrected commit dates written. This +guarantees that if a layer has corrected commit dates, all lower layers +must have corrected commit dates as well. + +When merging layers, we do not consider whether the merged layers had corrected +commit dates. Instead, the new layer will have corrected commit dates if the +layer below the new layer has corrected commit dates. + +While writing or merging layers, if the new layer is the only layer, it will +have corrected commit dates when written by compatible versions of Git. Thus, +rewriting split commit-graph as a single file (`--split=replace`) creates a +single layer with corrected commit dates. + ## Deleting graph-{hash} files After a new tip file is written, some `graph-{hash}` files may no longer diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt index 6fd20ebbc2..7c1630bf83 100644 --- a/Documentation/technical/hash-function-transition.txt +++ b/Documentation/technical/hash-function-transition.txt @@ -33,16 +33,9 @@ researchers. On 23 February 2017 the SHAttered attack Git v2.13.0 and later subsequently moved to a hardened SHA-1 implementation by default, which isn't vulnerable to the SHAttered -attack. +attack, but SHA-1 is still weak. -Thus Git has in effect already migrated to a new hash that isn't SHA-1 -and doesn't share its vulnerabilities, its new hash function just -happens to produce exactly the same output for all known inputs, -except two PDFs published by the SHAttered researchers, and the new -implementation (written by those researchers) claims to detect future -cryptanalytic collision attacks. - -Regardless, it's considered prudent to move past any variant of SHA-1 +Thus it's considered prudent to move past any variant of SHA-1 to a new hash. There's no guarantee that future attacks on SHA-1 won't be published in the future, and those attacks may not have viable mitigations. @@ -57,6 +50,38 @@ SHA-1 still possesses the other properties such as fast object lookup and safe error checking, but other hash functions are equally suitable that are believed to be cryptographically secure. +Choice of Hash +-------------- +The hash to replace the hardened SHA-1 should be stronger than SHA-1 +was: we would like it to be trustworthy and useful in practice for at +least 10 years. + +Some other relevant properties: + +1. A 256-bit hash (long enough to match common security practice; not + excessively long to hurt performance and disk usage). + +2. High quality implementations should be widely available (e.g., in + OpenSSL and Apple CommonCrypto). + +3. The hash function's properties should match Git's needs (e.g. Git + requires collision and 2nd preimage resistance and does not require + length extension resistance). + +4. As a tiebreaker, the hash should be fast to compute (fortunately + many contenders are faster than SHA-1). + +There were several contenders for a successor hash to SHA-1, including +SHA-256, SHA-512/256, SHA-256x16, K12, and BLAKE2bp-256. + +In late 2018 the project picked SHA-256 as its successor hash. + +See 0ed8d8da374 (doc hash-function-transition: pick SHA-256 as +NewHash, 2018-08-04) and numerous mailing list threads at the time, +particularly the one starting at +https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/ +for more information. + Goals ----- 1. The transition to SHA-256 can be done one local repository at a time. @@ -94,7 +119,7 @@ Overview -------- We introduce a new repository format extension. Repositories with this extension enabled use SHA-256 instead of SHA-1 to name their objects. -This affects both object names and object content --- both the names +This affects both object names and object content -- both the names of objects and all references to other objects within an object are switched to the new hash function. @@ -107,7 +132,7 @@ mapping to allow naming objects using either their SHA-1 and SHA-256 names interchangeably. "git cat-file" and "git hash-object" gain options to display an object -in its sha1 form and write an object given its sha1 form. This +in its SHA-1 form and write an object given its SHA-1 form. This requires all objects referenced by that object to be present in the object database so that they can be named using the appropriate name (using the bidirectional hash mapping). @@ -115,7 +140,7 @@ object database so that they can be named using the appropriate name Fetches from a SHA-1 based server convert the fetched objects into SHA-256 form and record the mapping in the bidirectional mapping table (see below for details). Pushes to a SHA-1 based server convert the -objects being pushed into sha1 form so the server does not have to be +objects being pushed into SHA-1 form so the server does not have to be aware of the hash function the client is using. Detailed Design @@ -151,38 +176,38 @@ repository extensions. Object names ~~~~~~~~~~~~ -Objects can be named by their 40 hexadecimal digit sha1-name or 64 -hexadecimal digit sha256-name, plus names derived from those (see +Objects can be named by their 40 hexadecimal digit SHA-1 name or 64 +hexadecimal digit SHA-256 name, plus names derived from those (see gitrevisions(7)). -The sha1-name of an object is the SHA-1 of the concatenation of its -type, length, a nul byte, and the object's sha1-content. This is the +The SHA-1 name of an object is the SHA-1 of the concatenation of its +type, length, a nul byte, and the object's SHA-1 content. This is the traditional <sha1> used in Git to name objects. -The sha256-name of an object is the SHA-256 of the concatenation of its -type, length, a nul byte, and the object's sha256-content. +The SHA-256 name of an object is the SHA-256 of the concatenation of its +type, length, a nul byte, and the object's SHA-256 content. Object format ~~~~~~~~~~~~~ The content as a byte sequence of a tag, commit, or tree object named -by sha1 and sha256 differ because an object named by sha256-name refers to -other objects by their sha256-names and an object named by sha1-name -refers to other objects by their sha1-names. +by SHA-1 and SHA-256 differ because an object named by SHA-256 name refers to +other objects by their SHA-256 names and an object named by SHA-1 name +refers to other objects by their SHA-1 names. -The sha256-content of an object is the same as its sha1-content, except -that objects referenced by the object are named using their sha256-names -instead of sha1-names. Because a blob object does not refer to any -other object, its sha1-content and sha256-content are the same. +The SHA-256 content of an object is the same as its SHA-1 content, except +that objects referenced by the object are named using their SHA-256 names +instead of SHA-1 names. Because a blob object does not refer to any +other object, its SHA-1 content and SHA-256 content are the same. -The format allows round-trip conversion between sha256-content and -sha1-content. +The format allows round-trip conversion between SHA-256 content and +SHA-1 content. Object storage ~~~~~~~~~~~~~~ Loose objects use zlib compression and packed objects use the packed format described in Documentation/technical/pack-format.txt, just like -today. The content that is compressed and stored uses sha256-content -instead of sha1-content. +today. The content that is compressed and stored uses SHA-256 content +instead of SHA-1 content. Pack index ~~~~~~~~~~ @@ -191,21 +216,21 @@ hash functions. They have the following format (all integers are in network byte order): - A header appears at the beginning and consists of the following: - - The 4-byte pack index signature: '\377t0c' - - 4-byte version number: 3 - - 4-byte length of the header section, including the signature and + * The 4-byte pack index signature: '\377t0c' + * 4-byte version number: 3 + * 4-byte length of the header section, including the signature and version number - - 4-byte number of objects contained in the pack - - 4-byte number of object formats in this pack index: 2 - - For each object format: - - 4-byte format identifier (e.g., 'sha1' for SHA-1) - - 4-byte length in bytes of shortened object names. This is the + * 4-byte number of objects contained in the pack + * 4-byte number of object formats in this pack index: 2 + * For each object format: + ** 4-byte format identifier (e.g., 'sha1' for SHA-1) + ** 4-byte length in bytes of shortened object names. This is the shortest possible length needed to make names in the shortened object name table unambiguous. - - 4-byte integer, recording where tables relating to this format + ** 4-byte integer, recording where tables relating to this format are stored in this index file, as an offset from the beginning. - - 4-byte offset to the trailer from the beginning of this file. - - Zero or more additional key/value pairs (4-byte key, 4-byte + * 4-byte offset to the trailer from the beginning of this file. + * Zero or more additional key/value pairs (4-byte key, 4-byte value). Only one key is supported: 'PSRC'. See the "Loose objects and unreachable objects" section for supported values and how this is used. All other keys are reserved. Readers must ignore @@ -213,37 +238,36 @@ network byte order): - Zero or more NUL bytes. This can optionally be used to improve the alignment of the full object name table below. - Tables for the first object format: - - A sorted table of shortened object names. These are prefixes of + * A sorted table of shortened object names. These are prefixes of the names of all objects in this pack file, packed together without offset values to reduce the cache footprint of the binary search for a specific object name. - - A table of full object names in pack order. This allows resolving + * A table of full object names in pack order. This allows resolving a reference to "the nth object in the pack file" (from a reachability bitmap or from the next table of another object format) to its object name. - - A table of 4-byte values mapping object name order to pack order. + * A table of 4-byte values mapping object name order to pack order. For an object in the table of sorted shortened object names, the value at the corresponding index in this table is the index in the previous table for that same object. - This can be used to look up the object in reachability bitmaps or to look up its name in another object format. - - A table of 4-byte CRC32 values of the packed object data, in the + * A table of 4-byte CRC32 values of the packed object data, in the order that the objects appear in the pack file. This is to allow compressed data to be copied directly from pack to pack during repacking without undetected data corruption. - - A table of 4-byte offset values. For an object in the table of + * A table of 4-byte offset values. For an object in the table of sorted shortened object names, the value at the corresponding index in this table indicates where that object can be found in the pack file. These are usually 31-bit pack file offsets, but large offsets are encoded as an index into the next table with the most significant bit set. - - A table of 8-byte offset entries (empty for pack files less than + * A table of 8-byte offset entries (empty for pack files less than 2 GiB). Pack files are organized with heavily used objects toward the front, so most object references should not need to refer to this table. @@ -252,10 +276,10 @@ network byte order): up to and not including the table of CRC32 values. - Zero or more NUL bytes. - The trailer consists of the following: - - A copy of the 20-byte SHA-256 checksum at the end of the + * A copy of the 20-byte SHA-256 checksum at the end of the corresponding packfile. - - 20-byte SHA-256 checksum of all of the above. + * 20-byte SHA-256 checksum of all of the above. Loose object index ~~~~~~~~~~~~~~~~~~ @@ -288,18 +312,18 @@ To remove entries (e.g. in "git pack-refs" or "git-prune"): Translation table ~~~~~~~~~~~~~~~~~ -The index files support a bidirectional mapping between sha1-names -and sha256-names. The lookup proceeds similarly to ordinary object -lookups. For example, to convert a sha1-name to a sha256-name: +The index files support a bidirectional mapping between SHA-1 names +and SHA-256 names. The lookup proceeds similarly to ordinary object +lookups. For example, to convert a SHA-1 name to a SHA-256 name: 1. Look for the object in idx files. If a match is present in the - idx's sorted list of truncated sha1-names, then: - a. Read the corresponding entry in the sha1-name order to pack + idx's sorted list of truncated SHA-1 names, then: + a. Read the corresponding entry in the SHA-1 name order to pack name order mapping. - b. Read the corresponding entry in the full sha1-name table to + b. Read the corresponding entry in the full SHA-1 name table to verify we found the right object. If it is, then - c. Read the corresponding entry in the full sha256-name table. - That is the object's sha256-name. + c. Read the corresponding entry in the full SHA-256 name table. + That is the object's SHA-256 name. 2. Check for a loose object. Read lines from loose-object-idx until we find a match. @@ -313,10 +337,10 @@ Since all operations that make new objects (e.g., "git commit") add the new objects to the corresponding index, this mapping is possible for all objects in the object store. -Reading an object's sha1-content -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The sha1-content of an object can be read by converting all sha256-names -its sha256-content references to sha1-names using the translation table. +Reading an object's SHA-1 content +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The SHA-1 content of an object can be read by converting all SHA-256 names +of its SHA-256 content references to SHA-1 names using the translation table. Fetch ~~~~~ @@ -339,7 +363,7 @@ the following steps: 1. index-pack: inflate each object in the packfile and compute its SHA-1. Objects can contain deltas in OBJ_REF_DELTA format against objects the client has locally. These objects can be looked up - using the translation table and their sha1-content read as + using the translation table and their SHA-1 content read as described above to resolve the deltas. 2. topological sort: starting at the "want"s from the negotiation phase, walk through objects in the pack and emit a list of them, @@ -348,12 +372,12 @@ the following steps: (This list only contains objects reachable from the "wants". If the pack from the server contained additional extraneous objects, then they will be discarded.) -3. convert to sha256: open a new (sha256) packfile. Read the topologically +3. convert to SHA-256: open a new SHA-256 packfile. Read the topologically sorted list just generated. For each object, inflate its - sha1-content, convert to sha256-content, and write it to the sha256 - pack. Record the new sha1<->sha256 mapping entry for use in the idx. + SHA-1 content, convert to SHA-256 content, and write it to the SHA-256 + pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx. 4. sort: reorder entries in the new pack to match the order of objects - in the pack the server generated and include blobs. Write a sha256 idx + in the pack the server generated and include blobs. Write a SHA-256 idx file 5. clean up: remove the SHA-1 based pack file, index, and topologically sorted list obtained from the server in steps 1 @@ -378,19 +402,20 @@ experimenting to get this to perform well. Push ~~~~ Push is simpler than fetch because the objects referenced by the -pushed objects are already in the translation table. The sha1-content +pushed objects are already in the translation table. The SHA-1 content of each object being pushed can be read as described in the "Reading -an object's sha1-content" section to generate the pack written by git +an object's SHA-1 content" section to generate the pack written by git send-pack. Signed Commits ~~~~~~~~~~~~~~ We add a new field "gpgsig-sha256" to the commit object format to allow signing commits without relying on SHA-1. It is similar to the -existing "gpgsig" field. Its signed payload is the sha256-content of the +existing "gpgsig" field. Its signed payload is the SHA-256 content of the commit object with any "gpgsig" and "gpgsig-sha256" fields removed. This means commits can be signed + 1. using SHA-1 only, as in existing signed commit objects 2. using both SHA-1 and SHA-256, by using both gpgsig-sha256 and gpgsig fields. @@ -404,10 +429,11 @@ Signed Tags ~~~~~~~~~~~ We add a new field "gpgsig-sha256" to the tag object format to allow signing tags without relying on SHA-1. Its signed payload is the -sha256-content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP +SHA-256 content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP SIGNATURE-----" delimited in-body signature removed. This means tags can be signed + 1. using SHA-1 only, as in existing signed tag objects 2. using both SHA-1 and SHA-256, by using gpgsig-sha256 and an in-body signature. @@ -415,11 +441,11 @@ This means tags can be signed Mergetag embedding ~~~~~~~~~~~~~~~~~~ -The mergetag field in the sha1-content of a commit contains the -sha1-content of a tag that was merged by that commit. +The mergetag field in the SHA-1 content of a commit contains the +SHA-1 content of a tag that was merged by that commit. -The mergetag field in the sha256-content of the same commit contains the -sha256-content of the same tag. +The mergetag field in the SHA-256 content of the same commit contains the +SHA-256 content of the same tag. Submodules ~~~~~~~~~~ @@ -494,7 +520,7 @@ Caveats ------- Invalid objects ~~~~~~~~~~~~~~~ -The conversion from sha1-content to sha256-content retains any +The conversion from SHA-1 content to SHA-256 content retains any brokenness in the original object (e.g., tree entry modes encoded with leading 0, tree objects whose paths are not sorted correctly, and commit objects without an author or committer). This is a deliberate @@ -513,15 +539,15 @@ allow lifting this restriction. Alternates ~~~~~~~~~~ -For the same reason, a sha256 repository cannot borrow objects from a -sha1 repository using objects/info/alternates or +For the same reason, a SHA-256 repository cannot borrow objects from a +SHA-1 repository using objects/info/alternates or $GIT_ALTERNATE_OBJECT_REPOSITORIES. git notes ~~~~~~~~~ -The "git notes" tool annotates objects using their sha1-name as key. +The "git notes" tool annotates objects using their SHA-1 name as key. This design does not describe a way to migrate notes trees to use -sha256-names. That migration is expected to happen separately (for +SHA-256 names. That migration is expected to happen separately (for example using a file at the root of the notes tree to describe which hash it uses). @@ -555,7 +581,7 @@ unclear: Git 2.12 -Does this mean Git v2.12.0 is the commit with sha1-name +Does this mean Git v2.12.0 is the commit with SHA-1 name e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7 or the commit with new-40-digit-hash-name e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7? @@ -598,44 +624,12 @@ The user can also explicitly specify which format to use for a particular revision specifier and for output, overriding the mode. For example: -git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256} - -Choice of Hash --------------- -In early 2005, around the time that Git was written, Xiaoyun Wang, -Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1 -collisions in 2^69 operations. In August they published details. -Luckily, no practical demonstrations of a collision in full SHA-1 were -published until 10 years later, in 2017. - -Git v2.13.0 and later subsequently moved to a hardened SHA-1 -implementation by default that mitigates the SHAttered attack, but -SHA-1 is still believed to be weak. - -The hash to replace this hardened SHA-1 should be stronger than SHA-1 -was: we would like it to be trustworthy and useful in practice for at -least 10 years. - -Some other relevant properties: - -1. A 256-bit hash (long enough to match common security practice; not - excessively long to hurt performance and disk usage). - -2. High quality implementations should be widely available (e.g., in - OpenSSL and Apple CommonCrypto). - -3. The hash function's properties should match Git's needs (e.g. Git - requires collision and 2nd preimage resistance and does not require - length extension resistance). - -4. As a tiebreaker, the hash should be fast to compute (fortunately - many contenders are faster than SHA-1). - -We choose SHA-256. + git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256} Transition plan --------------- Some initial steps can be implemented independently of one another: + - adding a hash function API (vtable) - teaching fsck to tolerate the gpgsig-sha256 field - excluding gpgsig-* from the fields copied by "git commit --amend" @@ -647,9 +641,9 @@ Some initial steps can be implemented independently of one another: - introducing index v3 - adding support for the PSRC field and safer object pruning - The first user-visible change is the introduction of the objectFormat extension (without compatObjectFormat). This requires: + - teaching fsck about this mode of operation - using the hash function API (vtable) when computing object names - signing objects and verifying signatures @@ -657,6 +651,7 @@ extension (without compatObjectFormat). This requires: repository Next comes introduction of compatObjectFormat: + - implementing the loose-object-idx - translating object names between object formats - translating object content between object formats @@ -669,10 +664,11 @@ Next comes introduction of compatObjectFormat: "Object names on the command line" above) The next step is supporting fetches and pushes to SHA-1 repositories: + - allow pushes to a repository using the compat format - generate a topologically sorted list of the SHA-1 names of fetched objects -- convert the fetched packfile to sha256 format and generate an idx +- convert the fetched packfile to SHA-256 format and generate an idx file - re-sort to match the order of objects in the fetched packfile @@ -734,6 +730,7 @@ Using hash functions in parallel Objects newly created would be addressed by the new hash, but inside such an object (e.g. commit) it is still possible to address objects using the old hash function. + * You cannot trust its history (needed for bisectability) in the future without further work * Maintenance burden as the number of supported hash functions grows @@ -743,36 +740,38 @@ using the old hash function. Signed objects with multiple hashes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Instead of introducing the gpgsig-sha256 field in commit and tag objects -for sha256-content based signatures, an earlier version of this design -added "hash sha256 <sha256-name>" fields to strengthen the existing -sha1-content based signatures. +for SHA-256 content based signatures, an earlier version of this design +added "hash sha256 <SHA-256 name>" fields to strengthen the existing +SHA-1 content based signatures. In other words, a single signature was used to attest to the object content using both hash functions. This had some advantages: + * Using one signature instead of two speeds up the signing process. * Having one signed payload with both hashes allows the signer to - attest to the sha1-name and sha256-name referring to the same object. + attest to the SHA-1 name and SHA-256 name referring to the same object. * All users consume the same signature. Broken signatures are likely to be detected quickly using current versions of git. However, it also came with disadvantages: -* Verifying a signed object requires access to the sha1-names of all + +* Verifying a signed object requires access to the SHA-1 names of all objects it references, even after the transition is complete and translation table is no longer needed for anything else. To support - this, the design added fields such as "hash sha1 tree <sha1-name>" - and "hash sha1 parent <sha1-name>" to the sha256-content of a signed + this, the design added fields such as "hash sha1 tree <SHA-1 name>" + and "hash sha1 parent <SHA-1 name>" to the SHA-256 content of a signed commit, complicating the conversion process. -* Allowing signed objects without a sha1 (for after the transition is +* Allowing signed objects without a SHA-1 (for after the transition is complete) complicated the design further, requiring a "nohash sha1" - field to suppress including "hash sha1" fields in the sha256-content + field to suppress including "hash sha1" fields in the SHA-256 content and signed payload. Lazily populated translation table ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Some of the work of building the translation table could be deferred to push time, but that would significantly complicate and slow down pushes. -Calculating the sha1-name at object creation time at the same time it is -being streamed to disk and having its sha256-name calculated should be +Calculating the SHA-1 name at object creation time at the same time it is +being streamed to disk and having its SHA-256 name calculated should be an acceptable cost. Document History @@ -782,18 +781,19 @@ Document History bmwill@google.com, jonathantanmy@google.com, jrnieder@gmail.com, sbeller@google.com -Initial version sent to -http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com +* Initial version sent to https://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com 2017-03-03 jrnieder@gmail.com Incorporated suggestions from jonathantanmy and sbeller: -* describe purpose of signed objects with each hash type -* redefine signed object verification using object content under the + +* Describe purpose of signed objects with each hash type +* Redefine signed object verification using object content under the first hash function 2017-03-06 jrnieder@gmail.com + * Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2] -* Make sha3-based signatures a separate field, avoiding the need for +* Make SHA3-based signatures a separate field, avoiding the need for "hash" and "nohash" fields (thanks to peff[3]). * Add a sorting phase to fetch (thanks to Junio for noticing the need for this). @@ -805,23 +805,26 @@ Incorporated suggestions from jonathantanmy and sbeller: especially Junio). 2017-09-27 jrnieder@gmail.com, sbeller@google.com -* use placeholder NewHash instead of SHA3-256 -* describe criteria for picking a hash function. -* include a transition plan (thanks especially to Brandon Williams + +* Use placeholder NewHash instead of SHA3-256 +* Describe criteria for picking a hash function. +* Include a transition plan (thanks especially to Brandon Williams for fleshing these ideas out) -* define the translation table (thanks, Shawn Pearce[5], Jonathan +* Define the translation table (thanks, Shawn Pearce[5], Jonathan Tan, and Masaya Suzuki) -* avoid loose object overhead by packing more aggressively in +* Avoid loose object overhead by packing more aggressively in "git gc --auto" Later history: - See the history of this file in git.git for the history of subsequent - edits. This document history is no longer being maintained as it - would now be superfluous to the commit log +* See the history of this file in git.git for the history of subsequent + edits. This document history is no longer being maintained as it + would now be superfluous to the commit log + +References: -[1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/ -[2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/ -[3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/ -[4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net -[5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/ + [1] https://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/ + [2] https://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/ + [3] https://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/ + [4] https://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net + [5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/ diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index 69edf46c03..d363a71c37 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -26,7 +26,7 @@ Git index format Extensions are identified by signature. Optional extensions can be ignored if Git does not understand them. - Git currently supports cached tree and resolve undo extensions. + Git currently supports cache tree and resolve undo extensions. 4-byte extension signature. If the first byte is 'A'..'Z' the extension is optional and can be ignored. @@ -136,14 +136,35 @@ Git index format == Extensions -=== Cached tree - - Cached tree extension contains pre-computed hashes for trees that can - be derived from the index. It helps speed up tree object generation - from index for a new commit. - - When a path is updated in index, the path must be invalidated and - removed from tree cache. +=== Cache tree + + Since the index does not record entries for directories, the cache + entries cannot describe tree objects that already exist in the object + database for regions of the index that are unchanged from an existing + commit. The cache tree extension stores a recursive tree structure that + describes the trees that already exist and completely match sections of + the cache entries. This speeds up tree object generation from the index + for a new commit by only computing the trees that are "new" to that + commit. It also assists when comparing the index to another tree, such + as `HEAD^{tree}`, since sections of the index can be skipped when a tree + comparison demonstrates equality. + + The recursive tree structure uses nodes that store a number of cache + entries, a list of subnodes, and an object ID (OID). The OID references + the existing tree for that node, if it is known to exist. The subnodes + correspond to subdirectories that themselves have cache tree nodes. The + number of cache entries corresponds to the number of cache entries in + the index that describe paths within that tree's directory. + + The extension tracks the full directory structure in the cache tree + extension, but this is generally smaller than the full cache entry list. + + When a path is updated in index, Git invalidates all nodes of the + recursive cache tree corresponding to the parent directories of that + path. We store these tree nodes as being "invalid" by using "-1" as the + number of cache entries. Invalid nodes still store a span of index + entries, allowing Git to focus its efforts when reconstructing a full + cache tree. The signature for this extension is { 'T', 'R', 'E', 'E' }. @@ -174,7 +195,8 @@ Git index format first entry represents the root level of the repository, followed by the first subtree--let's call this A--of the root level (with its name relative to the root level), followed by the first subtree of A (with - its name relative to A), ... + its name relative to A), and so on. The specified number of subtrees + indicates when the current level of the recursive stack is complete. === Resolve undo @@ -251,14 +273,14 @@ Git index format - Stat data of $GIT_DIR/info/exclude. See "Index entry" section from ctime field until "file size". - - Stat data of core.excludesfile + - Stat data of core.excludesFile - 32-bit dir_flags (see struct dir_struct) - Hash of $GIT_DIR/info/exclude. A null hash means the file does not exist. - - Hash of core.excludesfile. A null hash means the file does + - Hash of core.excludesFile. A null hash means the file does not exist. - NUL-terminated string of per-dir exclude file name. This usually diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt index 96d2fc589f..1faa949bf6 100644 --- a/Documentation/technical/pack-format.txt +++ b/Documentation/technical/pack-format.txt @@ -274,6 +274,26 @@ Pack file entry: <+ Index checksum of all of the above. +== pack-*.rev files have the format: + + - A 4-byte magic number '0x52494458' ('RIDX'). + + - A 4-byte version identifier (= 1). + + - A 4-byte hash function identifier (= 1 for SHA-1, 2 for SHA-256). + + - A table of index positions (one per packed object, num_objects in + total, each a 4-byte unsigned integer in network order), sorted by + their corresponding offsets in the packfile. + + - A trailer, containing a: + + checksum of the corresponding packfile, and + + a checksum of all of the above. + +All 4-byte numbers are in network order. + == multi-pack-index (MIDX) files have the following format: The multi-pack-index files refer to multiple pack-files and loose objects. @@ -316,6 +336,9 @@ CHUNK LOOKUP: (Chunks are provided in file-order, so you can infer the length using the next chunk position if necessary.) + The CHUNK LOOKUP matches the table of contents from + link:technical/chunk-format.html[the chunk-based file format]. + The remaining data in the body is described one chunk at a time, and these chunks may be given in any order. Chunks are required unless otherwise specified. diff --git a/Documentation/technical/packfile-uri.txt b/Documentation/technical/packfile-uri.txt index 318713abc3..f7eabc6c76 100644 --- a/Documentation/technical/packfile-uri.txt +++ b/Documentation/technical/packfile-uri.txt @@ -37,8 +37,11 @@ at least so that we can test the client. This is the implementation: a feature, marked experimental, that allows the server to be configured by one or more `uploadpack.blobPackfileUri=<sha1> <uri>` entries. Whenever the list of objects to be sent is assembled, all such -blobs are excluded, replaced with URIs. The client will download those URIs, -expecting them to each point to packfiles containing single blobs. +blobs are excluded, replaced with URIs. As noted in "Future work" below, the +server can evolve in the future to support excluding other objects (or other +implementations of servers could be made that support excluding other objects) +without needing a protocol change, so clients should not expect that packfiles +downloaded in this way only contain single blobs. Client design ------------- diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt index 85daeb5d9e..a7c806a73e 100644 --- a/Documentation/technical/protocol-v2.txt +++ b/Documentation/technical/protocol-v2.txt @@ -33,8 +33,8 @@ In protocol v2 these special packets will have the following semantics: * '0000' Flush Packet (flush-pkt) - indicates the end of a message * '0001' Delimiter Packet (delim-pkt) - separates sections of a message - * '0002' Message Packet (response-end-pkt) - indicates the end of a response - for stateless connections + * '0002' Response End Packet (response-end-pkt) - indicates the end of a + response for stateless connections Initial Client Request ---------------------- @@ -192,11 +192,20 @@ ls-refs takes in the following arguments: When specified, only references having a prefix matching one of the provided prefixes are displayed. +If the 'unborn' feature is advertised the following argument can be +included in the client's request. + + unborn + The server will send information about HEAD even if it is a symref + pointing to an unborn branch in the form "unborn HEAD + symref-target:<target>". + The output of ls-refs is as follows: output = *ref flush-pkt - ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF) + obj-id-or-unborn = (obj-id | "unborn") + ref = PKT-LINE(obj-id-or-unborn SP refname *(SP ref-attribute) LF) ref-attribute = (symref | peeled) symref = "symref-target:" symref-target peeled = "peeled:" obj-id diff --git a/Documentation/technical/reftable.txt b/Documentation/technical/reftable.txt index 8095ab2590..3ef169af27 100644 --- a/Documentation/technical/reftable.txt +++ b/Documentation/technical/reftable.txt @@ -872,17 +872,11 @@ A repository must set its `$GIT_DIR/config` to configure reftable: Layout ^^^^^^ -A collection of reftable files are stored in the `$GIT_DIR/reftable/` -directory: - -.... -00000001-00000001.log -00000002-00000002.ref -00000003-00000003.ref -.... - -where reftable files are named by a unique name such as produced by the -function `${min_update_index}-${max_update_index}.ref`. +A collection of reftable files are stored in the `$GIT_DIR/reftable/` directory. +Their names should have a random element, such that each filename is globally +unique; this helps avoid spurious failures on Windows, where open files cannot +be removed or overwritten. It suggested to use +`${min_update_index}-${max_update_index}-${random}.ref` as a naming convention. Log-only files use the `.log` extension, while ref-only and mixed ref and log files use `.ref`. extension. @@ -893,9 +887,9 @@ current files, one per line, in order, from oldest (base) to newest .... $ cat .git/reftable/tables.list -00000001-00000001.log -00000002-00000002.ref -00000003-00000003.ref +00000001-00000001-RANDOM1.log +00000002-00000002-RANDOM2.ref +00000003-00000003-RANDOM3.ref .... Readers must read `$GIT_DIR/reftable/tables.list` to determine which @@ -940,7 +934,7 @@ new reftable and atomically appending it to the stack: 3. Select `update_index` to be most recent file's `max_update_index + 1`. 4. Prepare temp reftable `tmp_XXXXXX`, including log entries. -5. Rename `tmp_XXXXXX` to `${update_index}-${update_index}.ref`. +5. Rename `tmp_XXXXXX` to `${update_index}-${update_index}-${random}.ref`. 6. Copy `tables.list` to `tables.list.lock`, appending file from (5). 7. Rename `tables.list.lock` to `tables.list`. @@ -993,7 +987,7 @@ prevents other processes from trying to compact these files. should always be the case, assuming that other processes are adhering to the locking protocol. 7. Rename `${min_update_index}-${max_update_index}_XXXXXX` to -`${min_update_index}-${max_update_index}.ref`. +`${min_update_index}-${max_update_index}-${random}.ref`. 8. Write the new stack to `tables.list.lock`, replacing `B` and `C` with the file from (4). 9. Rename `tables.list.lock` to `tables.list`. @@ -1005,6 +999,22 @@ This strategy permits compactions to proceed independently of updates. Each reftable (compacted or not) is uniquely identified by its name, so open reftables can be cached by their name. +Windows +^^^^^^^ + +On windows, and other systems that do not allow deleting or renaming to open +files, compaction may succeed, but other readers may prevent obsolete tables +from being deleted. + +On these platforms, the following strategy can be followed: on closing a +reftable stack, reload `tables.list`, and delete any tables no longer mentioned +in `tables.list`. + +Irregular program exit may still leave about unused files. In this case, a +cleanup operation can read `tables.list`, note its modification timestamp, and +delete any unreferenced `*.ref` files that are older. + + Alternatives considered ~~~~~~~~~~~~~~~~~~~~~~~ |