summaryrefslogtreecommitdiff
path: root/Documentation
AgeCommit message (Collapse)AuthorFilesLines
2020-10-07Documentation/config/checkout: replace sq with backticksLibravatar Denton Liu1-8/+8
The modern style for Git documentation is to use backticks to quote any command-line documenation so that it is typeset in monospace. Replace all single quotes with backticks to conform to this. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-05Git 2.29-rc0Libravatar Junio C Hamano1-0/+31
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-05Merge branch 'sn/fast-import-doc'Libravatar Junio C Hamano1-1/+1
Doc update. * sn/fast-import-doc: fast-import: fix typo in documentation
2020-10-05Merge branch 'pb/submodule-doc-fix'Libravatar Junio C Hamano1-10/+13
Doc update. * pb/submodule-doc-fix: gitsubmodules doc: invoke 'ls-files' with '--recurse-submodules'
2020-10-05Merge branch 'jk/format-auto-base-when-able'Libravatar Junio C Hamano1-1/+3
"git format-patch" learns to take "whenAble" as a possible value for the format.useAutoBase configuration variable to become no-op when the automatically computed base does not make sense. * jk/format-auto-base-when-able: format-patch: teach format.useAutoBase "whenAble" option
2020-10-05Merge branch 'jk/refspecs-negative'Libravatar Junio C Hamano1-0/+16
"git fetch" and "git push" support negative refspecs. * jk/refspecs-negative: refspec: add support for negative refspecs
2020-10-05Merge branch 'rs/archive-add-file'Libravatar Junio C Hamano1-0/+6
"git archive" learns the "--add-file" option to include untracked files into a snapshot from a tree-ish. * rs/archive-add-file: Makefile: use git-archive --add-file archive: add --add-file archive: read short blobs in archive.c::write_archive_entry()
2020-10-04fast-import: fix typo in documentationLibravatar Samanta Navarro1-1/+1
Signed-off-by: Samanta Navarro <ferivoz@riseup.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-04gitsubmodules doc: invoke 'ls-files' with '--recurse-submodules'Libravatar Philippe Blain1-10/+13
`git ls-files` was never taught to respect the `submodule.recurse` configuration variable, and it is too late now to change that [1], but still the command is mentioned in 'gitsubmodules(7)' as if it does respect that config. Adjust the call in 'gitsubmodules(7)' by calling 'ls-files' with the '--recurse-submodules' option. While at it, uniformize the capitalization in that file, and use backticks instead of quotes for Git commands and configuration variables. [1] https://lore.kernel.org/git/pull.732.git.1599707259907.gitgitgadget@gmail.com/T/#u Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-04Nineteenth batchLibravatar Junio C Hamano1-0/+33
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-04Merge branch 'jk/shortlog-group-by-trailer'Libravatar Junio C Hamano1-1/+30
"git shortlog" has been taught to group commits by the contents of the trailer lines, like "Reviewed-by:", "Coauthored-by:", etc. * jk/shortlog-group-by-trailer: shortlog: allow multiple groups to be specified shortlog: parse trailer idents shortlog: rename parse_stdin_ident() shortlog: de-duplicate trailer values shortlog: match commit trailers with --group trailer: add interface for iterating over commit trailers shortlog: add grouping option shortlog: change "author" variables to "ident"
2020-10-04Merge branch 'jc/fmt-merge-msg-suppress-destination'Libravatar Junio C Hamano1-1/+1
Docfix. * jc/fmt-merge-msg-suppress-destination: config/fmt-merge-msg.txt: drop space in quote
2020-10-04Merge branch 'tb/upload-pack-filters'Libravatar Junio C Hamano1-1/+1
Hotfix. * tb/upload-pack-filters: config/uploadpack.txt: fix typo in `--filter=tree:<n>`
2020-10-04Merge branch 'eg/mailinfo-doc-scissors'Libravatar Junio C Hamano1-4/+3
The explanation of the "scissors line" has been clarified. * eg/mailinfo-doc-scissors: Doc: show example scissors line
2020-10-01format-patch: teach format.useAutoBase "whenAble" optionLibravatar Jacob Keller1-1/+3
The format.useAutoBase configuration option exists to allow users to enable '--base=auto' for format-patch by default. This can sometimes lead to poor workflow, due to unexpected failures when attempting to format an ancient patch: $ git format-patch -1 <an old commit> fatal: base commit shouldn't be in revision list This can be very confusing, as it is not necessarily immediately obvious that the user requested a --base (since this was in the configuration, not on the command line). We do want --base=auto to fail when it cannot provide a suitable base, as it would be equally confusing if a formatted patch did not include the base information when it was requested. Teach format.useAutoBase a new mode, "whenAble". This mode will cause format-patch to attempt to include a base commit when it can. However, if no valid base commit can be found, then format-patch will continue formatting the patch without a base commit. In order to avoid making yet another branch name unusable with --base, do not teach --base=whenAble or --base=whenable. Instead, refactor the base_commit option to use a callback, and rely on the global configuration variable auto_base. This does mean that a user cannot request this optional base commit generation from the command line. However, this is likely not too valuable. If the user requests base information manually, they will be immediately informed of the failure to acquire a suitable base commit. This allows the user to make an informed choice about whether to continue the format. Add tests to cover the new mode of operation for --base. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30refspec: add support for negative refspecsLibravatar Jacob Keller1-0/+16
Both fetch and push support pattern refspecs which allow fetching or pushing references that match a specific pattern. Because these patterns are globs, they have somewhat limited ability to express more complex situations. For example, suppose you wish to fetch all branches from a remote except for a specific one. To allow this, you must setup a set of refspecs which match only the branches you want. Because refspecs are either explicit name matches, or simple globs, many patterns cannot be expressed. Add support for a new type of refspec, referred to as "negative" refspecs. These are prefixed with a '^' and mean "exclude any ref matching this refspec". They can only have one "side" which always refers to the source. During a fetch, this refers to the name of the ref on the remote. During a push, this refers to the name of the ref on the local side. With negative refspecs, users can express more complex patterns. For example: git fetch origin refs/heads/*:refs/remotes/origin/* ^refs/heads/dontwant will fetch all branches on origin into remotes/origin, but will exclude fetching the branch named dontwant. Refspecs today are commutative, meaning that order doesn't expressly matter. Rather than forcing an implied order, negative refspecs will always be applied last. That is, in order to match, a ref must match at least one positive refspec, and match none of the negative refspecs. This is similar to how negative pathspecs work. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-29Eighteenth batchLibravatar Junio C Hamano1-0/+24
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-29Merge branch 'jk/make-protocol-v2-the-default'Libravatar Junio C Hamano2-6/+1
The transport protocol v2 has become the default again. * jk/make-protocol-v2-the-default: protocol: re-enable v2 protocol by default
2020-09-29Merge branch 'tb/bloom-improvements'Libravatar Junio C Hamano4-1/+18
"git commit-graph write" learned to limit the number of bloom filters that are computed from scratch with the --max-new-filters option. * tb/bloom-improvements: commit-graph: introduce 'commitGraph.maxNewFilters' builtin/commit-graph.c: introduce '--max-new-filters=<n>' commit-graph: rename 'split_commit_graph_opts' bloom: encode out-of-bounds filters as non-empty bloom/diff: properly short-circuit on max_changes bloom: use provided 'struct bloom_filter_settings' bloom: split 'get_bloom_filter()' in two commit-graph.c: store maximum changed paths commit-graph: respect 'commitGraph.readChangedPaths' t/helper/test-read-graph.c: prepare repo settings commit-graph: pass a 'struct repository *' in more places t4216: use an '&&'-chain commit-graph: introduce 'get_bloom_filter_settings()'
2020-09-29Merge branch 'bc/faq-misc'Libravatar Junio C Hamano1-0/+86
More FAQ entries. * bc/faq-misc: docs: explain how to deal with files that are always modified docs: explain why reverts are not always applied on merge docs: explain why squash merges are broken with long-running branches
2020-09-28Doc: show example scissors lineLibravatar Evan Gates1-4/+3
The text tries to say the code accepts many variations that look remotely like scissors and perforation marks, but gives too little detail for users to decide what is and what is not taken as a scissors line for themselves. Instead of describing the heuristics more, just spell out what will always be accepted, namely "-- >8 --", as it would not help users to give them more choices and flexibility and be "creative" in their scissors line. Signed-off-by: Evan Gates <evan.gates@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27config/uploadpack.txt: fix typo in `--filter=tree:<n>`Libravatar Martin Ågren1-1/+1
That should be a ":", not a second "=". While at it, refer to the placeholder "<n>" as "<n>", not "n" (see, e.g., the entry just before this one). Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27config/fmt-merge-msg.txt: drop space in quoteLibravatar Martin Ågren1-1/+1
We document how `merge.suppressDest` can be used to omit " into <branch name>" from the title of the merge message. It is true that we omit the space character before "into", but that lone double quote character risks ending up on the wrong side of a line break, looking a bit out of place. This currently happens with, e.g., 80-character terminals. Drop that leading quoted space. The result should be just as clear about how this option affects the formatted message. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27shortlog: allow multiple groups to be specifiedLibravatar Jeff King1-0/+7
Now that shortlog supports reading from trailers, it can be useful to combine counts from multiple trailers, or between trailers and authors. This can be done manually by post-processing the output from multiple runs, but it's non-trivial to make sure that each name/commit pair is counted only once. This patch teaches shortlog to accept multiple --group options on the command line, and pull data from all of them. That makes it possible to run: git shortlog -ns --group=author --group=trailer:co-authored-by to get a shortlog that counts authors and co-authors equally. The implementation is mostly straightforward. The "group" enum becomes a bitfield, and the trailer key becomes a list. I didn't bother implementing the multi-group semantics for reading from stdin. It would be possible to do, but the existing matching code makes it awkward, and I doubt anybody cares. The duplicate suppression we used for trailers now covers authors and committers as well (though in non-trailer single-group mode we can skip the hash insertion and lookup, since we only see one value per commit). There is one subtlety: we now care about the case when no group bit is set (in which case we default to showing the author). The caller in builtin/log.c needs to be adapted to ask explicitly for authors, rather than relying on shortlog_init(). It would be possible with some gymnastics to make this keep working as-is, but it's not worth it for a single caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27shortlog: parse trailer identsLibravatar Jeff King1-3/+4
Trailers don't necessarily contain name/email identity values, so shortlog has so far treated them as opaque strings. However, since many trailers do contain identities, it's useful to treat them as such when they can be parsed. That lets "-e" work as usual, as well as mailmap. When they can't be parsed, we'll continue with the old behavior of treating them as a single string (there's no new test for that here, since the existing tests cover a trailer like this). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27shortlog: de-duplicate trailer valuesLibravatar Jeff King1-1/+2
The current documentation is vague about what happens with --group=trailer:signed-off-by when we see a commit with: Signed-off-by: One Signed-off-by: Two Signed-off-by: One We clearly should credit both "One" and "Two", but should "One" get credited twice? The current code does so, but mostly because that was the easiest thing to do. It's probably more useful to count each commit at most once. This will become especially important when we allow values from multiple sources in a future patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27shortlog: match commit trailers with --groupLibravatar Jeff King1-0/+13
If a project uses commit trailers, this patch lets you use shortlog to see who is performing each action. For example, running: git shortlog -ns --group=trailer:reviewed-by in git.git shows who has reviewed. You can even use a custom format to see things like who has helped whom: git shortlog --format="...helped %an (%ad)" \ --group=trailer:helped-by Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-27shortlog: add grouping optionLibravatar Jeff King1-1/+8
In preparation for adding more grouping types, let's refactor the committer/author grouping code and add a user-facing option that binds them together. In particular: - the main option is now "--group", to make it clear that the various group types are mutually exclusive. The "--committer" option is an alias for "--group=committer". - we keep an enum rather than a binary flag, to prepare for more values - we prefer switch statements to ternary assignment, since other group types will need more custom code Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25Seventeenth batchLibravatar Junio C Hamano1-0/+19
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25Merge branch 'jx/proc-receive-hook'Libravatar Junio C Hamano4-6/+136
"git receive-pack" that accepts requests by "git push" learned to outsource most of the ref updates to the new "proc-receive" hook. * jx/proc-receive-hook: doc: add documentation for the proc-receive hook transport: parse report options for tracking refs t5411: test updates of remote-tracking branches receive-pack: new config receive.procReceiveRefs doc: add document for capability report-status-v2 New capability "report-status-v2" for git-push receive-pack: feed report options to post-receive receive-pack: add new proc-receive hook t5411: add basic test cases for proc-receive hook transport: not report a non-head push as a branch
2020-09-25Merge branch 'ds/maintenance-part-1'Libravatar Junio C Hamano5-5/+104
A "git gc"'s big brother has been introduced to take care of more repository maintenance tasks, not limited to the object database cleaning. * ds/maintenance-part-1: maintenance: add trace2 regions for task execution maintenance: add auto condition for commit-graph task maintenance: use pointers to check --auto maintenance: create maintenance.<task>.enabled config maintenance: take a lock on the objects directory maintenance: add --task option maintenance: add commit-graph task maintenance: initialize task array maintenance: replace run_auto_gc() maintenance: add --quiet option maintenance: create basic maintenance runner
2020-09-25protocol: re-enable v2 protocol by defaultLibravatar Jeff King2-6/+1
Protocol v2 became the default in v2.26.0 via 684ceae32d (fetch: default to protocol version 2, 2019-12-23). More widespread use turned up a regression in negotiation. That was fixed in v2.27.0 via 4fa3f00abb (fetch-pack: in protocol v2, in_vain only after ACK, 2020-04-27), but we also reverted the default to v0 as a precuation in 11c7f2a30b (Revert "fetch: default to protocol version 2", 2020-04-22). In v2.28.0, we re-enabled it for experimental users with 3697caf4b9 (config: let feature.experimental imply protocol.version=2, 2020-05-20) and haven't heard any complaints. v2.28 has only been out for 2 months, but I'd generally expect people turning on feature.experimental to also stay pretty up-to-date. So we're not likely to collect much more data by waiting. In addition, we have no further reports from people running v2.26.0, and of course some people have been setting protocol.version manually for ages. Let's move forward with v2 as the default again. It's possible there are still lurking bugs, but we won't know until it gets more widespread use. And we can find and squash them just like any other bug at this point. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-22Sixteenth batchLibravatar Junio C Hamano1-0/+31
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-22Merge branch 'al/ref-filter-merged-and-no-merged'Libravatar Junio C Hamano4-13/+28
"git for-each-ref" and friends that list refs used to allow only one --merged or --no-merged to filter them; they learned to take combination of both kind of filtering. * al/ref-filter-merged-and-no-merged: Doc: prefer more specific file name ref-filter: make internal reachable-filter API more precise ref-filter: allow merged and no-merged filters Doc: cover multiple contains/no-contains filters t3201: test multiple branch filter combinations
2020-09-22Merge branch 'cd/commit-graph-doc'Libravatar Junio C Hamano1-1/+1
Doc update. * cd/commit-graph-doc: commit-graph-format.txt: fix no-parent value
2020-09-22Merge branch 'ls/mergetool-meld-auto-merge'Libravatar Junio C Hamano1-0/+10
The 'meld' backend of the "git mergetool" learned to give the underlying 'meld' the '--auto-merge' option, which would help reduce the amount of text that requires manual merging. * ls/mergetool-meld-auto-merge: mergetool: allow auto-merge for meld to follow the vim-diff behavior
2020-09-22Merge branch 'hn/refs-trace-backend'Libravatar Junio C Hamano1-0/+4
Developer support. * hn/refs-trace-backend: refs: add GIT_TRACE_REFS debugging mechanism
2020-09-22Merge branch 'jt/threaded-index-pack'Libravatar Junio C Hamano1-1/+1
"git index-pack" learned to resolve deltified objects with greater parallelism. * jt/threaded-index-pack: index-pack: make quantum of work smaller index-pack: make resolve_delta() assume base data index-pack: calculate {ref,ofs}_{first,last} early index-pack: remove redundant child field index-pack: unify threaded and unthreaded code index-pack: remove redundant parameter Documentation: deltaBaseCacheLimit is per-thread
2020-09-20docs: explain how to deal with files that are always modifiedLibravatar brian m. carlson1-0/+33
Users frequently have problems where two filenames differ only in case, causing one of those files to show up consistently as being modified. Let's add a FAQ entry that explains how to deal with that. In addition, let's explain another common case where files are consistently modified, which is when files using a smudge or clean filter have not been run through that filter. Explain the way to fix this as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-20docs: explain why reverts are not always applied on mergeLibravatar brian m. carlson1-0/+21
A common scenario is for a user to apply a change to one branch and cherry-pick it into another, then later revert it in the first branch. This results in the change being present when the two branches are merged, which is confusing to many users. We already have documentation for how this works in `git merge`, but it is clear from the frequency with which this is asked that it's hard to grasp. We also don't explain to users that they are better off doing a rebase in this case, which will do what they intended. Let's add an entry to the FAQ telling users what's happening and advising them to use rebase here. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-20docs: explain why squash merges are broken with long-running branchesLibravatar brian m. carlson1-0/+32
In many projects, squash merges are commonly used, primarily to keep a tidy history in the face of developers who do not use logically independent, bisectable commits. As common as this is, this tends to cause significant problems when squash merges are used to merge long-running branches due to the lack of any new merge bases. Even very experienced developers may make this mistake, so let's add a FAQ entry explaining why this is problematic and explaining that regular merge commits should be used to merge two long-running branches. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-19archive: add --add-fileLibravatar René Scharfe1-0/+6
Allow users to append non-tracked files. This simplifies the generation of source packages with a few extra files, e.g. containing version information. They get the same access times and user information as tracked files. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-18Fifteenth batchLibravatar Junio C Hamano1-1/+28
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-18Merge branch 'es/wt-add-detach'Libravatar Junio C Hamano2-0/+14
"git worktree add" learns that the "-d" is a synonym to "--detach" option to create a new worktree without being on a branch. * es/wt-add-detach: git-worktree.txt: discuss branch-based vs. throwaway worktrees worktree: teach `add` to recognize -d as shorthand for --detach git-checkout.txt: document -d short option for --detach
2020-09-18Doc: prefer more specific file nameLibravatar Aaron Lipman4-3/+3
Change filters.txt to ref-reachability-filters.txt in order to avoid squatting on a file name that might be useful for another purpose. Signed-off-by: Aaron Lipman <alipman88@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-18commit-graph: introduce 'commitGraph.maxNewFilters'Libravatar Taylor Blau2-1/+6
Introduce a configuration variable to specify a default value for the recently-introduce '--max-new-filters' option of 'git commit-graph write'. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-18builtin/commit-graph.c: introduce '--max-new-filters=<n>'Libravatar Taylor Blau1-0/+6
Introduce a command-line flag to specify the maximum number of new Bloom filters that a 'git commit-graph write' is willing to compute from scratch. Prior to this patch, a commit-graph write with '--changed-paths' would compute Bloom filters for all selected commits which haven't already been computed (i.e., by a previous commit-graph write with '--split' such that a roll-up or replacement is performed). This behavior can cause prohibitively-long commit-graph writes for a variety of reasons: * There may be lots of filters whose diffs take a long time to generate (for example, they have close to the maximum number of changes, diffing itself takes a long time, etc). * Old-style commit-graphs (which encode filters with too many entries as not having been computed at all) cause us to waste time recomputing filters that appear to have not been computed only to discover that they are too-large. This can make the upper-bound of the time it takes for 'git commit-graph write --changed-paths' to be rather unpredictable. To make this command behave more predictably, introduce '--max-new-filters=<n>' to allow computing at most '<n>' Bloom filters from scratch. This lets "computing" already-known filters proceed quickly, while bounding the number of slow tasks that Git is willing to do. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17bloom: encode out-of-bounds filters as non-emptyLibravatar Taylor Blau1-1/+1
When a changed-path Bloom filter has either zero, or more than a certain number (commonly 512) of entries, the commit-graph machinery encodes it as "missing". More specifically, it sets the indices adjacent in the BIDX chunk as equal to each other to indicate a "length 0" filter; that is, that the filter occupies zero bytes on disk. This has heretofore been fine, since the commit-graph machinery has no need to care about these filters with too few or too many changed paths. Both cases act like no filter has been generated at all, and so there is no need to store them. In a subsequent commit, however, the commit-graph machinery will learn to only compute Bloom filters for some commits in the current commit-graph layer. This is a change from the current implementation which computes Bloom filters for all commits that are in the layer being written. Critically for this patch, only computing some of the Bloom filters means adding a third state for length 0 Bloom filters: zero entries, too many entries, or "hasn't been computed". It will be important for that future patch to distinguish between "not representable" (i.e., zero or too-many changed paths), and "hasn't been computed". In particular, we don't want to waste time recomputing filters that have already been computed. To that end, change how we store Bloom filters in the "computed but not representable" category: - Bloom filters with no entries are stored as a single byte with all bits low (i.e., all queries to that Bloom filter will return "definitely not") - Bloom filters with too many entries are stored as a single byte with all bits set high (i.e., all queries to that Bloom filter will return "maybe"). These rules are sufficient to not incur a behavior change by changing the on-disk representation of these two classes. Likewise, no specification changes are necessary for the commit-graph format, either: - Filters that were previously empty will be recomputed and stored according to the new rules, and - old clients reading filters generated by new clients will interpret the filters correctly and be none the wiser to how they were generated. Clients will invoke the Bloom machinery in more cases than before, but this can be addressed by returning a NULL filter when all bits are set high. This can be addressed in a future patch. Note that this does increase the size of on-disk commit-graphs, but far less than other proposals. In particular, this is generally more efficient than storing a bitmap for which commits haven't computed their Bloom filters. Storing a bitmap incurs a penalty of one bit per commit, whereas storing explicit filters as above incurs a penalty of one byte per too-large or empty commit. In practice, these boundary commits likely occupy a small proportion of the overall number of commits, and so the size penalty is likely smaller than storing a bitmap for all commits. See, for example, these relative proportions of such boundary commits (collected by SZEDER Gábor): | Percentage of | commit-graph | | | commits modifying | file size | | ├────────┬──────────────┼───────────────────┤ pct. | | 0 path | >= 512 paths | before | after | change | ┌────────────────┼────────┼──────────────┼─────────┼─────────┼───────────┤ | android-base | 13.20% | 0.13% | 37.468M | 37.534M | +0.1741 % | | cmssw | 0.15% | 0.23% | 17.118M | 17.119M | +0.0091 % | | cpython | 3.07% | 0.01% | 7.967M | 7.971M | +0.0423 % | | elasticsearch | 0.70% | 1.00% | 8.833M | 8.835M | +0.0128 % | | gcc | 0.00% | 0.08% | 16.073M | 16.074M | +0.0030 % | | gecko-dev | 0.14% | 0.64% | 59.868M | 59.874M | +0.0105 % | | git | 0.11% | 0.02% | 3.895M | 3.895M | +0.0020 % | | glibc | 0.02% | 0.10% | 3.555M | 3.555M | +0.0021 % | | go | 0.00% | 0.07% | 3.186M | 3.186M | +0.0018 % | | homebrew-cask | 0.40% | 0.02% | 7.035M | 7.035M | +0.0065 % | | homebrew-core | 0.01% | 0.01% | 11.611M | 11.611M | +0.0002 % | | jdk | 0.26% | 5.64% | 5.537M | 5.540M | +0.0590 % | | linux | 0.01% | 0.51% | 63.735M | 63.740M | +0.0073 % | | llvm-project | 0.12% | 0.03% | 25.515M | 25.516M | +0.0050 % | | rails | 0.10% | 0.10% | 6.252M | 6.252M | +0.0027 % | | rust | 0.07% | 0.17% | 9.364M | 9.364M | +0.0033 % | | tensorflow | 0.09% | 1.02% | 7.009M | 7.010M | +0.0158 % | | webkit | 0.05% | 0.31% | 17.405M | 17.406M | +0.0047 % | (where the above increase is determined by computing a non-split commit-graph before and after this patch). Given that these projects are all "large" by commit count, the storage cost by writing these filters explicitly is negligible. In the most extreme example, android-base (which has 494,848 commits at the time of writing) would have its commit-graph increase by a modest 68.4 KB. Finally, a test to exercise filters which contain too many changed path entries will be introduced in a subsequent patch. Suggested-by: SZEDER Gábor <szeder.dev@gmail.com> Suggested-by: Jakub Narębski <jnareb@gmail.com> Helped-by: Derrick Stolee <dstolee@microsoft.com> Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17maintenance: add auto condition for commit-graph taskLibravatar Derrick Stolee1-0/+10
Instead of writing a new commit-graph in every 'git maintenance run --auto' process (when maintenance.commit-graph.enalbed is configured to be true), only write when there are "enough" commits not in a commit-graph file. This count is controlled by the maintenance.commit-graph.auto config option. To compute the count, use a depth-first search starting at each ref, and leaving markers using the SEEN flag. If this count reaches the limit, then terminate early and start the task. Otherwise, this operation will peel every ref and parse the commit it points to. If these are all in the commit-graph, then this is typically a very fast operation. Users with many refs might feel a slow-down, and hence could consider updating their limit to be very small. A negative value will force the step to run every time. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-17maintenance: create maintenance.<task>.enabled configLibravatar Derrick Stolee3-5/+17
Currently, a normal run of "git maintenance run" will only run the 'gc' task, as it is the only one enabled. This is mostly for backwards- compatible reasons since "git maintenance run --auto" commands replaced previous "git gc --auto" commands after some Git processes. Users could manually run specific maintenance tasks by calling "git maintenance run --task=<task>" directly. Allow users to customize which steps are run automatically using config. The 'maintenance.<task>.enabled' option then can turn on these other tasks (or turn off the 'gc' task). Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>