summaryrefslogtreecommitdiff
path: root/Documentation
AgeCommit message (Collapse)AuthorFilesLines
2021-09-28builtin/repack.c: make largest pack preferredLibravatar Taylor Blau1-0/+4
When repacking into a geometric series and writing a multi-pack bitmap, it is beneficial to have the largest resulting pack be the preferred object source in the bitmap's MIDX, since selecting the large packs can lead to fewer broken delta chains and better compression. Teach 'git repack' to identify this pack and pass it to the MIDX write machinery in order to mark it as preferred. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28builtin/repack.c: support writing a MIDX while repackingLibravatar Taylor Blau1-4/+10
Teach `git repack` a new `--write-midx` option for callers that wish to persist a multi-pack index in their repository while repacking. There are two existing alternatives to this new flag, but they don't cover our particular use-case. These alternatives are: - Call 'git multi-pack-index write' after running 'git repack', or - Set 'GIT_TEST_MULTI_PACK_INDEX=1' in your environment when running 'git repack'. The former works, but introduces a gap in bitmap coverage between repacking and writing a new MIDX (since the repack may have deleted a pack included in the existing MIDX, invalidating it altogether). Setting the 'GIT_TEST_' environment variable is obviously unsupported. In fact, even if it were supported officially, it still wouldn't work, because it generates the MIDX *after* redundant packs have been dropped, leading to the same issue as above. Introduce a new option which eliminates this race by teaching `git repack` to generate the MIDX at the critical point: after the new packs have been written and moved into place, but before the redundant packs have been removed. This option is compatible with `git repack`'s '--bitmap' option (it changes the interpretation to be: "write a bitmap corresponding to the MIDX after one has been generated"). There is a little bit of additional noise in the patch below to avoid repeating ourselves when selecting which packs to delete. Instead of a single loop as before (where we iterate over 'existing_packs', decide if a pack is worth deleting, and if so, delete it), we have two loops (the first where we decide which ones are worth deleting, and the second where we actually do the deleting). This makes it so we have a single check we can make consistently when (1) telling the MIDX which packs we want to exclude, and (2) actually unlinking the redundant packs. There is also a tiny change to short-circuit the body of write_midx_included_packs() when no packs remain in the case of an empty repository. The MIDX code does not handle this, so avoid trying to generate a MIDX covering zero packs in the first place. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28midx: preliminary support for `--refs-snapshot`Libravatar Taylor Blau1-0/+15
To figure out which commits we can write a bitmap for, the multi-pack index/bitmap code does a reachability traversal, marking any commit which can be found in the MIDX as eligible to receive a bitmap. This approach will cause a problem when multi-pack bitmaps are able to be generated from `git repack`, since the reference tips can change during the repack. Even though we ignore commits that don't exist in the MIDX (when doing a scan of the ref tips), it's possible that a commit in the MIDX reaches something that isn't. This can happen when a multi-pack index contains some pack which refers to loose objects (e.g., if a pack was pushed after starting the repack but before generating the MIDX which depends on an object which is stored as loose in the repository, and by definition isn't included in the multi-pack index). By taking a snapshot of the references before we start repacking, we can close that race window. In the above scenario (where we have a packed object pointing at a loose one), we'll either (a) take a snapshot of the references before seeing the packed one, or (b) take it after, at which point we can guarantee that the loose object will be packed and included in the MIDX. This patch does just that. It writes a temporary "reference snapshot", which is a list of OIDs that are at the ref tips before writing a multi-pack bitmap. References that are "preferred" (i.e,. are a suffix of at least one value of the 'pack.preferBitmapTips' configuration) are marked with a special '+'. The format is simple: one line per commit at each tip, with an optional '+' at the beginning (for preferred references, as described above). When provided, the reference snapshot is used to drive bitmap selection instead of the MIDX code doing its own traversal. When it isn't provided, the usual traversal takes place instead. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28builtin/multi-pack-index.c: support `--stdin-packs` modeLibravatar Taylor Blau1-0/+4
To power a new `--write-midx` mode, `git repack` will want to write a multi-pack index containing a certain set of packs in the repository. This new option will be used by `git repack` to write a MIDX which contains only the packs which will survive after the repack (that is, it will exclude any packs which are about to be deleted). This patch effectively exposes the function implemented in the previous commit via the `git multi-pack-index` builtin. An alternative approach would have been to call that function from the `git repack` builtin directly, but this introduces awkward problems around closing and reopening the object store, so the MIDX will be written out-of-process. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01pack-bitmap: write multi-pack bitmapsLibravatar Taylor Blau1-1/+11
Write multi-pack bitmaps in the format described by Documentation/technical/bitmap-format.txt, inferring their presence with the absence of '--bitmap'. To write a multi-pack bitmap, this patch attempts to reuse as much of the existing machinery from pack-objects as possible. Specifically, the MIDX code prepares a packing_data struct that pretends as if a single packfile has been generated containing all of the objects contained within the MIDX. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01midx: avoid opening multiple MIDXs when writingLibravatar Taylor Blau1-0/+2
Opening multiple instance of the same MIDX can lead to problems like two separate packed_git structures which represent the same pack being added to the repository's object store. The above scenario can happen because prepare_midx_pack() checks if `m->packs[pack_int_id]` is NULL in order to determine if a pack has been opened and installed in the repository before. But a caller can construct two copies of the same MIDX by calling get_multi_pack_index() and load_multi_pack_index() since the former manipulates the object store directly but the latter is a lower-level routine which allocates a new MIDX for each call. So if prepare_midx_pack() is called on multiple MIDXs with the same pack_int_id, then that pack will be installed twice in the object store's packed_git pointer. This can lead to problems in, for e.g., the pack-bitmap code, which does something like the following (in pack-bitmap.c:open_pack_bitmap()): struct bitmap_index *bitmap_git = ...; for (p = get_all_packs(r); p; p = p->next) { if (open_pack_bitmap_1(bitmap_git, p) == 0) ret = 0; } which is a problem if two copies of the same pack exist in the packed_git list because pack-bitmap.c:open_pack_bitmap_1() contains a conditional like the following: if (bitmap_git->pack || bitmap_git->midx) { /* ignore extra bitmap file; we can only handle one */ warning("ignoring extra bitmap file: %s", packfile->pack_name); close(fd); return -1; } Avoid this scenario by not letting write_midx_internal() open a MIDX that isn't also pointed at by the object store. So long as this is the case, other routines should prefer to open MIDXs with get_multi_pack_index() or reprepare_packed_git() instead of creating instances on their own. Because get_multi_pack_index() returns `r->object_store->multi_pack_index` if it is non-NULL, we'll only have one instance of a MIDX open at one time, avoiding these problems. To encourage this, drop the `struct multi_pack_index *` parameter from `write_midx_internal()`, and rely instead on the `object_dir` to find (or initialize) the correct MIDX instance. Likewise, replace the call to `close_midx()` with `close_object_store()`, since we're about to replace the MIDX with a new one and should invalidate the object store's memory of any MIDX that might have existed beforehand. Note that this now forbids passing object directories that don't belong to alternate repositories over `--object-dir`, since before we would have happily opened a MIDX in any directory, but now restrict ourselves to only those reachable by `r->objects->multi_pack_index` (and alternate MIDXs that we can see by walking the `next` pointer). As far as I can tell, supporting arbitrary directories with `--object-dir` was a historical accident, since even the documentation says `<alt>` when referring to the value passed to this option. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-01midx: reject empty `--preferred-pack`'sLibravatar Taylor Blau1-3/+3
The soon-to-be-implemented multi-pack bitmap treats object in the first bit position specially by assuming that all objects in the pack it was selected from are also represented from that pack in the MIDX. In other words, the pack from which the first object was selected must also have all of its other objects selected from that same pack in the MIDX in case of any duplicates. But this assumption relies on the fact that there is at least one object in that pack to begin with; otherwise the object in the first bit position isn't from a preferred pack, in which case we can no longer assume that all objects in that pack were also selected from the same pack. Guard this assumption by checking the number of objects in the given preferred pack, and failing if the given pack is empty. To make sure we can safely perform this check, open any packs which are contained in an existing MIDX via prepare_midx_pack(). The same is done for new packs via the add_pack_to_midx() callback, but packs picked up from a previous MIDX will not yet have these opened. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24Documentation: describe MIDX-based bitmapsLibravatar Taylor Blau2-21/+60
Update the technical documentation to describe the multi-pack bitmap format. This patch merely introduces the new format, and describes its high-level ideas. Git does not yet know how to read nor write these multi-pack variants, and so the subsequent patches will: - Introduce code to interpret multi-pack bitmaps, according to this document. - Then, introduce code to write multi-pack bitmaps from the 'git multi-pack-index write' sub-command. Finally, the implementation will gain tests in subsequent patches (as opposed to inline with the patch teaching Git how to write multi-pack bitmaps) to avoid a cyclic dependency. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-16Git 2.33Libravatar Junio C Hamano1-4/+0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-11Git 2.33-rc2Libravatar Junio C Hamano1-12/+0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-11Merge branch 'jn/log-m-does-not-imply-p'Libravatar Junio C Hamano1-4/+4
Earlier "git log -m" was changed to always produce patch output, which would break existing scripts, which has been reverted. * jn/log-m-does-not-imply-p: Revert 'diff-merges: let "-m" imply "-p"'
2021-08-09Revert 'diff-merges: let "-m" imply "-p"'Libravatar Jonathan Nieder1-4/+4
This reverts commit f5bfcc823ba242a46e20fb6f71c9fbf7ebb222fe, which made "git log -m" imply "--patch" by default. The logic was that "-m", which makes diff generation for merges perform a diff against each parent, has no use unless I am viewing the diff, so we could save the user some typing by turning on display of the resulting diff automatically. That wasn't expected to adversely affect scripts because scripts would either be using a command like "git diff-tree" that already emits diffs by default or would be combining -m with a diff generation option such as --name-status. By saving typing for interactive use without adversely affecting scripts in the wild, it would be a pure improvement. The problem is that although diff generation options are only relevant for the displayed diff, a script author can imagine them affecting path limiting. For example, I might run git log -w --format=%H -- README hoping to list commits that edited README, excluding whitespace-only changes. In fact, a whitespace-only change is not TREESAME so the use of -w here has no effect (since we don't apply these diff generation flags to the diff_options struct rev_info::pruning used for this purpose), but the documentation suggests that it should work Suppose you specified foo as the <paths>. We shall call commits that modify foo !TREESAME, and the rest TREESAME. (In a diff filtered for foo, they look different and equal, respectively.) and a script author who has not tested whitespace-only changes wouldn't notice. Similarly, a script author could include git log -m --first-parent --format=%H -- README to filter the first-parent history for commits that modified README. The -m is a no-op but it reflects the script author's intent. For example, until 1e20a407fe2 (stash list: stop passing "-m" to "git log", 2021-05-21), "git stash list" did this. As a result, we can't safely change "-m" to imply "-p" without fear of breaking such scripts. Restore the previous behavior. Noticed because Rust's src/bootstrap/bootstrap.py made use of this same construct: https://github.com/rust-lang/rust/pull/87513. That script has been updated to omit the unnecessary "-m" option, but we can expect other scripts in the wild to have similar expectations. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-04The eighth batchLibravatar Junio C Hamano1-1/+16
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-04Merge branch 'ar/doc-markup-fix'Libravatar Junio C Hamano2-2/+2
Doc mark-up fix. * ar/doc-markup-fix: Documentation: render special characters correctly
2021-08-04Merge branch 'pb/merge-autostash-more'Libravatar Junio C Hamano1-1/+2
The local changes stashed by "git merge --autostash" were lost when the merge failed in certain ways, which has been corrected. * pb/merge-autostash-more: merge: apply autostash if merge strategy fails merge: apply autostash if fast-forward fails Documentation: define 'MERGE_AUTOSTASH' merge: add missing word "strategy" to a message
2021-08-04Merge branch 'ab/update-submitting-patches'Libravatar Junio C Hamano1-111/+96
Reorganize and update the SubmitingPatches document. * ab/update-submitting-patches: SubmittingPatches: replace discussion of Travis with GitHub Actions SubmittingPatches: move discussion of Signed-off-by above "send"
2021-08-02Git 2.33-rc0Libravatar Junio C Hamano1-0/+17
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02Merge branch 'fc/pull-no-rebase-merges-theirs-into-ours'Libravatar Junio C Hamano1-1/+1
Documentation fix for "git pull --rebase=no". * fc/pull-no-rebase-merges-theirs-into-ours: doc: pull: fix rebase=false documentation
2021-08-02Merge branch 'jk/config-env-doc'Libravatar Junio C Hamano1-11/+17
Documentation around GIT_CONFIG has been updated. * jk/config-env-doc: doc/git-config: simplify "override" advice for FILES section doc/git-config: clarify GIT_CONFIG environment variable doc/git-config: explain --file instead of referring to GIT_CONFIG
2021-08-02Merge branch 'pb/submodule-recurse-doc'Libravatar Junio C Hamano1-2/+3
Doc update. * pb/submodule-recurse-doc: doc: clarify description of 'submodule.recurse'
2021-07-30Documentation: render special characters correctlyLibravatar Andrei Rybak2-2/+2
Three hyphens are rendered verbatim, so "--" has to be used to produce a dash. There is no double arrow ("<->" is rendered as "<→"), so a left and right arrow "<-->" have to be combined for that. So fix asciidoc output for special characters. This is similar to fixes in commit de82095a95 (doc hash-function-transition: fix asciidoc output, 2021-02-05). Signed-off-by: Andrei Rybak <rybak.a.v@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-28The seventh batchLibravatar Junio C Hamano1-0/+43
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-28Merge branch 'en/rename-limits-doc'Libravatar Junio C Hamano3-12/+21
Documentation on "git diff -l<n>" and diff.renameLimit have been updated, and the defaults for these limits have been raised. * en/rename-limits-doc: rename: bump limit defaults yet again diffcore-rename: treat a rename_limit of 0 as unlimited doc: clarify documentation for rename/copy limits diff: correct warning message when renameLimit exceeded
2021-07-28Merge branch 'ds/gender-neutral-doc-guidelines'Libravatar Junio C Hamano1-0/+45
A guideline for gender neutral documentation has been added. * ds/gender-neutral-doc-guidelines: CodingGuidelines: recommend gender-neutral description
2021-07-28Merge branch 'dl/diff-merge-base'Libravatar Junio C Hamano1-3/+7
"git diff --merge-base" documentation has been updated. * dl/diff-merge-base: git-diff: fix missing --merge-base docs
2021-07-28Merge branch 'sm/worktree-add-lock'Libravatar Junio C Hamano1-2/+2
"git worktree add --lock" learned to record why the worktree is locked with a custom message. * sm/worktree-add-lock: worktree: teach `add` to accept --reason <string> with --lock worktree: mark lock strings with `_()` for translation t2400: clean up '"add" worktree with lock' test
2021-07-23Documentation: define 'MERGE_AUTOSTASH'Libravatar Philippe Blain1-1/+2
The documentation for 'git merge --abort' and 'git merge --quit' both mention the special ref 'MERGE_AUTOSTASH', but this ref is not formally defined anywhere. Mention it in the description of the '--autostash' option for 'git merge'. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22The sixth batchLibravatar Junio C Hamano1-0/+10
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22Merge branch 'bc/rev-list-without-commit-line'Libravatar Junio C Hamano1-0/+8
"git rev-list" learns to omit the "commit <object-name>" header lines from the output with the `--no-commit-header` option. * bc/rev-list-without-commit-line: rev-list: add option for --pretty=format without header
2021-07-22Merge branch 'ab/gitignore-discovery-doc'Libravatar Junio C Hamano1-6/+5
Doc update. * ab/gitignore-discovery-doc: docs: .gitignore parsing is to the top of the repo
2021-07-22Merge branch 'ab/send-email-optim'Libravatar Junio C Hamano1-3/+0
"git send-email" optimization. * ab/send-email-optim: perl: nano-optimize by replacing Cwd::cwd() with Cwd::getcwd() send-email: move trivial config handling to Perl perl: lazily load some common Git.pm setup code send-email: lazily load modules for a big speedup send-email: get rid of indirect object syntax send-email: use function syntax instead of barewords send-email: lazily shell out to "git var" send-email: lazily load config for a big speedup send-email: copy "config_regxp" into git-send-email.perl send-email: refactor sendemail.smtpencryption config parsing send-email: remove non-working support for "sendemail.smtpssl" send-email tests: test for boolean variables without a value send-email tests: support GIT_TEST_PERL_FATAL_WARNINGS=true
2021-07-22Merge branch 'jk/typofix'Libravatar Junio C Hamano1-1/+1
Typofix. * jk/typofix: doc/rev-list-options: fix duplicate word typo
2021-07-22SubmittingPatches: replace discussion of Travis with GitHub ActionsLibravatar Ævar Arnfjörð Bjarmason1-32/+17
Replace the discussion of Travis CI added in 0e5d028a7a0 (Documentation: add setup instructions for Travis CI, 2016-05-02) with something that covers the GitHub Actions added in 889cacb6897 (ci: configure GitHub Actions for CI/PR, 2020-04-11). The setup is trivial compared to using Travis, and it even works on Windows (that "hopefully soon" comment was probably out-of-date on Travis as well). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22SubmittingPatches: move discussion of Signed-off-by above "send"Libravatar Ævar Arnfjörð Bjarmason1-79/+79
Move the section discussing the addition of a SOB trailer above the section that discusses generating the patch itself. This makes sense as we don't want someone to go through the process of "git format-patch", only to realize late that they should have used "git commit -s" or equivalent. This is a move-only change, no lines here are being altered, only moved around. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-21doc: pull: fix rebase=false documentationLibravatar Felipe Contreras1-1/+1
"git pull --rebase=false" means we merge their history into ours, but it has been described the other way around. Cc: Stephen Haberman <stephen@exigencecorp.com> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> [jc: updated the log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20doc: clarify description of 'submodule.recurse'Libravatar Philippe Blain1-2/+3
The doc for 'submodule.recurse' starts with "Specifies if commands recurse into submodles by default". This is not exactly true of all commands that have a '--recurse-submodules' option. For example, 'git pull --recurse-submodules' does not run 'git pull' in each submodule, but rather runs 'git submodule update --recursive' so that the submodule working trees after the pull matches the commits recorded in the superproject. Clarify that by just saying that it enables '--recurse-submodules'. Note that the way this setting interacts with 'fetch.recurseSubmodules' and 'push.recurseSubmodules', which can have other values than true or false, is already documented since 4da9e99e6e (doc: be more precise on (fetch|push).recurseSubmodules, 2020-04-06). Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20doc/git-config: simplify "override" advice for FILES sectionLibravatar Jeff King1-5/+4
At the end of the FILES section, we indicate that you can override the regular lookup rules with --global, etc. But: - we're missing the --local option - we point to GIT_CONFIG instead of --file, but the latter has much better documentation - we're vague about how the overrides work; the actual option descriptions are much better here So let's just mention the names and point people back to the OPTIONS section. We could perhaps even delete this paragraph entirely, but the presence of the names may give people reading FILES a clue about where to look for more information. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20doc/git-config: clarify GIT_CONFIG environment variableLibravatar Jeff King1-5/+6
The scope and utility of the GIT_CONFIG variable was drastically reduced by dc87183189 (Only use GIT_CONFIG in "git config", not other programs, 2008-06-30). But the documentation in git-config(1) predates that, which makes it rather misleading. These days it is really just another way to say "--file". So let's say that, and explicitly make it clear that it does not impact other Git commands (like GIT_CONFIG_SYSTEM, etc, would). I also bumped it to the bottom of the list of variables, and warned people off of using it. We don't have any plans for deprecation at this point, but there's little point in encouraging people to use it by putting it at the top of the list. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-20doc/git-config: explain --file instead of referring to GIT_CONFIGLibravatar Jeff King1-1/+7
The explanation for the --file option only refers to GIT_CONFIG. This redirection to an environment variable is confusing, but doubly so because the description of GIT_CONFIG is out of date. Let's describe --file from scratch, detailing both the reading and writing behavior as we do for other similar options like --system, etc. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-16The fifth batchLibravatar Junio C Hamano1-0/+29
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-16Merge branch 'ds/gender-neutral-doc'Libravatar Junio C Hamano3-6/+5
Update the documentation not to assume users are of certain gender and adds to guidelines to do so. * ds/gender-neutral-doc: *: fix typos comments: avoid using the gender of our users doc: avoid using the gender of other people
2021-07-16Merge branch 'ab/fetch-negotiate-segv-fix'Libravatar Junio C Hamano2-3/+13
Code recently added to support common ancestry negotiation during "git push" did not sanity check its arguments carefully enough. * ab/fetch-negotiate-segv-fix: fetch: fix segfault in --negotiate-only without --negotiation-tip=* fetch: document the --negotiate-only option send-pack.c: move "no refs in common" abort earlier
2021-07-16CodingGuidelines: recommend gender-neutral descriptionLibravatar Junio C Hamano1-0/+45
Technical writing seeks to convey information with minimal friction. One way that a reader can experience friction is if they encounter a description of "a user" that is later simplified using a gendered pronoun. If the reader does not consider that pronoun to apply to them, then they can experience cognitive dissonance that removes focus from the information. Give some basic tips to guide us avoid unnecessary uses of gendered description. Using a gendered pronoun is appropriate when referring to a specific person. There are acceptable existing uses of gendered pronouns within the Git codebase, such as: * References to real people (e.g. Linus Torvalds, "the Git maintainer"). Do not misgender real people. If there is any doubt to the gender of a person, then avoid using pronouns. * References to fictional people with clear genders (e.g. Alice and Bob). * Sample text used in test cases (e.g t3702, t6432). * The official text of the GPL license contains uses of "he or she", but using singular "they" (or modifying the text in some other way) is not within the scope of the Git project. * Literal email messages in Documentation/howto/ should not be edited for grammatical concerns such as this, unless we update the entire document to fit the standard documentation format. If such an effort is taken on, then the authorship would change and no longer refer to the exact mail message. * External projects consumed in contrib/ should not deviate solely for style reasons. Recommended edits should be contributed to those projects directly. Other cases within the Git project were cleaned up by the previous changes. Co-authored-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15rename: bump limit defaults yet againLibravatar Elijah Newren2-2/+2
These were last bumped in commit 92c57e5c1d29 (bump rename limit defaults (again), 2011-02-19), and were bumped both because processors had gotten faster, and because people were getting ugly merges that caused problems and reporting it to the mailing list (suggesting that folks were willing to spend more time waiting). Since that time: * Linus has continued recommending kernel folks to set diff.renameLimit=0 (maps to 32767, currently) * Folks with repositories with lots of renames were happy to set merge.renameLimit above 32767, once the code supported that, to get correct cherry-picks * Processors have gotten faster * It has been discovered that the timing methodology used last time probably used too large example files. The last point is probably worth explaining a bit more: * The "average" file size used appears to have been average blob size in the linux kernel history at the time (probably v2.6.25 or something close to it). * Since bigger files are modified more frequently, such a computation weights towards larger files. * Larger files may be more likely to be modified over time, but are not more likely to be renamed -- the mean and median blob size within a tree are a bit higher than the mean and median of blob sizes in the history leading up to that version for the linux kernel. * The mean blob size in v2.6.25 was half the average blob size in history leading to that point * The median blob size in v2.6.25 was about 40% of the mean blob size in v2.6.25. * Since the mean blob size is more than double the median blob size, any file as big as the mean will not be compared to any files of median size or less (because they'd be more than 50% dissimilar). * Since it is the number of files compared that provides the O(n^2) behavior, median-sized files should matter more than mean-sized ones. The combined effect of the above is that the file size used in past calculations was likely about 5x too large. Combine that with a CPU performance improvement of ~30%, and we can increase the limits by a factor of sqrt(5/(1-.3)) = 2.67, while keeping the original stated time limits. Keeping the same approximate time limit probably makes sense for diff.renameLimit (there is no progress feedback in e.g. git log -p), but the experience above suggests merge.renameLimit could be extended significantly. In fact, it probably would make sense to have an unlimited default setting for merge.renameLimit, but that would likely need to be coupled with changes to how progress is displayed. (See https://lore.kernel.org/git/YOx+Ok%2FEYvLqRMzJ@coredump.intra.peff.net/ for details in that area.) For now, let's just bump the approximate time limit from 10s to 1m. (Note: We do not want to use actual time limits, because getting results that depend on how loaded your system is that day feels bad, and because we don't discover that we won't get all the renames until after we've put in a lot of work rather than just upfront telling the user there are too many files involved.) Using the original time limit of 2s for diff.renameLimit, and bumping merge.renameLimit from 10s to 60s, I found the following timings using the simple script at the end of this commit message (on an AWS c5.xlarge which reports as "Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz"): N Timing 1300 1.995s 7100 59.973s So let's round down to nice even numbers and bump the limits from 400->1000, and from 1000->7000. Here is the measure_rename_perf script (adapted from https://lore.kernel.org/git/20080211113516.GB6344@coredump.intra.peff.net/ in particular to avoid triggering the linear handling from basename-guided rename detection): #!/bin/bash n=$1; shift rm -rf repo mkdir repo && cd repo git init -q -b main mkdata() { mkdir $1 for i in `seq 1 $2`; do (sed "s/^/$i /" <../sample echo tag: $1 ) >$1/$i done } mkdata initial $n git add . git commit -q -m initial mkdata new $n git add . cd new for i in *; do git mv $i $i.renamed; done cd .. git rm -q -rf initial git commit -q -m new time git diff-tree -M -l0 --summary HEAD^ HEAD Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15diffcore-rename: treat a rename_limit of 0 as unlimitedLibravatar Elijah Newren1-0/+1
In commit 89973554b52c (diffcore-rename: make diff-tree -l0 mean -l<large>, 2017-11-29), -l0 was given a special magical "large" value, but one which was not large enough for some uses (as can be seen from commit 9f7e4bfa3b6d (diff: remove silent clamp of renameLimit, 2017-11-13). Make 0 (or a negative value) be treated as unlimited instead and update the documentation to mention this. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15doc: clarify documentation for rename/copy limitsLibravatar Elijah Newren3-12/+20
A few places in the docs implied that rename/copy detection is always quadratic or that all (unpaired) files were involved in the quadratic portion of rename/copy detection. The following two commits each introduced an exception to this: 9027f53cb505 (Do linear-time/space rename logic for exact renames, 2007-10-25) bd24aa2f97a0 (diffcore-rename: guide inexact rename detection based on basenames, 2021-02-14) (As a side note, for copy detection, the basename guided inexact rename detection is turned off and the exact renames will only result in sources (without the dests) being removed from the set of files used in quadratic detection. So, for copy detection, the documentation was closer to correct.) Avoid implying that all files involved in rename/copy detection are subject to the full quadratic algorithm. While at it, also note the default values for all these settings. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15worktree: teach `add` to accept --reason <string> with --lockLibravatar Stephen Manz1-2/+2
The default reason stored in the lock file, "added with --lock", is unlikely to be what the user would have given in a separate `git worktree lock` command. Allowing `--reason` to be specified along with `--lock` when adding a working tree gives the user control over the reason for locking without needing a second command. Signed-off-by: Stephen Manz <smanz@alum.mit.edu> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-13The fourth batchLibravatar Junio C Hamano1-0/+26
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-13Merge branch 'bk/doc-commit-typofix'Libravatar Junio C Hamano1-1/+1
Doc typo/grammo-fix. * bk/doc-commit-typofix: Documentation: fix typo in the --patch option of the commit command
2021-07-13Merge branch 'fc/push-simple-updates'Libravatar Junio C Hamano1-7/+6
Some code and doc clarification around "git push". * fc/push-simple-updates: doc: push: explain default=simple correctly push: remove unused code in setup_push_upstream() push: simplify setup_push_simple() push: reorganize setup_push_simple() push: copy code to setup_push_simple() push: hedge code of default=simple push: rename !triangular to same_remote