summaryrefslogtreecommitdiff
path: root/Documentation/config/diff.txt
AgeCommit message (Collapse)AuthorFilesLines
2021-07-15rename: bump limit defaults yet againLibravatar Elijah Newren1-1/+1
These were last bumped in commit 92c57e5c1d29 (bump rename limit defaults (again), 2011-02-19), and were bumped both because processors had gotten faster, and because people were getting ugly merges that caused problems and reporting it to the mailing list (suggesting that folks were willing to spend more time waiting). Since that time: * Linus has continued recommending kernel folks to set diff.renameLimit=0 (maps to 32767, currently) * Folks with repositories with lots of renames were happy to set merge.renameLimit above 32767, once the code supported that, to get correct cherry-picks * Processors have gotten faster * It has been discovered that the timing methodology used last time probably used too large example files. The last point is probably worth explaining a bit more: * The "average" file size used appears to have been average blob size in the linux kernel history at the time (probably v2.6.25 or something close to it). * Since bigger files are modified more frequently, such a computation weights towards larger files. * Larger files may be more likely to be modified over time, but are not more likely to be renamed -- the mean and median blob size within a tree are a bit higher than the mean and median of blob sizes in the history leading up to that version for the linux kernel. * The mean blob size in v2.6.25 was half the average blob size in history leading to that point * The median blob size in v2.6.25 was about 40% of the mean blob size in v2.6.25. * Since the mean blob size is more than double the median blob size, any file as big as the mean will not be compared to any files of median size or less (because they'd be more than 50% dissimilar). * Since it is the number of files compared that provides the O(n^2) behavior, median-sized files should matter more than mean-sized ones. The combined effect of the above is that the file size used in past calculations was likely about 5x too large. Combine that with a CPU performance improvement of ~30%, and we can increase the limits by a factor of sqrt(5/(1-.3)) = 2.67, while keeping the original stated time limits. Keeping the same approximate time limit probably makes sense for diff.renameLimit (there is no progress feedback in e.g. git log -p), but the experience above suggests merge.renameLimit could be extended significantly. In fact, it probably would make sense to have an unlimited default setting for merge.renameLimit, but that would likely need to be coupled with changes to how progress is displayed. (See https://lore.kernel.org/git/YOx+Ok%2FEYvLqRMzJ@coredump.intra.peff.net/ for details in that area.) For now, let's just bump the approximate time limit from 10s to 1m. (Note: We do not want to use actual time limits, because getting results that depend on how loaded your system is that day feels bad, and because we don't discover that we won't get all the renames until after we've put in a lot of work rather than just upfront telling the user there are too many files involved.) Using the original time limit of 2s for diff.renameLimit, and bumping merge.renameLimit from 10s to 60s, I found the following timings using the simple script at the end of this commit message (on an AWS c5.xlarge which reports as "Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz"): N Timing 1300 1.995s 7100 59.973s So let's round down to nice even numbers and bump the limits from 400->1000, and from 1000->7000. Here is the measure_rename_perf script (adapted from https://lore.kernel.org/git/20080211113516.GB6344@coredump.intra.peff.net/ in particular to avoid triggering the linear handling from basename-guided rename detection): #!/bin/bash n=$1; shift rm -rf repo mkdir repo && cd repo git init -q -b main mkdata() { mkdir $1 for i in `seq 1 $2`; do (sed "s/^/$i /" <../sample echo tag: $1 ) >$1/$i done } mkdata initial $n git add . git commit -q -m initial mkdata new $n git add . cd new for i in *; do git mv $i $i.renamed; done cd .. git rm -q -rf initial git commit -q -m new time git diff-tree -M -l0 --summary HEAD^ HEAD Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-15doc: clarify documentation for rename/copy limitsLibravatar Elijah Newren1-3/+4
A few places in the docs implied that rename/copy detection is always quadratic or that all (unpaired) files were involved in the quadratic portion of rename/copy detection. The following two commits each introduced an exception to this: 9027f53cb505 (Do linear-time/space rename logic for exact renames, 2007-10-25) bd24aa2f97a0 (diffcore-rename: guide inexact rename detection based on basenames, 2021-02-14) (As a side note, for copy detection, the basename guided inexact rename detection is turned off and the exact renames will only result in sources (without the dests) being removed from the set of files used in quadratic detection. So, for copy detection, the documentation was closer to correct.) Avoid implying that all files involved in rename/copy detection are subject to the full quadratic algorithm. While at it, also note the default values for all these settings. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-08diff: do not show submodule with untracked files as "-dirty"Libravatar Sangeeta Jain1-0/+2
Git diff reports a submodule directory as -dirty even when there are only untracked files in the submodule directory. This is inconsistent with what `git describe --dirty` says when run in the submodule directory in that state. Make `--ignore-submodules=untracked` the default for `git diff` when there is no configuration variable or command line option, so that the command would not give '-dirty' suffix to a submodule whose working tree has untracked files, to make it consistent with `git describe --dirty` that is run in the submodule working tree. And also make `--ignore-submodules=none` the default for `git status` so that the user doesn't end up deleting a submodule that has uncommitted (untracked) files. Signed-off-by: Sangeeta Jain <sangunb09@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-24diff: add config option relativeLibravatar Laurent Arnoud1-0/+4
The `diff.relative` boolean option set to `true` shows only changes in the current directory/value specified by the `path` argument of the `relative` option and shows pathnames relative to the aforementioned directory. Teach `--no-relative` to override earlier `--relative` Add for git-format-patch(1) options documentation `--relative` and `--no-relative` Signed-off-by: Laurent Arnoud <laurent@spkdev.net> Acked-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-09Merge branch 'sg/diff-indent-heuristic-non-experimental'Libravatar Junio C Hamano1-1/+1
We promoted the "indent heuristics" that decides where to split diff hunks from experimental to the default a few years ago, but some stale documentation still marked it as experimental, which has been corrected. * sg/diff-indent-heuristic-non-experimental: diff: 'diff.indentHeuristic' is no longer experimental
2019-08-15diff: 'diff.indentHeuristic' is no longer experimentalLibravatar SZEDER Gábor1-1/+1
The indent heuristic started out as experimental, but it's now our default diff heuristic since 33de716387 (diff: enable indent heuristic by default, 2017-05-08). Alas, that commit didn't update the documentation, and the description of the 'diff.indentHeuristic' configuration variable still implies that it's experimental and not the default. Update the description of 'diff.indentHeuristic' to make it clear that it's the default diff heuristic. The description of the related '--indent-heuristic' option has already been updated in bab76141da (diff: --indent-heuristic is no longer experimental, 2017-10-29). Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-09Merge branch 'nd/switch-and-restore'Libravatar Junio C Hamano1-1/+2
Two new commands "git switch" and "git restore" are introduced to split "checking out a branch to work on advancing its history" and "checking out paths out of the index and/or a tree-ish to work on advancing the current history" out of the single "git checkout" command. * nd/switch-and-restore: (46 commits) completion: disable dwim on "git switch -d" switch: allow to switch in the middle of bisect t2027: use test_must_be_empty Declare both git-switch and git-restore experimental help: move git-diff and git-reset to different groups doc: promote "git restore" user-manual.txt: prefer 'merge --abort' over 'reset --hard' completion: support restore t: add tests for restore restore: support --patch restore: replace --force with --ignore-unmerged restore: default to --source=HEAD when only --staged is specified restore: reject invalid combinations with --staged restore: add --worktree and --staged checkout: factor out worktree checkout code restore: disable overlay mode by default restore: make pathspec mandatory restore: take tree-ish from --source option instead checkout: split part of it to new command 'restore' doc: promote "git switch" ...
2019-04-02checkout: split part of it to new command 'switch'Libravatar Nguyễn Thái Ngọc Duy1-1/+2
"git checkout" doing too many things is a source of confusion for many users (and it even bites old timers sometimes). To remedy that, the command will be split into two new ones: switch and restore. The good old "git checkout" command is still here and will be until all (or most of users) are sick of it. See the new man page for the final design of switch. The actual implementation though is still pretty much the same as "git checkout" and not completely aligned with the man page. Following patches will adjust their behavior to match the man page. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-07Documentation: turn middle-of-line tabs into spacesLibravatar Martin Ågren1-1/+1
These tabs happen to appear in columns where they don't stand out too much, so the diff here is non-obvious. Some of these are rendered differently by AsciiDoc and Asciidoctor (although the difference might be invisible!), which is how I found a few of them. The remainder were found using `git grep "[a-zA-Z.,)]$TAB[a-zA-Z]"`. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-07config/diff.txt: drop spurious backtickLibravatar Martin Ågren1-1/+1
Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-13Merge branch 'nd/config-split'Libravatar Junio C Hamano1-0/+230
Split the overly large Documentation/config.txt file into million little pieces. This potentially allows each individual piece included into the manual page of the command it affects more easily. * nd/config-split: (81 commits) config.txt: remove config/dummy.txt config.txt: move worktree.* to a separate file config.txt: move web.* to a separate file config.txt: move versionsort.* to a separate file config.txt: move user.* to a separate file config.txt: move url.* to a separate file config.txt: move uploadpack.* to a separate file config.txt: move uploadarchive.* to a separate file config.txt: move transfer.* to a separate file config.txt: move tag.* to a separate file config.txt: move submodule.* to a separate file config.txt: move stash.* to a separate file config.txt: move status.* to a separate file config.txt: move splitIndex.* to a separate file config.txt: move showBranch.* to a separate file config.txt: move sequencer.* to a separate file config.txt: move sendemail-config.txt to config/ config.txt: move reset.* to a separate file config.txt: move rerere.* to a separate file config.txt: move repack.* to a separate file ...
2018-10-29config.txt: move diff-config.txt to config/Libravatar Nguyễn Thái Ngọc Duy1-0/+222
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>