summaryrefslogtreecommitdiff
path: root/diff.c
AgeCommit message (Collapse)AuthorFilesLines
2011-05-04Merge branch 'jh/dirstat' into maintLibravatar Junio C Hamano1-5/+37
* jh/dirstat: --dirstat: In case of renames, use target filename instead of source filename Teach --dirstat not to completely ignore rearranged lines within a file --dirstat-by-file: Make it faster and more correct --dirstat: Describe non-obvious differences relative to --stat or regular diff
2011-04-12--dirstat: In case of renames, use target filename instead of source filenameLibravatar Johan Herland1-1/+1
This changes --dirstat analysis to count "damage" toward the target filename, rather than the source filename. For renames within a directory, this won't matter to the final output, but when moving files between diretories, the output now lists the target directory rather than the source directory. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-11Teach --dirstat not to completely ignore rearranged lines within a fileLibravatar Johan Herland1-1/+18
Currently, the --dirstat analysis ignores when lines within a file are rearranged, because the "damage" calculated by show_dirstat() is 0. However, if the object name has changed, we already know that there is some damage, and it is unintuitive to claim there is _no_ damage. Teach show_dirstat() to assign a minimum amount of damage (== 1) to entries for which the analysis otherwise yields zero damage, to still represent that these files are changed, instead of saying that there is no change. Also, skip --dirstat analysis when the object names are the same (e.g. for a pure file rename). Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-11--dirstat-by-file: Make it faster and more correctLibravatar Johan Herland1-5/+20
Currently, when using --dirstat-by-file, it first does the full --dirstat analysis (using diffcore_count_changes()), and then resets 'damage' to 1, if any damage was found by diffcore_count_changes(). But --dirstat-by-file is not interested in the file damage per se. It only cares if the file changed at all. In that sense it only cares if the blob object for a file has changed. We therefore only need to compare the object names of each file pair in the diff queue and we can skip the entire --dirstat analysis and simply set 'damage' to 1 for each entry where the object name has changed. This makes --dirstat-by-file faster, and also bypasses --dirstat's practice of ignoring rearranged lines within a file. The patch also contains an added testcase verifying that --dirstat-by-file now detects changes that only rearrange lines within a file. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22Remove unused variablesLibravatar Johannes Schindelin1-2/+1
Noticed by gcc 4.6.0. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22Fix sparse warningsLibravatar Stephen Boyd1-1/+1
Fix warnings from 'make check'. - These files don't include 'builtin.h' causing sparse to complain that cmd_* isn't declared: builtin/clone.c:364, builtin/fetch-pack.c:797, builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78, builtin/merge-index.c:69, builtin/merge-recursive.c:22 builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426 builtin/notes.c:822, builtin/pack-redundant.c:596, builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149, builtin/remote.c:1512, builtin/remote-ext.c:240, builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384, builtin/unpack-file.c:25, builtin/var.c:75 - These files have symbols which should be marked static since they're only file scope: submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13, submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79, unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123, url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48 - These files redeclare symbols to be different types: builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571, usage.c:49, usage.c:58, usage.c:63, usage.c:72 - These files use a literal integer 0 when they really should use a NULL pointer: daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362 While we're in the area, clean up some unused #includes in builtin files (mostly exec_cmd.h). Signed-off-by: Stephen Boyd <bebarino@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-19Merge branch 'jk/merge-rename-ux'Libravatar Junio C Hamano1-1/+1
* jk/merge-rename-ux: pull: propagate --progress to merge merge: enable progress reporting for rename detection add inexact rename detection progress infrastructure commit: stop setting rename limit bump rename limit defaults (again) merge: improve inexact rename limit warning
2011-03-16Merge branch 'jk/diffstat-binary' into maintLibravatar Junio C Hamano1-10/+21
* jk/diffstat-binary: diff: don't retrieve binary blobs for diffstat diff: handle diffstat of rewritten binary files
2011-03-16standardize brace placement in struct definitionsLibravatar Jonathan Nieder1-4/+2
In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char *baz; }; Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-22diff: don't retrieve binary blobs for diffstatLibravatar Jeff King1-4/+11
We only need the size, which is much cheaper to get, especially if it is a big binary file. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-22diff: handle diffstat of rewritten binary filesLibravatar Jeff King1-10/+14
The logic in builtin_diffstat assumes that a complete_rewrite pair should have its lines counted. This is nonsensical for binary files and leads to confusing things like: $ git diff --stat --summary HEAD^ HEAD foo.rand | Bin 4096 -> 4096 bytes 1 files changed, 0 insertions(+), 0 deletions(-) $ git diff --stat --summary -B HEAD^ HEAD foo.rand | 34 +++++++++++++++------------------- 1 files changed, 15 insertions(+), 19 deletions(-) rewrite foo.rand (100%) So let's reorder the function to handle binary files first (which from diffstat's perspective look like complete rewrites anyway), then rewrites, then actual diffstats. There are two bonus prizes to this reorder: 1. It gets rid of a now-superfluous goto. 2. The binary case is at the top, which means we can further optimize it in the next patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-21bump rename limit defaults (again)Libravatar Jeff King1-1/+1
We did this once before in 5070591 (bump rename limit defaults, 2008-04-30). Back then, we were shooting for about 1 second for a diff/log calculation, and 5 seconds for a merge. There are a few new things to consider, though: 1. Average processors are faster now. 2. We've seen on the mailing list some ugly merges where not using inexact rename detection leads to many more conflicts. Merges of this size take a long time anyway, so users are probably happy to spend a little bit of time computing the renames. Let's bump the diff/merge default limits from 200/500 to 400/1000. Those are 2 seconds and 10 seconds respectively on my modern hardware. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-21Merge branch 'ks/blame-worktree-textconv-cached'Libravatar Junio C Hamano1-2/+2
* ks/blame-worktree-textconv-cached: fill_textconv(): Don't get/put cache if sha1 is not valid t/t8006: Demonstrate blame is broken when cachetextconv is on
2010-12-19fill_textconv(): Don't get/put cache if sha1 is not validLibravatar Kirill Smelkov1-2/+2
When blaming files in the working tree, the filespec is marked with !sha1_valid, as we have not given the contents an object name yet. The function to cache textconv results (keyed on the object name), however, didn't check this condition, and ended up on storing the cached result under a random object name. Cc: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Cc: Clément Poulain <clement.poulain@ensimag.imag.fr> Cc: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Cc: Jeff King <peff@peff.net> Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-16Merge branch 'kb/diff-C-M-synonym'Libravatar Junio C Hamano1-8/+8
* kb/diff-C-M-synonym: diff: use "find" instead of "detect" as prefix for long forms of -M and -C diff: add --detect-copies-harder as a synonym for --find-copies-harder
2010-12-10diff: use "find" instead of "detect" as prefix for long forms of -M and -CLibravatar Yann Dirson1-9/+9
It is more consistent with existing --find-copies-harder; luckily "detect" variant has not appeared in any officially released version of git. Signed-off-by: Yann Dirson <ydirson@altern.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-09Merge branch 'np/diff-in-corrupt-repository' into maintLibravatar Junio C Hamano1-2/+6
* np/diff-in-corrupt-repository: diff: don't presume empty file when corresponding object is missing
2010-12-09Merge branch 'cm/diff-check-at-eol' into maintLibravatar Junio C Hamano1-1/+1
* cm/diff-check-at-eol: diff --check: correct line numbers of new blank lines at EOF
2010-12-08Merge branch 'jk/diff-CBM'Libravatar Junio C Hamano1-3/+3
* jk/diff-CBM: diff: report bogus input to -C/-M/-B
2010-11-29Merge branch 'np/diff-in-corrupt-repository'Libravatar Junio C Hamano1-2/+6
* np/diff-in-corrupt-repository: diff: don't presume empty file when corresponding object is missing
2010-11-29Merge branch 'cm/diff-check-at-eol'Libravatar Junio C Hamano1-1/+1
* cm/diff-check-at-eol: diff --check: correct line numbers of new blank lines at EOF
2010-11-29diff: add --detect-copies-harder as a synonym for --find-copies-harderLibravatar Kevin Ballard1-1/+1
Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-24Merge branch 'cb/diff-fname-optim' into maintLibravatar Junio C Hamano1-1/+1
* cb/diff-fname-optim: diff: avoid repeated scanning while looking for funcname do not search functions for patch ID add rebase patch id tests
2010-11-24Merge branch 'jk/no-textconv-symlink' into maintLibravatar Junio C Hamano1-3/+8
* jk/no-textconv-symlink: diff: don't use pathname-based diff drivers for symlinks
2010-11-17Merge branch 'cb/diff-fname-optim'Libravatar Junio C Hamano1-1/+1
* cb/diff-fname-optim: diff: avoid repeated scanning while looking for funcname do not search functions for patch ID add rebase patch id tests
2010-11-17Merge branch 'jk/no-textconv-symlink'Libravatar Junio C Hamano1-3/+8
* jk/no-textconv-symlink: diff: don't use pathname-based diff drivers for symlinks
2010-10-26Merge branch 'kb/merge-recursive-rename-threshold'Libravatar Junio C Hamano1-6/+25
* kb/merge-recursive-rename-threshold: diff: add synonyms for -M, -C, -B merge-recursive: option to specify rename threshold Conflicts: Documentation/diff-options.txt Documentation/merge-strategies.txt
2010-10-26Merge branch 'maint'Libravatar Junio C Hamano1-2/+2
* maint: Fix copy-pasted comments related to tree diff handling.
2010-10-25Fix copy-pasted comments related to tree diff handling.Libravatar Yann Dirson1-2/+2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-21diff: don't presume empty file when corresponding object is missingLibravatar Nicolas Pitre1-2/+6
The low-level diff code will happily produce totally bogus diff output with a broken repository via format-patch and friends by treating missing objects as empty files. Let's prevent that from happening any longer. Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-21diff: report bogus input to -C/-M/-BLibravatar Jeff King1-3/+3
We already detect invalid input to these functions, but we simply exit with an error code, never saying anything as simple as "your input was wrong". Let's fix that. Before: $ git diff -CM $ echo $? 128 After: $ git diff -CM error: invalid argument to -C: M $ echo $? 128 There should be no problems with having diff_opt_parse print to stderr, as there is already precedent in complaining about bogus --color and --output arguments. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-16diff --check: correct line numbers of new blank lines at EOFLibravatar Christoph Mallon1-1/+1
The whitespace check printed the value of the wrong variable, i.e. the beginning of the block of blank lines at the EOF (possibly absent) in the old file. As "git diff --check" is used by users to check their changes before making a commit, we should point at the line number in the file after the change. Signed-off-by: Christoph Mallon <christoph.mallon@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-29Merge branch 'jc/pickaxe-grep'Libravatar Junio C Hamano1-3/+8
* jc/pickaxe-grep: diff/log -G<pattern>: tests git log/diff: add -G<regexp> that greps in the patch text diff: pass the entire diff-options to diffcore_pickaxe() gitdiffcore doc: update pickaxe description
2010-09-29diff: trivial fix for --output file error messageLibravatar Matthieu Moy1-1/+1
The option argument is either after the equal sign in --output=... or in the next command-line argument. optarg is the reliable way to access it. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-29diff: add synonyms for -M, -C, -BLibravatar Kevin Ballard1-3/+22
Add new long-form options --detect-renames[=<n>], --detect-copies[=<n>], and --break-rewrites[=[<n>][/<m>]] as synonyms for the -M, -C, and -B options (respectively). Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-29merge-recursive: option to specify rename thresholdLibravatar Kevin Ballard1-3/+3
The recursive merge strategy turns on rename detection but leaves the rename threshold at the default. Add a strategy option to allow the user to specify a rename threshold to use. Signed-off-by: Kevin Ballard <kevin@sb.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-23do not search functions for patch IDLibravatar Clemens Buchacher1-1/+1
Visual aids, such as the function name in the hunk header, are not necessary for the purposes of computing a patch ID. This is a performance optimization. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-23diff: don't use pathname-based diff drivers for symlinksLibravatar Jeff King1-3/+8
When we're diffing symlinks, we consider the contents to be the pathname that the symlink points to. When a user sets up a userdiff driver like "*.pdf diff=pdf", their "diff.pdf.*" config generally tells us what to do with the content of pdf files. With the current code, we will actually process a symlink like "link.pdf" using a configured pdf driver, meaning we are using contents which consist of a pathname with configuration that is expecting contents that consist of an actual pdf file. The most noticeable example of this would have been textconv; however, it was already protected in its own textconv-specific code path. We can still see the breakage with something like "diff.*.binary", though. You could also see it with diff.*.funcname, though it is a bit harder to trigger accidentally there. This patch adds a check for S_ISREG lower in the callstack than the textconv-specific check, which should block use of any userdiff config for non-regular files. We can drop the check in the textconv code, which is now redundant. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-09Merge branch 'maint'Libravatar Junio C Hamano1-1/+4
* maint: xdiff-interface.c: always trim trailing space from xfuncname matches diff.c: call regfree to free memory allocated by regcomp when necessary
2010-09-09diff.c: call regfree to free memory allocated by regcomp when necessaryLibravatar Brandon Casey1-1/+4
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-31Merge branch 'cb/binary-patch-id'Libravatar Junio C Hamano1-0/+7
* cb/binary-patch-id: hash binary sha1 into patch id
2010-08-31git log/diff: add -G<regexp> that greps in the patch textLibravatar Junio C Hamano1-2/+7
Teach "-G<regexp>" that is similar to "-S<regexp> --pickaxe-regexp" to the "git diff" family of commands. This limits the diff queue to filepairs whose patch text actually has an added or a deleted line that matches the given regexp. Unlike "-S<regexp>", changing other parts of the line that has a substring that matches the given regexp IS counted as a change, as such a change would appear as one deletion followed by one addition in a patch text. Unlike -S (pickaxe) that is intended to be used to quickly detect a commit that changes the number of occurrences of hits between the preimage and the postimage to serve as a part of larger toolchain, this is meant to be used as the top-level Porcelain feature. The implementation unfortunately has to run "diff" twice if you are running "log" family of commands to produce patches in the final output (e.g. "git log -p" or "git format-patch"). I think we _could_ cache the result in-core if we wanted to, but that would require larger surgery to the diffcore machinery (i.e. adding an extra pointer in the filepair structure to keep a pointer to a strbuf around, stuff the textual diff to the strbuf inside diffgrep_consume(), and make use of it in later stages when it is available) and it may not be worth it. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-31diff: pass the entire diff-options to diffcore_pickaxe()Libravatar Junio C Hamano1-1/+1
That would make it easier to give enhanced feature to the pickaxe transformation. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-21Merge branch 'mm/shortopt-detached'Libravatar Junio C Hamano1-43/+124
* mm/shortopt-detached: log: parse separate option for --glob log: parse separate options like git log --grep foo diff: parse separate options --stat-width n, --stat-name-width n diff: split off a function for --stat-* option parsing diff: parse separate options like -S foo Conflicts: revision.c
2010-08-18Merge branch 'jc/maint-follow-rename-fix'Libravatar Junio C Hamano1-14/+13
* jc/maint-follow-rename-fix: log: test for regression introduced in v1.7.2-rc0~103^2~2 diff --follow: do call diffcore_std() as necessary diff --follow: do not waste cycles while recursing
2010-08-18Merge branch 'jl/submodule-ignore-diff'Libravatar Junio C Hamano1-7/+34
* jl/submodule-ignore-diff: Add tests for the diff.ignoreSubmodules config option Add the 'diff.ignoreSubmodules' config setting Submodules: Use "ignore" settings from .gitmodules too for diff and status Submodules: Add the new "ignore" config option for diff and status Conflicts: diff.c
2010-08-16hash binary sha1 into patch idLibravatar Clemens Buchacher1-0/+7
Since commit 2f82f760 (Take binary diffs into account for "git rebase"), binary files are included in patch ID computation. Binary files are diffed using the text diff algorithm, however, which has a huge impact on performance. The following tests performance for a 50000 line file marked as binary in .gitattributes. $ git format-patch --stdout --ignore-if-in-upstream master real 0m0.367s user 0m0.354s sys 0m0.010s Instead of diffing the binary files, hash the pre- and post-image sha1, which is just as unique. As a result, performance is much improved. $ git format-patch --stdout --ignore-if-in-upstream master real 0m0.016s user 0m0.015s sys 0m0.001s Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-13diff --follow: do call diffcore_std() as necessaryLibravatar Junio C Hamano1-14/+13
Usually, diff frontends populate the output queue with filepairs without any rename information and call diffcore_std() to sort the renames out. When --follow is in effect, however, diff-tree family of frontend has a hack that looks like this: diff-tree frontend -> diff_tree_sha1() . populate diff_queued_diff . if --follow is in effect and there is only one change that creates the target path, then -> try_to_follow_renames() -> diff_tree_sha1() with no pathspec but with -C -> diffcore_std() to find renames . if rename is found, tweak diff_queued_diff and put a single filepair that records the found rename there -> diffcore_std() . tweak elements on diff_queued_diff by - rename detection - path ordering - pickaxe filtering We need to skip parts of the second call to diffcore_std() that is related to rename detection, and do so only when try_to_follow_renames() did find a rename. Earlier 1da6175 (Make diffcore_std only can run once before a diff_flush, 2010-05-06) tried to deal with this issue incorrectly; it unconditionally disabled any second call to diffcore_std(). This hopefully fixes the breakage. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-11diff: strip extra "/" when stripping prefixLibravatar Jakub Narebski1-2/+8
There are two ways a user might want to use "diff --relative": 1. For a file in a directory, like "subdir/file", the user can use "--relative=subdir/" to strip the directory. 2. To strip part of a filename, like "foo-10", they can use "--relative=foo-". We currently handle both of those situations. However, if the user passes "--relative=subdir" (without the trailing slash), we produce inconsistent results. For the unified diff format, we collapse the double-slash of "a//file" correctly into "a/file". But for other formats (raw, stat, name-status), we end up with "/file". We can do what the user means here and strip the extra "/" (and only a slash). We are not hurting any existing users of (2) above with this behavior change because the existing output for this case was nonsensical. Patch by Jakub, tests and commit message by Jeff King. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09Add the 'diff.ignoreSubmodules' config settingLibravatar Johannes Schindelin1-1/+5
When you have a lot of submodules checked out, the time penalty to check for dirty submodules can easily imply a multiplication of the total time by the factor 20. This makes the difference between almost instantaneous (< 2 seconds) and unbearably slow (> 50 seconds) here, since the disk caches are constantly overloaded. To this end, the submodule.*.ignore config option was introduced, but it is per-submodule. This commit introduces a global config setting to set a default (porcelain) value for the --ignore-submodules option, keeping the default at 'none'. It can be overridden by the submodule.*.ignore setting and by the --ignore-submodules option. Incidentally, this commit fixes an issue with the overriding logic: multiple --ignore-submodules options would not clear the previously set flags. While at it, fix a typo in the documentation for submodule.*.ignore. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>