summaryrefslogtreecommitdiff
path: root/diff.c
AgeCommit message (Collapse)AuthorFilesLines
2011-08-17Merge branch 'rc/histogram-diff'Libravatar Junio C Hamano1-0/+2
* rc/histogram-diff: xdiff/xhistogram: drop need for additional variable xdiff/xhistogram: rely on xdl_trim_ends() xdiff/xhistogram: rework handling of recursed results xdiff: do away with xdl_mmfile_next() Make test number unique xdiff/xprepare: use a smaller sample size for histogram diff xdiff/xprepare: skip classification teach --histogram to diff t4033-diff-patience: factor out tests xdiff/xpatience: factor out fall-back-diff function xdiff/xprepare: refactor abort cleanups xdiff/xprepare: use memset()
2011-07-19Merge branch 'jc/zlib-wrap'Libravatar Junio C Hamano1-5/+5
* jc/zlib-wrap: zlib: allow feeding more than 4GB in one go zlib: zlib can only process 4GB at a time zlib: wrap deflateBound() too zlib: wrap deflate side of the API zlib: wrap inflateInit2 used to accept only for gzip format zlib: wrap remaining calls to direct inflate/inflateEnd zlib wrapper: refactor error message formatter Conflicts: sha1_file.c
2011-07-12teach --histogram to diffLibravatar Tay Ray Chuan1-0/+2
Port JGit's HistogramDiff algorithm over to C. Rough numbers (TODO) show that it is faster than its --patience cousin, as well as the default Meyers algorithm. The implementation has been reworked to use structs and pointers, instead of bitmasks, thus doing away with JGit's 2^28 line limit. We also use xdiff's default hash table implementation (xdl_hash_bits() with XDL_HASHLONG()) for convenience. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-29Merge branch 'mg/diff-stat-count'Libravatar Junio C Hamano1-9/+45
* mg/diff-stat-count: diff --stat-count: finishing touches diff-options.txt: describe --stat-{width,name-width,count} diff: introduce --stat-lines to limit the stat lines diff.c: omit hidden entries from namelen calculation with --stat
2011-06-29Merge branch 'jk/combine-diff-binary-etc'Libravatar Junio C Hamano1-13/+1
* jk/combine-diff-binary-etc: combine-diff: respect textconv attributes refactor get_textconv to not require diff_filespec combine-diff: handle binary files as binary combine-diff: calculate mode_differs earlier combine-diff: split header printing into its own function
2011-06-10zlib: zlib can only process 4GB at a timeLibravatar Junio C Hamano1-1/+1
The size of objects we read from the repository and data we try to put into the repository are represented in "unsigned long", so that on larger architectures we can handle objects that weigh more than 4GB. But the interface defined in zlib.h to communicate with inflate/deflate limits avail_in (how many bytes of input are we calling zlib with) and avail_out (how many bytes of output from zlib are we ready to accept) fields effectively to 4GB by defining their type to be uInt. In many places in our code, we allocate a large buffer (e.g. mmap'ing a large loose object file) and tell zlib its size by assigning the size to avail_in field of the stream, but that will truncate the high octets of the real size. The worst part of this story is that we often pass around z_stream (the state object used by zlib) to keep track of the number of used bytes in input/output buffer by inspecting these two fields, which practically limits our callchain to the same 4GB limit. Wrap z_stream in another structure git_zstream that can express avail_in and avail_out in unsigned long. For now, just die() when the caller gives a size that cannot be given to a single zlib call. In later patches in the series, we would make git_inflate() and git_deflate() internally loop to give callers an illusion that our "improved" version of zlib interface can operate on a buffer larger than 4GB in one go. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10zlib: wrap deflateBound() tooLibravatar Junio C Hamano1-1/+1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10zlib: wrap deflate side of the APILibravatar Junio C Hamano1-3/+3
Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use of deflateInit2 in remote-curl.c to tell the library to use gzip header and trailer in git_deflate_init_gzip(). There is only one caller that cares about the status from deflateEnd(). Introduce git_deflate_end_gently() to let that sole caller retrieve the status and act on it (i.e. die) for now, but we would probably want to make inflate_end/deflate_end die when they ran out of memory and get rid of the _gently() kind. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-06Merge branch 'jk/diff-not-so-quick'Libravatar Junio C Hamano1-0/+7
* jk/diff-not-so-quick: diff: futureproof "stop feeding the backend early" logic diff_tree: disable QUICK optimization with diff filter Conflicts: diff.c
2011-05-31Merge branch 'jc/rename-degrade-cc-to-c' into maintLibravatar Junio C Hamano1-0/+26
* jc/rename-degrade-cc-to-c: diffcore-rename: fall back to -C when -C -C busts the rename limit diffcore-rename: record filepair for rename src diffcore-rename: refactor "too many candidates" logic builtin/diff.c: remove duplicated call to diff_result_code()
2011-05-31diff: futureproof "stop feeding the backend early" logicLibravatar Junio C Hamano1-0/+7
Refactor the "do not stop feeding the backend early" logic into a small helper function and use it in both run_diff_files() and diff_tree() that has the stop-early optimization. We may later add other types of diffcore transformation that require to look at the whole result like diff-filter does, and having the logic in a single place is essential for longer term maintainability. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27diff --stat-count: finishing touchesLibravatar Junio C Hamano1-2/+4
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27diff: introduce --stat-lines to limit the stat linesLibravatar Michael J Gruber1-4/+34
Often one is interested in the full --stat output only for commits which change a few files, but not others, because larger restructuring gives a --stat which fills a few screens. Introduce a new option --stat-count=<count> which limits the --stat output to the first <count> lines, followed by a "..." line. It can also be given as the third parameter in --stat=<width>,<name-width>,<count>. Also, the unstuck form is supported analogous to the other two stat parameters. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27diff.c: omit hidden entries from namelen calculation with --statLibravatar Michael J Gruber1-5/+9
Currently, --stat calculates the longest name from all items but then drops some (mode changes) from the output later on. Instead, drop them from the namelen generation and calculation. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-26Merge branch 'jm/maint-diff-words-with-sbe' into maintLibravatar Junio C Hamano1-2/+10
* jm/maint-diff-words-with-sbe: do not read beyond end of malloc'd buffer
2011-05-23refactor get_textconv to not require diff_filespecLibravatar Jeff King1-13/+1
This function actually does two things: 1. Load the userdiff driver for the filespec. 2. Decide whether the driver has a textconv component, and initialize the textconv cache if applicable. Only part (1) requires the filespec object, and some callers may not have a filespec at all. So let's split them it into two functions, and put part (2) with the userdiff code, which is a better fit. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-23Merge branch 'jm/maint-diff-words-with-sbe'Libravatar Junio C Hamano1-2/+10
* jm/maint-diff-words-with-sbe: do not read beyond end of malloc'd buffer
2011-05-20do not read beyond end of malloc'd bufferLibravatar Jim Meyering1-2/+10
With diff.suppress-blank-empty=true, "git diff --word-diff" would output data that had been read from uninitialized heap memory. The problem was that fn_out_consume did not account for the possibility of a line with length 1, i.e., the empty context line that diff.suppress-blank-empty=true converts from " \n" to "\n". Since it assumed there would always be a prefix character (the space), it decremented "len" unconditionally, thus passing len=0 to emit_line, which would then blindly call emit_line_0 with len=-1 which would pass that value on to fwrite as SIZE_MAX. Boom. Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-13Merge branch 'jh/dirstat-lines'Libravatar Junio C Hamano1-19/+152
* jh/dirstat-lines: Mark dirstat error messages for translation Improve error handling when parsing dirstat parameters New --dirstat=lines mode, doing dirstat analysis based on diffstat Allow specifying --dirstat cut-off percentage as a floating point number Add config variable for specifying default --dirstat behavior Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file Make --dirstat=0 output directories that contribute < 0.1% of changes Add several testcases for --dirstat and friends
2011-05-13Merge branch 'jc/fix-diff-files-unmerged' into maintLibravatar Junio C Hamano1-6/+6
* jc/fix-diff-files-unmerged: diff-files: show unmerged entries correctly diff: remove often unused parameters from diff_unmerge() diff.c: return filepair from diff_unmerge() test: use $_z40 from test-lib
2011-05-06Merge branch 'jc/fix-diff-files-unmerged'Libravatar Junio C Hamano1-6/+6
* jc/fix-diff-files-unmerged: diff-files: show unmerged entries correctly diff: remove often unused parameters from diff_unmerge() diff.c: return filepair from diff_unmerge() test: use $_z40 from test-lib
2011-05-04Merge branch 'jh/dirstat' into maintLibravatar Junio C Hamano1-5/+37
* jh/dirstat: --dirstat: In case of renames, use target filename instead of source filename Teach --dirstat not to completely ignore rearranged lines within a file --dirstat-by-file: Make it faster and more correct --dirstat: Describe non-obvious differences relative to --stat or regular diff
2011-04-29Mark dirstat error messages for translationLibravatar Johan Herland1-4/+4
Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29Improve error handling when parsing dirstat parametersLibravatar Johan Herland1-30/+41
When encountering errors or unknown tokens while parsing parameters to the --dirstat option, it makes sense to die() with an error message informing the user of which parameter did not make sense. However, when parsing the diff.dirstat config variable, we cannot simply die(), but should instead (after warning the user) ignore the erroneous or unrecognized parameter. After all, future Git versions might add more dirstat parameters, and using two different Git versions on the same repo should not cripple the older Git version just because of a parameter that is only understood by a more recent Git version. This patch fixes the issue by refactoring the dirstat parameter parsing so that parse_dirstat_params() keeps on parsing parameters, even if an earlier parameter was not recognized. When parsing has finished, it returns zero if all parameters were successfully parsed, and non-zero if one or more parameters were not recognized (with appropriate error messages appended to the 'errmsg' argument). The parse_dirstat_params() callers then decide (based on the return value from parse_dirstat_params()) whether to warn and ignore (in case of diff.dirstat), or to warn and die (in case of --dirstat). The patch also adds a couple of tests verifying the correct behavior of --dirstat and diff.dirstat in the face of unknown (possibly future) dirstat parameters. Suggested-by: Junio C Hamano <gitster@pobox.com> Improved-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29New --dirstat=lines mode, doing dirstat analysis based on diffstatLibravatar Johan Herland1-2/+59
This patch adds an alternative implementation of show_dirstat(), called show_dirstat_by_line(), which uses the more expensive diffstat analysis (as opposed to show_dirstat()'s own (relatively inexpensive) analysis) to derive the numbers from which the --dirstat output is computed. The alternative implementation is controlled by the new "lines" parameter to the --dirstat option (or the diff.dirstat config variable). For binary files, the diffstat analysis counts bytes instead of lines, so to prevent binary files from dominating the dirstat results, the byte counts for binary files are divided by 64 before being compared to their textual/line-based counterparts. This is a stupid and ugly - but very cheap - heuristic. In linux-2.6.git, running the three different --dirstat modes: time git diff v2.6.20..v2.6.30 --dirstat=changes > /dev/null vs. time git diff v2.6.20..v2.6.30 --dirstat=lines > /dev/null vs. time git diff v2.6.20..v2.6.30 --dirstat=files > /dev/null yields the following average runtimes on my machine: - "changes" (default): ~6.0 s - "lines": ~9.6 s - "files": ~0.1 s So, as expected, there's a considerable performance hit (~60%) by going through the full diffstat analysis as compared to the default "changes" analysis (obviously, "files" is much faster than both). As such, the "lines" mode is probably only useful if you really need the --dirstat numbers to be consistent with the numbers returned from the other --*stat options. The patch also includes documentation and tests for the new dirstat mode. Improved-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29Allow specifying --dirstat cut-off percentage as a floating point numberLibravatar Johan Herland1-10/+16
Only the first digit after the decimal point is kept, as the dirstat calculations all happen in permille. Selftests verifying floating-point percentage input has been added. Improved-by: Junio C Hamano <gitster@pobox.com> Improved-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29Add config variable for specifying default --dirstat behaviorLibravatar Johan Herland1-1/+9
The new diff.dirstat config variable takes the same arguments as '--dirstat=<args>', and specifies the default arguments for --dirstat. The config is obviously overridden by --dirstat arguments passed on the command line. When not specified, the --dirstat defaults are 'changes,noncumulative,3'. The patch also adds several tests verifying the interaction between the diff.dirstat config variable, and the --dirstat command line option. Improved-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-fileLibravatar Johan Herland1-9/+60
Instead of having multiple interconnected dirstat-related options, teach the --dirstat option itself to accept all behavior modifiers as parameters. - Preserve the current --dirstat=<limit> (where <limit> is an integer specifying a cut-off percentage) - Add --dirstat=cumulative, replacing --cumulative - Add --dirstat=files, replacing --dirstat-by-file - Also add --dirstat=changes and --dirstat=noncumulative for specifying the current default behavior. These allow the user to reset other --dirstat parameters (e.g. 'cumulative' and 'files') occuring earlier on the command line. The deprecated options (--cumulative and --dirstat-by-file) are still functional, although they have been removed from the documentation. Allow multiple parameters to be separated by commas, e.g.: --dirstat=files,10,cumulative Update the documentation accordingly, and add testcases verifying the behavior of the new syntax. Improved-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29Make --dirstat=0 output directories that contribute < 0.1% of changesLibravatar Johan Herland1-2/+2
The expected output from --dirstat=0, is to include any directory with changes, even if those changes contribute a minuscule portion of the total changes. However, currently, directories that contribute less than 0.1% are not included, since their 'permille' value is 0, and there is an 'if (permille)' check in gather_dirstat() that causes them to be ignored. This test is obviously intended to exclude directories that contribute no changes whatsoever, but in this case, it hits too broadly. The correct check is against 'this_dir' from which the permille is calculated. Only if this value is 0 does the directory truly contribute no changes, and should be skipped from the output. This patches fixes this issue, and updates corresponding testcases to expect the new behvaior. Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-28Merge branch 'jc/diff-irreversible-delete'Libravatar Junio C Hamano1-5/+14
* jc/diff-irreversible-delete: git diff -D: omit the preimage of deletes
2011-04-28Merge branch 'jc/rename-degrade-cc-to-c'Libravatar Junio C Hamano1-0/+26
* jc/rename-degrade-cc-to-c: diffcore-rename: fall back to -C when -C -C busts the rename limit diffcore-rename: record filepair for rename src diffcore-rename: refactor "too many candidates" logic builtin/diff.c: remove duplicated call to diff_result_code()
2011-04-28Merge branch 'jh/dirstat'Libravatar Junio C Hamano1-5/+37
* jh/dirstat: --dirstat: In case of renames, use target filename instead of source filename Teach --dirstat not to completely ignore rearranged lines within a file --dirstat-by-file: Make it faster and more correct --dirstat: Describe non-obvious differences relative to --stat or regular diff
2011-04-23diff: remove often unused parameters from diff_unmerge()Libravatar Junio C Hamano1-4/+1
e9c8409 (diff-index --cached --raw: show tree entry on the LHS for unmerged entries., 2007-01-05) added a <mode, object name> pair as parameters to this function, to store them in the pre-image side of an unmerged file pair. Now the function is fixed to return the filepair it queued, we can make the caller on the special case codepath to do so. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-23diff.c: return filepair from diff_unmerge()Libravatar Junio C Hamano1-5/+8
The underlying diff_queue() returns diff_filepair so that the caller can further add information to it, and the helper function diff_unmerge() utilizes the feature itself, but does not expose it to its callers, which was kind of selfish. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-12--dirstat: In case of renames, use target filename instead of source filenameLibravatar Johan Herland1-1/+1
This changes --dirstat analysis to count "damage" toward the target filename, rather than the source filename. For renames within a directory, this won't matter to the final output, but when moving files between diretories, the output now lists the target directory rather than the source directory. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-11Teach --dirstat not to completely ignore rearranged lines within a fileLibravatar Johan Herland1-1/+18
Currently, the --dirstat analysis ignores when lines within a file are rearranged, because the "damage" calculated by show_dirstat() is 0. However, if the object name has changed, we already know that there is some damage, and it is unintuitive to claim there is _no_ damage. Teach show_dirstat() to assign a minimum amount of damage (== 1) to entries for which the analysis otherwise yields zero damage, to still represent that these files are changed, instead of saying that there is no change. Also, skip --dirstat analysis when the object names are the same (e.g. for a pure file rename). Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-11--dirstat-by-file: Make it faster and more correctLibravatar Johan Herland1-5/+20
Currently, when using --dirstat-by-file, it first does the full --dirstat analysis (using diffcore_count_changes()), and then resets 'damage' to 1, if any damage was found by diffcore_count_changes(). But --dirstat-by-file is not interested in the file damage per se. It only cares if the file changed at all. In that sense it only cares if the blob object for a file has changed. We therefore only need to compare the object names of each file pair in the diff queue and we can skip the entire --dirstat analysis and simply set 'damage' to 1 for each entry where the object name has changed. This makes --dirstat-by-file faster, and also bypasses --dirstat's practice of ignoring rearranged lines within a file. The patch also contains an added testcase verifying that --dirstat-by-file now detects changes that only rearrange lines within a file. Signed-off-by: Johan Herland <johan@herland.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-02git diff -D: omit the preimage of deletesLibravatar Junio C Hamano1-5/+14
When reviewing a patch while concentrating primarily on the text after then change, wading through pages of deleted text involves a cognitive burden. Introduce the -D option that omits the preimage text from the patch output for deleted files. When used with -B (represent total rewrite as a single wholesale deletion followed by a single wholesale addition), the preimage text is also omitted. To prevent such a patch from being applied by mistake, the output is designed not to be usable by "git apply" (or GNU "patch"); it is strictly for human consumption. It of course is possible to "apply" such a patch by hand, as a human can read the intention out of such a patch. It however is impossible to apply such a patch even manually in reverse, as the whole point of this option is to omit the information necessary to do so from the output. Initial request by Mart Sõmermaa, documentation and tests helped by Michael J Gruber. Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22diffcore-rename: fall back to -C when -C -C busts the rename limitLibravatar Junio C Hamano1-0/+26
When there are too many paths in the project, the number of rename source candidates "git diff -C -C" finds will exceed the rename detection limit, and no inexact rename detection is performed. We however could fall back to "git diff -C" if the number of modified paths is sufficiently small. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22Remove unused variablesLibravatar Johannes Schindelin1-2/+1
Noticed by gcc 4.6.0. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22Fix sparse warningsLibravatar Stephen Boyd1-1/+1
Fix warnings from 'make check'. - These files don't include 'builtin.h' causing sparse to complain that cmd_* isn't declared: builtin/clone.c:364, builtin/fetch-pack.c:797, builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78, builtin/merge-index.c:69, builtin/merge-recursive.c:22 builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426 builtin/notes.c:822, builtin/pack-redundant.c:596, builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149, builtin/remote.c:1512, builtin/remote-ext.c:240, builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384, builtin/unpack-file.c:25, builtin/var.c:75 - These files have symbols which should be marked static since they're only file scope: submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13, submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79, unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123, url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48 - These files redeclare symbols to be different types: builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571, usage.c:49, usage.c:58, usage.c:63, usage.c:72 - These files use a literal integer 0 when they really should use a NULL pointer: daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362 While we're in the area, clean up some unused #includes in builtin files (mostly exec_cmd.h). Signed-off-by: Stephen Boyd <bebarino@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-19Merge branch 'jk/merge-rename-ux'Libravatar Junio C Hamano1-1/+1
* jk/merge-rename-ux: pull: propagate --progress to merge merge: enable progress reporting for rename detection add inexact rename detection progress infrastructure commit: stop setting rename limit bump rename limit defaults (again) merge: improve inexact rename limit warning
2011-03-16Merge branch 'jk/diffstat-binary' into maintLibravatar Junio C Hamano1-10/+21
* jk/diffstat-binary: diff: don't retrieve binary blobs for diffstat diff: handle diffstat of rewritten binary files
2011-03-16standardize brace placement in struct definitionsLibravatar Jonathan Nieder1-4/+2
In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char *baz; }; Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-22diff: don't retrieve binary blobs for diffstatLibravatar Jeff King1-4/+11
We only need the size, which is much cheaper to get, especially if it is a big binary file. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-22diff: handle diffstat of rewritten binary filesLibravatar Jeff King1-10/+14
The logic in builtin_diffstat assumes that a complete_rewrite pair should have its lines counted. This is nonsensical for binary files and leads to confusing things like: $ git diff --stat --summary HEAD^ HEAD foo.rand | Bin 4096 -> 4096 bytes 1 files changed, 0 insertions(+), 0 deletions(-) $ git diff --stat --summary -B HEAD^ HEAD foo.rand | 34 +++++++++++++++------------------- 1 files changed, 15 insertions(+), 19 deletions(-) rewrite foo.rand (100%) So let's reorder the function to handle binary files first (which from diffstat's perspective look like complete rewrites anyway), then rewrites, then actual diffstats. There are two bonus prizes to this reorder: 1. It gets rid of a now-superfluous goto. 2. The binary case is at the top, which means we can further optimize it in the next patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-21bump rename limit defaults (again)Libravatar Jeff King1-1/+1
We did this once before in 5070591 (bump rename limit defaults, 2008-04-30). Back then, we were shooting for about 1 second for a diff/log calculation, and 5 seconds for a merge. There are a few new things to consider, though: 1. Average processors are faster now. 2. We've seen on the mailing list some ugly merges where not using inexact rename detection leads to many more conflicts. Merges of this size take a long time anyway, so users are probably happy to spend a little bit of time computing the renames. Let's bump the diff/merge default limits from 200/500 to 400/1000. Those are 2 seconds and 10 seconds respectively on my modern hardware. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-21Merge branch 'ks/blame-worktree-textconv-cached'Libravatar Junio C Hamano1-2/+2
* ks/blame-worktree-textconv-cached: fill_textconv(): Don't get/put cache if sha1 is not valid t/t8006: Demonstrate blame is broken when cachetextconv is on
2010-12-19fill_textconv(): Don't get/put cache if sha1 is not validLibravatar Kirill Smelkov1-2/+2
When blaming files in the working tree, the filespec is marked with !sha1_valid, as we have not given the contents an object name yet. The function to cache textconv results (keyed on the object name), however, didn't check this condition, and ended up on storing the cached result under a random object name. Cc: Axel Bonnet <axel.bonnet@ensimag.imag.fr> Cc: Clément Poulain <clement.poulain@ensimag.imag.fr> Cc: Diane Gasselin <diane.gasselin@ensimag.imag.fr> Cc: Jeff King <peff@peff.net> Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-16Merge branch 'kb/diff-C-M-synonym'Libravatar Junio C Hamano1-8/+8
* kb/diff-C-M-synonym: diff: use "find" instead of "detect" as prefix for long forms of -M and -C diff: add --detect-copies-harder as a synonym for --find-copies-harder