summaryrefslogtreecommitdiff
path: root/diffcore.h
AgeCommit message (Collapse)AuthorFilesLines
2020-04-07diff: restrict when prefetching occursLibravatar Jonathan Tan1-0/+21
Commit 7fbbcb21b1 ("diff: batch fetching of missing blobs", 2019-04-08) optimized "diff" by prefetching blobs in a partial clone, but there are some cases wherein blobs do not need to be prefetched. In these cases, any command that uses the diff machinery will unnecessarily fetch blobs. diffcore_std() may read blobs when it calls the following functions: (1) diffcore_skip_stat_unmatch() (controlled by the config variable diff.autorefreshindex) (2) diffcore_break() and diffcore_merge_broken() (for break-rewrite detection) (3) diffcore_rename() (for rename detection) (4) diffcore_pickaxe() (for detecting addition/deletion of specified string) Instead of always prefetching blobs, teach diffcore_skip_stat_unmatch(), diffcore_break(), and diffcore_rename() to prefetch blobs upon the first read of a missing object. This covers (1), (2), and (3): to cover the rest, teach diffcore_std() to prefetch if the output type is one that includes blob data (and hence blob data will be required later anyway), or if it knows that (4) will be run. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-07diff: make diff_populate_filespec_options structLibravatar Jonathan Tan1-3/+6
The behavior of diff_populate_filespec() currently can be customized through a bitflag, but a subsequent patch requires it to support a non-boolean option. Replace the bitflag with an options struct. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-18diff: move doc to diff.h and diffcore.hLibravatar Heba Waly1-0/+32
Move the documentation from Documentation/technical/api-diff.txt to both diff.h and diffcore.h as it's easier for the developers to find the usage information beside the code instead of looking for it in another doc file. Also documentation/technical/api-diff.txt is removed because the information it has is now redundant and it'll be hard to keep it up to date and synchronized with the documentation in the header files. There are three members documented in the doc file that weren't found in the header files, assuming the doc wasn't up to date and the members no longer exist: touched_flags, COLOR_DIFF_WORDS and QUIET. Signed-off-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21diff.c: reduce implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy1-6/+7
diff and textconv code has so widespread use that it's hard to simply update their api and all call sites at once because it would result in a big patch. For now reduce the_index references to two places: diff_setup() and fill_textconv(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20Merge branch 'en/incl-forward-decl'Libravatar Junio C Hamano1-0/+4
Code hygiene improvement for the header files. * en/incl-forward-decl: Remove forward declaration of an enum compat/precompose_utf8.h: use more common include guard style urlmatch.h: fix include guard Move definition of enum branch_track from cache.h to branch.h alloc: make allocate_alloc_state and clear_alloc_state more consistent Add missing includes and forward declarations
2018-08-15Add missing includes and forward declarationsLibravatar Elijah Newren1-0/+4
I looped over the toplevel header files, creating a temporary two-line C program for each consisting of #include "git-compat-util.h" #include $HEADER This patch is the result of manually fixing errors in compiling those tiny programs. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-03diffcore.h: drop extern from function declarationLibravatar Nguyễn Thái Ngọc Duy1-25/+25
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02diff: convert fill_filespec to struct object_idLibravatar Brandon Williams1-1/+1
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-17Merge branch 'tk/diffcore-delta-remove-unused'Libravatar Junio C Hamano1-1/+0
Code cleanup. * tk/diffcore-delta-remove-unused: diffcore-delta: remove unused parameter to diffcore_count_changes()
2016-11-14diffcore-delta: remove unused parameter to diffcore_count_changes()Libravatar Tobias Klauser1-1/+0
The delta_limit parameter to diffcore_count_changes() has been unused since commit ba23bbc8e ("diffcore-delta: make change counter to byte oriented again.", 2006-03-04). Remove the parameter and adjust all callers. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-28diff: rename struct diff_filespec's sha1_valid memberLibravatar brian m. carlson1-1/+1
Now that this struct's sha1 member is called "oid", update the comment and the sha1_valid member to be called "oid_valid" instead. The following Coccinelle semantic patch was used to implement this, followed by the transformations in object_id.cocci: @@ struct diff_filespec o; @@ - o.sha1_valid + o.oid_valid @@ struct diff_filespec *p; @@ - p->sha1_valid + p->oid_valid Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-28diff: convert struct diff_filespec to struct object_idLibravatar brian m. carlson1-1/+1
Convert struct diff_filespec's sha1 member to use a struct object_id called "oid" instead. The following Coccinelle semantic patch was used to implement this, followed by the transformations in object_id.cocci: @@ struct diff_filespec o; @@ - o.sha1 + o.oid.hash @@ struct diff_filespec *p; @@ - p->sha1 + p->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-18diff --stat: mark any file larger than core.bigfilethreshold binaryLibravatar Nguyễn Thái Ngọc Duy1-0/+1
Too large files may lead to failure to allocate memory. If it happens here, it could impact quite a few commands that involve diff. Moreover, too large files are inefficient to compare anyway (and most likely non-text), so mark them binary and skip looking at their content. Noticed-by: Dale R. Worley <worley@alum.mit.edu> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-18diff.c: allow to pass more flags to diff_populate_filespecLibravatar Nguyễn Thái Ngọc Duy1-1/+2
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-18Merge branch 'jk/diff-filespec-cleanup'Libravatar Junio C Hamano1-1/+1
Portability fix to a topic already in v1.9 * jk/diff-filespec-cleanup: diffcore.h: be explicit about the signedness of is_binary
2014-03-05Merge branch 'ks/combine-diff'Libravatar Junio C Hamano1-0/+14
Teach combine-diff to honour the path-output-order imposed by diffcore-order, and optimize how matching paths are found in the N-way diffs made with parents. * ks/combine-diff: tests: add checking that combine-diff emits only correct paths combine-diff: simplify intersect_paths() further combine-diff: combine_diff_path.len is not needed anymore combine-diff: optimize combine_diff_path sets intersection diff test: add tests for combine-diff with orderfile diffcore-order: export generic ordering interface
2014-02-27Merge branch 'nd/diff-quiet-stat-dirty'Libravatar Junio C Hamano1-0/+2
"git diff --quiet -- pathspec1 pathspec2" sometimes did not return correct status value. * nd/diff-quiet-stat-dirty: diff: do not quit early on stat-dirty files diff.c: move diffcore_skip_stat_unmatch core logic out for reuse later
2014-02-24diff: do not quit early on stat-dirty filesLibravatar Nguyễn Thái Ngọc Duy1-0/+2
When QUICK is set (i.e. with --quiet) we try to do as little work as possible, stopping after seeing the first change. stat-dirty is considered a "change" but it may turn out not, if no actual content is changed. The actual content test is performed too late in the process and the shortcut may be taken prematurely, leading to incorrect return code. Assume we do "git diff --quiet". If we have a stat-dirty file "a" and a really dirty file "b". We break the loop in run_diff_files() and stop after "a" because we have got a "change". Later in diffcore_skip_stat_unmatch() we find out "a" is actually not changed. But there's nothing else in the diff queue, we incorrectly declare "no change", ignoring the fact that "b" is changed. This also happens to "git diff --quiet HEAD" when it hits diff_can_quit_early() in oneway_diff(). This patch does the content test earlier in order to keep going if "a" is unchanged. The test result is cached so that when diffcore_skip_stat_unmatch() is done in the end, we spend no cycles on re-testing "a". Reported-by: IWAMOTO Toshihiro <iwamoto@valinux.co.jp> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24diffcore-order: export generic ordering interfaceLibravatar Kirill Smelkov1-0/+14
diffcore_order() interface only accepts a queue of `struct diff_filepair`. In the next patches, we'll want to order `struct combine_diff_path` by path, so let's first rework diffcore-order to also provide generic low-level interface for ordering arbitrary objects, provided they have path accessors. The new interface is: - `struct obj_order` for describing objects to ordering routine, and - order_objects() for actually doing the ordering work. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24diffcore.h: be explicit about the signedness of is_binaryLibravatar Richard Lowe1-1/+1
Bitfields need to specify their signedness explicitly or the compiler is free to default as it sees fit. With compilers that default 'unsigned' (SUNWspro 12 seems to do this) the tri-state nature of is_binary vanishes and all files are treated as binary. Signed-off-by: Richard Lowe <richlowe@richlowe.net> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-17diff_filespec: use only 2 bits for is_binary flagLibravatar Jeff King1-1/+1
The is_binary flag needs only three values: -1, 0, and 1. However, we use a whole 32-bit int for it on most systems (both 32- and 64- bit). Instead, we can mark it to use only 2 bits. On 32-bit systems, this lets it end up as part of the bitfield above (saving 4 bytes). On 64-bit systems, we don't see any change (because the savings end up as padding), but it does leave room for another "free" 32-bit value to be added later. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-17diff_filespec: reorder is_binary fieldLibravatar Jeff King1-1/+1
The middle of the diff_filespec struct contains a mixture of ints, shorts, and bit-fields, followed by a pointer. On an x86-64 system with an LP64 or LLP64 data model (i.e., most of them), the integers and flags end up being padded out by 41 bits to put the pointer at an 8-byte boundary. After the pointer, we have the "int is_binary" field, which is only 32 bits. We end up wasting another 32 bits to pad the struct size up to a multiple of 64 bits. We can move the is_binary field before the pointer, which lets the compiler store it where we used to have padding. This shrinks the top padding to only 9 bits (from the bit-fields), and eliminates the bottom padding entirely, dropping the struct size from 88 to 80 bytes. On a 32-bit system, there is no benefit, but nor should there be any harm (we only need 4-byte alignment there, so we were already using only 9 bits of padding). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-17diff_filespec: drop xfrm_flags fieldLibravatar Jeff King1-1/+0
The only mention of this field in the code is by some debugging code which prints it out (and it will always be zero, since we never touch it otherwise). It was obsoleted very early on by 25d5ea4 ([PATCH] Redo rename/copy detection logic., 2005-05-24). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-17diff_filespec: drop funcname_pattern_ident fieldLibravatar Jeff King1-1/+0
This struct field was obsoleted by be58e70 (diff: unify external diff and funcname parsing code, 2008-10-05), but we forgot to remove it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-17diff_filespec: reorder dirty_submodule macro definitionsLibravatar Jeff King1-1/+1
diff_filespec has a 2-bit "dirty_submodule" field and defines two flags as macros. Originally these were right next to each other, but a new field was accidentally added in between in commit 4682d85. This patch puts the field and its flags back together. Using an enum like: enum { DIRTY_SUBMODULE_UNTRACKED = 1, DIRTY_SUBMODULE_MODIFIED = 2 } dirty_submodule; would be more obvious, but it bloats the structure. Limiting the enum size like: } dirty_submodule : 2; might work, but it is not portable. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-27Merge branch 'jk/maint-null-in-trees'Libravatar Junio C Hamano1-1/+1
We do not want a link to 0{40} object stored anywhere in our objects. * jk/maint-null-in-trees: fsck: detect null sha1 in tree entries do not write null sha1s to on-disk index diff: do not use null sha1 as a sentinel value
2012-07-29diff: do not use null sha1 as a sentinel valueLibravatar Jeff King1-1/+1
The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-13Merge branch 'jc/refactor-diff-stdin'Libravatar Junio C Hamano1-0/+1
Due to the way "git diff --no-index" is bolted onto by touching the low level code that is shared with the rest of the "git diff" code, even though it has to work in a very different way, any comparison that involves a file "-" at the root level incorrectly tried to read from the standard input. This cleans up the no-index codepath further to remove code that reads from the standard input from the core side, which is never necessary when git is running its usual diff operation. * jc/refactor-diff-stdin: diff-index.c: "git diff" has no need to read blob from the standard input diff-index.c: unify handling of command line paths diff-index.c: do not pretend paths are pathspecs
2012-06-28diff-index.c: "git diff" has no need to read blob from the standard inputLibravatar Junio C Hamano1-0/+1
Only "diff --no-index -" does. Bolting the logic into the low-level function diff_populate_filespec() was a layering violation from day one. Move populate_from_stdin() function out of the generic diff.c to its only user, diff-index.c. Also make sure "-" from the command line stays a special token "read from the standard input", even if we later decide to sanitize the result from prefix_filename() function in a few obvious ways, e.g. removing unnecessary "./" prefix, duplicated slashes "//" in the middle, etc. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-20combine-diff: support format_callbackLibravatar Junio C Hamano1-1/+1
This teaches combine-diff machinery to feed a combined merge to a callback function when DIFF_FORMAT_CALLBACK is specified. So far, format callback functions are not used for anything but 2-way diffs. A callback is given a diff_queue_struct, which is an array of diff_filepair. As its name suggests, a diff_filepair is a _pair_ of diff_filespec that represents a single preimage and a single postimage. Since "diff -c" is to compare N parents with a single merge result and filter out any paths whose result match one (or more) of the parent(s), its output has to be able to represent N preimages and 1 postimage. For this reason, a callback function that inspects a diff_filepair that results from this new infrastructure can and is expected to view the preimage side (i.e. pair->one) as an array of diff_filespec. Each element in the array, except for the last one, is marked with "has_more_entries" bit, so that the same callback function can be used for 2-way diffs and combined diffs. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-31diff: pass the entire diff-options to diffcore_pickaxe()Libravatar Junio C Hamano1-1/+1
That would make it easier to give enhanced feature to the pickaxe transformation. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-13diff --follow: do call diffcore_std() as necessaryLibravatar Junio C Hamano1-2/+0
Usually, diff frontends populate the output queue with filepairs without any rename information and call diffcore_std() to sort the renames out. When --follow is in effect, however, diff-tree family of frontend has a hack that looks like this: diff-tree frontend -> diff_tree_sha1() . populate diff_queued_diff . if --follow is in effect and there is only one change that creates the target path, then -> try_to_follow_renames() -> diff_tree_sha1() with no pathspec but with -C -> diffcore_std() to find renames . if rename is found, tweak diff_queued_diff and put a single filepair that records the found rename there -> diffcore_std() . tweak elements on diff_queued_diff by - rename detection - path ordering - pickaxe filtering We need to skip parts of the second call to diffcore_std() that is related to rename detection, and do so only when try_to_follow_renames() did find a rename. Earlier 1da6175 (Make diffcore_std only can run once before a diff_flush, 2010-05-06) tried to deal with this issue incorrectly; it unconditionally disabled any second call to diffcore_std(). This hopefully fixes the breakage. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-12Standardize do { ... } while (0) styleLibravatar Jonathan Nieder1-4/+4
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09Document -B<n>[/<m>], -M<n> and -C<n> variants of -B, -M and -CLibravatar Matthieu Moy1-1/+1
These options take an optional argument, but this optional argument was not documented. Original patch by Matthieu Moy, but documentation for -B mostly copied from the explanations of Junio C Hamano. While we're there, fix a typo in a comment in diffcore.h. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-07Make diffcore_std only can run once before a diff_flushLibravatar Bo Yang1-0/+2
When file renames/copies detection is turned on, the second diffcore_std will degrade a 'C' pair to a 'R' pair. And this may happen when we run 'git log --follow' with hard copies finding. That is, the try_to_follow_renames() will run diffcore_std to find the copies, and then 'git log' will issue another diffcore_std, which will reduce 'src->rename_used' and recognize this copy as a rename. This is not what we want. So, I think we really don't need to run diffcore_std more than one time. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-07Add a macro DIFF_QUEUE_CLEAR.Libravatar Bo Yang1-0/+5
Refactor the diff_queue_struct code, this macro help to reset the structure. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-04git diff --submodule: Show detailed dirty status of submodulesLibravatar Jens Lehmann1-1/+3
When encountering a dirty submodule while doing "git diff --submodule" print an extra line for new untracked content and another for modified but already tracked content. And if the HEAD of the submodule is equal to the ref diffed against in the superproject, drop the output which would just show the same SHA1s and no commit message headlines. To achieve that, the dirty_submodule bitfield is expanded to two bits. The output of "git status" inside the submodule is parsed to set the according bits. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-18Performance optimization for detection of modified submodulesLibravatar Jens Lehmann1-0/+1
In the worst case is_submodule_modified() got called three times for each submodule. The information we got from scanning the whole submodule tree the first time can be reused instead. New parameters have been added to diff_change() and diff_addremove(), the information is stored in a new member of struct diff_filespec. Its value is then reused instead of calling is_submodule_modified() again. When no explicit "-dirty" is needed in the output the call to is_submodule_modified() is not necessary when the submodules HEAD already disagrees with the ref of the superproject, as this alone marks it as modified. To achieve that, get_stat_data() got an extra argument. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-02Merge branch 'maint'Libravatar Junio C Hamano1-1/+1
* maint: Add reference for status letters in documentation. Document that git-log takes --all-match. Update draft 1.6.0.4 release notes
2008-11-02Add reference for status letters in documentation.Libravatar Yann Dirson1-1/+1
Also fix error in diff_filepair::status documentation, and point to the in-code reference as well as the doc. Signed-off-by: Yann Dirson <ydirson@altern.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-18diff: introduce diff.<driver>.binaryLibravatar Jeff King1-2/+6
The "diff" gitattribute is somewhat overloaded right now. It can say one of three things: 1. this file is definitely binary, or definitely not (i.e., diff or !diff) 2. this file should use an external diff engine (i.e., diff=foo, diff.foo.command = custom-script) 3. this file should use particular funcname patterns (i.e., diff=foo, diff.foo.(x?)funcname = some-regex) Most of the time, there is no conflict between these uses, since using one implies that the other is irrelevant (e.g., an external diff engine will decide for itself whether the file is binary). However, there is at least one conflicting situation: there is no way to say "use the regular rules to determine whether this file is binary, but if we do diff it textually, use this funcname pattern." That is, currently setting diff=foo indicates that the file is definitely text. This patch introduces a "binary" config option for a diff driver, so that one can explicitly set diff.foo.binary. We default this value to "don't know". That is, setting a diff attribute to "foo" and using "diff.foo.funcname" will have no effect on the binaryness of a file. To get the current behavior, one can set diff.foo.binary to true. This patch also has one additional advantage: it cleans up the interface to the userdiff code a bit. Before, calling code had to know more about whether attributes were false, true, or unset to determine binaryness. Now that binaryness is a property of a driver, we can represent these situations just by passing back a driver struct. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-09-19Bust the ghost of long-defunct diffcore-pathspec.Libravatar Yann Dirson1-1/+0
This concept was retired by 77882f6 (Retire diffcore-pathspec., 2006-04-10), more than 2 years ago. Signed-off-by: Yann Dirson <ydirson@altern.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-10-26copy vs rename detection: avoid unnecessary O(n*m) loopsLibravatar Linus Torvalds1-1/+1
The core rename detection had some rather stupid code to check if a pathname was used by a later modification or rename, which basically walked the whole pathname space for all renames for each rename, in order to tell whether it was a pure rename (no remaining users) or should be considered a copy (other users of the source file remaining). That's really silly, since we can just keep a count of users around, and replace all those complex and expensive loops with just testing that simple counter (but this all depends on the previous commit that shared the diff_filespec data structure by using a separate reference count). Note that the reference count is not the same as the rename count: they behave otherwise rather similarly, but the reference count is tied to the allocation (and decremented at de-allocation, so that when it turns zero we can get rid of the memory), while the rename count is tied to the renames and is decremented when we find a rename (so that when it turns zero we know that it was a rename, not a copy). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-10-26Ref-count the filespecs used by diffcoreLibravatar Linus Torvalds1-0/+2
Rather than copy the filespecs when introducing new versions of them (for rename or copy detection), use a refcount and increment the count when reusing the diff_filespec. This avoids unnecessary allocations, but the real reason behind this is a future enhancement: we will want to track shared data across the copy/rename detection. In order to efficiently notice when a filespec is used by a rename, the rename machinery wants to keep track of a rename usage count which is shared across all different users of the filespec. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-10-02rename diff_free_filespec_data_large() to diff_free_filespec_blob()Libravatar Junio C Hamano1-1/+1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-10-02diffcore-rename: cache file deltasLibravatar Jeff King1-0/+1
We find rename candidates by computing a fingerprint hash of each file, and then comparing those fingerprints. There are inherently O(n^2) comparisons, so it pays in CPU time to hoist the (rather expensive) computation of the fingerprint out of that loop (or to cache it once we have computed it once). Previously, we didn't keep the filespec information around because then we had the potential to consume a great deal of memory. However, instead of keeping all of the filespec data, we can instead just keep the fingerprint. This patch implements and uses diff_free_filespec_data_large to accomplish that goal. We also have to change estimate_similarity not to needlessly repopulate the filespec data when we already have the hash. Practical tests showed 4.5x speedup for a 10% memory usage increase. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-07Fix configuration syntax to specify customized hunk header patterns.Libravatar Junio C Hamano1-1/+1
This updates the hunk header customization syntax. The special case 'funcname' attribute is gone. You assign the name of the type of contents to path's "diff" attribute as a string value in .gitattributes like this: *.java diff=java *.perl diff=perl *.doc diff=doc If you supply "diff.<name>.funcname" variable via the configuration mechanism (e.g. in $HOME/.gitconfig), the value is used as the regexp set to find the line to use for the hunk header (the variable is called "funcname" because such a line typically is the one that has the name of the function in programming language source text). If there is no such configuration, built-in default is used, if any. Currently there are two default patterns: default and java. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-06Per-path attribute based hunk header selection.Libravatar Junio C Hamano1-0/+1
This makes"diff -p" hunk headers customizable via gitattributes mechanism. It is based on Johannes's earlier patch that allowed to define a single regexp to be used for everything. The mechanism to arrive at the regexp that is used to define hunk header is the same as other use of gitattributes. You assign an attribute, funcname (because "diff -p" typically uses the name of the function the patch is about as the hunk header), a simple string value. This can be one of the names of built-in pattern (currently, "java" is defined) or a custom pattern name, to be looked up from the configuration file. (in .gitattributes) *.java funcname=java *.perl funcname=perl (in .git/config) [funcname] java = ... # ugly and complicated regexp to override the built-in one. perl = ... # another ugly and complicated regexp to define a new one. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-07-06Introduce diff_filespec_is_binary()Libravatar Junio C Hamano1-0/+2
This replaces an explicit initialization of filespec->is_binary field used for rename/break followed by direct access to that field with a wrapper function that lazily iniaitlizes and accesses the field. We would add more attribute accesses for the use of diff routines, and it would be better to make this abstraction earlier. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-06-30diffcore_filespec: add is_binaryLibravatar Junio C Hamano1-0/+1
diffcore-break and diffcore-rename would want to behave slightly differently depending on the binary-ness of the data, so add one bit to the filespec, as the structure is now passed down to diffcore_count_changes() function. Signed-off-by: Junio C Hamano <gitster@pobox.com>