summaryrefslogtreecommitdiff
path: root/diff.h
AgeCommit message (Collapse)AuthorFilesLines
2017-06-05Merge branch 'js/blame-lib'Libravatar Junio C Hamano1-0/+7
The internal logic used in "git blame" has been libified to make it easier to use by cgit. * js/blame-lib: (29 commits) blame: move entry prepend to libgit blame: move scoreboard setup to libgit blame: move scoreboard-related methods to libgit blame: move fake-commit-related methods to libgit blame: move origin-related methods to libgit blame: move core structures to header blame: create entry prepend function blame: create scoreboard setup function blame: create scoreboard init function blame: rework methods that determine 'final' commit blame: wrap blame_sort and compare_blame_final blame: move progress updates to a scoreboard callback blame: make sanity_check use a callback in scoreboard blame: move no_whole_file_rename flag to scoreboard blame: move xdl_opts flags to scoreboard blame: move show_root flag to scoreboard blame: move reverse flag to scoreboard blame: move contents_from to scoreboard blame: move copy/move thresholds to scoreboard blame: move stat counters to scoreboard ...
2017-05-24blame: move textconv_object with related functionsLibravatar Jeff Smith1-0/+7
textconv_object is used in places other than blame.c and should be moved to a more appropriate location. Other textconv related functions are located in diff.c so that seems as good a place as any. Signed-off-by: Jeff Smith <whydoubt@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08diff-lib: convert do_diff_cache to struct object_idLibravatar brian m. carlson1-1/+1
This is needed to convert parse_tree_indirect. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-31Rename sha1_array to oid_arrayLibravatar brian m. carlson1-2/+2
Since this structure handles an array of object IDs, rename it to struct oid_array. Also rename the accessor functions and the initialization constant. This commit was produced mechanically by providing non-Documentation files to the following Perl one-liners: perl -pi -E 's/struct sha1_array/struct oid_array/g' perl -pi -E 's/\bsha1_array_/oid_array_/g' perl -pi -E 's/SHA1_ARRAY_INIT/OID_ARRAY_INIT/g' Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-27Merge branch 'nd/ita-empty-commit'Libravatar Junio C Hamano1-1/+2
When new paths were added by "git add -N" to the index, it was enough to circumvent the check by "git commit" to refrain from making an empty commit without "--allow-empty". The same logic prevented "git status" to show such a path as "new file" in the "Changes not staged for commit" section. * nd/ita-empty-commit: commit: don't be fooled by ita entries when creating initial commit commit: fix empty commit creation when there's no changes but ita entries diff: add --ita-[in]visible-in-index diff-lib: allow ita entries treated as "not yet exist in index"
2016-10-26diff_aligned_abbrev: use "struct oid"Libravatar Jeff King1-1/+1
Since we're modifying this function anyway, it's a good time to update it to the more modern "struct oid". We can also drop some of the magic numbers in favor of GIT_SHA1_HEXSZ, along with some descriptive comments. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-26diff_unique_abbrev: rename to diff_aligned_abbrevLibravatar Jeff King1-1/+5
The word "align" describes how the function actually differs from find_unique_abbrev, and will make it less confusing when we add more diff-specific abbrevation functions that do not do this alignment. Since this is a globally available function, let's also move its descriptive comment to the header file, where we typically document function interfaces. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-24commit: fix empty commit creation when there's no changes but ita entriesLibravatar Nguyễn Thái Ngọc Duy1-1/+1
If i-t-a entries are present and there is no change between the index and HEAD i-t-a entries, index_differs_from() still returns "dirty, new entries" (aka, the resulting commit is not empty), but cache-tree will skip i-t-a entries and produce the exact same tree of current commit. index_differs_from() is supposed to catch this so we can abort git-commit (unless --no-empty is specified). Update it to optionally ignore i-t-a entries when doing a diff between the index and HEAD so that it would return "no change" in this case and abort commit. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-24diff-lib: allow ita entries treated as "not yet exist in index"Libravatar Nguyễn Thái Ngọc Duy1-0/+1
When comparing the index and the working tree to show which paths are new, and comparing the tree recorded in the HEAD and the index to see if committing the contents recorded in the index would result in an empty commit, we would want the former comparison to say "these are new paths" and the latter to say "there is no change" for paths that are marked as intent-to-add. We made a similar attempt at d95d728a ("diff-lib.c: adjust position of i-t-a entries in diff", 2015-03-16), which redefined the semantics of these two comparison modes globally, which was a disaster and had to be reverted at 78cc1a54 ("Revert "diff-lib.c: adjust position of i-t-a entries in diff"", 2015-06-23). To make sure we do not repeat the same mistake, introduce a new internal diffopt option so that this different semantics can be asked for only by callers that ask it, while making sure other unaudited callers will get the same comparison result. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-26Merge branch 'mh/diff-indent-heuristic'Libravatar Junio C Hamano1-0/+1
Output from "git diff" can be made easier to read by selecting which lines are common and which lines are added/deleted intelligently when the lines before and after the changed section are the same. A command line option is added to help with the experiment to find a good heuristics. * mh/diff-indent-heuristic: blame: honor the diff heuristic options and config parse-options: add parse_opt_unknown_cb() diff: improve positioning of add/delete blocks in diffs xdl_change_compact(): introduce the concept of a change group recs_match(): take two xrecord_t pointers as arguments is_blank_line(): take a single xrecord_t as argument xdl_change_compact(): only use heuristic if group can't be matched xdl_change_compact(): fix compaction heuristic to adjust ixo
2016-09-19blame: honor the diff heuristic options and configLibravatar Michael Haggerty1-0/+1
Teach "git blame" and "git annotate" the --compaction-heuristic and --indent-heuristic options that are now supported by "git diff". Also teach them to honor the `diff.compactionHeuristic` and `diff.indentHeuristic` configuration options. It would be conceivable to introduce separate configuration options for "blame" and "annotate"; for example `blame.compactionHeuristic` and `blame.indentHeuristic`. But it would be confusing to users if blame output is inconsistent with diff output, so it makes more sense for them to respect the same configuration. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-12Merge branch 'jk/diff-submodule-diff-inline'Libravatar Junio C Hamano1-2/+9
The "git diff --submodule={short,log}" mechanism has been enhanced to allow "--submodule=diff" to show the patch between the submodule commits bound to the superproject. * jk/diff-submodule-diff-inline: diff: teach diff to display submodule difference with an inline diff submodule: refactor show_submodule_summary with helper function submodule: convert show_submodule_summary to use struct object_id * allow do_submodule_path to work even if submodule isn't checked out diff: prepare for additional submodule formats graph: add support for --line-prefix on all graph-aware output diff.c: remove output_prefix_length field cache: add empty_tree_oid object and helper function
2016-08-31diff: teach diff to display submodule difference with an inline diffLibravatar Jacob Keller1-1/+2
Teach git-diff and friends a new format for displaying the difference of a submodule. The new format is an inline diff of the contents of the submodule between the commit range of the update. This allows the user to see the actual code change caused by a submodule update. Add tests for the new format and option. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-31diff: prepare for additional submodule formatsLibravatar Jacob Keller1-1/+6
A future patch will add a new format for displaying the difference of a submodule. Make it easier by changing how we store the current selected format. Replace the DIFF_OPT flag with an enumeration, as each format will be mutually exclusive. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-31graph: add support for --line-prefix on all graph-aware outputLibravatar Jacob Keller1-0/+2
Add an extension to git-diff and git-log (and any other graph-aware displayable output) such that "--line-prefix=<string>" will print the additional line-prefix on every line of output. To make this work, we have to fix a few bugs in the graph API that force graph_show_commit_msg to be used only when you have a valid graph. Additionally, we extend the default_diff_output_prefix handler to work even when no graph is enabled. This is somewhat of a hack on top of the graph API, but I think it should be acceptable here. This will be used by a future extension of submodule display which displays the submodule diff as the actual diff between the pre and post commit in the submodule project. Add some tests for both git-log and git-diff to ensure that the prefix is honored correctly. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-31diff.c: remove output_prefix_length fieldLibravatar Junio C Hamano1-1/+0
"diff/log --stat" has a logic that determines the display columns available for the diffstat part of the output and apportions it for pathnames and diffstat graph automatically. 5e71a84a (Add output_prefix_length to diff_options, 2012-04-16) added the output_prefix_length field to diff_options structure to allow this logic to subtract the display columns used for the history graph part from the total "terminal width"; this matters when the "git log --graph -p" option is in use. The field must be set to the number of display columns needed to show the output from the output_prefix() callback, which is error prone. As there is only one user of the field, and the user has the actual value of the prefix string, let's get rid of the field and have the user count the display width itself. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-29patch-ids: add flag to create the diff patch id using header only dataLibravatar Kevin Willford1-1/+1
This will allow a diff patch id to be created using only the header data so that the contents of the file will not have to be loaded. Signed-off-by: Kevin Willford <kcwillford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-03Merge branch 'mm/diff-renames-default'Libravatar Junio C Hamano1-0/+1
The end-user facing Porcelain level commands like "diff" and "log" now enables the rename detection by default. * mm/diff-renames-default: diff: activate diff.renames by default log: introduce init_log_defaults() t: add tests for diff.renames (true/false/unset) t4001-diff-rename: wrap file creations in a test Documentation/diff-config: fix description of diff.renames
2016-02-26Merge branch 'jk/tighten-alloc'Libravatar Junio C Hamano1-2/+2
Update various codepaths to avoid manually-counted malloc(). * jk/tighten-alloc: (22 commits) ewah: convert to REALLOC_ARRAY, etc convert ewah/bitmap code to use xmalloc diff_populate_gitlink: use a strbuf transport_anonymize_url: use xstrfmt git-compat-util: drop mempcpy compat code sequencer: simplify memory allocation of get_message test-path-utils: fix normalize_path_copy output buffer size fetch-pack: simplify add_sought_entry fast-import: simplify allocation in start_packfile write_untracked_extension: use FLEX_ALLOC helper prepare_{git,shell}_cmd: use argv_array use st_add and st_mult for allocation size computation convert trivial cases to FLEX_ARRAY macros use xmallocz to avoid size arithmetic convert trivial cases to ALLOC_ARRAY convert manual allocations to argv_array argv-array: add detach function add helpers for allocating flex-array structs harden REALLOC_ARRAY and xcalloc against size_t overflow tree-diff: catch integer overflow in combine_diff_path allocation ...
2016-02-26Merge branch 'jk/more-comments-on-textconv'Libravatar Junio C Hamano1-0/+16
The memory ownership rule of fill_textconv() API, which was a bit tricky, has been documented a bit better. * jk/more-comments-on-textconv: diff: clarify textconv interface
2016-02-25diff: activate diff.renames by defaultLibravatar Matthieu Moy1-0/+1
Rename detection is a very convenient feature, and new users shouldn't have to dig in the documentation to benefit from it. Potential objections to activating rename detection are that it sometimes fail, and it is sometimes slow. But rename detection is already activated by default in several cases like "git status" and "git merge", so activating diff.renames does not fundamentally change the situation. When the rename detection fails, it now fails consistently between "git diff" and "git status". This setting does not affect plumbing commands, hence well-written scripts will not be affected. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22diff: clarify textconv interfaceLibravatar Jeff King1-0/+16
The memory allocation scheme for the textconv interface is a bit tricky, and not well documented. It was originally designed as an internal part of diff.c (matching fill_mmfile), but gradually was made public. Refactoring it is difficult, but we can at least improve the situation by documenting the intended flow and enforcing it with an in-code assertion. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-19tree-diff: catch integer overflow in combine_diff_path allocationLibravatar Jeff King1-2/+2
A combine_diff_path struct has two "flex" members allocated alongside the struct: a string to hold the pathname, and an array of parent pointers. We use an "int" to compute this, meaning we may easily overflow it if the pathname is extremely long. We can fix this by using size_t, and checking for overflow with the st_add helper. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-05Merge branch 'nd/diff-with-path-params' into maintLibravatar Junio C Hamano1-2/+2
A few options of "git diff" did not work well when the command was run from a subdirectory. * nd/diff-with-path-params: diff: make -O and --output work in subdirectory diff-no-index: do not take a redundant prefix argument
2016-02-03Merge branch 'nd/diff-with-path-params'Libravatar Junio C Hamano1-2/+2
A few options of "git diff" did not work well when the command was run from a subdirectory. * nd/diff-with-path-params: diff: make -O and --output work in subdirectory diff-no-index: do not take a redundant prefix argument
2016-01-21diff: make -O and --output work in subdirectoryLibravatar Duy Nguyen1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-21diff-no-index: do not take a redundant prefix argumentLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Prefix is already set up in "revs". The same prefix should be used for all options parsing. So kill the last argument. This patch does not actually change anything because the only caller does use the same prefix for init_revisions() and diff_no_index(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-04avoid shifting signed integers 31 bitsLibravatar Jeff King1-1/+1
We sometimes use 32-bit unsigned integers as bit-fields. It's fine to access the MSB, because it's unsigned. However, doing so as "1 << 31" is wrong, because the constant "1" is a signed int, and we shift into the sign bit, causing undefined behavior. We can fix this by using "1U" as the constant. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-09log: add "log.follow" configuration variableLibravatar David Turner1-0/+1
People who work on projects with mostly linear history with frequent whole file renames may want to always use "git log --follow" when inspecting the life of the content that live in a single path. Teach the command to behave as if "--follow" was given from the command line when log.follow configuration variable is set *and* there is one (and only one) path on the command line. Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-11Merge branch 'jk/color-diff-plain-is-context'Libravatar Junio C Hamano1-1/+1
"color.diff.plain" was a misnomer; give it 'color.diff.context' as a more logical synonym. * jk/color-diff-plain-is-context: diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT diff: accept color.diff.context as a synonym for "plain"
2015-06-11Merge branch 'jc/diff-ws-error-highlight'Libravatar Junio C Hamano1-0/+5
Allow whitespace breakages in deleted and context lines to be also painted in the output. * jc/diff-ws-error-highlight: diff.c: --ws-error-highlight=<kind> option diff.c: add emit_del_line() and emit_context_line() t4015: separate common setup and per-test expectation t4015: modernise style
2015-05-27diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXTLibravatar Jeff King1-1/+1
The latter is a much more descriptive name (and we support "color.diff.context" now). This also updates the name of any local variables which were used to store the color. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-26diff.c: --ws-error-highlight=<kind> optionLibravatar Junio C Hamano1-0/+5
Traditionally, we only cared about whitespace breakages introduced in new lines. Some people want to paint whitespace breakages on old lines, too. When they see a whitespace breakage on a new line, they can spot the same kind of whitespace breakage on the corresponding old line and want to say "Ah, those breakages are there but they were inherited from the original, so let's not touch them for now." Introduce `--ws-error-highlight=<kind>` option, that lets them pass a comma separated list of `old`, `new`, and `context` to specify what lines to highlight whitespace errors on. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-13diff: convert struct combine_diff_path to object_idLibravatar brian m. carlson1-2/+3
Also, convert a constant to GIT_SHA1_HEXSZ. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-03Merge branch 'ks/tree-diff-nway'Libravatar Junio C Hamano1-2/+9
Instead of running N pair-wise diff-trees when inspecting a N-parent merge, find the set of paths that were touched by walking N+1 trees in parallel. These set of paths can then be turned into N pair-wise diff-tree results to be processed through rename detections and such. And N=2 case nicely degenerates to the usual 2-way diff-tree, which is very nice. * ks/tree-diff-nway: mingw: activate alloca combine-diff: speed it up, by using multiparent diff tree-walker directly tree-diff: rework diff_tree() to generate diffs for multiparent cases as well Portable alloca for Git tree-diff: reuse base str(buf) memory on sub-tree recursion tree-diff: no need to call "full" diff_tree_sha1 from show_path() tree-diff: rework diff_tree interface to be sha1 based tree-diff: diff_tree() should now be static tree-diff: remove special-case diff-emitting code for empty-tree cases tree-diff: simplify tree_entry_pathcmp tree-diff: show_path prototype is not needed anymore tree-diff: rename compare_tree_entry -> tree_entry_pathcmp tree-diff: move all action-taking code out of compare_tree_entry() tree-diff: don't assume compare_tree_entry() returns -1,0,1 tree-diff: consolidate code for emitting diffs and recursion in one place tree-diff: show_tree() is not needed tree-diff: no need to pass match to skip_uninteresting() tree-diff: no need to manually verify that there is no mode change for a path combine-diff: move changed-paths scanning logic into its own function combine-diff: move show_log_first logic/action out of paths scanning
2014-04-07tree-diff: rework diff_tree() to generate diffs for multiparent cases as wellLibravatar Kirill Smelkov1-0/+9
Previously diff_tree(), which is now named ll_diff_tree_sha1(), was generating diff_filepair(s) for two trees t1 and t2, and that was usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes a commit introduces. In Git, however, we have fundamentally built flexibility in that a commit can have many parents - 1 for a plain commit, 2 for a simple merge, but also more than 2 for merging several heads at once. For merges there is a so called combine-diff, which shows diff, a merge introduces by itself, omitting changes done by any parent. That works through first finding paths, that are different to all parents, and then showing generalized diff, with separate columns for +/- for each parent. The code lives in combine-diff.c . There is an impedance mismatch, however, in that a commit could generally have any number of parents, and that while diffing trees, we divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there is no special casing for multiple parents commits in e.g. revision-walker . That impedance mismatch *hurts* *performance* *badly* for generating combined diffs - in "combine-diff: optimize combine_diff_path sets intersection" I've already removed some slowness from it, but from the timings provided there, it could be seen, that combined diffs still cost more than an order of magnitude more cpu time, compared to diff for usual commits, and that would only be an optimistic estimate, if we take into account that for e.g. linux.git there is only one merge for several dozens of plain commits. That slowness comes from the fact that currently, while generating combined diff, a lot of time is spent computing diff(commit,commit^2) just to only then intersect that huge diff to almost small set of files from diff(commit,commit^1). That's because at present, to compute combine-diff, for first finding paths, that "every parent touches", we use the following combine-diff property/definition: D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn) (w.r.t. paths) where D(A,P1...Pn) is combined diff between commit A, and parents Pi and D(A,Pi) is usual two-tree diff Pi..A So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n 1-parent diffs and intersecting results will be slow. And usually, for linux.git and other topic-based workflows, that D(A,P2) is huge, because, if merge-base of A and P2, is several dozens of merges (from A, via first parent) below, that D(A,P2) will be diffing sum of merges from several subsystems to 1 subsystem. The solution is to avoid computing n 1-parent diffs, and to find changed-to-all-parents paths via scanning A's and all Pi's trees simultaneously, at each step comparing their entries, and based on that comparison, populate paths result, and deduce we could *skip* *recursing* into subdirectories, if at least for 1 parent, sha1 of that dir tree is the same as in A. That would save us from doing significant amount of needless work. Such approach is very similar to what diff_tree() does, only there we deal with scanning only 2 trees simultaneously, and for n+1 tree, the logic is a bit more complex: D(T,P1...Pn) calculation scheme ------------------------------- D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn) (regarding resulting paths set) D(T,Pj) - diff between T..Pj D(T,P1...Pn) - combined diff from T to parents P1,...,Pn We start from all trees, which are sorted, and compare their entries in lock-step: T P1 Pn - - - |t| |p1| |pn| |-| |--| ... |--| imin = argmin(p1...pn) | | | | | | |-| |--| |--| |.| |. | |. | . . . . . . at any time there could be 3 cases: 1) t < p[imin]; 2) t > p[imin]; 3) t = p[imin]. Schematic deduction of what every case means, and what to do, follows: 1) t < p[imin] -> ∀j t ∉ Pj -> "+t" ∈ D(T,Pj) -> D += "+t"; t↓ 2) t > p[imin] 2.1) ∃j: pj > p[imin] -> "-p[imin]" ∉ D(T,Pj) -> D += ø; ∀ pi=p[imin] pi↓ 2.2) ∀i pi = p[imin] -> pi ∉ T -> "-pi" ∈ D(T,Pi) -> D += "-p[imin]"; ∀i pi↓ 3) t = p[imin] 3.1) ∃j: pj > p[imin] -> "+t" ∈ D(T,Pj) -> only pi=p[imin] remains to investigate 3.2) pi = p[imin] -> investigate δ(t,pi) | | v 3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø -> ⎧δ(t,pi) - if pi=p[imin] -> D += ⎨ ⎩"+t" - if pi>p[imin] in any case t↓ ∀ pi=p[imin] pi↓ ~ For comparison, here is how diff_tree() works: D(A,B) calculation scheme ------------------------- A B - - |a| |b| a < b -> a ∉ B -> D(A,B) += +a a↓ |-| |-| a > b -> b ∉ A -> D(A,B) += -b b↓ | | | | a = b -> investigate δ(a,b) a↓ b↓ |-| |-| |.| |.| . . . . ~~~~~~~~ This patch generalizes diff tree-walker to work with arbitrary number of parents as described above - i.e. now there is a resulting tree t, and some parents trees tp[i] i=[0..nparent). The generalization builds on the fact that usual diff D(A,B) is by definition the same as combined diff D(A,[B]), so if we could rework the code for common case and make it be not slower for nparent=1 case, usual diff(t1,t2) generation will not be slower, and multiparent diff tree-walker would greatly benefit generating combine-diff. What we do is as follows: 1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be a paths generator (new name diff_tree_paths()), with each generated path being `struct combine_diff_path` with info for path, new sha1,mode and for every parent which sha1,mode it was in it. 2) From that info, we can still generate usual diff queue with struct diff_filepairs, via "exporting" generated combine_diff_path, if we know we run for nparent=1 case. (see emit_diff() which is now named emit_diff_first_parent_only()) 3) In order for diff_can_quit_early(), which checks DIFF_OPT_TST(opt, HAS_CHANGES)) to work, that exporting have to be happening not in bulk, but incrementally, one diff path at a time. For such consumers, there is a new callback in diff_options introduced: ->pathchange(opt, struct combine_diff_path *) which, if set to !NULL, is called for every generated path. (see new compat ll_diff_tree_sha1() wrapper around new paths generator for setup) 4) The paths generation itself, is reworked from previous ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation scheme" provided above: On the start we allocate [nparent] arrays in place what was earlier just for one parent tree. then we just generalize loops, and comparison according to the algorithm. Some notes(*): 1) alloca(), for small arrays, is used for "runs not slower for nparent=1 case than before" goal - if we change it to xmalloc()/free() the timings get ~1% worse. For alloca() we use just-introduced xalloca/xalloca_free compatibility wrappers, so it should not be a portability problem. 2) For every parent tree, we need to keep a tag, whether entry from that parent equals to entry from minimal parent. For performance reasons I'm keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ. Not doing so, we'd need to alloca another [nparent] array, which hurts performance. 3) For emitted paths, memory could be reused, if we know the path was processed via callback and will not be needed later. We use efficient hand-made realloc-style path_appendnew(), that saves us from ~1-1.5% of potential additional slowdown. 4) goto(s) are used in several places, as the code executes a little bit faster with lowered register pressure. Also - we should now check for FIND_COPIES_HARDER not only when two entries names are the same, and their hashes are equal, but also for a case, when a path was removed from some of all parents having it. The reason is, if we don't, that path won't be emitted at all (see "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants all paths - with diff or without - to be emitted, to be later analyzed for being copies sources. The new check is only necessary for nparent >1, as for nparent=1 case xmin_eqtotal always =1 =nparent, and a path is always added to diff as removal. ~~~~~~~~ Timings for # without -c, i.e. testing only nparent=1 case `git log --raw --no-abbrev --no-renames` before and after the patch are as follows: navy.git linux.git v3.10..v3.11 before 0.611s 1.889s after 0.619s 1.907s slowdown 1.3% 0.9% This timings show we did no harm to usual diff(tree1,tree2) generation. From the table we can see that we actually did ~1% slowdown, but I think I've "earned" that 1% in the previous patch ("tree-diff: reuse base str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case, net timings stays approximately the same. The output also stayed the same. (*) If we revert 1)-4) to more usual techniques, for nparent=1 case, we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as "do no harm for nparent=1 case" rule. For linux.git, combined diff will run an order of magnitude faster and appropriate timings will be provided in the next commit, as we'll be taking advantage of the new diff tree-walker for combined-diff generation there. P.S. and combined diff is not some exotic/for-play-only stuff - for example for a program I write to represent Git archives as readonly filesystem, there is initial scan with `git log --reverse --raw --no-abbrev --no-renames -c` to extract log of what was created/changed when, as a result building a map {} sha1 -> in which commit (and date) a content was added that `-c` means also show combined diff for merges, and without them, if a merge is non-trivial (merges changes from two parents with both having separate changes to a file), or an evil one, the map will not be full, i.e. some valid sha1 would be absent from it. That case was my initial motivation for combined diffs speedup. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-26tree-diff: diff_tree() should now be staticLibravatar Kirill Smelkov1-2/+0
We reworked all its users to use the functionality through diff_tree_sha1 variant in recent patches (see "tree-diff: allow diff_tree_sha1 to accept NULL sha1" and what comes next). diff_tree() is now not used outside tree-diff.c - make it static. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-07Merge branch 'jc/hold-diff-remove-q-synonym-for-no-deletion'Libravatar Junio C Hamano1-2/+0
Remove a confusing and deprecated "-q" option from "git diff-files"; "git diff-files --diff-filter=d" can be used instead.
2014-02-24combine-diff: combine_diff_path.len is not needed anymoreLibravatar Kirill Smelkov1-1/+0
The field was used in order to speed-up name comparison and also to mark removed paths by setting it to 0. Because the updated code does significantly less strcmp and also just removes paths from the list and free right after we know a path will not be needed, it is not needed anymore. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-27Merge branch 'tg/diff-no-index-refactor'Libravatar Junio C Hamano1-1/+1
"git diff ../else/where/A ../else/where/B" when ../else/where is clearly outside the repository, and "git diff --no-index A B", do not have to look at the index at all, but we used to read the index unconditionally. * tg/diff-no-index-refactor: diff: avoid some nesting diff: add test for --no-index executed outside repo diff: don't read index when --no-index is given diff: move no-index detection to builtin/diff.c
2013-12-27Merge branch 'zk/difftool-counts'Libravatar Junio C Hamano1-0/+2
Show the total number of paths and the number of paths shown so far when "git difftool" prompts to launch an external diff tool, which would give users some sense of progress. * zk/difftool-counts: diff.c: fix some recent whitespace style violations difftool: display the number of files in the diff queue in the prompt
2013-12-12diff: move no-index detection to builtin/diff.cLibravatar Thomas Gummerer1-1/+1
Currently the --no-index option is parsed in diff_no_index(). Move the detection if a no-index diff should be executed to builtin/diff.c, where we can use it for executing diff_no_index() conditionally. This will also allow us to execute other operations conditionally, which will be done in the next patch. There are no functional changes. Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-06difftool: display the number of files in the diff queue in the promptLibravatar Zoltan Klinger1-0/+2
When --prompt option is set, git-difftool displays a prompt for each modified file to be viewed in an external diff program. At that point, it could be useful to display a counter and the total number of files in the diff queue. Below is the current difftool prompt for the first of 5 modified files: Viewing: 'diff.c' Launch 'vimdiff' [Y/n]: Consider the modified prompt: Viewing (1/5): 'diff.c' Launch 'vimdiff' [Y/n]: The current GIT_EXTERNAL_DIFF mechanism does not tell the number of paths in the diff queue nor the current counter. To make this "counter/total" info available for GIT_EXTERNAL_DIFF programs without breaking existing ones by doing the following: - Keep track of the number of paths shown so far in diff_options; - Export two new environment variables from run_external_diff() to show the total number of paths (from diff_queue_struct) and the current value of the counter (from diff_options); and - Update git-difftool--helper to use these two environment variables. Signed-off-by: Zoltan Klinger <zoltan.klinger@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-31Use the word 'stuck' instead of 'sticked'Libravatar Nicolas Vigier1-1/+1
The past participle of 'stick' is 'stuck'. Signed-off-by: Nicolas Vigier <boklm@mars-attacks.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-23Merge branch 'mg/more-textconv'Libravatar Junio C Hamano1-2/+6
Make "git grep" and "git show" pay attention to --textconv when dealing with blob objects. * mg/more-textconv: grep: honor --textconv for the case rev:path grep: allow to use textconv filters t7008: demonstrate behavior of grep with textconv cat-file: do not die on --textconv without textconv filters show: honor --textconv for blobs diff_opt: track whether flags have been set explicitly t4030: demonstrate behavior of show with textconv
2013-09-09Merge branch 'jl/submodule-mv'Libravatar Junio C Hamano1-2/+1
"git mv A B" when moving a submodule A does "the right thing", inclusing relocating its working tree and adjusting the paths in the .gitmodules file. * jl/submodule-mv: (53 commits) rm: delete .gitmodules entry of submodules removed from the work tree mv: update the path entry in .gitmodules for moved submodules submodule.c: add .gitmodules staging helper functions mv: move submodules using a gitfile mv: move submodules together with their work trees rm: do not set a variable twice without intermediate reading. t6131 - skip tests if on case-insensitive file system parse_pathspec: accept :(icase)path syntax pathspec: support :(glob) syntax pathspec: make --literal-pathspecs disable pathspec magic pathspec: support :(literal) syntax for noglob pathspec kill limit_pathspec_to_literal() as it's only used by parse_pathspec() parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN parse_pathspec: make sure the prefix part is wildcard-free rename field "raw" to "_raw" in struct pathspec tree-diff: remove the use of pathspec's raw[] in follow-rename codepath remove match_pathspec() in favor of match_pathspec_depth() remove init_pathspec() in favor of parse_pathspec() remove diff_tree_{setup,release}_paths convert common_prefix() to use struct pathspec ...
2013-07-19diff: remove "diff-files -q" in a version of Git in a distant futureLibravatar Junio C Hamano1-2/+0
This was inherited from "show-diff -q" that was invented to tell comparison between the index and the working tree to ignore only removals in 2005. These days, it is spelled as "--diff-filter=d". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-19diff: deprecate -q option to diff-filesLibravatar Junio C Hamano1-0/+2
This reimplements the ancient "-q" option to "git diff-files" that was inherited from "show-diff -q" in terms of "--diff-filter=d". We will be deprecating the "-q" option, so let's issue a warning when we do so. Incidentally this also tentatively fixes "git diff --no-index" to honor "-q" and hide deletions; the use will get the same warning. We should remove the support for "-q" in a future version but it is not that urgent. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-17diff: preparse --diff-filter string argumentLibravatar Junio C Hamano1-1/+4
Instead of running strchr() on the list of status characters over and over again, parse the --diff-filter option into bitfields and use the bits to see if the change to the filepair matches the status requested. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15remove diff_tree_{setup,release}_pathsLibravatar Nguyễn Thái Ngọc Duy1-2/+0
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>