summaryrefslogtreecommitdiff
path: root/diff.c
AgeCommit message (Collapse)AuthorFilesLines
2005-05-24[PATCH] Redo rename/copy detection logic.Libravatar Junio C Hamano1-19/+98
Earlier implementation had a major screw-up in the memory management area. Rename/copy logic sometimes borrowed a pointer to a structure without any provision for downstream to determine which pointer is shared and which is not. This resulted in the later clean-up code to sometimes double free such structure, resulting in a segfault. This made -M and -C useless. Another problem the earlier implementation had was that it reordered the patches, and forced the logic to differentiate renames and copies to depend on that particular order. This problem was fixed by teaching rename/copy detection logic not to do any reordering, and rename-copy differentiator not to depend on the order of the patches. The diffs will leave rename/copy detector in the same destination path order as the patch that was fed into it. Some test vectors have been reordered to accommodate this change. It also adds a sanity check logic to the human-readable diff-raw output to detect paths with embedded TAB and LF characters, which cannot be expressed with that format. This idea came up during a discussion with Chris Wedgwood. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23[PATCH] Fix diff-pruning logic which was running prune too early.Libravatar Junio C Hamano1-92/+64
For later stages to reorder patches, pruning logic and rename detection logic should not decide which delete to discard (because another entry said it will take over the file as a rename) until the very end. Also fix some tests that were assuming the earlier "last one is rename or keep everything else is copy" semantics of diff-raw format, which no longer is true. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23[PATCH] diff-raw format update take #2.Libravatar Junio C Hamano1-48/+76
This changes the diff-raw format again, following the mailing list discussion. The new format explicitly expresses which one is a rename and which one is a copy. The documentation and tests are updated to match this change. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-23[PATCH] Rename/copy detection fix.Libravatar Junio C Hamano1-47/+98
The rename/copy detection logic in earlier round was only good enough to show patch output and discussion on the mailing list about the diff-raw format updates revealed many problems with it. This patch fixes all the ones known to me, without making things I want to do later impossible, mostly related to patch reordering. (1) Earlier rename/copy detector determined which one is rename and which one is copy too early, which made it impossible to later introduce diffcore transformers to reorder patches. This patch fixes it by moving that logic to the very end of the processing. (2) Earlier output routine diff_flush() was pruning all the "no-change" entries indiscriminatingly. This was done due to my false assumption that one of the requirements in the diff-raw output was not to show such an entry (which resulted in my incorrect comment about "diff-helper never being able to be equivalent to built-in diff driver"). My special thanks go to Linus for correcting me about this. When we produce diff-raw output, for the downstream to be able to tell renames from copies, sometimes it _is_ necessary to output "no-change" entries, and this patch adds diffcore_prune() function for doing it. (3) Earlier diff_filepair structure was trying to be not too specific about rename/copy operations, but the purpose of the structure was to record one or two paths, which _was_ indeed about rename/copy. This patch discards xfrm_msg field which was trying to be generic for this wrong reason, and introduces a couple of fields (rename_score and rename_rank) that are explicitly specific to rename/copy logic. One thing to note is that the information in a single diff_filepair structure _still_ does not distinguish renames from copies, and it is deliberately so. This is to allow patches to be reordered in later stages. (4) This patch also adds some tests about diff-raw format output and makes sure that necessary "no-change" entries appear on the output. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-22Some more sparse warning fixesLibravatar Linus Torvalds1-4/+4
Proper function declarations and NULL pointer usage.
2005-05-22Include file cleanups..Libravatar Linus Torvalds1-1/+0
Add <limits.h> to the include files handled by "cache.h", and remove extraneous #include directives from various .c files. The rule is that "cache.h" gets all the basic stuff, so that we'll have as few system dependencies as possible.
2005-05-22[PATCH] Diffcore updates.Libravatar Junio C Hamano1-82/+90
This moves the path selection logic from individual programs to a new diffcore transformer (diff-tree still needs to have its own for performance reasons). Also the header printing code in diff-tree was tweaked not to produce anything when pickaxe is in effect and there is nothing interesting to report. An interesting example is the following in the GIT archive itself: $ git-whatchanged -p -C -S'or something in a real script' Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21[PATCH] The diff-raw format updates.Libravatar Junio C Hamano1-58/+74
Update the diff-raw format as Linus and I discussed, except that it does not use sequence of underscore '_' letters to express nonexistence. All '0' mode is used for that purpose instead. The new diff-raw format can express rename/copy, and the earlier restriction that -M and -C _must_ be used with the patch format output is no longer necessary. The patch makes -M and -C flags independent of -p flag, so you need to say git-whatchanged -M -p to get the diff/patch format. Updated are both documentations and tests. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21[PATCH] Prepare diffcore interface for diff-tree header supression.Libravatar Junio C Hamano1-21/+22
This does not actually supress the extra headers when pickaxe is used, but prepares enough support for diff-tree to implement it. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21[PATCH] Constness fix for pickaxe option.Libravatar Junio C Hamano1-1/+1
Constness fix for pickaxe option. Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-21[PATCH] Introducing software archaeologist's tool "pickaxe".Libravatar Junio C Hamano1-9/+14
This steals the "pickaxe" feature from JIT and make it available to the bare Plumbing layer. From the command line, the user gives a string he is intersted in. Using the diff-core infrastructure previously introduced, it filters the differences to limit the output only to the diffs between <src> and <dst> where the string appears only in one but not in the other. For example: $ ./git-rev-list HEAD | ./git-diff-tree -Sdiff-tree-helper --stdin -M would show the diffs that touch the string "diff-tree-helper". In real software-archaeologist application, you would typically look for a few to several lines of code and see where that code came from. The "pickaxe" module runs after "rename/copy detection" module, so it even crosses the file rename boundary, as the above example demonstrates. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-21[PATCH] Diff overhaul, adding half of copy detection.Libravatar Junio C Hamano1-413/+286
This introduces the diff-core, the layer between the diff-tree family and the external diff interface engine. The calls to the interface diff-tree family uses (diff_change and diff_addremove) have not changed and will not change. The purpose of the diff-core layer is to provide an infrastructure to transform the set of differences sent from the applications, before sending them to the external diff interface. The recently introduced rename detection code has been rewritten to use the diff-core facility. When applications send in separate creates and deletes, matching ones are transformed into a single rename-and-edit diff, and sent out to the external diff interface as such. This patch also enhances the rename detection code further to be able to detect copies. Currently this happens only as long as copy sources appear as part of the modified files, but there already is enough provision for callers to report unmodified files to diff-core, so that they can be also used as copy source candidates. Extending the callers this way will be done in a separate patch. Please see and marvel at how well this works by trying out the newly added t/t4003-diff-rename-1.sh test script. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-20sparse cleanupLibravatar Linus Torvalds1-3/+3
Fix various things that sparse complains about: - use NULL instead of 0 - make sure we declare everything properly, or mark it static - use proper function declarations ("fn(void)" instead of "fn()") Sparse is always right.
2005-05-20[PATCH] Simplify "reverse-diff" logic in the diff core.Libravatar Junio C Hamano1-23/+15
Instead of swapping the arguments just before output, this patch makes the swapping happen on the input side of the diff core, when "reverse-diff" is in effect. This greatly simplifies the logic, but more importantly it is necessary for upcoming "copy detection" work. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-19[PATCH] diff overhaulLibravatar Junio C Hamano1-29/+113
This cleans up the way calls are made into the diff core from diff-tree family and diff-helper. Earlier, these programs had "if (generating_patch)" sprinkled all over the place, but those ugliness are gone and handled uniformly from the diff core, even when not generating patch format. This also allowed diff-cache and diff-files to acquire -R (reverse) option to generate diff in reverse. Users of diff-tree can swap two trees easily so I did not add -R there. [ Linus' note: I'll add -R to "diff-tree" too, since a "commit diff" doesn't have another tree to switch around: the other tree is always the parent(s) of the commit ] Also -M<digits-as-mantissa> suggestion made by Linus has been implemented. Documentation updates are also included. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-19[PATCH] Declare stacked variables before the first statement.Libravatar Thomas Glanzmann1-2/+2
Signed-off-by: Thomas Glanzmann <sithglan@stud.uni-erlangen.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-19[PATCH] Detect renames in diff family.Libravatar Junio C Hamano1-4/+17
A bit of clean-up of diff.c which fixes up some comments and removes a memory leak. This also re-introduces the rename score debugging fprintf(), but leaves it #idef'ed it out for normal use. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-19diff.c: remove left-over scoring debug messageLibravatar Linus Torvalds1-3/+0
It may be wonderful for rating the scoring, but it's not appropriate for actual use ;)
2005-05-19[PATCH] Detect renames in diff family.Libravatar Junio C Hamano1-32/+378
This rips out the rename detection engine from diff-helper and moves it to the diff core, and updates the internal calling convention used by diff-tree family into the diff core. In order to give the same option name to diff-tree family as well as to diff-helper, I've changed the earlier diff-helper '-r' option to '-M' (stands for Move; sorry but the natural abbreviation 'r' for 'rename' is already taken for 'recursive'). Although I did a fair amount of test with the git-diff-tree with existing rename commits in the core GIT repository, this should still be considered beta (preview) release. This patch depends on the diff-delta infrastructure just committed. This implements almost everything I wanted to see in this series of patch, except a few minor cleanups in the calling convention into diff core, but that will be a separate cleanup patch. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-18[PATCH] Diff-helper updateLibravatar Junio C Hamano1-12/+22
This patch adds a framework and a stub implementation of rename detection to diff-helper program. The current stub code is just enough to detect pure renames in diff-tree output and not fancier. The plan is perhaps to use the same delta code when Nico's delta storage patch is merged for similarity evaluation purposes. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-18[PATCH] Fix diff output take #4.Libravatar Junio C Hamano1-7/+7
This implements the output format suggested by Linus in <Pine.LNX.4.58.0505161556260.18337@ppc970.osdl.org>, except the imaginary diff option is spelled "diff --git" with double dashes as suggested by Matthias Urlichs. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-15Rename cache_match_stat() to ce_match_stat()Libravatar Brad Roberts1-1/+1
Signed-off-by: Brad Roberts <braddr@puremagic.com> Signed-off-by: Petr Baudis <pasky@ucw.cz>
2005-05-15[PATCH 1/3] Update mode-change strings in diff output.Libravatar Junio C Hamano1-10/+12
This updates the mode change strings to be a bit more machine friendly. Although this might go against the spirit of readability for human consumption, these mode bits strings are shown only when unusual things (mode change, file creation and deletion) happens, output normalized for machine consumption would be permissible. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Petr Baudis <pasky@ucw.cz>
2005-05-09Rename environment variables.Libravatar Junio C Hamano1-5/+5
H. Peter Anvin mentioned that using SHA1_whatever as an environment variable name is not nice and we should instead use names starting with "GIT_" prefix to avoid conflicts. Here is what this patch does: * Renames the following environment variables: New name Old Name GIT_AUTHOR_DATE AUTHOR_DATE GIT_AUTHOR_EMAIL AUTHOR_EMAIL GIT_AUTHOR_NAME AUTHOR_NAME GIT_COMMITTER_EMAIL COMMIT_AUTHOR_EMAIL GIT_COMMITTER_NAME COMMIT_AUTHOR_NAME GIT_ALTERNATE_OBJECT_DIRECTORIES SHA1_FILE_DIRECTORIES GIT_OBJECT_DIRECTORY SHA1_FILE_DIRECTORY * Introduces a compatibility macro, gitenv(), which does an getenv() and if it fails calls gitenv_bc(), which in turn picks up the value from old name while giving a warning about using an old name. * Changes all users of the environment variable to fetch environment variable with the new name using gitenv(). * Updates the documentation and scripts shipped with Linus GIT distribution. The transition plan is as follows: * We will keep the backward compatibility list used by gitenv() for now, so the current scripts and user environments continue to work as before. The users will get warnings when they have old name but not new name in their environment to the stderr. * The Porcelain layers should start using new names. However, just in case it ends up calling old Plumbing layer implementation, they should also export old names, taking values from the corresponding new names, during the transition period. * After a transition period, we would drop the compatibility support and drop gitenv(). Revert the callers to directly call getenv() but keep using the new names. The last part is probably optional and the transition duration needs to be set to a reasonable value. Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-07[PATCH] Add #include <limits.h> so that git compiles under SolarisLibravatar Thomas Glanzmann1-0/+1
<JC> Editorial Note. We may want to include standard headers in one of those headers everybody includes, e.g. cache.h, to reduce clutters, but this commit is as Thomas posted to the GIT list. Date: Sat, 7 May 2005 10:41:41 +0200 Signed-off-by: Thomas Glanzmann <sithglan@stud.uni-erlangen.de> Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-05Update diff engine for symlinks stored in the cache.Libravatar Junio C Hamano1-22/+56
This patch updates the external diff interface engine for the change to store the symbolic links in the cache, recently done by Kay Sievers. The main thing it does is when comparing with the work tree, it prepares the counterpart to the blob being compared by doing a readlink followed by sending that result to a temporary file to be diffed. Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-04Optimize diff-cache -p --cachedLibravatar Junio C Hamano1-13/+54
This patch optimizes "diff-cache -p --cached" by avoiding to inflate blobs into temporary files when the blob recorded in the cache matches the corresponding file in the work tree. The file in the work tree is passed as the comparison source in such a case instead. This optimization kicks in only when we have already read the cache this optimization and this is deliberate. Especially, diff-tree does not use this code, because changes are contained in small number of files relative to the project size most of the time, and reading cache is so expensive for a large project that the cost of reading it outweighs the savings by not inflating blobs. Also this patch cleans up the structure passed from diff clients by removing one unused structure member. Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-04Terminate diff-* on non-zero exit from GIT_EXTERNAL_DIFFLibravatar Junio C Hamano1-8/+12
(slightly updated from the version posted to the GIT mailing list with small bugfixes). This patch changes the git-apply-patch-script to exit non-zero when the patch cannot be applied. Previously, the external diff driver deliberately ignored the exit status of GIT_EXTERNAL_DIFF command, which was a design mistake. It now stops the processing when GIT_EXTERNAL_DIFF exits non-zero, so the damages from running git-diff-* with git-apply-patch-script between two wrong trees can be contained. The "diff" command line generated by the built-in driver is changed to always exit 0 in order to match this new behaviour. I know Pasky does not use GIT_EXTERNAL_DIFF yet, so this change should not break Cogito, either. Signed-off-by: Junio C Hamano <junkio@cox.net>
2005-05-01Change the prefix for builtin diff generation.Libravatar Linus Torvalds1-1/+1
It's silly, and it shouldn't matter, but every time I look at the diffs, I ended up just worrying why "l/" and "k/" as the prefixes. Junio says it's a tribute to linux-kernel, but graciously also said I can change it to something else. So make it "a/" and "b/" until somebody else complains ;)
2005-05-01[PATCH] Rework built-in diff to make its output more dense.Libravatar Junio C Hamano1-11/+15
Linus says, The fewer lines there are that don't usually tell a human anything, the better. Dense is good. This patch makes the default diff output more dense. This removes the previous misguided attempt to be cg-patch compatible. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-28[PATCH] diff.c: clean temporary filesLibravatar Junio C Hamano1-5/+24
When diff-cache -p and friends are interrupted, they can leave their temporary files behind. Also when the external diff program is killed instead of exiting (this usually happens when piping the output to a pager, which can cause SIGPIPE when the user quits viewing the diff early), they incorrectly died without cleaning their temporary file. This fixes these problems. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-28[PATCH] Make diff-cache and friends output more cg-patch friendly.Libravatar Junio C Hamano1-20/+36
This changes the way the default arguments to diff are built when diff-cache and friends are invoked with -p and there is no GIT_EXTERNAL_DIFF environment variable. It attempts to be more cg-patch friendly by: - Showing diffs against /dev/null to denote added or removed files; - Showing file modes for existing files as a comment after the diff label. Unfortunately with this change GIT_DIFF_CMD customization cannot be supported easily anymore, so it has been dropped. GIT_DIFF_OPTS customization to change diffs from unified to context is still there, though. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-27diff.c: don't add extra '/' to pathnameLibravatar Linus Torvalds1-2/+0
The "base" string already contains any finishing "/", so the way to get the full pathname is to just concatenate the base and path directly, with no extra slashes in between.
2005-04-27[PATCH] Reworked external diff interface.Libravatar Junio C Hamano1-35/+58
This introduces three public functions for diff-cache and friends can use to call out to the GIT_EXTERNAL_DIFF program when they wish to. A normal "add/remove/change" entry is turned into 7-parameter process invocation of GIT_EXTERNAL_DIFF program as before. In addition, the program can now be called with a single parameter when diff-cache and friends want to report an unmerged path. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-26[PATCH] introduce xmalloc and xreallocLibravatar Christopher Li1-3/+2
Introduce xmalloc and xrealloc to die gracefully with a descriptive message when out of memory, rather than taking a SIGSEGV. Signed-off-by: Christopher Li<chrislgit@chrisli.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-26[PATCH] Diff-tree-helper take two.Libravatar Junio C Hamano1-43/+205
This reworks the diff-tree-helper and show-diff to further make external diff command interface simpler. These commands now honor GIT_EXTERNAL_DIFF environment variable which can point at an arbitrary program that takes 7 parameters: name file1 file1-sha1 file1-mode file2 file2-sha1 file2-mode The parameters for an external diff command are as follows: name this invocation of the command is to emit diff for the named cache/tree entry. file1 pathname that holds the contents of the first file. This can be a file inside the working tree, or a temporary file created from the blob object, or /dev/null. The command should not attempt to unlink it -- the temporary is unlinked by the caller. file1-sha1 sha1 hash if file1 is a blob object, or "." otherwise. file1-mode mode bits for file1, or "." for a deleted file. If GIT_EXTERNAL_DIFF environment variable is not set, the default is to invoke diff with the set of parameters old show-diff used to use. This built-in implementation honors the GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as before. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-25[PATCH] Split external diff command interface to a separate file.Libravatar Junio C Hamano1-0/+106
With this patch, the non-core'ish part of show-diff command that invokes an external "diff" comand to obtain patches is split into a separate file. The next patch will introduce a new command, diff-tree-helper, which uses this common diff interface to format diff-tree and diff-cache output into a patch form. Signed-off-by: Junio C Hamano <junkio@cox.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>