summaryrefslogtreecommitdiff
path: root/contrib/diff-highlight
AgeCommit message (Collapse)AuthorFilesLines
2018-03-21diff-highlight: detect --graph by indentLibravatar Jeff King2-15/+91
This patch fixes a corner case where diff-highlight may scramble some diffs when combined with --graph. Commit 7e4ffb4c17 (diff-highlight: add support for --graph output, 2016-08-29) taught diff-highlight to skip past the graph characters at the start of each line with this regex: ($COLOR?\|$COLOR?\s+)* I.e., any series of pipes separated by and followed by arbitrary whitespace. We need to match more than just a single space because the commit in question may be indented to accommodate other parts of the graph drawing. E.g.: * commit 1234abcd | ... | diff --git ... has only a single space, but for the last commit before a fork: | | | | * | commit 1234abcd | |/ ... | | diff --git the diff lines have more spaces between the pipes and the start of the diff. However, when we soak up all of those spaces with the $GRAPH regex, we may accidentally include the leading space for a context line. That means we may consider the actual contents of a context line as part of the diff syntax. In other words, something like this: normal context line -old line +new line -this is a context line with a leading dash would cause us to see that final context line as a removal line, and we'd end up showing the hunk in the wrong order: normal context line -old line -this is a context line with a leading dash +new line Instead, let's a be a little more clever about parsing the graph. We'll look for the actual "*" line that marks the start of a commit, and record the indentation we see there. Then we can skip past that indentation when checking whether the line is a hunk header, removal, addition, etc. There is one tricky thing: the indentation in bytes may be different for various lines of the graph due to coloring. E.g., the "*" on a commit line is generally shown without color, but on the actual diff lines, it will be replaced with a colorized "|" character, adding several bytes. We work around this here by counting "visible" bytes. This is unfortunately a bit more expensive, making us about twice as slow to handle --graph output. But since this is meant to be used interactively anyway, it's tolerably fast (and the non-graph case is unaffected). One alternative would be to search for hunk header lines and use their indentation (since they'd have the same colors as the diff lines which follow). But that just opens up different corner cases. If we see: | | @@ 1,2 1,3 @@ we cannot know if this is a real diff that has been indented due to the graph, or if it's a context line that happens to look like a diff header. We can only be sure of the indent on the "*" lines, since we know those don't contain arbitrary data (technically the user could include a bunch of extra indentation via --format, but that's rare enough to disregard). Reported-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-21diff-highlight: use flush() helper consistentlyLibravatar Jeff King1-4/+3
The current flush() helper only shows the queued diff but does not clear the queue. This is conceptually a bug, but it works because we only call it once at the end of the program. Let's teach it to clear the queue, which will let us use it in more places (one for now, but more in future patches). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-21diff-highlight: test graphs with --colorLibravatar Jeff King1-0/+9
Our tests send git's output directly to files or pipes, so there will never be any color. Let's do at least one --color test to make sure that we can handle this case (which we currently can, but will be an easy thing to mess up when we touch the graph code in a future patch). We'll just cover the --graph case, since this is much more complex than the earlier cases (i.e., if it manages to highlight, then the non-graph case definitely would). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-21diff-highlight: test interleaved parallel lines of historyLibravatar Jeff King1-7/+15
The graph test in t9400 covers the case of two simultaneous branches, but all of the commits during this time are on the right-hand branch. So we test a graph structure like: | | | * commit ... | | but we never see the reverse, a commit on the left-hand branch: | | * | commit ... | | Since this is an easy thing to get wrong when touching the graph-matching code, let's cover it by adding one more commit with its timestamp interleaved with the other branch. Note that we need to pass --date-order to convince Git to show it this way (since --topo-order tries to keep lines of history separate). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-21diff-highlight: prefer "echo" to "cat" in testsLibravatar Jeff King1-8/+4
We generate a bunch of one-line files whose contents match their names, and then generate our commits by cat-ing those files. Let's just echo the contents directly, which saves some processes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-21diff-highlight: use test_tick in graph testLibravatar Jeff King1-7/+11
The exact ordering output by Git may depend on the commit timestamps, so let's make sure they're actually monotonically increasing, and not all the same (or worse, subject to how long the test script takes to run). Let's use test_tick to make sure this is stable. Note that we actually have to rearrange the order of the branches to match the expected graph structure (which means that previously we might racily have been testing a slightly different output, though the test is written in such a way that we'd still pass). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-21diff-highlight: correct test graph diagramLibravatar Jeff King1-2/+2
We actually branch "A" off of "D". The sample "--graph" output is right, but the left-to-right diagram is misleading. Let's fix it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06diff-highlight: add clean target to MakefileLibravatar Daniel Watkins1-0/+3
Now that `make` produces a file, we should have a clean target to remove it. Signed-off-by: Daniel Watkins <daniel@daniel-watkins.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15diff-highlight: split code into moduleLibravatar Jeff King5-19/+82
The diff-so-fancy project is also written in perl, and most of its users pipe diffs through both diff-highlight and diff-so-fancy. It would be nice if this could be done in a single script. So let's pull most of diff-highlight's code into its own module which can be used by diff-so-fancy. In addition, we'll abstract a few basic items like reading from stdio so that a script using the module can do more processing before or after diff-highlight handles the lines. See the README update for more details. One small downside is that the diff-highlight script must now be built using the Makefile. There are ways around this, but it quickly gets into perl arcana. Let's go with the simple solution. As a bonus, our Makefile now respects the PERL_PATH variable if it is set. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-31diff-highlight: avoid highlighting combined diffsLibravatar Jeff King2-1/+38
The algorithm in diff-highlight only understands how to look at two sides of a diff; it cannot correctly handle combined diffs with multiple preimages. Often highlighting does not trigger at all for these diffs because the line counts do not match up. E.g., if we see: - ours -theirs ++resolved we would not bother highlighting; it naively looks like a single line went away, and then a separate hunk added another single line. But of course there are exceptions. E.g., if the other side deleted the line, we might see: - ours ++resolved which looks like we dropped " ours" and added "+resolved". This is only a small highlighting glitch (we highlight the space and the "+" along with the content), but it's also the tip of the iceberg. Even if we learned to find the true content here (by noticing we are in a 3-way combined diff and marking _two_ characters from the front of the line as uninteresting), there are other more complicated cases where we really do need to handle a 3-way hunk. Let's just punt for now; we can recognize combined diffs by the presence of extra "@" symbols in the hunk header, and treat them as non-diff content. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-31diff-highlight: add multi-byte testsLibravatar Jeff King1-1/+35
Now that we have a test suite for diff highlight, we can show off the improvements from 8d00662 (diff-highlight: do not split multibyte characters, 2015-04-03). While we're at it, we can also add another case that _doesn't_ work: combining code points are treated as their own unit, which means that we may stick colors between them and the character they are modifying (with the result that the color is not shown in an xterm, though it's possible that other terminals err the other way, and show the color but not the accent). There's no fix here, but let's document it as a failure. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-31diff-highlight: ignore test cruftLibravatar Jeff King1-0/+2
These are the same as in the normal t/.gitignore, with the exception of ".prove", as our Makefile does not support it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-29diff-highlight: add support for --graph outputLibravatar Brian Henderson2-7/+14
Signed-off-by: Brian Henderson <henderson.bj@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-29diff-highlight: add failing test for handling --graph outputLibravatar Brian Henderson1-0/+62
Signed-off-by: Brian Henderson <henderson.bj@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-29diff-highlight: add some testsLibravatar Brian Henderson3-0/+190
Signed-off-by: Brian Henderson <henderson.bj@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-14Merge branch 'jk/colors'Libravatar Junio C Hamano1-2/+7
"diff-highlight" (in contrib/) used to show byte-by-byte differences, which meant that multi-byte characters can be chopped in the middle. It learned to pay attention to character boundaries (assuming the UTF-8 payload). * jk/colors: diff-highlight: do not split multibyte characters
2015-04-04diff-highlight: do not split multibyte charactersLibravatar Kyle J. McKay1-2/+7
When the input is UTF-8 and Perl is operating on bytes instead of characters, a diff that changes one multibyte character to another that shares an initial byte sequence will result in a broken diff display as the common byte sequence prefix will be separated from the rest of the bytes in the multibyte character. For example, if a single line contains only the unicode character U+C9C4 (encoded as UTF-8 0xEC, 0xA7, 0x84) and that line is then changed to the unicode character U+C9C0 (encoded as UTF-8 0xEC, 0xA7, 0x80), when operating on bytes diff-highlight will show only the single byte change from 0x84 to 0x80 thus creating invalid UTF-8 and a broken diff display. Fix this by putting Perl into character mode when splitting the line and then back into byte mode after the split is finished. The utf8::xxx functions require Perl 5.8 so we require that as well. Also, since we are mucking with code in the split_line function, we change a '*' quantifier to a '+' quantifier when matching the $COLOR expression which has the side effect of speeding everything up while eliminating useless '' elements in the returned array. Reported-by: Yi EungJun <semtlenori@gmail.com> Signed-off-by: Kyle J. McKay <mackyle@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22Merge branch 'jk/colors'Libravatar Junio C Hamano2-13/+90
"diff-highlight" filter (in contrib/) allows its color output to be customized via configuration variables. * jk/colors: parse_color: drop COLOR_BACKGROUND macro diff-highlight: allow configurable colors parse_color: recognize "no$foo" to clear the $foo attribute parse_color: support 24-bit RGB values parse_color: refactor color storage
2014-11-20diff-highlight: allow configurable colorsLibravatar Jeff King2-13/+90
Until now, the highlighting colors were hard-coded in the script (as "reverse" and "noreverse"), and you had to edit the script to change them. This patch teaches diff-highlight to read from color.diff-highlight.* to set them. In addition, it expands the possiblities considerably by adding two features: 1. Old/new lines can be colored independently (so you can use a color scheme that complements existing line coloring). 2. Normal, unhighlighted parts of the lines can be colored, too. Technically this can be done by separately configuring color.diff.old/new and matching it to your diff-highlight colors. But you may want a different look for your highlighted diffs versus your regular diffs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-04diff-highlight: exit when a pipe is brokenLibravatar John Szakmeister1-0/+4
While using diff-highlight with other tools, I have discovered that Python ignores SIGPIPE by default. Unfortunately, this also means that tools attempting to launch a pager under Python--and don't realize this is happening--means that the subprocess inherits this setting. In this case, it means diff-highlight will be launched with SIGPIPE being ignored. Let's work with those broken scripts by restoring the default SIGPIPE handler. Signed-off-by: John Szakmeister <john@szakmeister.net> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-13diff-highlight: document some non-optimal casesLibravatar Jeff King1-0/+93
The diff-highlight script works on heuristics, so it can be wrong. Let's document some of the wrong-ness in case somebody feels like working on it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-13diff-highlight: match multi-line hunksLibravatar Jeff King2-34/+52
Currently we only bother highlighting single-line hunks. The rationale was that the purpose of highlighting is to point out small changes between two similar lines that are otherwise hard to see. However, that meant we missed similar cases where two lines were changed together, like: -foo(buf); -bar(buf); +foo(obj->buf); +bar(obj->buf); Each of those changes is simple, and would benefit from highlighting (the "obj->" parts in this case). This patch considers whole hunks at a time. For now, we consider only the case where the hunk has the same number of removed and added lines, and assume that the lines from each segment correspond one-to-one. While this is just a heuristic, in practice it seems to generate sensible results (especially because we now omit highlighting on completely-changed lines, so when our heuristic is wrong, we tend to avoid highlighting at all). Based on an original idea and implementation by Michał Kiedrowicz. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-13diff-highlight: refactor to prepare for multi-line hunksLibravatar Jeff King1-8/+14
The current code structure assumes that we will only look at a pair of lines at any given time, and that the end result should always be to output that pair. However, we want to eventually handle multi-line hunks, which will involve collating pairs of removed/added lines. Let's refactor the code to return highlighted pairs instead of printing them. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-13diff-highlight: don't highlight whole linesLibravatar Jeff King1-2/+26
If you have a change like: -foo +bar we end up highlighting the entirety of both lines (since the whole thing is changed). But the point of diff highlighting is to pinpoint the specific change in a pair of lines that are mostly identical. In this case, the highlighting is just noise, since there is nothing to pinpoint, and we are better off doing nothing. The implementation looks for "interesting" pairs by checking to see whether they actually have a matching prefix or suffix that does not simply consist of colorization and whitespace. However, the implementation makes it easy to plug in other heuristics, too, like: 1. Depending on the source material, the set of "boring" characters could be tweaked to include language-specific stuff (like braces or semicolons for C). 2. Instead of saying "an interesting line has at least one character of prefix or suffix", we could require that less than N percent of the line be highlighted. The simple "ignore whitespace, and highlight if there are any matched characters" implemented by this patch seems to give good results on git.git. I'll leave experimentation with other heuristics to somebody who has a dataset that does not look good with the current code. Based on an original idea and implementation by Michał Kiedrowicz. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-13diff-highlight: make perl strict and warnings fatalLibravatar Jeff King1-0/+3
These perl features can catch bugs, and we shouldn't be violating any of the strict rules or creating any warnings, so let's turn them on. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-18contrib: add diff highlight scriptLibravatar Jeff King2-0/+181
This is a simple and stupid script for highlighting differing parts of lines in a unified diff. See the README for a discussion of the limitations. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>