From 0324e8fc6b297c9e61745dc4e7d110780334157d Mon Sep 17 00:00:00 2001 From: Phillip Wood Date: Tue, 4 May 2021 09:27:34 +0000 Subject: word diff: handle zero length matches If find_word_boundaries() encounters a zero length match (which can be caused by matching a newline or using '*' instead of '+' in the regex) we stop splitting the input into words which generates an inaccurate diff. To fix this increment the start point when there is a zero length match and try a new match. This is safe as posix regular expressions always return the longest available match so a zero length match means there are no longer matches available from the current position. Commit bf82940dbf1 (color-words: enable REG_NEWLINE to help user, 2009-01-17) prevented matching newlines in negated character classes but it is still possible for the user to have an explicit newline match in the regex which could cause a zero length match. One could argue that having explicit newline matches or using '*' rather than '+' are user errors but it seems to be better to work round them than produce inaccurate diffs. Signed-off-by: Phillip Wood Signed-off-by: Junio C Hamano --- t/t4034-diff-words.sh | 5 +++++ 1 file changed, 5 insertions(+) (limited to 't') diff --git a/t/t4034-diff-words.sh b/t/t4034-diff-words.sh index 56f1e62a97..17ceba9f61 100755 --- a/t/t4034-diff-words.sh +++ b/t/t4034-diff-words.sh @@ -184,6 +184,11 @@ test_expect_success 'word diff with a regular expression' ' word_diff --color-words="[a-z]+" ' +test_expect_success 'word diff with zero length matches' ' + cp expect.letter-runs-are-words expect && + word_diff --color-words="[a-z${LF}]*" +' + test_expect_success 'set up a diff driver' ' git config diff.testdriver.wordRegex "[^[:space:]]" && cat <<-\EOF >.gitattributes -- cgit v1.2.3