Age | Commit message (Collapse) | Author | Files | Lines |
|
The implementations in diff.c to detect moved lines needs to compare
strings and hash strings, which is implemented in that file, as well
as in the xdiff library.
Remove the rather recent implementation in diff.c and rely on the well
exercised code in the xdiff lib.
With this change the hash used for bucketing the strings for the moved
line detection changes from FNV32 (that is provided via the hashmaps
memhash) to DJB2 (which is used internally in xdiff). Benchmarks found
on the web[1] do not indicate that these hashes are different in
performance for readable strings.
[1] https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This will turn out to be useful in a later patch.
xdl_recmatch is exported in xdiff/xutils.h, to be used by various
xdiff/*.c files, but not outside of xdiff/. This one makes it available
to the outside, too.
While at it, add documentation.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
For computing moved lines, we feed the characters of each
line into a hash. When we've been asked to ignore
whitespace, then we pick each character using next_byte(),
which returns -1 on end-of-string, which it determines using
the start/end pointers we feed it.
However our check of its return value treats "0" the same as
"-1", meaning we'd quit if the string has an embedded NUL.
This is unlikely to ever come up in practice since our line
boundaries generally come from calling strlen() in the first
place.
But it was a bit surprising to me as a reader of the
next_byte() code. And it's possible that we may one day feed
this function with more exotic input, which otherwise works
with arbitrary ptr/len pairs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The code for handling whitespace with --color-moved
represents partial strings as a pair of pointers. There are
two possible conventions for the end pointer:
1. It points to the byte right after the end of the
string.
2. It points to the final byte of the string.
But we seem to use both conventions in the code:
a. we assign the initial pointers from the NUL-terminated
string using (1)
b. we eat trailing whitespace by checking the second
pointer for isspace(), which needs (2)
c. the next_byte() function checks for end-of-string with
"if (cp > endp)", which is (2)
d. in next_byte() we skip past internal whitespace with
"while (cp < end)", which is (1)
This creates fewer bugs than you might think, because there
are some subtle interactions. Because of (a) and (c), we
always return the NUL-terminator from next_byte(). But all
of the callers of next_byte() happen to handle that
gracefully.
Because of the mismatch between (d) and (c), next_byte()
could accidentally return a whitespace character right at
endp. But because of the interaction of (a) and (b), we fail
to actually chomp trailing whitespace, meaning our endp
_always_ points to a NUL, canceling out the problem.
But that does leave (b) as a real bug: when ignoring
whitespace only at the end-of-line, we don't correctly trim
it, and fail to match up lines.
We can fix the whole thing by moving consistently to one
convention. Since convention (1) is idiomatic in our code
base, we'll pick that one.
The existing "-w" and "-b" tests continue to pass, and a new
"--ignore-space-at-eol" shows off the breakage we're fixing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Commit fa5ba2c1dd (diff: fix infinite loop with
--color-moved --ignore-space-change, 2017-10-12) added a
test to make sure that "--color-moved -b" doesn't run
forever, but the test in question doesn't actually have any
moved lines in it.
Let's scrap that test and add a variant of the existing
"--color-moved -w" test, but this time we'll check that we
find the move with whitespace changes, but not arbitrary
whitespace additions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We test that lines with whitespace changes are not found by
"--color-moved" by default, but are found if "-w" is added.
Let's add one more twist: a line that has non-whitespace
changes should not be marked as a pure move.
This is perhaps an obvious case for us to get right (and we
do), but as we add more whitespace tests, they will form a
pattern of "make sure this case is a move and this other
case is not".
Note that we have to add a line to our moved block, since
having a too-small block doesn't trigger the "moved"
heuristics. And we also add a line of context to ensure
that there's more context lines than moved lines (so the
diff shows us moving the lines up, rather than moving the
context down).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In preparation for testing several different whitespace
options, let's split out the setup and cleanup steps of the
whitespace test.
While we're here, let's also switch to using "<<-" to indent
our here-documents properly, and use q_to_tab to more
explicitly mark where we expect whitespace to appear.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Doc update.
* jc/branch-force-doc-readability-fix:
branch doc: sprinkle a few commas for readability
|
|
Update the documentation for "git filter-branch" so that the filter
options are listed in the same order as they are applied, as
described in an earlier part of the doc.
* dg/filter-branch-filter-order-doc:
doc: list filter-branch subdirectory-filter first
|
|
"git fetch <there> <src>:<dst>" allows an object name on the <src>
side when the other side accepts such a request since Git v2.5, but
the documentation was left stale.
* jc/fetch-refspec-doc-update:
fetch doc: src side of refspec could be full SHA-1
|
|
Doc updates.
* wk/merge-options-gpg-sign-doc:
Documentation/merge-options.txt: describe -S/--gpg-sign for 'pull'
|
|
* maint:
Prepare for 2.14.3
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This is the "theoretically more correct" approach of simply
stepping back to the state before plumbing commands started paying
attention to "color.ui" configuration variable.
* jk/ref-filter-colors-fix:
tag: respect color.ui config
Revert "color: check color.ui in git_default_config()"
Revert "t6006: drop "always" color config tests"
Revert "color: make "always" the same as "auto" in config"
color: make "always" the same as "auto" in config
provide --color option for all ref-filter users
t3205: use --color instead of color.branch=always
t3203: drop "always" color test
t6006: drop "always" color config tests
t7502: use diff.noprefix for --verbose test
t7508: use test_terminal for color output
t3701: use test-terminal to collect color output
t4015: prefer --color to -c color.diff=always
test-terminal: set TERM=vt100
|
|
Doc update.
* jc/doc-checkout:
checkout doc: clarify command line args for "checkout paths" mode
|
|
Docfix.
* tb/complete-describe:
completion: add --broken and --dirty to describe
|
|
* rs/rs-mailmap:
.mailmap: normalize name for René Scharfe
|
|
Improve behaviour of "git fsck" upon finding a missing object.
* rs/fsck-null-return-from-lookup:
fsck: handle NULL return of lookup_blob() and lookup_tree()
|
|
Leakfix and futureproofing.
* jk/sha1-loose-object-info-fix:
sha1_loose_object_info: handle errors from unpack_sha1_rest
|
|
* sb/branch-avoid-repeated-strbuf-release:
branch: reset instead of release a strbuf
|
|
* rs/qsort-s:
test-stringlist: avoid buffer underrun when sorting nothing
|
|
* jn/strbuf-doc-re-reuse:
strbuf doc: reuse after strbuf_release is fine
|
|
Code clean-up.
* rs/run-command-use-alloc-array:
run-command: use ALLOC_ARRAY
|
|
Code clean-up.
* rs/tag-null-pointer-arith-fix:
tag: avoid NULL pointer arithmetic
|
|
Code clean-up.
* rs/cocci-de-paren-call-params:
coccinelle: remove parentheses that become unnecessary
|
|
Docfix.
* ad/doc-markup-fix:
doc: correct command formatting
|
|
Doc updates.
* mr/doc-negative-pathspec:
docs: improve discoverability of exclude pathspec
|
|
Code clean-up.
* jk/validate-headref-fix:
validate_headref: use get_oid_hex for detached HEADs
validate_headref: use skip_prefix for symref parsing
validate_headref: NUL-terminate HEAD buffer
|
|
Doc update.
* ks/doc-use-camelcase-for-config-name:
doc: camelCase the config variables to improve readability
|
|
A docfix.
* jk/doc-read-tree-table-asciidoctor-fix:
doc: put literal block delimiter around table
|
|
* hn/typofix:
submodule.h: typofix
|
|
Doc updates.
* ks/test-readme-phrasofix:
t/README: fix typo and grammatically improve a sentence
|
|
Typofix.
* ez/doc-duplicated-words-fix:
doc: fix minor typos (extra/duplicated words)
|
|
Doc update.
* kd/doc-for-each-ref:
doc/for-each-ref: explicitly specify option names
doc/for-each-ref: consistently use '=' to between argument names and values
|
|
Finishing touches to a topic already in 'master'.
* cc/subprocess-handshake-missing-capabilities:
subprocess: loudly die when subprocess asks for an unsupported capability
|
|
Code clean-up.
* jk/system-path-cleanup:
git_extract_argv0_path: do nothing without RUNTIME_PREFIX
system_path: move RUNTIME_PREFIX to a sub-function
|
|
Doc update.
* bb/doc-eol-dirty:
Documentation: mention that `eol` can change the dirty status of paths
|
|
A mismerge fix.
* mg/timestamp-t-fix:
name-rev: change ULONG_MAX to TIME_MAX
|
|
A leakfix.
* ma/pkt-line-leakfix:
pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
|
|
A leakfix.
* jk/config-lockfile-leak-fix:
config: use a static lock_file struct
|
|
Build clean-up.
* dw/diff-highlight-makefile-fix:
diff-highlight: add clean target to Makefile
|
|
Code clean-up.
* jk/drop-sha1-entry-pos:
sha1-lookup: remove sha1_entry_pos() from header file
sha1_file: drop experimental GIT_USE_LOOKUP search
|
|
In the "--format=..." option of the "git for-each-ref" command (and
its friends, i.e. the listing mode of "git branch/tag"), "%(atom:)"
(e.g. "%(refname:)", "%(body:)" used to error out. Instead, treat
them as if the colon and an empty string that follows it were not
there.
* tb/ref-filter-empty-modifier:
ref-filter.c: pass empty-string as NULL to atom parsers
|
|
Backports a moral equivalent of 2015 fix to the poll emulation from
the upstream gnulib to fix occasional breakages on HPE NonStop.
* rb/compat-poll-fix:
poll.c: always set revents, even if to zero
|
|
Fixes for a handful memory access issues identified by valgrind.
* tg/memfixes:
sub-process: use child_process.args instead of child_process.argv
http-push: fix construction of hex value from path
path.c: fix uninitialized memory access
|
|
Spell the name of our system as "Git" in the output from
request-pull script.
* ar/request-pull-phrasofix:
request-pull: capitalise "Git" to make it a proper noun
|
|
The documentation for '-X<option>' for merges was misleadingly
written to suggest that "-s theirs" exists, which is not the case.
* jc/merge-x-theirs-docfix:
merge-strategies: avoid implying that "-s theirs" exists
|
|
"git mailinfo" was loose in decoding quoted printable and produced
garbage when the two letters after the equal sign are not
hexadecimal. This has been fixed.
* rs/mailinfo-qp-decode-fix:
mailinfo: don't decode invalid =XY quoted-printable sequences
|
|
The built-in pattern to detect the "function header" for HTML did
not match <H1>..<H6> elements without any attributes, which has
been fixed.
* ik/userdiff-html-h-element-fix:
userdiff: fix HTML hunk header regexp
|