summaryrefslogtreecommitdiff
path: root/xdiff/xhistogram.c
AgeCommit message (Collapse)AuthorFilesLines
2021-12-04xdiff: drop xpparam_t parameter from histogram cmp_recs()Libravatar Jeff King1-3/+2
Since 663c5ad035 (diff histogram: intern strings, 2021-11-17), our cmp_recs() does not call xdl_recmatch(), and thus no longer needs an xpparam_t struct from which to get the flags. We can drop the unused parameter from the function, as well as the macro which wraps it. There's no functional change here; it's just simplifying things (and making -Wunused-parameter happier). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-04xdiff: drop CMP_ENV macro from xhistogramLibravatar Jeff King1-3/+0
This macro has been unused since 43ca7530df (xdiff/xhistogram: rely on xdl_trim_ends(), 2011-08-01). The function that called it went away there, and its use in the CMP() macro was inlined. It probably should have been deleted then, but nobody noticed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-18diff histogram: intern stringsLibravatar Phillip Wood1-3/+2
Histogram is the only diff algorithm not to call xdl_classify_record(). xdl_classify_record() ensures that the hash values of two strings that are not equal differ which means that it is not necessary to use xdl_recmatch() when comparing lines, all that is necessary is to compare the hash values. This gives a 7% reduction in the runtime of "git log --patch" when using the histogram diff algorithm. Test HEAD^ HEAD ----------------------------------------------------------------------------- 4000.1: log -3000 (baseline) 0.18(0.14+0.04) 0.19(0.17+0.02) +5.6% 4000.2: log --raw -3000 (tree-only) 0.99(0.77+0.21) 0.98(0.78+0.20) -1.0% 4000.3: log -p -3000 (Myers) 4.84(4.31+0.51) 4.81(4.15+0.64) -0.6% 4000.4: log -p -3000 --histogram 6.34(5.86+0.46) 5.87(5.19+0.66) -7.4% 4000.5: log -p -3000 --patience 5.39(4.60+0.76) 5.35(4.60+0.73) -0.7% Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-20merge-base, xdiff: zero out xpparam_t structuresLibravatar Michał Kępień1-0/+2
xpparam_t structures are usually zero-initialized before their specific fields are assigned to, but there are three locations in the tree where that does not happen. Add the missing memset() calls in order to make initialization of xpparam_t structures consistent tree-wide and to prevent stack garbage from being used as field values. Signed-off-by: Michał Kępień <michal@isc.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-28xdiff: remove duplicate headers from xhistogram.cLibravatar Carlo Marcelo Arenas Belón1-2/+0
8c912eea94 ("teach --histogram to diff", 2011-07-12) included them, but were already part of xinclude.h Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-23xdiff/histogram: remove tail recursionLibravatar Stefan Beller1-6/+14
When running the same reproduction script as the previous patch, it turns out the stack is too small, which can be easily avoided. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-19xdiff/xhistogram: move index allocation into find_lcsLibravatar Stefan Beller1-43/+53
This fixes a memory issue when recursing a lot, which can be reproduced as seq 1 100000 >one seq 1 4 100000 >two git diff --no-index --histogram one two Before this patch, histogram_diff would call itself recursively before calling free_index, which would mean a lot of memory is allocated during the recursion and only freed afterwards. By moving the memory allocation (and its free call) into find_lcs, the memory is free'd before we recurse, such that memory is reused in the next step of the recursion instead of using new memory. This addresses only the memory pressure, not the run time complexity, that is also awful for the corner case outlined above. Helpful in understanding the code (in addition to the sparse history of this file), was https://stackoverflow.com/a/32367597 which reproduces most of the code comments of the JGit implementation. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-19xdiff/xhistogram: factor out memory cleanup into free_index()Libravatar Stefan Beller1-4/+9
This will be useful in the next patch as we'll introduce multiple callers. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-19xdiff/xhistogram: pass arguments directly to fall_back_to_classic_diffLibravatar Stefan Beller1-11/+11
By passing the 'xpp' and 'env' argument directly to the function 'fall_back_to_classic_diff', we eliminate an occurrence of the 'index' in histogram_diff, which will prove useful in a bit. While at it, move it up in the file. This will make the diff of one of the next patches more legible. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-12Correct common spelling mistakes in comments and testsLibravatar Stefano Lattarini1-1/+1
Most of these were found using Lucas De Marchi's codespell tool. Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-19xdiff: PATIENCE/HISTOGRAM are not independent option bitsLibravatar Junio C Hamano1-1/+1
Because the default Myers, patience and histogram algorithms cannot be in effect at the same time, XDL_PATIENCE_DIFF and XDL_HISTOGRAM_DIFF are not independent bits. Instead of wasting one bit per algorithm, define a few macros to access the few bits they occupy and update the code that access them. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-08xdiff/xhistogram: drop need for additional variableLibravatar Tay Ray Chuan1-5/+4
Having an additional variable (ptr) instead of changing line(1|2) and count(1|2) was for debugging purposes. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-08xdiff/xhistogram: rely on xdl_trim_ends()Libravatar Tay Ray Chuan1-27/+4
Do away with reduce_common_start_end() and use xdf->dstart and xdf->dend set by xdl_trim_ends() that similarly tells us where the first unmatched line from the start and end occurs. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-08xdiff/xhistogram: rework handling of recursed resultsLibravatar Tay Ray Chuan1-6/+9
Previously we were over-complicating matters by trying to combine the recursed results. Now, terminate immediately if a recursive call failed and return its result. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-12teach --histogram to diffLibravatar Tay Ray Chuan1-0/+384
Port JGit's HistogramDiff algorithm over to C. Rough numbers (TODO) show that it is faster than its --patience cousin, as well as the default Meyers algorithm. The implementation has been reworked to use structs and pointers, instead of bitmasks, thus doing away with JGit's 2^28 line limit. We also use xdiff's default hash table implementation (xdl_hash_bits() with XDL_HASHLONG()) for convenience. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>