summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-05-26grep: don't redundantly compile throwaway patterns under threadingLibravatar Ævar Arnfjörð Bjarmason1-3/+11
Change the pattern compilation logic under threading so that grep doesn't compile a pattern it never ends up using on the non-threaded code path, only to compile it again N times for N threads which will each use their own copy, ignoring the initially compiled pattern. This redundant compilation dates back to the initial introduction of the threaded grep in commit 5b594f457a ("Threaded grep", 2010-01-25). There was never any reason for doing this redundant work other than an oversight in the initial commit. Jeff King suggested on-list in <20170414212325.fefrl3qdjigwyitd@sigill.intra.peff.net> that this might be needed to check the pattern for sanity before threaded execution commences. That's not the case. The pattern is compiled under threading in start_threads() before any concurrent execution has started by calling pthread_create(), so if the pattern contains an error we still do the right thing. I.e. die with one error before any threaded execution has commenced, instead of e.g. spewing out an error for each N threads, which could be a regression a change like this might inadvertently introduce. This change is not meant as an optimization, any performance gains from this are in the hundreds to thousands of nanoseconds at most. If we wanted more performance here we could just re-use the compiled patterns in multiple threads (regcomp(3) is thread-safe), or partially re-use them and the associated structures in the case of later PCRE JIT changes. Rather, it's just to make the code easier to reason about. It's confusing to debug this under threading & non-threading when the threading codepaths redundantly compile a pattern which is never used. The reason the patterns are recompiled is as a side-effect of duplicating the whole grep_opt structure, which is not thread safe, writable, and munged during execution. The grep_opt structure then points to the grep_pat structure where pattern or patterns are stored. I looked into e.g. splitting the API into some "do & alloc threadsafe stuff", "spawn thread", "do and alloc non-threadsafe stuff", but the execution time of grep_opt_dup() & pattern compilation is trivial compared to actually executing the grep, so there was no point. Even with the more expensive JIT changes to follow the most expensive PCRE patterns take something like 0.0X milliseconds to compile at most[1]. The undocumented --debug mode added in commit 17bf35a3c7 ("grep: teach --debug option to dump the parse tree", 2012-09-13) still works properly with this change. It only emits debugging info during pattern compilation, which is now dumped by the pattern compiled just before the first thread is started. 1. http://sljit.sourceforge.net/pcre.html Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: assert that threading is enabled when calling grep_{lock,unlock}Libravatar Ævar Arnfjörð Bjarmason1-4/+4
Change the grep_{lock,unlock} functions to assert that num_threads is true, instead of only locking & unlocking the pthread mutex lock when it is. These functions are never called when num_threads isn't true, this logic has gone through multiple iterations since the initial introduction of grep threading in commit 5b594f457a ("Threaded grep", 2010-01-25), but ever since then they'd only be called if num_threads was true, so this check made the code confusing to read. Replace the check with an assertion, so that it's clear to the reader that this code path is never taken unless we're spawning threads. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: given --threads with NO_PTHREADS=YesPlease, warnLibravatar Ævar Arnfjörð Bjarmason2-0/+31
Add a warning about missing thread support when grep.threads or --threads is set to a non 0 (default) or 1 (no parallelism) value under NO_PTHREADS=YesPlease. This is for consistency with the index-pack & pack-objects commands, which also take a --threads option & are configurable via pack.threads, and have long warned about the same under NO_PTHREADS=YesPlease. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26pack-objects: fix buggy warning about threadsLibravatar Ævar Arnfjörð Bjarmason2-2/+4
Fix a buggy warning about threads under NO_PTHREADS=YesPlease. Due to re-using the delta_search_threads variable for both the state of the "pack.threads" config & the --threads option, setting "pack.threads" but not supplying --threads would trigger the warning for both "pack.threads" & --threads. Solve this bug by resetting the delta_search_threads variable in git_pack_config(), it might then be set by --threads again and be subsequently warned about, as the test I'm changing here asserts. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26pack-objects & index-pack: add test for --threads warningLibravatar Ævar Arnfjörð Bjarmason1-0/+36
Add a test for the warning that's emitted when --threads or pack.threads is provided under NO_PTHREADS=YesPlease. This uses the new PTHREADS prerequisite. The assertion for C_LOCALE_OUTPUT in the latter test is currently redundant, since unlike index-pack the pack-objects warnings aren't i18n'd. However they might be changed to be i18n'd in the future, and there's no harm in future-proofing the test. There's an existing bug in the implementation of pack-objects which this test currently tests for as-is. Details about the bug & the fix are included in a follow-up change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26test-lib: add a PTHREADS prerequisiteLibravatar Ævar Arnfjörð Bjarmason3-0/+6
Add a PTHREADS prerequisite which is false when git is compiled with NO_PTHREADS=YesPlease. There's lots of custom code that runs when threading isn't available, but before this prerequisite there was no way to test it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: move is_fixed() earlier to avoid forward declarationLibravatar Ævar Arnfjörð Bjarmason1-12/+12
Move the is_fixed() function which are currently only used in compile_regexp() earlier so it can be used in the PCRE family of functions in a later change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: change internal *pcre* variable & function names to be *pcre1*Libravatar Ævar Arnfjörð Bjarmason2-30/+30
Change the internal PCRE variable & function names to have a "1" suffix. This is for preparation for libpcre2 support, where having non-versioned names would be confusing. An earlier change in this series ("grep: change the internal PCRE macro names to be PCRE1", 2017-04-07) elaborates on the motivations behind this change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: change the internal PCRE macro names to be PCRE1Libravatar Ævar Arnfjörð Bjarmason4-7/+7
Change the internal USE_LIBPCRE define, & build options flag to use a naming convention ending in PCRE1, without changing the long-standing USE_LIBPCRE Makefile flag which enables this code. This is for preparation for libpcre2 support where having things like USE_LIBPCRE and USE_LIBPCRE2 in any more places than we absolutely need to for backwards compatibility with old Makefile arguments would be confusing. In some ways it would be better to change everything that now uses USE_LIBPCRE to use USE_LIBPCRE1, and to make specifying USE_LIBPCRE (or --with-pcre) an error. This would impose a one-time burden on packagers of git to s/USE_LIBPCRE/USE_LIBPCRE1/ in their build scripts. However I'd like to leave the door open to making USE_LIBPCRE=YesPlease eventually mean USE_LIBPCRE2=YesPlease, i.e. once PCRE v2 is ubiquitous enough that it makes sense to make it the default. This code and the USE_LIBPCRE Makefile argument was added in commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09). At the time there was no indication that the PCRE project would release an entirely new & incompatible API around 3 years later. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: factor test for \0 in grep patterns into a functionLibravatar Ævar Arnfjörð Bjarmason1-7/+15
Factor the test for \0 in grep patterns into a function. Since commit 9eceddeec6 ("Use kwset in grep", 2011-08-21) any pattern containing a \0 is considered fixed as regcomp() can't handle it. This change makes later changes that make use of either has_null() or is_fixed() (but not both) smaller. While I'm at it make the comment conform to the style guide, i.e. add an opening "/*\n". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: remove redundant regflags assignmentsLibravatar Ævar Arnfjörð Bjarmason1-5/+1
Remove redundant assignments to the "regflags" variable. This variable is only used set under GREP_PATTERN_TYPE_ERE, so there's no need to un-set it under GREP_PATTERN_TYPE_{FIXED,BRE,PCRE}. Back in 5010cb5fcc[1], we did do "opt.regflags &= ~REG_EXTENDED" upon seeing "-G" on the command line and flipped the bit on upon seeing "-E", but I think that was perfectly sensible and it would have been a bug if we didn't. They were part of the command line parsing that could have seen "-E" on the command line earlier. When cca2c172 ("git-grep: do not die upon -F/-P when grep.extendedRegexp is set.", 2011-05-09) switched the command line parsing to "read into a 'tentatively this is what we saw the last' variable and then finally commit just once", we didn't touch opt.regflags for PCRE and FIXED, but we still had to flip regflags between BRE and ERE, because parsing of grep.extendedregexp configuration variable directly touched opt.regflags back then, which was done by b22520a3 ("grep: allow -E and -n to be turned on by default via configuration", 2011-03-30). When 84befcd0 ("grep: add a grep.patternType configuration setting", 2012-08-03) introduced extended_regexp_option field, we stopped flipping regflags while reading the configuration, and that was when we should have noticed and stopped dropping REG_EXTENDED bit in the "now we can commit what type to use" helper function. There is no reason to do this anymore, so stop doing it, more to reduce "wait this is used under fixed/BRE/PCRE how?" confusion when reading the code, than to to save ourselves trivial CPU cycles by removing one assignment. 1. "built-in "git grep"", 2006-04-30. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26grep: catch a missing enum in switch statementLibravatar Ævar Arnfjörð Bjarmason1-0/+2
Add a die(...) to a default case for the switch statement selecting between grep pattern types under --recurse-submodules. Normally this would be caught by -Wswitch, but the grep_pattern_type type is converted to int by going through parse_options(). Changing the argument type passed to compile_submodule_options() won't work, the value will just get coerced. The -Wswitch-default warning will warn about it, but that produces a lot of noise across the codebase, this potential issue would be drowned in that noise. Thus catching this at runtime is the least bad option. This won't ever trigger in practice, but if a new pattern type were to be added this catches an otherwise silent bug during development. See commit 0281e487fd ("grep: optionally recurse into submodules", 2016-12-16) for the initial addition of this code. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26perf: add a comparison test of log --grep regex engines with -FLibravatar Ævar Arnfjörð Bjarmason1-0/+44
Add a performance comparison test of log --grepgrep regex engines given fixed strings. See the preceding fixed-string t/perf change ("perf: add a comparison test of grep regex engines with -F", 2017-04-21) for notes about this, in particular this mostly tests exactly the same codepath now, but might not in the future: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p4221-log-grep-engines-fixed.sh [...] Test this tree -------------------------------------------------------- 4221.1: fixed log --grep='int' 5.99(5.55+0.40) 4221.2: basic log --grep='int' 5.92(5.56+0.31) 4221.3: extended log --grep='int' 6.01(5.51+0.45) 4221.4: perl log --grep='int' 5.99(5.56+0.38) 4221.6: fixed log --grep='uncommon' 5.06(4.76+0.27) 4221.7: basic log --grep='uncommon' 5.02(4.78+0.21) 4221.8: extended log --grep='uncommon' 4.99(4.78+0.20) 4221.9: perl log --grep='uncommon' 5.00(4.72+0.26) 4221.11: fixed log --grep='æ' 5.35(5.12+0.20) 4221.12: basic log --grep='æ' 5.34(5.11+0.20) 4221.13: extended log --grep='æ' 5.39(5.10+0.22) 4221.14: perl log --grep='æ' 5.44(5.16+0.23) Only the non-ASCII -i case is different: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_4221_LOG_OPTS=' -i' ./run p4221-log-grep-engines-fixed.sh [...] Test this tree ----------------------------------------------------------- 4221.1: fixed log -i --grep='int' 6.17(5.77+0.35) 4221.2: basic log -i --grep='int' 6.16(5.59+0.39) 4221.3: extended log -i --grep='int' 6.15(5.70+0.39) 4221.4: perl log -i --grep='int' 6.15(5.69+0.38) 4221.6: fixed log -i --grep='uncommon' 5.10(4.88+0.21) 4221.7: basic log -i --grep='uncommon' 5.04(4.76+0.25) 4221.8: extended log -i --grep='uncommon' 5.07(4.82+0.23) 4221.9: perl log -i --grep='uncommon' 5.03(4.78+0.22) 4221.11: fixed log -i --grep='æ' 5.93(5.65+0.25) 4221.12: basic log -i --grep='æ' 5.88(5.62+0.25) 4221.13: extended log -i --grep='æ' 6.02(5.69+0.29) 4221.14: perl log -i --grep='æ' 5.36(5.06+0.29) See commit ("perf: add a comparison test of grep regex engines", 2017-04-19) for details on the machine the above test run was executed on. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26perf: add a comparison test of log --grep regex enginesLibravatar Ævar Arnfjörð Bjarmason1-0/+53
Add a very basic performance comparison test comparing the POSIX basic, extended and perl engines with patterns matching log messages via --grep=<pattern>. $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p4220-log-grep-engines.sh [...] Test this tree --------------------------------------------------------------------- 4220.1: basic log --grep='how.to' 6.22(6.00+0.21) 4220.2: extended log --grep='how.to' 6.23(5.98+0.23) 4220.3: perl log --grep='how.to' 6.07(5.79+0.25) 4220.5: basic log --grep='^how to' 6.19(5.93+0.22) 4220.6: extended log --grep='^how to' 6.19(5.93+0.23) 4220.7: perl log --grep='^how to' 6.14(5.88+0.24) 4220.9: basic log --grep='[how] to' 6.96(6.65+0.28) 4220.10: extended log --grep='[how] to' 6.96(6.69+0.24) 4220.11: perl log --grep='[how] to' 6.95(6.58+0.33) 4220.13: basic log --grep='\(e.t[^ ]*\|v.ry\) rare' 7.10(6.80+0.27) 4220.14: extended log --grep='(e.t[^ ]*|v.ry) rare' 7.07(6.80+0.26) 4220.15: perl log --grep='(e.t[^ ]*|v.ry) rare' 7.70(7.46+0.22) 4220.17: basic log --grep='m\(ú\|u\)lt.b\(æ\|y\)te' 6.12(5.87+0.24) 4220.18: extended log --grep='m(ú|u)lt.b(æ|y)te' 6.14(5.84+0.26) 4220.19: perl log --grep='m(ú|u)lt.b(æ|y)te' 6.16(5.93+0.20) With -i: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_4220_LOG_OPTS=' -i' ./run p4220-log-grep-engines.sh [...] Test this tree ------------------------------------------------------------------------ 4220.1: basic log -i --grep='how.to' 6.74(6.41+0.32) 4220.2: extended log -i --grep='how.to' 6.78(6.55+0.22) 4220.3: perl log -i --grep='how.to' 6.06(5.77+0.28) 4220.5: basic log -i --grep='^how to' 6.80(6.57+0.22) 4220.6: extended log -i --grep='^how to' 6.83(6.52+0.29) 4220.7: perl log -i --grep='^how to' 6.16(5.94+0.20) 4220.9: basic log -i --grep='[how] to' 7.87(7.61+0.24) 4220.10: extended log -i --grep='[how] to' 7.85(7.57+0.27) 4220.11: perl log -i --grep='[how] to' 7.03(6.75+0.25) 4220.13: basic log -i --grep='\(e.t[^ ]*\|v.ry\) rare' 8.68(8.41+0.25) 4220.14: extended log -i --grep='(e.t[^ ]*|v.ry) rare' 8.80(8.44+0.28) 4220.15: perl log -i --grep='(e.t[^ ]*|v.ry) rare' 7.85(7.56+0.26) 4220.17: basic log -i --grep='m\(ú\|u\)lt.b\(æ\|y\)te' 6.94(6.68+0.24) 4220.18: extended log -i --grep='m(ú|u)lt.b(æ|y)te' 7.04(6.76+0.24) 4220.19: perl log -i --grep='m(ú|u)lt.b(æ|y)te' 6.26(5.92+0.29) See commit ("perf: add a comparison test of grep regex engines", 2017-04-19) for details on the machine the above test run was executed on. Before commit ("log: make --regexp-ignore-case work with --perl-regexp", 2017-05-20) this test will almost definitely fail (depending on the repo) if passed the -i option, since it wasn't properly supported under PCRE. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26perf: add a comparison test of grep regex engines with -FLibravatar Ævar Arnfjörð Bjarmason1-0/+41
Add a performance comparison test of grep regex engines given fixed strings. The current logic in compile_regexp() ignores the engine parameter and uses kwset() to search for these, so this test shows no difference between engines right now: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p7821-grep-engines-fixed.sh [...] Test this tree ------------------------------------------------ 7821.1: fixed grep int 0.56(1.67+0.68) 7821.2: basic grep int 0.57(1.70+0.57) 7821.3: extended grep int 0.59(1.76+0.51) 7821.4: perl grep int 1.08(1.71+0.55) 7821.6: fixed grep uncommon 0.23(0.55+0.50) 7821.7: basic grep uncommon 0.24(0.55+0.50) 7821.8: extended grep uncommon 0.26(0.55+0.52) 7821.9: perl grep uncommon 0.24(0.58+0.47) 7821.11: fixed grep æ 0.36(1.30+0.42) 7821.12: basic grep æ 0.36(1.32+0.40) 7821.13: extended grep æ 0.38(1.30+0.42) 7821.14: perl grep æ 0.35(1.24+0.48) Only when run with -i via GIT_PERF_7821_GREP_OPTS=' -i' do we avoid avoid going through the same kwset.[ch] codepath, see the "Even when -F..." comment in grep.c. This only kicks for the non-ASCII case: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_7821_GREP_OPTS=' -i' ./run p7821-grep-engines-fixed.sh [...] Test this tree --------------------------------------------------- 7821.1: fixed grep -i int 0.62(2.10+0.57) 7821.2: basic grep -i int 0.68(1.90+0.61) 7821.3: extended grep -i int 0.78(1.94+0.57) 7821.4: perl grep -i int 0.98(1.78+0.74) 7821.6: fixed grep -i uncommon 0.24(0.44+0.64) 7821.7: basic grep -i uncommon 0.25(0.56+0.54) 7821.8: extended grep -i uncommon 0.27(0.62+0.45) 7821.9: perl grep -i uncommon 0.24(0.59+0.49) 7821.11: fixed grep -i æ 0.30(0.96+0.39) 7821.12: basic grep -i æ 0.27(0.92+0.44) 7821.13: extended grep -i æ 0.28(0.90+0.46) 7821.14: perl grep -i æ 0.28(0.74+0.49) I'm planning to change how fixed-string searching happens. This test gives a baseline for comparing performance before & after any such change. See commit ("perf: add a comparison test of grep regex engines", 2017-04-19) for details on the machine the above test run was executed on. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26perf: add a comparison test of grep regex enginesLibravatar Ævar Arnfjörð Bjarmason1-0/+56
Add a very basic performance comparison test comparing the POSIX basic, extended and perl engines. In theory the "basic" and "extended" engines should be implemented using the same underlying code with a slightly different pattern parser, but some implementations may not do this. Jump through some slight hoops to test both, which is worthwhile since "basic" is the default. Running this on an i7 3.4GHz Linux 4.9.0-2 Debian testing against a checkout of linux.git & latest upstream PCRE, both PCRE and git compiled with -O3 using gcc 7.1.1: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p7820-grep-engines.sh [...] Test this tree --------------------------------------------------------------- 7820.1: basic grep 'how.to' 0.34(1.24+0.53) 7820.2: extended grep 'how.to' 0.33(1.23+0.45) 7820.3: perl grep 'how.to' 0.31(1.05+0.56) 7820.5: basic grep '^how to' 0.32(1.24+0.42) 7820.6: extended grep '^how to' 0.33(1.20+0.44) 7820.7: perl grep '^how to' 0.57(2.67+0.42) 7820.9: basic grep '[how] to' 0.51(2.16+0.45) 7820.10: extended grep '[how] to' 0.49(2.20+0.43) 7820.11: perl grep '[how] to' 0.56(2.60+0.43) 7820.13: basic grep '\(e.t[^ ]*\|v.ry\) rare' 0.66(3.25+0.40) 7820.14: extended grep '(e.t[^ ]*|v.ry) rare' 0.65(3.19+0.46) 7820.15: perl grep '(e.t[^ ]*|v.ry) rare' 1.05(5.74+0.34) 7820.17: basic grep 'm\(ú\|u\)lt.b\(æ\|y\)te' 0.34(1.28+0.47) 7820.18: extended grep 'm(ú|u)lt.b(æ|y)te' 0.34(1.38+0.38) 7820.19: perl grep 'm(ú|u)lt.b(æ|y)te' 0.39(1.56+0.44) Options can also be passed to git-grep via the GIT_PERF_7820_GREP_OPTS environment variable. There are various modes such as "-v" that have very different performance profiles, but handling the combinatorial explosion of testing all those options would make this script much more complex and harder to maintain. Instead just add the ability to do one-shot runs with arbitrary options, e.g.: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_7820_GREP_OPTS=" -i" ./run p7820-grep-engines.sh [...] Test this tree ------------------------------------------------------------------ 7820.1: basic grep -i 'how.to' 0.49(1.72+0.38) 7820.2: extended grep -i 'how.to' 0.46(1.64+0.42) 7820.3: perl grep -i 'how.to' 0.44(1.45+0.45) 7820.5: basic grep -i '^how to' 0.47(1.76+0.38) 7820.6: extended grep -i '^how to' 0.47(1.70+0.42) 7820.7: perl grep -i '^how to' 0.65(2.72+0.37) 7820.9: basic grep -i '[how] to' 0.86(3.64+0.42) 7820.10: extended grep -i '[how] to' 0.84(3.62+0.46) 7820.11: perl grep -i '[how] to' 0.73(3.06+0.39) 7820.13: basic grep -i '\(e.t[^ ]*\|v.ry\) rare' 1.63(8.13+0.36) 7820.14: extended grep -i '(e.t[^ ]*|v.ry) rare' 1.64(8.01+0.44) 7820.15: perl grep -i '(e.t[^ ]*|v.ry) rare' 1.44(6.88+0.44) 7820.17: basic grep -i 'm\(ú\|u\)lt.b\(æ\|y\)te' 0.66(2.67+0.44) 7820.18: extended grep -i 'm(ú|u)lt.b(æ|y)te' 0.66(2.67+0.43) 7820.19: perl grep -i 'm(ú|u)lt.b(æ|y)te' 0.59(2.31+0.37) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21perf: emit progress output when unpacking & buildingLibravatar Ævar Arnfjörð Bjarmason1-0/+2
Amend the t/perf/run output so that in addition to the "Running N tests" heading currently being emitted, it also emits "Unpacking $rev" and "Building $rev" when setting up the build/$rev directory & when building it, respectively. This makes it easier to see what's going on and what revision is being tested as the output scrolls by. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't doLibravatar Ævar Arnfjörð Bjarmason3-3/+28
Add a git GIT_PERF_MAKE_COMMAND variable to compliment the existing GIT_PERF_MAKE_OPTS facility. This allows specifying an arbitrary shell command to execute instead of 'make'. This is useful e.g. in cases where the name, semantics or defaults of a Makefile flag have changed over time. It can even be used to change the contents of the tree, useful for monkeypatching ancient versions of git to get them to build. This opens Pandora's box in some ways, it's now possible to "jailbreak" the perf environment and e.g. modify the source tree via this arbitrary instead of just issuing a custom "make" command, such a command has to be re-entrant in the sense that subsequent perf runs will re-use the possibly modified tree. It would be pointless to try to mitigate or work around that caveat in a tool purely aimed at Git developers, so this change makes no attempt to do so. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21grep: add tests to fix blind spots with \0 patternsLibravatar Ævar Arnfjörð Bjarmason1-0/+71
Address a big blind spot in the tests for patterns containing \0. The is_fixed() function considers any string that contains \0 fixed, even if it contains regular expression metacharacters, those patterns are currently matched with kwset. Before this change removing that memchr(s, 0, len) check from is_fixed() wouldn't change the result of any of the tests, since regcomp() will happily match the part before the \0. The kwset path is dependent on whether the the -i flag is on, and whether the pattern has any non-ASCII characters, but none of this was tested for. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21grep: prepare for testing binary regexes containing rx metacharactersLibravatar Ævar Arnfjörð Bjarmason1-3/+3
Add setup code needed for testing regexes that contain both binary data and regex metacharacters. The POSIX regcomp() function inherently can't support that, because it takes a \0-delimited char *, but other regex engines APIs like PCRE v2 take a pattern/length pair, and are thus able to handle \0s in patterns as well as any other character. When kwset was imported in commit 9eceddeec6 ("Use kwset in grep", 2011-08-21) this limitation was fixed, but at the expense of introducing the undocumented limitation that any pattern containing \0 implicitly becomes a fixed match (equivalent to -F having been provided). That's not something we'd like to keep in the future. The inability to match patterns containing \0 is a leaky implementation detail. So add tests as a first step towards changing that. In order to test that \0-patterns can properly match as regexes the test string needs to have some regex metacharacters in it. There were other blind spots in the tests. The code around kwset specially handles case-insensitive & non-ASCII data, but there were no tests for this. Fix all of that by amending the text being matched to contain both regex metacharacters & non-ASCII data. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21grep: add a test helper function for less verbose -f \0 testsLibravatar Ævar Arnfjörð Bjarmason1-29/+29
Add a helper function to make the tests which check for patterns with \0 in them more succinct. Right now this isn't a big win, but subsequent commits will add a lot more of these tests. The helper is based on the match() function in t3070-wildmatch.sh. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21grep: add tests for grep pattern types being passed to submodulesLibravatar Ævar Arnfjörð Bjarmason1-0/+49
Add testing for grep pattern types being correctly passed to submodules. The pattern "(.|.)[\d]" matches differently under fixed (not at all), and then matches different lines under basic/extended & perl regular expressions, so this change asserts that the pattern type is passed along correctly. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21grep: amend submodule recursion test for regex engine testingLibravatar Ævar Arnfjörð Bjarmason1-83/+83
Amend the submodule recursion test to prepare it for subsequent tests of whether it passes along the grep.patternType to the submodule greps. This is the result of searching & replacing: foobar -> (1|2)d(3|4) foo -> (1|2) bar -> (3|4) Currently there's no tests for whether e.g. -P or -E is correctly passed along, tests for that will be added in a follow-up change, but first add content to the tests which will match differently under different regex engines. Reuse the pattern established in an earlier commit of mine in this series ("log: add exhaustive tests for pattern style options & config", 2017-04-07). The pattern "(.|.)[\d]" will match this content differently under fixed/basic/extended & perl. This test code was originally added in commit 0281e487fd ("grep: optionally recurse into submodules", 2016-12-16). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21grep: add tests for --threads=N and grep.threadsLibravatar Ævar Arnfjörð Bjarmason1-0/+16
Add tests for --threads=N being supplied on the command-line, or when grep.threads=N being supplied in the configuration. When the threading support was made run-time configurable in commit 89f09dd34e ("grep: add --threads=<num> option and grep.threads configuration", 2015-12-15) no tests were added for it. In developing a change to the grep code I was able to make '--threads=1 <pat>` segfault, while the test suite still passed. This change fixes that blind spot in the tests. In addition to asserting that asking for N threads shouldn't segfault, test that the grep output given any N is the same. The choice to test only 1..10 as opposed to 1..8 or 1..16 or whatever is arbitrary. Testing 1..1024 works locally for me (but gets noticeably slower as more threads are spawned). Given the structure of the code there's no reason to test an arbitrary number of threads, only 0, 1 and >=2 are special modes of operation. A later patch introduces a PTHREADS test prerequisite which is true under NO_PTHREADS=UnfortunatelyYes, but even under NO_PTHREADS it's fine to test --threads=N, we'll just ignore it and not use threading. So these tests also make sense under that mode to assert that --threads=N without pthreads still returns expected results. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21grep: change non-ASCII -i test to stop using --debugLibravatar Ævar Arnfjörð Bjarmason1-20/+5
Change a non-ASCII case-insensitive test case to stop using --debug, and instead simply test for the expected results. The test coverage remains the same with this change, but the test won't break due to internal refactoring. This test was added in commit 793dc676e0 ("grep/icase: avoid kwsset when -F is specified", 2016-06-25). It was asserting that the regex must be compiled with compile_fixed_regexp(), instead test for the expected results, allowing the underlying implementation to change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21grep: add a test for backreferences in PCRE patternsLibravatar Ævar Arnfjörð Bjarmason1-0/+7
Add a test for backreferences such as (.)\1 in PCRE patterns. This test ensures that the PCRE_NO_AUTO_CAPTURE option isn't turned on. Before this change turning it on would break these sort of patterns, but wouldn't break any tests. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21grep: add a test asserting that --perl-regexp dies when !PCRELibravatar Ævar Arnfjörð Bjarmason2-1/+15
Add a test asserting that when --perl-regexp (and -P for grep) is given to git-grep & git-log that we die with an error. In developing the PCRE v2 series I introduced a regression where -P would (through control-flow fall-through) become synonymous with basic POSIX matching. I.e. 'git grep -P '[\d]' would match "d" instead of digits. The entire test suite would still pass with this serious regression, since everything that tested for --perl-regexp would be guarded by the PCRE prerequisite, fix that blind-spot by adding tests under !PCRE asserting that git must die when given --perl-regexp or -P. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21log: make --regexp-ignore-case work with --perl-regexpLibravatar Ævar Arnfjörð Bjarmason2-5/+56
Make the --regexp-ignore-case option work with --perl-regexp. This never worked, and there was no test for this. Fix the bug and add a test. When PCRE support was added in commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) compile_pcre_regexp() would only check opt->ignore_case, but when the --perl-regexp option was added in commit 727b6fc3ed ("log --grep: accept --basic-regexp and --perl-regexp", 2012-10-03) the code didn't set the opt->ignore_case. Change the test suite to test for -i and --invert-regexp with basic/extended/perl patterns in addition to fixed, which was the only patternType that was tested for before in combination with those options. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21log: add exhaustive tests for pattern style options & configLibravatar Ævar Arnfjörð Bjarmason1-1/+97
Add exhaustive tests for how the different grep.patternType options & the corresponding command-line options affect git-log. Before this change it was possible to patch revision.c so that the --basic-regexp option was synonymous with --extended-regexp, and --perl-regexp wasn't recognized at all, and still have 100% of the test suite pass. This was because the first test being modified here, added in commit 34a4ae55b2 ("log --grep: use the same helper to set -E/-F options as "git grep"", 2012-10-03), didn't actually check whether we'd enabled extended regular expressions as distinct from re-toggling non-fixed string support. Fix that by changing the pattern to a pattern that'll only match if --extended-regexp option is provided, but won't match under the default --basic-regexp option. Other potential regressions were possible since there were no tests for the rest of the combinations of grep.patternType configuration toggles & corresponding git-log command-line options. Add exhaustive tests for those. The patterns being passed to fixed/basic/extended/PCRE are carefully crafted to return the wrong thing if the grep engine were to pick any other matching method than the one it's told to use. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21test-lib: rename the LIBPCRE prerequisite to PCRELibravatar Ævar Arnfjörð Bjarmason5-20/+20
Rename the LIBPCRE prerequisite to PCRE. This is for preparation for libpcre2 support, where having just "LIBPCRE" would be confusing as it implies v1 of the library. None of these tests are incompatible between versions 1 & 2 of libpcre, it's less confusing to give them a more general name to make it clear that they work on both library versions. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21grep & rev-list doc: stop promising libpcre for --perl-regexpLibravatar Ævar Arnfjörð Bjarmason2-4/+11
Stop promising in our grep & rev-list options documentation that we're always going to be using libpcre when given the --perl-regexp option. Instead talk about using "Perl-compatible regular expressions" and using these types of patterns using "a compile-time dependency". Saying "libpcre" means that we're talking about libpcre.so, which is always going to be v1. This change is part of an ongoing saga to add support for libpcre2, which comes with PCRE v2. In the future we might use some completely unrelated library to provide perl-compatible regular expression support. By wording the documentation differently and not promising any specific version of PCRE or even PCRE at all we have more wiggle room to change the implementation. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-21Makefile & configure: reword inaccurate comment about PCRELibravatar Ævar Arnfjörð Bjarmason2-6/+12
Reword an outdated & inaccurate comment which suggests that only git-grep can use PCRE. This comment was added back when PCRE support was initially added in commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09), and was true at the time. It hasn't been telling the full truth since git-log learned to use PCRE with --grep in commit 727b6fc3ed ("log --grep: accept --basic-regexp and --perl-regexp", 2012-10-03), and more importantly is likely to get more inaccurate over time as more use is made of PCRE in other areas. Reword it to be more future-proof, and to more clearly explain that this enables user-initiated runtime behavior. Copy/pasting this so much in configure.ac is lame, these Makefile-like flags aren't even used by autoconf, just the corresponding --with[out]-* options. But copy/pasting the comments that make sense for the Makefile to configure.ac where they make less sense is the pattern everything else follows in that file. I'm not going to war against that as part of this change, just following the existing pattern. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-09Git 2.13Libravatar Junio C Hamano3-1/+14
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-09Merge tag 'l10n-2.13.0-rnd2.1' of git://github.com/git-l10n/git-poLibravatar Junio C Hamano9-19174/+23732
l10n for Git 2.13.0 round 2.1 * tag 'l10n-2.13.0-rnd2.1' of git://github.com/git-l10n/git-po: l10n: zh_CN: for git v2.13.0 l10n round 2 l10n: sv.po: Update Swedish translation (3195t0f0u) l10n: zh_CN: review for git v2.13.0 l10n round 1 l10n: Update Catalan translation l10n: bg.po: Updated Bulgarian translation (3195t) l10n: fr.po v2.13 rnd 2 l10n: de.po: translate 4 new messages l10n: de.po: update German translation l10n: de.po: lower case after semi-colon l10n: vi.po(3195t): Update translation for v2.13.0 round 2 l10n: git.pot: v2.13.0 round 2 (4 new, 7 removed) l10n: zh_CN: for git v2.13.0 l10n round 1 l10n: fr.po v2.13 round 1 l10n: pt_PT: update Portuguese translation l10n: bg.po: Updated Bulgarian translation (3201t) l10n: vi.po(3198t): Updated Vietnamese translation for v2.13.0-rc0 l10n: sv.po: Update Swedish translation (3199t0f0u) l10n: git.pot: v2.13.0 round 1 (96 new, 37 removed)
2017-05-09Merge branch 'master' of git://github.com/nafmo/git-l10n-svLibravatar Jiang Xin1-434/+439
* 'master' of git://github.com/nafmo/git-l10n-sv: l10n: sv.po: Update Swedish translation (3195t0f0u)
2017-05-09l10n: zh_CN: for git v2.13.0 l10n round 2Libravatar Jiang Xin1-429/+436
Translate 4 messages (3195t0f0u) for git v2.13.0-rc2. Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
2017-05-09l10n: sv.po: Update Swedish translation (3195t0f0u)Libravatar Peter Krefting1-434/+439
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
2017-05-08Sync with v2.12.3Libravatar Junio C Hamano11-11/+116
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-09Merge branch 'jh/verify-index-checksum-only-in-fsck'Libravatar Junio C Hamano1-8/+26
* jh/verify-index-checksum-only-in-fsck: t1450: avoid use of "sed" on the index, which is a binary file
2017-05-09l10n: zh_CN: review for git v2.13.0 l10n round 1Libravatar Ray Chen1-17/+17
Signed-off-by: Ray Chen <oldsharp@gmail.com>
2017-05-09Merge branch 'master' of https://github.com/vnwildman/gitLibravatar Jiang Xin1-431/+438
* 'master' of https://github.com/vnwildman/git: l10n: vi.po(3195t): Update translation for v2.13.0 round 2
2017-05-09l10n: Update Catalan translationLibravatar Jordi Mas1-2126/+2491
Signed-off-by: Jordi Mas <jmas@softcatala.org>
2017-05-09l10n: bg.po: Updated Bulgarian translation (3195t)Libravatar Alexander Shopov1-85/+61
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
2017-05-09Merge branch 'fr_l10n_v2.13_rnd2' of git://github.com/jnavila/gitLibravatar Jiang Xin1-448/+478
* 'fr_l10n_v2.13_rnd2' of git://github.com/jnavila/git: l10n: fr.po v2.13 rnd 2
2017-05-06l10n: fr.po v2.13 rnd 2Libravatar Jean-Noel Avila1-448/+478
Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>
2017-05-05l10n: de.po: translate 4 new messagesLibravatar Ralf Thielow1-440/+465
Translate 4 new messages came from git.pot update in 28e1aaa48 (l10n: git.pot: v2.13.0 round 2 (4 new, 7 removed)). Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com> Acked-by: Matthias Rüster <matthias.ruester@gmail.com>
2017-05-05l10n: de.po: update German translationLibravatar Ralf Thielow1-1989/+2381
Translate 96 new messages came from git.pot update in dfc182b (l10n: git.pot: v2.13.0 round 1 (96 new, 37 removed)). Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com> Acked-by: Matthias Rüster <matthias.ruester@gmail.com>
2017-05-05l10n: de.po: lower case after semi-colonLibravatar Michael J Gruber1-1/+1
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
2017-05-05l10n: vi.po(3195t): Update translation for v2.13.0 round 2Libravatar Tran Ngoc Quan1-431/+438
Signed-off-by: Tran Ngoc Quan <vnwildman@gmail.com>
2017-05-05Git 2.12.3Libravatar Junio C Hamano3-4/+12
Signed-off-by: Junio C Hamano <gitster@pobox.com>