tgif.git - Terin's Improved Git Fork

Age	Commit message (Collapse)	Author	Files	Lines
2019-04-25	Merge branch 'jk/revision-rewritten-parents-in-prio-queue'	Junio C Hamano	1	-0/+18
	Performance fix for "rev-list --parents -- pathspec". * jk/revision-rewritten-parents-in-prio-queue: revision: use a prio_queue to hold rewritten parents
2019-04-04	revision: use a prio_queue to hold rewritten parents	Jeff King	1	-0/+18
	This patch fixes a quadratic list insertion in rewrite_one() when pathspec limiting is combined with --parents. What happens is something like this: 1. We see that some commit X touches the path, so we try to rewrite its parents. 2. rewrite_one() loops forever, rewriting parents, until it finds a relevant parent (or hits the root and decides there are none). The heavy lifting is done by process_parent(), which uses try_to_simplify_commit() to drop parents. 3. process_parent() puts any intermediate parents into the &revs->commits list, inserting by commit date as usual. So if commit X is recent, and then there's a large chunk of history that doesn't touch the path, we may add a lot of commits to &revs->commits. And insertion by commit date is O(n) in the worst case, making the whole thing quadratic. We tried to deal with this long ago in fce87ae538 (Fix quadratic performance in rewrite_one., 2008-07-12). In that scheme, we cache the oldest commit in the list; if the new commit to be added is older, we can start our linear traversal there. This often works well in practice because parents are older than their descendants, and thus we tend to add older and older commits as we traverse. But this isn't guaranteed, and in fact there's a simple case where it is not: merges. Imagine we look at the first parent of a merge and see a very old commit (let's say 3 years old). And on the second parent, as we go back 3 years in history, we might have many commits. That one first-parent commit has polluted our oldest-commit cache; it will remain the oldest while we traverse a huge chunk of history, during which we have to fall back to the slow, linear method of adding to the list. Naively, one might imagine that instead of caching the oldest commit, we'd start at the last-added one. But that just makes some cases faster while making others slower (and indeed, while it made a real-world test case much faster, it does quite poorly in the perf test include here). Fundamentally, these are just heuristics; our worst case is still quadratic, and some cases will approach that. Instead, let's use a data structure with better worst-case performance. Swapping out revs->commits for something else would have repercussions all over the code base, but we can take advantage of one fact: for the rewrite_one() case, nobody actually needs to see those commits in revs->commits until we've finished generating the whole list. That leaves us with two obvious options: 1. We can generate the list _unordered_, which should be O(n), and then sort it afterwards, which would be O(n log n) total. This is "sort-after" below. 2. We can insert the commits into a separate data structure, like a priority queue. This is "prio-queue" below. I expected that sort-after would be the fastest (since it saves us the extra step of copying the items into the linked list), but surprisingly the prio-queue seems to be a bit faster. Here are timings for the new p0001.6 for all three techniques across a few repositories, as compared to master: master cache-last sort-after prio-queue -------------------------------------------------------------------------------------------- GIT_PERF_REPO=git.git 0.52(0.50+0.02) 0.53(0.51+0.02) +1.9% 0.37(0.33+0.03) -28.8% 0.37(0.32+0.04) -28.8% GIT_PERF_REPO=linux.git 20.81(20.74+0.07) 20.31(20.24+0.07) -2.4% 0.94(0.86+0.07) -95.5% 0.91(0.82+0.09) -95.6% GIT_PERF_REPO=llvm-project.git 83.67(83.57+0.09) 4.23(4.15+0.08) -94.9% 3.21(3.15+0.06) -96.2% 2.98(2.91+0.07) -96.4% A few items to note: - the cache-list tweak does improve the bad case for llvm-project.git that started my digging into this problem. But it performs terribly on linux.git, barely helping at all. - the sort-after and prio-queue techniques work well. They approach the timing for running without --parents at all, which is what you'd expect (see below for more data). - prio-queue just barely outperforms sort-after. As I said, I'm not really sure why this is the case, but it is. You can see it even more prominently in this real-world case on llvm-project.git: git rev-list --parents 07ef786652e7 -- llvm/test/CodeGen/Generic/bswap.ll where prio-queue routinely outperforms sort-after by about 7%. One guess is that the prio-queue may just be more efficient because it uses a compact array. There are three new perf tests: - "rev-list --parents" gives us a baseline for running with --parents. This isn't sped up meaningfully here, because the bad case is triggered only with simplification. But it's good to make sure we don't screw it up (now, or in the future). - "rev-list -- dummy" gives us a baseline for just traversing with pathspec limiting. This gives a lower bound for the next test (and it's also a good thing for us to be checking in general for regressions, since we don't seem to have any existing tests). - "rev-list --parents -- dummy" shows off the problem (and our fix) Here are the timings for those three on llvm-project.git, before and after the fix: Test master prio-queue ------------------------------------------------------------------------------ 0001.3: rev-list --parents 2.24(2.12+0.12) 2.22(2.11+0.11) -0.9% 0001.5: rev-list -- dummy 2.89(2.82+0.07) 2.92(2.89+0.03) +1.0% 0001.6: rev-list --parents -- dummy 83.67(83.57+0.09) 2.98(2.91+0.07) -96.4% Changes in the first two are basically noise, and you can see we approach our lower bound in the final one. Note that we can't fully get rid of the list argument from process_parents(). Other callers do have lists, and it would be hard to convert them. They also don't seem to have this problem (probably because they actually remove items from the list as they loop, meaning it doesn't grow so large in the first place). So this basically just drops the "cache_ptr" parameter (which was used only by the one caller we're fixing here) and replaces it with a prio_queue. Callers are free to use either data structure, depending on what they're prepared to handle. Reported-by: Björn Pettersson A <bjorn.a.pettersson@ericsson.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-18	perf-lib.sh: rely on test-lib.sh for --tee handling	Jeff King	1	-23/+11
	Since its inception, the perf-lib.sh script has manually handled the "--tee" option (and other options which imply it, like "--valgrind") with a cut-and-pasted block from test-lib.sh. That block has grown stale over the years, and has at least three problems: 1. It uses $SHELL to re-exec the script, whereas the version in test-lib.sh learned to use $TEST_SHELL_PATH. 2. It does an ad-hoc search of the "$*" string, whereas test-lib.sh learned to carefully parse the arguments left to right. 3. It never learned about --verbose-log (which also implies --tee), so it would not trigger for that option. This last one was especially annoying, because t/perf/run uses the GIT_TEST_OPTS from your config.mak to run the perf scripts. So if you've set, say, "-x --verbose-log" there, it will be passed as part of most perf runs. And while this script doesn't recognize the option, the test-lib.sh that we source _does_, and the behavior ends up being much more annoying: - as the comment at the top of the block says, we have to run this tee code early, before we start munging variables (it says GIT_BUILD_DIR, but the problematic variable is actually GIT_TEST_INSTALLED). - since we don't recognize --verbose-log, we don't trigger the block. We go on to munge GIT_TEST_INSTALLED, converting it from a relative to an absolute path. - then we source test-lib.sh, which _does_ recognize --verbose-log. It re-execs the script, which runs again. But this time with an absolute version of GIT_TEST_INSTALLED. - As a result, we copy the absolute version of GIT_TEST_INSTALLED into perf_results_prefix. Instead of writing our results to the expected "test-results/build_1234abcd.p1234-whatever.times", we instead write them to "test-results/_full_path_to_repo_t_perf_build_1234...". The aggregate.perl script doesn't expect this, and so it prints "<missing>" for each result (even though it spent considerable time running the tests!). We can solve all of these in one blow by just deleting our custom handling, and relying on the inclusion of test-lib.sh to handle --tee, --verbose-log, etc. There's one catch, though. We want to handle GIT_TEST_INSTALLED after we've included test-lib.sh, since we want it un-munged in the re-exec'd version of the script. But if we want to convert it from a relative to an absolute path, we must do so before we load test-lib.sh, since it will change our working directory. So we compute the absolute directory first, store it away, then include test-lib.sh, and finally assign to GIT_TEST_INSTALLED as appropriate. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-14	prune: use bitmaps for reachability traversal	Jeff King	1	-0/+11
	Pruning generally has to traverse the whole commit graph in order to see which objects are reachable. This is the exact problem that reachability bitmaps were meant to solve, so let's use them (if they're available, of course). Here are timings on git.git: Test HEAD^ HEAD ------------------------------------------------------------------------ 5304.6: prune with bitmaps 3.65(3.56+0.09) 1.01(0.92+0.08) -72.3% And on linux.git: Test HEAD^ HEAD -------------------------------------------------------------------------- 5304.6: prune with bitmaps 35.05(34.79+0.23) 3.00(2.78+0.21) -91.4% The tests show a pretty optimal case, as we'll have just repacked and should have pretty good coverage of all refs with our bitmaps. But that's actually pretty realistic: normally prune is run via "gc" right after repacking. A few notes on the implementation: - the change is actually in reachable.c, so it would improve reachability traversals by "reflog expire --stale-fix", as well. Those aren't performed regularly, though (a normal "git gc" doesn't use --stale-fix), so they're not really worth measuring. There's a low chance of regressing that caller, since the use of bitmaps is totally transparent from the caller's perspective. - The bitmap case could actually get away without creating a "struct object", and instead the caller could just look up each object id in the bitmap result. However, this would be a marginal improvement in runtime, and it would make the callers much more complicated. They'd have to handle both the bitmap and non-bitmap cases separately, and in the case of git-prune, we'd also have to tweak prune_shallow(), which relies on our SEEN flags. - Because we do create real object structs, we go through a few contortions to create ones of the right type. This isn't strictly necessary (lookup_unknown_object() would suffice), but it's more memory efficient to use the correct types, since we already know them. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-14	prune: lazily perform reachability traversal	Jeff King	1	-0/+24
	The general strategy of "git prune" is to do a full reachability walk, then for each loose object see if we found it in our walk. But if we don't have any loose objects, we don't need to do the expensive walk in the first place. This patch postpones that walk until the first time we need to see its results. Note that this is really a specific case of a more general optimization, which is that we could traverse only far enough to find the object under consideration (i.e., stop the traversal when we find it, then pick up again when asked about the next object, etc). That could save us in some instances from having to do a full walk. But it's actually a bit tricky to do with our traversal code, and you'd need to do a full walk anyway if you have even a single unreachable object (which you generally do, if any objects are actually left after running git-repack). So in practice this lazy-load of the full walk catches one easy but common case (i.e., you've just repacked via git-gc, and there's nothing unreachable). The perf script is fairly contrived, but it does show off the improvement: Test HEAD^ HEAD ------------------------------------------------------------------------- 5304.4: prune with no objects 3.66(3.60+0.05) 0.00(0.00+0.00) -100.0% and would let us know if we accidentally regress this optimization. Note also that we need to take special care with prune_shallow(), which relies on us having performed the traversal. So this optimization can only kick in for a non-shallow repository. Since this is easy to get wrong and is not covered by existing tests, let's add an extra test to t5304 that covers this case explicitly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-20	tests: send "bug in the test script" errors to the script's stderr	SZEDER Gábor	1	-2/+2
	Some of the functions in our test library check that they were invoked properly with conditions like this: test "$#" = 2 \|\| error "bug in the test script: not 2 parameters to test-expect-success" If this particular condition is triggered, then 'error' will abort the whole test script with a bold red error message [1] right away. However, under certain circumstances the test script will be aborted completely silently, namely if: - a similar condition in a test helper function like 'test_line_count' is triggered, - which is invoked from the test script's "main" shell [2], - and the test script is run manually (i.e. './t1234-foo.sh' as opposed to 'make t1234-foo.sh' or 'make test') [3] - and without the '--verbose' option, because the error message is printed from within 'test_eval_', where standard output is redirected either to /dev/null or to a log file. The only indication that something is wrong is that not all tests in the script are executed and at the end of the test script's output there is no "# passed all N tests" message, which are subtle and can easily go unnoticed, as I had to experience myself. Send these "bug in the test script" error messages directly to the test scripts standard error and thus to the terminal, so those bugs will be much harder to overlook. Instead of updating all ~20 such 'error' calls with a redirection, let's add a BUG() function to 'test-lib.sh', wrapping an 'error' call with the proper redirection and also including the common prefix of those error messages, and convert all those call sites [4] to use this new BUG() function instead. [1] That particular error message from 'test_expect_success' is printed in color only when running with or without '--verbose'; with '--tee' or '--verbose-log' the error is printed without color, but it is printed to the terminal nonetheless. [2] If such a condition is triggered in a subshell of a test, then 'error' won't be able to abort the whole test script, but only the subshell, which in turn causes the test to fail in the usual way, indicating loudly and clearly that something is wrong. [3] Well, 'error' aborts the test script the same way when run manually or by 'make' or 'prove', but both 'make' and 'prove' pay attention to the test script's exit status, and even a silently aborted test script would then trigger those tools' usual noticable error messages. [4] Strictly speaking, not all those 'error' calls need that redirection to send their output to the terminal, see e.g. 'test_expect_success' in the opening example, but I think it's better to be consistent. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-12	p3400: replace calls to `git checkout -b' by `git checkout -B'	Alban Gruin	1	-5/+5
	p3400 makes a copy of the current repository to test git-rebase performance, and creates new branches in the copy with `git checkout -b'. If the original repository has branches with the same name as the script is trying to create, this operation will fail. This replaces these calls by `git checkout -B' to force the creation and update of these branches. Signed-off-by: Alban Gruin <alban.gruin@gmail.com> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-10	Merge branch 'ab/fsck-skiplist'	Junio C Hamano	2	-0/+53
	Update fsck.skipList implementation and documentation. * ab/fsck-skiplist: fsck: support comments & empty lines in skipList fsck: use oidset instead of oid_array for skipList fsck: use strbuf_getline() to read skiplist file fsck: add a performance test for skipList fsck: add a performance test fsck: document that skipList input must be unabbreviated fsck: document and test commented & empty line skipList input fsck: document and test sorted skipList input fsck tests: add a test for no skipList input fsck tests: setup of bogus commit object
2018-09-12	fsck: add a performance test for skipList	René Scharfe	1	-0/+40
	Create a performance test to see how the skipList implementation performs. First we setup N bad commits, then we see how progressively working our way up to 0..N in increments of 10x does. I.e. the needle(s) in the haystack get progressively more numerous. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-12	fsck: add a performance test	Ævar Arnfjörð Bjarmason	1	-0/+13
	Add a plain performance test for "fsck". This test will not be used to / referred to in any upcoming commit of mine in this series, but having a simple test for fsck performance is valuable, so let's add it while we're at it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20	t/perf: add perf tests for fetches from a bitmapped server	Jeff King	1	-0/+45
	A server with bitmapped packs can serve a clone very quickly. However, fetches are not necessarily made any faster, because we spend a lot less time in object traversal (which is what bitmaps help with) and more time finding deltas (because we may have to throw out on-disk deltas if the client does not have the base). As a first step to making this faster, this patch introduces a new perf script to measure fetches into a repo of various ages from a fully-bitmapped server. We separately measure the work done by the server (in pack-objects) and that done by the client (in index-pack). Furthermore, we measure the size of the resulting pack. Breaking it down like this (instead of just doing a regular "git fetch") lets us see how much each side benefits from any changes. And since we know the pack size, if we estimate the network speed, then one could calculate a complete wall-clock time for the operation (though the script does not do this automatically). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20	t/perf: add infrastructure for measuring sizes	Jeff King	3	-5/+81
	The main objective of scripts in the perf framework is to run "test_perf", which measures the time it takes to run some operation. However, it can also be interesting to see the change in the output size of certain operations. This patch introduces test_size, which records a single numeric output from the test and shows it in the aggregated output (with pretty printing and relative size comparison). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20	t/perf: factor out percent calculations	Jeff King	1	-9/+12
	This will let us reuse the code when we add new values to aggregate besides times. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20	t/perf: factor boilerplate out of test_perf	Jeff King	1	-26/+35
	About half of test_perf() is boilerplate preparing to run _any_ test, and the other half is specifically running a timing test. Let's split it into two functions, so that we can reuse the boilerplate in future commits. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-23	Merge branch 'cc/perf-bisect'	Junio C Hamano	1	-0/+6
	Performance test updates. * cc/perf-bisect: perf/bisect_run_script: disable codespeed
2018-05-06	perf/bisect_run_script: disable codespeed	Christian Couder	1	-0/+6
	When bisecting a performance regression using a config file, `./bisect_regression --config my_perf.conf` for example, the config file can contain Codespeed configuration which would instruct the 'aggregate.perl' script called by the 'run' script to output results in the Codespeed format and maybe to try to send this output to a Codespeed server. This is unfortunate because the 'bisect_run_script' relies on the regular output from 'aggregate.perl' to mesure performance, so let's disable Codespeed output and sending results to a Codespeed server. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-26	perf/aggregate: use Getopt::Long for option parsing	Christian Couder	1	-36/+26
	When passing an option '--foo' that it does not recognize, the aggregate.perl script should die with an helpful error message like: Unknown option: foo ./aggregate.perl [options] [--] [<dir_or_rev>...] [--] \ [<test_script>...] > Options: --codespeed * Format output for Codespeed --reponame <str> * Send given reponame to codespeed --sort-by <str> * Sort output (only "regression" \ criteria is supported) rather than: fatal: Needed a single revision rev-parse --verify --foo: command returned error: 128 To implement that let's use Getopt::Long for option parsing instead of the current manual and sloppy parsing. This should save some code and make option parsing simpler, tighter and safer. This will avoid something like 'foo--sort-by=regression' to be handled as if '--sort-by=regression' had been used, for example. As Getopt::Long eats '--' at the end of options, this changes a bit the way '--' is handled as we can now have '--' both after the options and before the scripts. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-25	Merge branch 'cc/perf-bisect'	Junio C Hamano	3	-10/+166
	Performance measuring framework in t/perf learned to help bisecting performance regressions. * cc/perf-bisect: t/perf: add scripts to bisect performance regressions perf/run: add --subsection option
2018-04-11	t/perf: add scripts to bisect performance regressions	Christian Couder	2	-0/+120
	The new bisect_regression script can be used to automatically bisect performance regressions. It will pass the new bisect_run_script to `git bisect run`. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11	perf/run: add --subsection option	Christian Couder	1	-10/+46
	This new option makes it possible to run perf tests as defined in only one subsection of a config file. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11	Merge branch 'nd/combined-test-helper'	Junio C Hamano	5	-13/+13
	Small test-helper programs have been consolidated into a single binary. * nd/combined-test-helper: (36 commits) t/helper: merge test-write-cache into test-tool t/helper: merge test-wildmatch into test-tool t/helper: merge test-urlmatch-normalization into test-tool t/helper: merge test-subprocess into test-tool t/helper: merge test-submodule-config into test-tool t/helper: merge test-string-list into test-tool t/helper: merge test-strcmp-offset into test-tool t/helper: merge test-sigchain into test-tool t/helper: merge test-sha1-array into test-tool t/helper: merge test-scrap-cache-tree into test-tool t/helper: merge test-run-command into test-tool t/helper: merge test-revision-walking into test-tool t/helper: merge test-regex into test-tool t/helper: merge test-ref-store into test-tool t/helper: merge test-read-cache into test-tool t/helper: merge test-prio-queue into test-tool t/helper: merge test-path-utils into test-tool t/helper: merge test-online-cpus into test-tool t/helper: merge test-mktemp into test-tool t/helper: merge (unused) test-mergesort into test-tool ...
2018-03-27	perf/aggregate: add --sort-by=regression option	Christian Couder	1	-1/+58
	One of the most interesting thing one can be interested in when looking at performance test results is possible performance regressions. This new option makes it easy to spot such possible regressions. This new option is named '--sort-by=regression' to make it possible and easy to add other ways to sort the results, like for example '--sort-by=utime'. If we would like to sort according to how much the stime regressed we could also add a new option called '--sort-by=regression:stime'. Then '--sort-by=regression' could become a synonym for '--sort-by=regression:rtime'. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27	perf/aggregate: add display_dir()	Christian Couder	1	-4/+7
	This new helper function will be reused in a subsequent commit. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27	t/helper: merge test-write-cache into test-tool	Nguyễn Thái Ngọc Duy	1	-1/+1
	Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27	t/helper: merge test-string-list into test-tool	Nguyễn Thái Ngọc Duy	1	-1/+1
	Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27	t/helper: merge test-read-cache into test-tool	Nguyễn Thái Ngọc Duy	1	-1/+1
	Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27	t/helper: merge test-drop-caches into test-tool	Nguyễn Thái Ngọc Duy	1	-6/+6
	Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-27	t/helper: merge test-lazy-init-name-hash into test-tool	Nguyễn Thái Ngọc Duy	1	-4/+4
	Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-27	perf: use GIT_PERF_REPEAT_COUNT=3 by default even without config file	René Scharfe	1	-5/+3
	9ba95ed23c (perf/run: update get_var_from_env_or_config() for subsections) stopped setting a default value for GIT_PERF_REPEAT_COUNT if no perf config file is present, because get_var_from_env_or_config returns early in that case. Fix it by setting the default value after calling this function. Its fifth parameter is not used for any other variable, so remove the associated code. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-15	Merge branch 'cc/perf-aggregate'	Junio C Hamano	1	-11/+37
	"make perf" enhancement. * cc/perf-aggregate: perf/aggregate: sort JSON fields in output perf/aggregate: add --reponame option perf/aggregate: add --subsection option
2018-02-13	Merge branch 'ab/simplify-perl-makefile'	Junio C Hamano	1	-1/+1
	The build procedure for perl/ part has been greatly simplified by weaning ourselves off of MakeMaker. * ab/simplify-perl-makefile: perl: treat PERLLIB_EXTRA as an extra path again perl: avoid *.pmc and fix Error.pm further Makefile: replace perl/Makefile.PL with simple make rules
2018-02-02	perf/aggregate: sort JSON fields in output	Christian Couder	1	-1/+1
	It is much easier to diff the output against a previous one when the fields are sorted. Helped-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-02	perf/aggregate: add --reponame option	Christian Couder	1	-2/+13
	This makes it easier to use the aggregate script on the command line when one wants to get the "environment" fields set in the codespeed output. Previously setting GIT_REPO_NAME was needed for this purpose. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-02	perf/aggregate: add --subsection option	Christian Couder	1	-9/+24
	This makes it easier to use the aggregate script on the command line, to get results from subsections. Previously setting GIT_PERF_SUBSECTION was needed for this purpose. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-23	Merge branch 'cc/codespeed'	Junio C Hamano	2	-54/+137
	"perf" test output can be sent to codespeed server. * cc/codespeed: perf/run: read GIT_PERF_REPO_NAME from perf.repoName perf/run: learn to send output to codespeed server perf/run: learn about perf.codespeedOutput perf/run: add conf_opts argument to get_var_from_env_or_config() perf/aggregate: implement codespeed JSON output perf/aggregate: refactor printing results perf/aggregate: fix checking ENV{GIT_PERF_SUBSECTION}
2018-01-05	perf/run: read GIT_PERF_REPO_NAME from perf.repoName	Christian Couder	1	-0/+3
	The GIT_PERF_REPO_NAME env variable is used in the `aggregate.perl` script to set the 'environment' field in the JSON Codespeed output. Let's make it easy to set this variable by setting it in a config file. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05	perf/run: learn to send output to codespeed server	Christian Couder	1	-1/+11
	Let's make it possible to set in a config file the URL of a codespeed server. And then let's make the `run` script send the perf test results to this URL at the end of the tests. This should make is possible to easily automate the process of running perf tests and having their results available in Codespeed. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05	perf/run: learn about perf.codespeedOutput	Christian Couder	1	-1/+6
	Let's make it possible to set in a config file the output format (regular or codespeed) of the perf tests. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05	perf/run: add conf_opts argument to get_var_from_env_or_config()	Christian Couder	1	-5/+6
	Let's make it possible to use `git config` type specifiers like `--int` or `--bool`, so that config values are converted to the canonical form and easier to use. This additional argument is now the fourth argument of get_var_from_env_or_config() instead of the fifth because we want the default value argument to be unset if it is not passed, and this is simpler if it is the last argument. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05	perf/aggregate: implement codespeed JSON output	Christian Couder	1	-2/+62
	Codespeed (https://github.com/tobami/codespeed/) is an open source project that can be used to track how some software performs over time. It stores performance test results in a database and can show nice graphs and charts on a web interface. As it can be interesting to use Codespeed to see how Git performance evolves over time and releases, let's implement a Codespeed output in "perf/aggregate.perl". Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05	perf/aggregate: refactor printing results	Christian Couder	1	-46/+50
	As we want to implement another kind of output than the current output for the perf test results, let's refactor the existing code that outputs the results in its own print_default_results() function. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05	perf/aggregate: fix checking ENV{GIT_PERF_SUBSECTION}	Christian Couder	1	-1/+1
	The way we check ENV{GIT_PERF_SUBSECTION} could trigger comparison between undef and "" that may be flagged by use of strict & warnings. Let's fix that. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-04	perf: amend the grep tests to test grep.threads	Ævar Arnfjörð Bjarmason	2	-21/+86
	Ever since 5b594f457a ("Threaded grep", 2010-01-25) the number of threads git-grep uses under PTHREADS has been hardcoded to 8, but there's no performance test to check whether this is an optimal setting. Amend the existing tests for the grep engines to support a mode where this can be tested, e.g.: GIT_PERF_GREP_THREADS='1 8 16' GIT_PERF_LARGE_REPO=~/g/linux ./run p782* Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-28	Merge branch 'bp/fsmonitor'	Junio C Hamano	1	-2/+1
	Test fix. * bp/fsmonitor: p7519: improve check for prerequisite WATCHMAN
2017-12-18	p7519: improve check for prerequisite WATCHMAN	René Scharfe	1	-2/+1
	The return code of command -v with a non-existing command is 1 in bash and 127 in dash. Use that return code directly to allow the script to work with dash and without watchman (e.g. on Debian). While at it stop redirecting the output. stderr is redirected to /dev/null by test_lazy_prereq already, and stdout can actually be useful -- the path of the found watchman executable is sent there, but it's shown only if the script was run with --verbose. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-13	Merge branch 'ds/for-each-file-in-obj-micro-optim'	Junio C Hamano	1	-0/+4
	The code to iterate over loose object files got optimized. * ds/for-each-file-in-obj-micro-optim: sha1_file: use strbuf_add() instead of strbuf_addf()
2017-12-11	Makefile: replace perl/Makefile.PL with simple make rules	Ævar Arnfjörð Bjarmason	1	-1/+1
	Replace the perl/Makefile.PL and the fallback perl/Makefile used under NO_PERL_MAKEMAKER=NoThanks with a much simpler implementation heavily inspired by how the i18n infrastructure's build process works[1]. The reason for having the Makefile.PL in the first place is that it was initially[2] building a perl C binding to interface with libgit, this functionality, that was removed[3] before Git.pm ever made it to the master branch. We've since since started maintaining a fallback perl/Makefile, as MakeMaker wouldn't work on some platforms[4]. That's just the tip of the iceberg. We have the PM.stamp hack in the top-level Makefile[5] to detect whether we need to regenerate the perl/perl.mak, which I fixed just recently to deal with issues like the perl version changing from under us[6]. There is absolutely no reason for why this needs to be so complex anymore. All we're getting out of this elaborate Rube Goldberg machine was copying perl/* to perl/blib/* as we do a string-replacement on the .pm files to hardcode @@LOCALEDIR@@ in the source, as well as pod2man-ing Git.pm & friends. So replace the whole thing with something that's pretty much a copy of how we generate po/build/.mo from po/.po, just with a small sed(1) command instead of msgfmt. As that's being done rename the files from .pm to .pmc just to indicate that they're generated (see "perldoc -f require"). While I'm at it, change the fallback for Error.pm from being something where we'll ship our own Error.pm if one doesn't exist at build time to one where we just use a Git::Error wrapper that'll always prefer the system-wide Error.pm, only falling back to our own copy if it really doesn't exist at runtime. It's now shipped as Git::FromCPAN::Error, making it easy to add other modules to Git::FromCPAN::* in the future if that's needed. Functional changes: * This will not always install into perl's idea of its global "installsitelib". This only potentially matters for packagers that need to expose Git.pm for non-git use, and as explained in the INSTALL file there's a trivial workaround. * The scripts themselves will 'use lib' the target directory, but if INSTLIBDIR is set it overrides it. It doesn't have to be this way, it could be set in addition to INSTLIBDIR, but my reading of [7] is that this is the desired behavior. * We don't build man pages for all of the perl modules as we used to, only Git(3pm). As discussed on-list[8] that we were building installed manpages for purely internal APIs like Git::I18N or private-Error.pm was always a bug anyway, and all the Git::SVN::* ones say they're internal APIs. There are apparently external users of Git.pm, but I don't expect there to be any of the others. As a side-effect of these general changes the perl documentation now only installed by install-{doc,man}, not a mere "install" as before. 1. 5e9637c629 ("i18n: add infrastructure for translating Git with gettext", 2011-11-18) 2. b1edc53d06 ("Introduce Git.pm (v4)", 2006-06-24) 3. 18b0fc1ce1 ("Git.pm: Kill Git.xs for now", 2006-09-23) 4. f848718a69 ("Make perl/ build procedure ActiveState friendly.", 2006-12-04) 5. ee9be06770 ("perl: detect new files in MakeMaker builds", 2012-07-27) 6. c59c4939c2 ("perl: regenerate perl.mak if perl -V changes", 2017-03-29) 7. 0386dd37b1 ("Makefile: add PERLLIB_EXTRA variable that adds to default perl path", 2013-11-15) 8. 87bmjjv1pu.fsf@evledraar.booking.com ("Re: [PATCH] Makefile: replace perl/Makefile.PL with simple make rules" Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-06	Merge branch 'jk/fewer-pack-rescan'	Junio C Hamano	3	-23/+82
	Internaly we use 0{40} as a placeholder object name to signal the codepath that there is no such object (e.g. the fast-forward check while "git fetch" stores a new remote-tracking ref says "we know there is no 'old' thing pointed at by the ref, as we are creating it anew" by passing 0{40} for the 'old' side), and expect that a codepath to locate an in-core object to return NULL as a sign that the object does not exist. A look-up for an object that does not exist however is quite costly with a repository with large number of packfiles. This access pattern has been optimized. * jk/fewer-pack-rescan: sha1_file: fast-path null sha1 as a missing object everything_local: use "quick" object existence check p5551: add a script to test fetch pack-dir rescans t/perf/lib-pack: use fast-import checkpoint to create packs p5550: factor out nonsense-pack creation
2017-12-06	Merge branch 'cc/perf-run-config'	Junio C Hamano	3	-15/+89
	* cc/perf-run-config: perf: store subsection results in "test-results/$GIT_PERF_SUBSECTION/" perf/run: show name of rev being built perf/run: add run_subsection() perf/run: update get_var_from_env_or_config() for subsections perf/run: add get_subsections() perf/run: add calls to get_var_from_env_or_config() perf/run: add GIT_PERF_DIRS_OR_REVS perf/run: add get_var_from_env_or_config() perf/run: add '--config' option to the 'run' script
2017-12-04	sha1_file: use strbuf_add() instead of strbuf_addf()	Derrick Stolee	1	-0/+4
	Replace use of strbuf_addf() with strbuf_add() when enumerating loose objects in for_each_file_in_obj_subdir(). Since we already check the length and hex-values of the string before consuming the path, we can prevent extra computation by using the lower- level method. One consumer of for_each_file_in_obj_subdir() is the abbreviation code. OID abbreviations use a cached list of loose objects (per object subdirectory) to make repeated queries fast, but there is significant cache load time when there are many loose objects. Most repositories do not have many loose objects before repacking, but in the GVFS case the repos can grow to have millions of loose objects. Profiling 'git log' performance in GitForWindows on a GVFS-enabled repo with ~2.5 million loose objects revealed 12% of the CPU time was spent in strbuf_addf(). Add a new performance test to p4211-line-log.sh that is more sensitive to this cache-loading. By limiting to 1000 commits, we more closely resemble user wait time when reading history into a pager. For a copy of the Linux repo with two ~512 MB packfiles and ~572K loose objects, running 'git log --oneline --parents --raw -1000' had the following performance: HEAD~1 HEAD ---------------------------------------- 7.70(7.15+0.54) 7.44(7.09+0.29) -3.4% Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>