summaryrefslogtreecommitdiff
path: root/t/perf
AgeCommit message (Collapse)AuthorFilesLines
2013-12-30pack-bitmap: implement optional name_hash cacheLibravatar Vicent Marti1-1/+2
When we use pack bitmaps rather than walking the object graph, we end up with the list of objects to include in the packfile, but we do not know the path at which any tree or blob objects would be found. In a recently packed repository, this is fine. A fetch would use the paths only as a heuristic in the delta compression phase, and a fully packed repository should not need to do much delta compression. As time passes, though, we may acquire more objects on top of our large bitmapped pack. If clients fetch frequently, then they never even look at the bitmapped history, and all works as usual. However, a client who has not fetched since the last bitmap repack will have "have" tips in the bitmapped history, but "want" newer objects. The bitmaps themselves degrade gracefully in this circumstance. We manually walk the more recent bits of history, and then use bitmaps when we hit them. But we would also like to perform delta compression between the newer objects and the bitmapped objects (both to delta against what we know the user already has, but also between "new" and "old" objects that the user is fetching). The lack of pathnames makes our delta heuristics much less effective. This patch adds an optional cache of the 32-bit name_hash values to the end of the bitmap file. If present, a reader can use it to match bitmapped and non-bitmapped names during delta compression. Here are perf results for p5310: Test origin/master HEAD^ HEAD ------------------------------------------------------------------------------------------------- 5310.2: repack to disk 36.81(37.82+1.43) 47.70(48.74+1.41) +29.6% 47.75(48.70+1.51) +29.7% 5310.3: simulated clone 30.78(29.70+2.14) 1.08(0.97+0.10) -96.5% 1.07(0.94+0.12) -96.5% 5310.4: simulated fetch 3.16(6.10+0.08) 3.54(10.65+0.06) +12.0% 1.70(3.07+0.06) -46.2% 5310.6: partial bitmap 36.76(43.19+1.81) 6.71(11.25+0.76) -81.7% 4.08(6.26+0.46) -88.9% You can see that the time spent on an incremental fetch goes down, as our delta heuristics are able to do their work. And we save time on the partial bitmap clone for the same reason. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30t/perf: add tests for pack bitmapsLibravatar Jeff King1-0/+56
This adds a few basic perf tests for the pack bitmap code to show off its improvements. The tests are: 1. How long does it take to do a repack (it gets slower with bitmaps, since we have to do extra work)? 2. How long does it take to do a clone (it gets faster with bitmaps)? 3. How does a small fetch perform when we've just repacked? 4. How does a clone perform when we haven't repacked since a week of pushes? Here are results against linux.git: Test origin/master this tree ----------------------------------------------------------------------- 5310.2: repack to disk 33.64(32.64+2.04) 67.67(66.75+1.84) +101.2% 5310.3: simulated clone 30.49(29.47+2.05) 1.20(1.10+0.10) -96.1% 5310.4: simulated fetch 3.49(6.79+0.06) 5.57(22.35+0.07) +59.6% 5310.6: partial bitmap 36.70(43.87+1.81) 8.18(21.92+0.73) -77.7% You can see that we do take longer to repack, but we do way better for further clones. A small fetch performs a bit worse, as we spend way more time on delta compression (note the heavy user CPU time, as we have 8 threads) due to the lack of name hashes for the bitmapped objects. The final test shows how the bitmaps degrade over time between packs. There's still a significant speedup over the non-bitmap case, but we don't do quite as well (we have to spend time accessing the "new" objects the old fashioned way, including delta compression). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-01Merge branch 'lf/echo-n-is-not-portable'Libravatar Junio C Hamano1-2/+2
* lf/echo-n-is-not-portable: Avoid using `echo -n` anywhere
2013-07-29Avoid using `echo -n` anywhereLibravatar Lukas Fleischer1-2/+2
`echo -n` is non-portable. The POSIX specification says: Conforming applications that wish to do prompting without <newline> characters or that could possibly be expecting to echo a -n, should use the printf utility derived from the Ninth Edition system. Since all of the affected shell scripts use a POSIX shell shebang, replace `echo -n` invocations with printf. Signed-off-by: Lukas Fleischer <git@cryptocrack.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-05Merge branch 'tr/test-v-and-v-subtest-only'Libravatar Junio C Hamano1-1/+2
Allows N instances of tests run in parallel, each running 1/N parts of the test suite under Valgrind, to speed things up. * tr/test-v-and-v-subtest-only: perf-lib: fix start/stop of perf tests test-lib: support running tests under valgrind in parallel test-lib: allow prefixing a custom string before "ok N" etc. test-lib: valgrind for only tests matching a pattern test-lib: verbose mode for only tests matching a pattern test-lib: self-test that --verbose works test-lib: rearrange start/end of test_expect_* and test_skip test-lib: refactor $GIT_SKIP_TESTS matching test-lib: enable MALLOC_* for the actual tests
2013-06-29perf-lib: fix start/stop of perf testsLibravatar Thomas Gummerer1-1/+2
ae75342 test-lib: rearrange start/end of test_expect_* and test_skip changed the way tests are started/stopped, but did not update the perf tests. They were therefore giving the wrong output, because of the wrong test count. Fix this by starting and stopping the tests correctly. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Acked-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-22Documentation: Update 'linux-2.6.git' -> 'linux.git'Libravatar W. Trevor King1-1/+1
The 3.x tree has been out for a while now. The -2.6 repository name survived the initial release [1], but kernel.org now only lists 'linux.git' (for aegl as well as torvalds) [2]. [1]: http://article.gmane.org/gmane.linux.kernel/1147422 On 2011-05-30 01:47:57 GMT, Linus Torvalds wrote: > ... yes, that means that my git tree is still called > "linux-2.6.git" on kernel.org. [2]: http://git.kernel.org/cgit/ Signed-off-by: W. Trevor King <wking@tremily.us> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-20Merge branch 'rs/discard-index-discard-array'Libravatar Junio C Hamano1-0/+14
* rs/discard-index-discard-array: read-cache: free cache in discard_index read-cache: add simple performance test
2013-06-09read-cache: add simple performance testLibravatar René Scharfe1-0/+14
Add the helper test-read-cache, which can be used to call read_cache and discard_cache in a loop as well as a performance check based on it. Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02Merge branch 'tr/line-log'Libravatar Junio C Hamano1-0/+34
* tr/line-log: git-log(1): remove --full-line-diff description line-log: fix documentation formatting log -L: improve comments in process_all_files() log -L: store the path instead of a diff_filespec log -L: test merge of parallel modify/rename t4211: pass -M to 'git log -M -L...' test log -L: fix overlapping input ranges log -L: check range set invariants when we look it up Speed up log -L... -M log -L: :pattern:file syntax to find by funcname Implement line-history search (git log -L) Export rewrite_parents() for 'log -L' Refactor parse_loc
2013-03-28Implement line-history search (git log -L)Libravatar Thomas Rast1-0/+34
This is a rewrite of much of Bo's work, mainly in an effort to split it into smaller, easier to understand routines. The algorithm is built around the struct range_set, which encodes a series of line ranges as intervals [a,b). This is used in two contexts: * A set of lines we are tracking (which will change as we dig through history). * To encode diffs, as pairs of ranges. The main routine is range_set_map_across_diff(). It processes the diff between a commit C and some parent P. It determines which diff hunks are relevant to the ranges tracked in C, and computes the new ranges for P. The algorithm is then simply to process history in topological order from newest to oldest, computing ranges and (partial) diffs. At branch points, we need to merge the ranges we are watching. We will find that many commits do not affect the chosen ranges, and mark them TREESAME (in addition to those already filtered by pathspec limiting). Another pass of history simplification then gets rid of such commits. This is wired as an extra filtering pass in the log machinery. This currently only reduces code duplication, but should allow for other simplifications and options to be used. Finally, we hook a diff printer into the output chain. Ideally we would wire directly into the diff logic, to optionally use features like word diff. However, that will require some major reworking of the diff chain, so we completely replace the output with our own diff for now. As this was a GSoC project, and has quite some history by now, many people have helped. In no particular order, thanks go to Jakub Narebski <jnareb@gmail.com> Jens Lehmann <Jens.Lehmann@web.de> Jonathan Nieder <jrnieder@gmail.com> Junio C Hamano <gitster@pobox.com> Ramsay Jones <ramsay@ramsay1.demon.co.uk> Will Palmer <wmpalmer@gmail.com> Apologies to everyone I forgot. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-09perf: update documentation of GIT_PERF_REPEAT_COUNTLibravatar Antoine Pelisse1-1/+1
Currently the documentation of GIT_PERF_REPEAT_COUNT says the default is five while "perf-lib.sh" uses a value of three as a default. Update the documentation so that it is consistent with the code. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-01Merge branch 'ep/malloc-check-perturb'Libravatar Junio C Hamano1-1/+1
Fixes a brown-paper bag bug. * ep/malloc-check-perturb: MALLOC_CHECK: enable it, unless disabled explicitly
2012-09-26MALLOC_CHECK: enable it, unless disabled explicitlyLibravatar René Scharfe1-1/+1
The malloc checks in tests are currently disabled. Actually evaluate the variable for turning them off and enable them if it's unset. Also use this opportunity to give it the more descriptive and consistent name TEST_NO_MALLOC_CHECK. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-25Merge branch 'ep/malloc-check-perturb'Libravatar Junio C Hamano1-0/+1
Run our test scripts with MALLOC_CHECK_ and MALLOC_PERTURB_, the built-in memory access checking facility GNU libc has. * ep/malloc-check-perturb: MALLOC_CHECK: various clean-ups Add MALLOC_CHECK_ and MALLOC_PERTURB_ libc env to the test suite for detecting heap corruption
2012-09-17MALLOC_CHECK: various clean-upsLibravatar Junio C Hamano1-0/+1
The most important in this change is to avoid affecting anything when test-lib is used from perf-lib. It also limits the effect of the MALLOC_CHECK only to what is run inside the actual test, and uses a fixed MALLOC_PERTURB_ in order to avoid hurting repeatability of the tests. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-17t/perf: add "trash directory" to .gitignoreLibravatar Ramkumar Ramachandra1-2/+3
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-04tests: Introduce test_seqLibravatar Michał Kiedrowicz1-1/+1
Jeff King wrote: The seq command is GNU-ism, and is missing at least in older BSD releases and their derivatives, not to mention antique commercial Unixes. We already purged it in b3431bc (Don't use seq in tests, not everyone has it, 2007-05-02), but a few new instances have crept in. They went unnoticed because they are in scripts that are not run by default. Replace them with test_seq that is implemented with a Perl snippet (proposed by Jeff). This is better than inlining this snippet everywhere it's needed because it's easier to read and it's easier to change the implementation (e.g. to C) if we ever decide to remove Perl from the test suite. Note that test_seq is not a complete replacement for seq(1). It just has what we need now, in addition that it makes it possible for us to do something like "test_seq a m" if we wanted to in the future. There are also many places that do `for i in 1 2 3 ...` but I'm not sure if it's worth converting them to test_seq. That would introduce running more processes of Perl. Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-14Merge branch 'nd/threaded-index-pack'Libravatar Junio C Hamano1-0/+40
Enables threading in index-pack to resolve base data in parallel. By Nguyễn Thái Ngọc Duy (3) and Ramsay Jones (1) * nd/threaded-index-pack: index-pack: disable threading if NO_PREAD is defined index-pack: support multithreaded delta resolving index-pack: restructure pack processing into three main functions compat/win32/pthread.h: Add an pthread_key_delete() implementation
2012-05-07index-pack: support multithreaded delta resolvingLibravatar Nguyễn Thái Ngọc Duy1-0/+40
This puts delta resolving on each base on a separate thread, one base cache per thread. Per-thread data is grouped in struct thread_local. When running with nr_threads == 1, no pthreads calls are made. The system essentially runs in non-thread mode. An experiment on a Xeon 24 core machine with git.git shows that performance does not increase proportional to the number of cores. So by default, we use maximum 3 cores. Some numbers with --threads from 1 to 16: 1..4 real 0m8.003s 0m5.307s 0m4.321s 0m3.830s user 0m7.720s 0m8.009s 0m8.133s 0m8.305s sys 0m0.224s 0m0.372s 0m0.360s 0m0.360s 5..8 real 0m3.727s 0m3.604s 0m3.332s 0m3.369s user 0m9.361s 0m9.817s 0m9.525s 0m9.769s sys 0m0.584s 0m0.624s 0m0.540s 0m0.560s 9..12 real 0m3.036s 0m3.139s 0m3.177s 0m2.961s user 0m8.977s 0m10.205s 0m9.737s 0m10.073s sys 0m0.596s 0m0.680s 0m0.684s 0m0.680s 13..16 real 0m2.985s 0m2.894s 0m2.975s 0m2.971s user 0m9.825s 0m10.573s 0m10.833s 0m11.361s sys 0m0.788s 0m0.732s 0m0.904s 0m1.016s On an Intel dual core and linux-2.6.git 1..4 real 2m37.789s 2m7.963s 2m0.920s 1m58.213s user 2m28.415s 2m52.325s 2m50.176s 2m41.187s sys 0m7.808s 0m11.181s 0m11.224s 0m10.731s Thanks Ramsay Jones for troubleshooting and support on MinGW platform. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-09p4000: use -3000 when promising -3000Libravatar Thomas Rast1-1/+1
The 'log -3000 (baseline)' test accidentally still used -1000 from an earlier version. Noticed-by: Lawrence Holding <Lawrence.Holding@cubic.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-08perf: export some important test-lib variablesLibravatar Thomas Rast2-1/+14
The only bug right now is that $GIT_TEST_CMP is needed for test_cmp to work. However, we also export the three most important paths for tests: TEST_DIRECTORY TRASH_DIRECTORY GIT_BUILD_DIR Since they are available within test_expect_success, a future test writer may expect them to also be defined in test_perf. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-08perf: load test-lib-functions from the correct directoryLibravatar Thomas Rast2-1/+6
Loading it in the subshells still referred to $TEST_DIRECTORY/.., which was only correct in preliminary versions of perf-lib.sh Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-06perf: compare diff algorithmsLibravatar Thomas Rast1-0/+29
8c912ee (teach --histogram to diff, 2011-07-12) claimed histogram diff was faster than both Myers and patience. We have since incorporated a performance testing framework, so add a test that compares the various diff tasks performed in a real 'log -p' workload. This does indeed show that histogram diff slightly beats Myers, while patience is much slower than the others. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-17Add a performance test for git-grepLibravatar Thomas Rast1-0/+23
The only catch is that we don't really know what our repo contains, so we have to ignore any possible "not found" status from git-grep. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-17Introduce a performance testing frameworkLibravatar Thomas Rast9-0/+688
This introduces a performance testing framework under t/perf/. It tries to be as close to the test-lib.sh infrastructure as possible, and thus should be easy to get used to for git developers. The following points were considered for the implementation: 1. You usually want to compare arbitrary revisions/build trees against each other. They may not have the performance test under consideration, or even the perf-lib.sh infrastructure. To cope with this, the 'run' script lets you specify arbitrary build dirs and revisions. It even automatically builds the revisions if it doesn't have them at hand yet. 2. Usually you would not want to run all tests. It would take too long anyway. The 'run' script lets you specify which tests to run; or you can also do it manually. There is a Makefile for discoverability and 'make clean', but it is not meant for real-world use. 3. Creating test repos from scratch in every test is extremely time-consuming, and shipping or downloading such large/weird repos is out of the question. We leave this decision to the user. Two different sizes of test repos can be configured, and the scripts just copy one or more of those (using hardlinks for the object store). By default it tries to use the build tree's git.git repository. This is fairly fast and versatile. Using a copy instead of a clone preserves many properties that the user may want to test for, such as lots of loose objects, unpacked refs, etc. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>