tgif.git - Terin's Improved Git Fork

Age	Commit message (Collapse)	Author	Files	Lines
2014-02-27	Merge branch 'jk/pack-bitmap'	Junio C Hamano	1	-0/+57
	Borrow the bitmap index into packfiles from JGit to speed up enumeration of objects involved in a commit range without having to fully traverse the history. * jk/pack-bitmap: (26 commits) ewah: unconditionally ntohll ewah data ewah: support platforms that require aligned reads read-cache: use get_be32 instead of hand-rolled ntoh_l block-sha1: factor out get_be and put_be wrappers do not discard revindex when re-preparing packfiles pack-bitmap: implement optional name_hash cache t/perf: add tests for pack bitmaps t: add basic bitmap functionality tests count-objects: recognize .bitmap in garbage-checking repack: consider bitmaps when performing repacks repack: handle optional files created by pack-objects repack: turn exts array into array-of-struct repack: stop using magic number for ARRAY_SIZE(exts) pack-objects: implement bitmap writing rev-list: add bitmap mode to speed up object lists pack-objects: use bitmaps when packing objects pack-objects: split add_object_entry pack-bitmap: add support for bitmap indexes documentation: add documentation for the bitmap format ewah: compressed bitmap implementation ...
2014-01-27	Merge branch 'jk/mark-edges-uninteresting'	Junio C Hamano	1	-0/+12
	Fix performance regression in v1.8.4.x and later. * jk/mark-edges-uninteresting: list-objects: only look at cmdline trees with edge_hint t/perf: time rev-list with UNINTERESTING commits
2014-01-21	t/perf: time rev-list with UNINTERESTING commits	Jeff King	1	-0/+12
	We time a straight "rev-list --all" and its "--object" counterpart, both going all the way to the root. However, we do not time a partial history walk. This patch adds an extreme case: a walk over a very small slice of history, but with a very large set of UNINTERESTING tips. This is similar to the connectivity check run by git on a small fetch, or the walk done by any pre-receive hooks that want to check incoming commits. This test reveals a performance regression in git v1.8.4.2, caused by fbd4a70 (list-objects: mark more commits as edges in mark_edges_uninteresting, 2013-08-16): Test fbd4a703^ fbd4a703 ------------------------------------------------------------------------------------------ 0001.1: rev-list --all 0.69(0.67+0.02) 0.69(0.68+0.01) +0.0% 0001.2: rev-list --all --objects 3.47(3.44+0.02) 3.48(3.44+0.03) +0.3% 0001.4: rev-list $commit --not --all 0.04(0.04+0.00) 0.04(0.04+0.00) +0.0% 0001.5: rev-list --objects $commit --not --all 0.04(0.03+0.00) 0.27(0.24+0.02) +575.0% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30	pack-bitmap: implement optional name_hash cache	Vicent Marti	1	-1/+2
	When we use pack bitmaps rather than walking the object graph, we end up with the list of objects to include in the packfile, but we do not know the path at which any tree or blob objects would be found. In a recently packed repository, this is fine. A fetch would use the paths only as a heuristic in the delta compression phase, and a fully packed repository should not need to do much delta compression. As time passes, though, we may acquire more objects on top of our large bitmapped pack. If clients fetch frequently, then they never even look at the bitmapped history, and all works as usual. However, a client who has not fetched since the last bitmap repack will have "have" tips in the bitmapped history, but "want" newer objects. The bitmaps themselves degrade gracefully in this circumstance. We manually walk the more recent bits of history, and then use bitmaps when we hit them. But we would also like to perform delta compression between the newer objects and the bitmapped objects (both to delta against what we know the user already has, but also between "new" and "old" objects that the user is fetching). The lack of pathnames makes our delta heuristics much less effective. This patch adds an optional cache of the 32-bit name_hash values to the end of the bitmap file. If present, a reader can use it to match bitmapped and non-bitmapped names during delta compression. Here are perf results for p5310: Test origin/master HEAD^ HEAD ------------------------------------------------------------------------------------------------- 5310.2: repack to disk 36.81(37.82+1.43) 47.70(48.74+1.41) +29.6% 47.75(48.70+1.51) +29.7% 5310.3: simulated clone 30.78(29.70+2.14) 1.08(0.97+0.10) -96.5% 1.07(0.94+0.12) -96.5% 5310.4: simulated fetch 3.16(6.10+0.08) 3.54(10.65+0.06) +12.0% 1.70(3.07+0.06) -46.2% 5310.6: partial bitmap 36.76(43.19+1.81) 6.71(11.25+0.76) -81.7% 4.08(6.26+0.46) -88.9% You can see that the time spent on an incremental fetch goes down, as our delta heuristics are able to do their work. And we save time on the partial bitmap clone for the same reason. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30	t/perf: add tests for pack bitmaps	Jeff King	1	-0/+56
	This adds a few basic perf tests for the pack bitmap code to show off its improvements. The tests are: 1. How long does it take to do a repack (it gets slower with bitmaps, since we have to do extra work)? 2. How long does it take to do a clone (it gets faster with bitmaps)? 3. How does a small fetch perform when we've just repacked? 4. How does a clone perform when we haven't repacked since a week of pushes? Here are results against linux.git: Test origin/master this tree ----------------------------------------------------------------------- 5310.2: repack to disk 33.64(32.64+2.04) 67.67(66.75+1.84) +101.2% 5310.3: simulated clone 30.49(29.47+2.05) 1.20(1.10+0.10) -96.1% 5310.4: simulated fetch 3.49(6.79+0.06) 5.57(22.35+0.07) +59.6% 5310.6: partial bitmap 36.70(43.87+1.81) 8.18(21.92+0.73) -77.7% You can see that we do take longer to repack, but we do way better for further clones. A small fetch performs a bit worse, as we spend way more time on delta compression (note the heavy user CPU time, as we have 8 threads) due to the lack of name hashes for the bitmapped objects. The final test shows how the bitmaps degrade over time between packs. There's still a significant speedup over the non-bitmap case, but we don't do quite as well (we have to spend time accessing the "new" objects the old fashioned way, including delta compression). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-27	Merge branch 'tg/diff-no-index-refactor'	Junio C Hamano	1	-0/+22
	"git diff ../else/where/A ../else/where/B" when ../else/where is clearly outside the repository, and "git diff --no-index A B", do not have to look at the index at all, but we used to read the index unconditionally. * tg/diff-no-index-refactor: diff: avoid some nesting diff: add test for --no-index executed outside repo diff: don't read index when --no-index is given diff: move no-index detection to builtin/diff.c
2013-12-12	diff: don't read index when --no-index is given	Thomas Gummerer	1	-0/+22
	git diff --no-index ... currently reads the index, during setup, when calling gitmodules_config(). This results in worse performance when the index is not actually needed. This patch avoids calling gitmodules_config() when the --no-index option is given. The times for executing "git diff --no-index" in the WebKit repository are improved as follows: Test HEAD~3 HEAD ------------------------------------------------------------------ 4001.1: diff --no-index 0.24(0.15+0.09) 0.01(0.00+0.00) -95.8% An additional improvement of this patch is that "git diff --no-index" no longer breaks when the index file is corrupt, which makes it possible to use it for investigating the broken repository. To improve the possible usage as investigation tool for broken repositories, setup_git_directory_gently() is also not called when the --no-index option is given. Also add a test to guard against future breakages, and a performance test to show the improvements. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-26	test: replace shebangs with descriptions in shell libraries	Jonathan Nieder	1	-1/+3
	A #! line in these files is misleading, since these scriptlets are meant to be sourced with '.' (using whatever shell sources them) instead of run directly using the interpreter named on the #! line. Removing the #! line shouldn't hurt syntax highlighting since these files have filenames ending with '.sh'. For documentation, add a brief description of how the files are meant to be used in place of the shebang line. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-01	Merge branch 'lf/echo-n-is-not-portable'	Junio C Hamano	1	-2/+2
	* lf/echo-n-is-not-portable: Avoid using `echo -n` anywhere
2013-07-29	Avoid using `echo -n` anywhere	Lukas Fleischer	1	-2/+2
	`echo -n` is non-portable. The POSIX specification says: Conforming applications that wish to do prompting without <newline> characters or that could possibly be expecting to echo a -n, should use the printf utility derived from the Ninth Edition system. Since all of the affected shell scripts use a POSIX shell shebang, replace `echo -n` invocations with printf. Signed-off-by: Lukas Fleischer <git@cryptocrack.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-05	Merge branch 'tr/test-v-and-v-subtest-only'	Junio C Hamano	1	-1/+2
	Allows N instances of tests run in parallel, each running 1/N parts of the test suite under Valgrind, to speed things up. * tr/test-v-and-v-subtest-only: perf-lib: fix start/stop of perf tests test-lib: support running tests under valgrind in parallel test-lib: allow prefixing a custom string before "ok N" etc. test-lib: valgrind for only tests matching a pattern test-lib: verbose mode for only tests matching a pattern test-lib: self-test that --verbose works test-lib: rearrange start/end of test_expect_* and test_skip test-lib: refactor $GIT_SKIP_TESTS matching test-lib: enable MALLOC_* for the actual tests
2013-06-29	perf-lib: fix start/stop of perf tests	Thomas Gummerer	1	-1/+2
	ae75342 test-lib: rearrange start/end of test_expect_* and test_skip changed the way tests are started/stopped, but did not update the perf tests. They were therefore giving the wrong output, because of the wrong test count. Fix this by starting and stopping the tests correctly. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Acked-by: Thomas Rast <trast@inf.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-22	Documentation: Update 'linux-2.6.git' -> 'linux.git'	W. Trevor King	1	-1/+1
	The 3.x tree has been out for a while now. The -2.6 repository name survived the initial release [1], but kernel.org now only lists 'linux.git' (for aegl as well as torvalds) [2]. [1]: http://article.gmane.org/gmane.linux.kernel/1147422 On 2011-05-30 01:47:57 GMT, Linus Torvalds wrote: > ... yes, that means that my git tree is still called > "linux-2.6.git" on kernel.org. [2]: http://git.kernel.org/cgit/ Signed-off-by: W. Trevor King <wking@tremily.us> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-20	Merge branch 'rs/discard-index-discard-array'	Junio C Hamano	1	-0/+14
	* rs/discard-index-discard-array: read-cache: free cache in discard_index read-cache: add simple performance test
2013-06-09	read-cache: add simple performance test	René Scharfe	1	-0/+14
	Add the helper test-read-cache, which can be used to call read_cache and discard_cache in a loop as well as a performance check based on it. Signed-off-by: René Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-02	Merge branch 'tr/line-log'	Junio C Hamano	1	-0/+34
	* tr/line-log: git-log(1): remove --full-line-diff description line-log: fix documentation formatting log -L: improve comments in process_all_files() log -L: store the path instead of a diff_filespec log -L: test merge of parallel modify/rename t4211: pass -M to 'git log -M -L...' test log -L: fix overlapping input ranges log -L: check range set invariants when we look it up Speed up log -L... -M log -L: :pattern:file syntax to find by funcname Implement line-history search (git log -L) Export rewrite_parents() for 'log -L' Refactor parse_loc
2013-03-28	Implement line-history search (git log -L)	Thomas Rast	1	-0/+34
	This is a rewrite of much of Bo's work, mainly in an effort to split it into smaller, easier to understand routines. The algorithm is built around the struct range_set, which encodes a series of line ranges as intervals [a,b). This is used in two contexts: * A set of lines we are tracking (which will change as we dig through history). * To encode diffs, as pairs of ranges. The main routine is range_set_map_across_diff(). It processes the diff between a commit C and some parent P. It determines which diff hunks are relevant to the ranges tracked in C, and computes the new ranges for P. The algorithm is then simply to process history in topological order from newest to oldest, computing ranges and (partial) diffs. At branch points, we need to merge the ranges we are watching. We will find that many commits do not affect the chosen ranges, and mark them TREESAME (in addition to those already filtered by pathspec limiting). Another pass of history simplification then gets rid of such commits. This is wired as an extra filtering pass in the log machinery. This currently only reduces code duplication, but should allow for other simplifications and options to be used. Finally, we hook a diff printer into the output chain. Ideally we would wire directly into the diff logic, to optionally use features like word diff. However, that will require some major reworking of the diff chain, so we completely replace the output with our own diff for now. As this was a GSoC project, and has quite some history by now, many people have helped. In no particular order, thanks go to Jakub Narebski <jnareb@gmail.com> Jens Lehmann <Jens.Lehmann@web.de> Jonathan Nieder <jrnieder@gmail.com> Junio C Hamano <gitster@pobox.com> Ramsay Jones <ramsay@ramsay1.demon.co.uk> Will Palmer <wmpalmer@gmail.com> Apologies to everyone I forgot. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-09	perf: update documentation of GIT_PERF_REPEAT_COUNT	Antoine Pelisse	1	-1/+1
	Currently the documentation of GIT_PERF_REPEAT_COUNT says the default is five while "perf-lib.sh" uses a value of three as a default. Update the documentation so that it is consistent with the code. Signed-off-by: Antoine Pelisse <apelisse@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-01	Merge branch 'ep/malloc-check-perturb'	Junio C Hamano	1	-1/+1
	Fixes a brown-paper bag bug. * ep/malloc-check-perturb: MALLOC_CHECK: enable it, unless disabled explicitly
2012-09-26	MALLOC_CHECK: enable it, unless disabled explicitly	René Scharfe	1	-1/+1
	The malloc checks in tests are currently disabled. Actually evaluate the variable for turning them off and enable them if it's unset. Also use this opportunity to give it the more descriptive and consistent name TEST_NO_MALLOC_CHECK. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-25	Merge branch 'ep/malloc-check-perturb'	Junio C Hamano	1	-0/+1
	Run our test scripts with MALLOC_CHECK_ and MALLOC_PERTURB_, the built-in memory access checking facility GNU libc has. * ep/malloc-check-perturb: MALLOC_CHECK: various clean-ups Add MALLOC_CHECK_ and MALLOC_PERTURB_ libc env to the test suite for detecting heap corruption
2012-09-17	MALLOC_CHECK: various clean-ups	Junio C Hamano	1	-0/+1
	The most important in this change is to avoid affecting anything when test-lib is used from perf-lib. It also limits the effect of the MALLOC_CHECK only to what is run inside the actual test, and uses a fixed MALLOC_PERTURB_ in order to avoid hurting repeatability of the tests. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-17	t/perf: add "trash directory" to .gitignore	Ramkumar Ramachandra	1	-2/+3
	Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-04	tests: Introduce test_seq	Michał Kiedrowicz	1	-1/+1
	Jeff King wrote: The seq command is GNU-ism, and is missing at least in older BSD releases and their derivatives, not to mention antique commercial Unixes. We already purged it in b3431bc (Don't use seq in tests, not everyone has it, 2007-05-02), but a few new instances have crept in. They went unnoticed because they are in scripts that are not run by default. Replace them with test_seq that is implemented with a Perl snippet (proposed by Jeff). This is better than inlining this snippet everywhere it's needed because it's easier to read and it's easier to change the implementation (e.g. to C) if we ever decide to remove Perl from the test suite. Note that test_seq is not a complete replacement for seq(1). It just has what we need now, in addition that it makes it possible for us to do something like "test_seq a m" if we wanted to in the future. There are also many places that do `for i in 1 2 3 ...` but I'm not sure if it's worth converting them to test_seq. That would introduce running more processes of Perl. Signed-off-by: Michał Kiedrowicz <michal.kiedrowicz@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-14	Merge branch 'nd/threaded-index-pack'	Junio C Hamano	1	-0/+40
	Enables threading in index-pack to resolve base data in parallel. By Nguyễn Thái Ngọc Duy (3) and Ramsay Jones (1) * nd/threaded-index-pack: index-pack: disable threading if NO_PREAD is defined index-pack: support multithreaded delta resolving index-pack: restructure pack processing into three main functions compat/win32/pthread.h: Add an pthread_key_delete() implementation
2012-05-07	index-pack: support multithreaded delta resolving	Nguyễn Thái Ngọc Duy	1	-0/+40
	This puts delta resolving on each base on a separate thread, one base cache per thread. Per-thread data is grouped in struct thread_local. When running with nr_threads == 1, no pthreads calls are made. The system essentially runs in non-thread mode. An experiment on a Xeon 24 core machine with git.git shows that performance does not increase proportional to the number of cores. So by default, we use maximum 3 cores. Some numbers with --threads from 1 to 16: 1..4 real 0m8.003s 0m5.307s 0m4.321s 0m3.830s user 0m7.720s 0m8.009s 0m8.133s 0m8.305s sys 0m0.224s 0m0.372s 0m0.360s 0m0.360s 5..8 real 0m3.727s 0m3.604s 0m3.332s 0m3.369s user 0m9.361s 0m9.817s 0m9.525s 0m9.769s sys 0m0.584s 0m0.624s 0m0.540s 0m0.560s 9..12 real 0m3.036s 0m3.139s 0m3.177s 0m2.961s user 0m8.977s 0m10.205s 0m9.737s 0m10.073s sys 0m0.596s 0m0.680s 0m0.684s 0m0.680s 13..16 real 0m2.985s 0m2.894s 0m2.975s 0m2.971s user 0m9.825s 0m10.573s 0m10.833s 0m11.361s sys 0m0.788s 0m0.732s 0m0.904s 0m1.016s On an Intel dual core and linux-2.6.git 1..4 real 2m37.789s 2m7.963s 2m0.920s 1m58.213s user 2m28.415s 2m52.325s 2m50.176s 2m41.187s sys 0m7.808s 0m11.181s 0m11.224s 0m10.731s Thanks Ramsay Jones for troubleshooting and support on MinGW platform. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-09	p4000: use -3000 when promising -3000	Thomas Rast	1	-1/+1
	The 'log -3000 (baseline)' test accidentally still used -1000 from an earlier version. Noticed-by: Lawrence Holding <Lawrence.Holding@cubic.com> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-08	perf: export some important test-lib variables	Thomas Rast	2	-1/+14
	The only bug right now is that $GIT_TEST_CMP is needed for test_cmp to work. However, we also export the three most important paths for tests: TEST_DIRECTORY TRASH_DIRECTORY GIT_BUILD_DIR Since they are available within test_expect_success, a future test writer may expect them to also be defined in test_perf. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-08	perf: load test-lib-functions from the correct directory	Thomas Rast	2	-1/+6
	Loading it in the subshells still referred to $TEST_DIRECTORY/.., which was only correct in preliminary versions of perf-lib.sh Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-06	perf: compare diff algorithms	Thomas Rast	1	-0/+29
	8c912ee (teach --histogram to diff, 2011-07-12) claimed histogram diff was faster than both Myers and patience. We have since incorporated a performance testing framework, so add a test that compares the various diff tasks performed in a real 'log -p' workload. This does indeed show that histogram diff slightly beats Myers, while patience is much slower than the others. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-17	Add a performance test for git-grep	Thomas Rast	1	-0/+23
	The only catch is that we don't really know what our repo contains, so we have to ignore any possible "not found" status from git-grep. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-17	Introduce a performance testing framework	Thomas Rast	9	-0/+688
	This introduces a performance testing framework under t/perf/. It tries to be as close to the test-lib.sh infrastructure as possible, and thus should be easy to get used to for git developers. The following points were considered for the implementation: 1. You usually want to compare arbitrary revisions/build trees against each other. They may not have the performance test under consideration, or even the perf-lib.sh infrastructure. To cope with this, the 'run' script lets you specify arbitrary build dirs and revisions. It even automatically builds the revisions if it doesn't have them at hand yet. 2. Usually you would not want to run all tests. It would take too long anyway. The 'run' script lets you specify which tests to run; or you can also do it manually. There is a Makefile for discoverability and 'make clean', but it is not meant for real-world use. 3. Creating test repos from scratch in every test is extremely time-consuming, and shipping or downloading such large/weird repos is out of the question. We leave this decision to the user. Two different sizes of test repos can be configured, and the scripts just copy one or more of those (using hardlinks for the object store). By default it tries to use the build tree's git.git repository. This is fairly fast and versatile. Using a copy instead of a clone preserves many properties that the user may want to test for, such as lots of loose objects, unpacked refs, etc. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>