tgif.git - Terin's Improved Git Fork

Age	Commit message (Collapse)	Author	Files	Lines
2014-07-23	Merge branch 'jk/tag-sort'	Junio C Hamano	1	-0/+5
	* jk/tag-sort: tag: support configuring --sort via .gitconfig tag: fix --sort tests to use cat<<-\EOF format
2014-07-17	tag: support configuring --sort via .gitconfig	Jacob Keller	1	-0/+5
	Add support for configuring default sort ordering for git tags. Command line option will override this configured value, using the exact same syntax. Cc: Jeff King <peff@peff.net> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-25	Merge branch 'jk/repack-pack-writebitmaps-config'	Junio C Hamano	1	-7/+10
	* jk/repack-pack-writebitmaps-config: t7700: drop explicit --no-pack-kept-objects from .keep test repack: introduce repack.writeBitmaps config option repack: simplify handling of --write-bitmap-index pack-objects: stop respecting pack.writebitmaps
2014-06-16	Merge branch 'sh/enable-preloadindex'	Junio C Hamano	1	-2/+2
	* sh/enable-preloadindex: environment.c: enable core.preloadindex by default
2014-06-16	Merge branch 'jm/format-patch-mail-sig'	Junio C Hamano	1	-0/+4
	* jm/format-patch-mail-sig: format-patch: add "--signature-file=<file>" option format-patch: make newline after signature conditional
2014-06-16	Merge branch 'jl/status-added-submodule-is-never-ignored'	Junio C Hamano	1	-2/+6
	submodule..ignore and diff.ignoresubmodules are used to ignore all submodule changes in "diff" output, but it can be confusing to apply these configuration values to status and commit. This is a backward-incompatible change, but should be so in a good way (aka bugfix). jl/status-added-submodule-is-never-ignored: commit -m: commit staged submodules regardless of ignore config status/commit: show staged submodules regardless of ignore config
2014-06-10	repack: introduce repack.writeBitmaps config option	Jeff King	1	-7/+10
	We currently have pack.writeBitmaps, which originally operated at the pack-objects level. This should really have been a repack.* option from day one. Let's give it the more sensible name, but keep the old version as a deprecated synonym. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-06	Merge branch 'jj/command-line-adjective'	Junio C Hamano	1	-5/+5
	* jj/command-line-adjective: Documentation: use "command-line" when used as a compound adjective, and fix other minor grammatical issues
2014-06-06	Merge branch 'nd/status-auto-comment-char'	Junio C Hamano	1	-0/+3
	* nd/status-auto-comment-char: commit: allow core.commentChar=auto for character auto selection config: be strict on core.commentChar
2014-06-06	Merge branch 'dk/raise-core-deltabasecachelimit'	Junio C Hamano	1	-1/+1
	The `core.deltabasecachelimit` used to default to 16 MiB , but this proved to be too small, and has been bumped to 96 MiB. * dk/raise-core-deltabasecachelimit: Bump core.deltaBaseCacheLimit to 96m
2014-06-06	Merge branch 'mm/pager-less-sans-S'	Junio C Hamano	1	-5/+10
	Since the very beginning of Git, we gave the LESS environment a default value "FRSX" when we spawn "less" as the pager. "S" (chop long lines instead of wrapping) has been removed from this default set of options, because it is more or less a personal taste thing, as opposed to others that have good justifications (i.e. "R" is very much justified because many kinds of output we produce are colored and "FX" is justified because output we produce is often shorter than a page). Existing users who prefer not to see line-wrapped output may want to set $ git config core.pager "less -S" to restore the traditional behaviour. It is expected that people find output from the most subcommands easier to read with the new default, except for "blame" which tends to produce really long lines. To override the new default only for "git blame", you can do this: $ git config pager.blame "less -S" * mm/pager-less-sans-S: pager: remove 'S' from $LESS by default
2014-06-03	environment.c: enable core.preloadindex by default	Steve Hoelzer	1	-2/+2
	Many people are on filesystems with horrible stat latency (not limited to Windows but also NFS), which core.preloadindex was designed to help. We discussed enabling it by default early in 2013 but didn't. Per http://thread.gmane.org/gmane.comp.version-control.git/219273/focus=219322 let's enable the setting by default, with the original choice of max 20 threads / min 500 paths per thread parameters. Signed-off-by: Steve Hoelzer <shoelzer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-27	format-patch: add "--signature-file=<file>" option	Jeremiah Mahler	1	-0/+4
	Add an option to format-patch for reading a signature from a file. $ git format-patch -1 --signature-file=$HOME/.signature The config variable `format.signaturefile` can also be used to make this the default. $ git config format.signaturefile $HOME/.signature $ git format-patch -1 Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-21	Documentation: use "command-line" when used as a compound adjective, and fix ↵	Jason St. John	1	-5/+5
	other minor grammatical issues Signed-off-by: Jason St. John <jstjohn@purdue.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-19	commit: allow core.commentChar=auto for character auto selection	Nguyễn Thái Ngọc Duy	1	-0/+3
	When core.commentChar is "auto", the comment char starts with '#' as in default but if it's already in the prepared message, find another char in a small subset. This should stop surprises because git strips some lines unexpectedly. Note that git is not smart enough to recognize '#' as the comment char in custom templates and convert it if the final comment char is different. It thinks '#' lines in custom templates as part of the commit message. So don't use this with custom templates. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-07	pager: remove 'S' from $LESS by default	Matthieu Moy	1	-5/+10
	By default, Git used to set $LESS to -FRSX if $LESS was not set by the user. The FRX flags actually make sense for Git (F and X because sometimes the output Git pipes to less is short, and R because Git pipes colored output). The S flag (chop long lines), on the other hand, is not related to Git and is a matter of user preference. Git should not decide for the user to change LESS's default. More specifically, the S flag harms users who review untrusted code within a pager, since a patch looking like: -old code; +new good code; [... lots of tabs ...] malicious code; would appear identical to: -old code; +new good code; Users who prefer the old behavior can still set the $LESS environment variable to -FRSX explicitly, or set core.pager to 'less -S'. The documentation in config.txt is made a bit longer to keep both an example setting the 'S' flag (needed to recover the old behavior) and an example showing how to unset a flag set by Git. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-06	Bump core.deltaBaseCacheLimit to 96m	David Kastrup	1	-1/+1
	The default of 16m causes serious thrashing for large delta chains combined with large files. Here are some benchmarks (pu variant of git blame): time git blame -C src/xdisp.c >/dev/null for a repository of Emacs repacked with git gc --aggressive (v1.9, resulting in a window size of 250) located on an SSD drive. The file in question has about 30000 lines, 1Mb of size, and a history with about 2500 commits. 16m (previous default): real 3m33.936s user 2m15.396s sys 1m17.352s 32m: real 3m1.319s user 2m8.660s sys 0m51.904s 64m: real 2m20.636s user 1m55.780s sys 0m23.964s 96m: real 2m5.668s user 1m50.784s sys 0m14.288s 128m: real 2m4.337s user 1m50.764s sys 0m12.832s 192m: real 2m3.567s user 1m49.508s sys 0m13.312s Signed-off-by: David Kastrup <dak@gnu.org> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-21	Documentation: git-gui: describe gui.displayuntracked	Max Kirillov	1	-0/+4
	Signed-off-by: Max Kirillov <max@max630.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-08	Merge branch 'jl/nor-or-nand-and'	Junio C Hamano	1	-3/+3
	Eradicate mistaken use of "nor" (that is, essentially "nor" used not in "neither A nor B" ;-)) from in-code comments, command output strings, and documentations. * jl/nor-or-nand-and: code and test: fix misuses of "nor" comments: fix misuses of "nor" contrib: fix misuses of "nor" Documentation: fix misuses of "nor"
2014-04-07	status/commit: show staged submodules regardless of ignore config	Jens Lehmann	1	-2/+6
	Currently setting submodule.<name>.ignore and/or diff.ignoreSubmodules to "all" suppresses all output of submodule changes for the diff family, status and commit. For status and commit this is really confusing, as it even when the user chooses to record a new commit for an ignored submodule by adding it manually this change won't show up under the to-be-committed changes. To add insult to injury, a later "git commit" will error out with "nothing to commit" when only ignored submodules are staged. Fix that by making wt_status always print staged submodule changes, no matter what ignore settings are configured. The only exception is when the user explicitly uses the "--ignore-submodules=all" command line option, in that case the submodule output is still suppressed. This also makes "git commit" work again when only modifications of ignored submodules are staged, as that command uses the "commitable" member of the wt_status struct to determine if staged changes are present. But this only happens when the commit command uses the wt_status* functions to produce status output for human consumption (when forking an editor or with --dry-run), in all other cases (e.g. when run in a script with '-m') another code path is taken which uses index_differs_from() to determine if any changes are staged which still ignores submodules according to their configuration. This will be fixed in a follow-up commit. Change t7508 to reflect this new behavior and add three new tests to show that a single staged submodule configured to be ignored will be committed when the status output is generated and won't be if not. Also update the documentation of the ignore config options accordingly. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-03	Merge branch 'nd/gc-aggressive'	Junio C Hamano	1	-0/+5
	Allow tweaking the maximum length of the delta-chain produced by "gc --aggressive". * nd/gc-aggressive: environment.c: fix constness for odb_pack_keep() gc --aggressive: make --depth configurable
2014-03-31	Merge branch 'ca/doc-config-third-party'	Junio C Hamano	1	-2/+7
	* ca/doc-config-third-party: config.txt: third-party tools may and do use their own variables
2014-03-31	Documentation: fix misuses of "nor"	Justin Lebar	1	-3/+3
	Signed-off-by: Justin Lebar <jlebar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-31	gc --aggressive: make --depth configurable	Nguyễn Thái Ngọc Duy	1	-0/+5
	When 1c192f3 (gc --aggressive: make it really aggressive - 2007-12-06) made --depth=250 the default value, it didn't really explain the reason behind, especially the pros and cons of --depth=250. An old mail from Linus below explains it at length. Long story short, --depth=250 is a disk saver and a performance killer. Not everybody agrees on that aggressiveness. Let the user configure it. From: Linus Torvalds <torvalds@linux-foundation.org> Subject: Re: [PATCH] gc --aggressive: make it really aggressive Date: Thu, 6 Dec 2007 08:19:24 -0800 (PST) Message-ID: <alpine.LFD.0.9999.0712060803430.13796@woody.linux-foundation.org> Gmane-URL: http://article.gmane.org/gmane.comp.gcc.devel/94637 On Thu, 6 Dec 2007, Harvey Harrison wrote: > > 7:41:25elapsed 86%CPU Heh. And this is why you want to do it exactly once, and then just export the end result for others ;) > -r--r--r-- 1 hharrison hharrison 324094684 2007-12-06 07:26 pack-1d46...pack But yeah, especially if you allow longer delta chains, the end result can be much smaller (and what makes the one-time repack more expensive is the window size, not the delta chain - you could make the delta chains longer with no cost overhead at packing time) HOWEVER. The longer delta chains do make it potentially much more expensive to then use old history. So there's a trade-off. And quite frankly, a delta depth of 250 is likely going to cause overflows in the delta cache (which is only 256 entries in size and it's a hash, so it's going to start having hash conflicts long before hitting the 250 depth limit). So when I said "--depth=250 --window=250", I chose those numbers more as an example of extremely aggressive packing, and I'm not at all sure that the end result is necessarily wonderfully usable. It's going to save disk space (and network bandwidth - the delta's will be re-used for the network protocol too!), but there are definitely downsides too, and using long delta chains may simply not be worth it in practice. (And some of it might just want to have git tuning, ie if people think that long deltas are worth it, we could easily just expand on the delta hash, at the cost of some more memory used!) That said, the good news is that working with new history will not be affected negatively, and if you want to be _really_ sneaky, there are ways to say "create a pack that contains the history up to a version one year ago, and be very aggressive about those old versions that we still want to have around, but do a separate pack for newer stuff using less aggressive parameters" So this is something that can be tweaked, although we don't really have any really nice interfaces for stuff like that (ie the git delta cache size is hardcoded in the sources and cannot be set in the config file, and the "pack old history more aggressively" involves some manual scripting and knowing how "git pack-objects" works rather than any nice simple command line switch). So the thing to take away from this is: - git is certainly flexible as hell - .. but to get the full power you may need to tweak things - .. happily you really only need to have one person to do the tweaking, and the tweaked end results will be available to others that do not need to know/care. And whether the difference between 320MB and 500MB is worth any really involved tweaking (considering the potential downsides), I really don't know. Only testing will tell. Linus Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-21	config.txt: third-party tools may and do use their own variables	Chris Angelico	1	-2/+7
	Signed-off-by: Chris Angelico <rosuav@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-18	Merge branch 'jk/repack-pack-keep-objects'	Junio C Hamano	1	-0/+7
	* jk/repack-pack-keep-objects: repack: add `repack.packKeptObjects` config var
2014-03-14	Merge branch 'sr/add--interactive-term-readkey'	Junio C Hamano	1	-1/+1
	* sr/add--interactive-term-readkey: git-add--interactive: warn if module for interactive.singlekey is missing git-config: document interactive.singlekey requires Term::ReadKey
2014-03-14	Merge branch 'sg/archive-restrict-remote'	Junio C Hamano	1	-0/+7
	Allow loosening remote "git archive" invocation security check that refuses to serve tree-ish not at the tip of any ref. * sg/archive-restrict-remote: add uploadarchive.allowUnreachable option docs: clarify remote restrictions for git-upload-archive
2014-03-14	Merge branch 'tg/index-v4-format'	Junio C Hamano	1	-0/+4
	* tg/index-v4-format: read-cache: add index.version config variable test-lib: allow setting the index format version introduce GIT_INDEX_VERSION environment variable
2014-03-07	Merge branch 'jc/push-2.0-default-to-simple'	Junio C Hamano	1	-10/+4
	Finally update the "git push" default behaviour to "simple".
2014-03-05	Merge branch 'nd/daemonize-gc'	Junio C Hamano	1	-0/+4
	Allow running "gc --auto" in the background. * nd/daemonize-gc: gc: config option for running --auto in background daemon: move daemonize() to libgit.a
2014-03-03	git-config: document interactive.singlekey requires Term::ReadKey	Simon Ruderich	1	-1/+1
	Most distributions don't require Term::ReadKey as dependency, leaving the user to wonder why the setting doesn't work. Signed-off-by: Simon Ruderich <simon@ruderich.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-03	repack: add `repack.packKeptObjects` config var	Jeff King	1	-0/+7
	The git-repack command always passes `--honor-pack-keep` to pack-objects. This has traditionally been a good thing, as we do not want to duplicate those objects in a new pack, and we are not going to delete the old pack. However, when bitmaps are in use, it is important for a full repack to include all reachable objects, even if they may be duplicated in a .keep pack. Otherwise, we cannot generate the bitmaps, as the on-disk format requires the set of objects in the pack to be fully closed. Even if the repository does not generally have .keep files, a simultaneous push could cause a race condition in which a .keep file exists at the moment of a repack. The repack may try to include those objects in one of two situations: 1. The pushed .keep pack contains objects that were already in the repository (e.g., blobs due to a revert of an old commit). 2. Receive-pack updates the refs, making the objects reachable, but before it removes the .keep file, the repack runs. In either case, we may prefer to duplicate some objects in the new, full pack, and let the next repack (after the .keep file is cleaned up) take care of removing them. This patch introduces both a command-line and config option to disable the `--honor-pack-keep` option. By default, it is triggered when pack.writeBitmaps (or `--write-bitmap-index` is turned on), but specifying it explicitly can override the behavior (e.g., in cases where you prefer .keep files to bitmaps, but only when they are present). Note that this option just disables the pack-objects behavior. We still leave packs with a .keep in place, as we do not necessarily know that we have duplicated all of their objects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-28	add uploadarchive.allowUnreachable option	Scott J. Goldman	1	-0/+7
	In commit ee27ca4, we started restricting remote git-archive invocations to only accessing reachable commits. This matches what upload-pack allows, but does restrict some useful cases (e.g., HEAD:foo). We loosened this in 0f544ee, which allows `foo:bar` as long as `foo` is a ref tip. However, that still doesn't allow many useful things, like: 1. Commits accessible from a ref, like `foo^:bar`, which are reachable 2. Arbitrary sha1s, even if they are reachable. We can do a full object-reachability check for these cases, but it can be quite expensive if the client has sent us the sha1 of a tree; we have to visit every sub-tree of every commit in the worst case. Let's instead give site admins an escape hatch, in case they prefer the more liberal behavior. For many sites, the full object database is public anyway (e.g., if you allow dumb walker access), or the site admin may simply decide the security/convenience tradeoff is not worth it. This patch adds a new config option to disable the restrictions added in ee27ca4. It defaults to off, meaning there is no change in behavior by default. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-27	Merge branch 'jk/pack-bitmap'	Junio C Hamano	1	-0/+25
	Borrow the bitmap index into packfiles from JGit to speed up enumeration of objects involved in a commit range without having to fully traverse the history. * jk/pack-bitmap: (26 commits) ewah: unconditionally ntohll ewah data ewah: support platforms that require aligned reads read-cache: use get_be32 instead of hand-rolled ntoh_l block-sha1: factor out get_be and put_be wrappers do not discard revindex when re-preparing packfiles pack-bitmap: implement optional name_hash cache t/perf: add tests for pack bitmaps t: add basic bitmap functionality tests count-objects: recognize .bitmap in garbage-checking repack: consider bitmaps when performing repacks repack: handle optional files created by pack-objects repack: turn exts array into array-of-struct repack: stop using magic number for ARRAY_SIZE(exts) pack-objects: implement bitmap writing rev-list: add bitmap mode to speed up object lists pack-objects: use bitmaps when packing objects pack-objects: split add_object_entry pack-bitmap: add support for bitmap indexes documentation: add documentation for the bitmap format ewah: compressed bitmap implementation ...
2014-02-27	Merge branch 'da/pull-ff-configuration'	Junio C Hamano	1	-0/+10
	"git pull" learned to pay attention to pull.ff configuration variable. * da/pull-ff-configuration: pull: add --ff-only to the help text pull: add pull.ff configuration
2014-02-27	Merge branch 'nv/commit-gpgsign-config'	Junio C Hamano	1	-0/+8
	Introduce commit.gpgsign configuration variable to force every commit to be GPG signed. The variable cannot be overriden from the command line of some of the commands that create commits except for "git commit" and "git commit-tree", but I am not convinced that it is a good idea to sprinkle support for --no-gpg-sign everywhere, which in turn means that this configuration variable may not be such a good idea. * nv/commit-gpgsign-config: test the commit.gpgsign config option commit-tree: add and document --no-gpg-sign commit-tree: add the commit.gpgsign option to sign all commits
2014-02-24	commit-tree: add the commit.gpgsign option to sign all commits	Nicolas Vigier	1	-0/+8
	If you want to GPG sign all your commits, you have to add the -S option all the time. The commit.gpgsign config option allows to sign all commits automatically. Signed-off-by: Nicolas Vigier <boklm@mars-attacks.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24	read-cache: add index.version config variable	Thomas Gummerer	1	-0/+4
	Add a config variable that allows setting the default index version when initializing a new index file. Similar to the GIT_INDEX_VERSION environment variable this only affects new index files. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-10	gc: config option for running --auto in background	Nguyễn Thái Ngọc Duy	1	-0/+4
	`gc --auto` takes time and can block the user temporarily (but not any less annoyingly). Make it run in background on systems that support it. The only thing lost with running in background is printouts. But gc output is not really interesting. You can keep it in foreground by changing gc.autodetach. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-17	Merge branch 'nd/shallow-clone'	Junio C Hamano	1	-0/+4
	Fetching from a shallow-cloned repository used to be forbidden, primarily because the codepaths involved were not carefully vetted and we did not bother supporting such usage. This attempts to allow object transfer out of a shallow-cloned repository in a controlled way (i.e. the receiver become a shallow repository with truncated history). * nd/shallow-clone: (31 commits) t5537: fix incorrect expectation in test case 10 shallow: remove unused code send-pack.c: mark a file-local function static git-clone.txt: remove shallow clone limitations prune: clean .git/shallow after pruning objects clone: use git protocol for cloning shallow repo locally send-pack: support pushing from a shallow clone via http receive-pack: support pushing to a shallow clone via http smart-http: support shallow fetch/clone remote-curl: pass ref SHA-1 to fetch-pack as well send-pack: support pushing to a shallow clone receive-pack: allow pushes that update .git/shallow connected.c: add new variant that runs with --shallow-file add GIT_SHALLOW_FILE to propagate --shallow-file to subprocesses receive/send-pack: support pushing from a shallow clone receive-pack: reorder some code in unpack() fetch: add --update-shallow to accept refs that update .git/shallow upload-pack: make sure deepening preserves shallow roots fetch: support fetching from a shallow repository clone: support remote shallow repository ...
2014-01-15	pull: add pull.ff configuration	David Aguilar	1	-0/+10
	Add a `pull.ff` configuration option that is analogous to the `merge.ff` option. This allows us to control the fast-forward behavior for pull-initiated merges only. Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-13	Merge branch 'jn/pager-lv-default-env'	Junio C Hamano	1	-0/+4
	Just like we give a reasonable default for "less" via the LESS environment variable, specify a reasonable default for "lv" via the "LV" environment variable when spawning the pager. * jn/pager-lv-default-env: pager: set LV=-c alongside LESS=FRSX
2014-01-07	pager: set LV=-c alongside LESS=FRSX	Jonathan Nieder	1	-0/+4
	On systems with lv configured as the preferred pager (i.e., DEFAULT_PAGER=lv at build time, or PAGER=lv exported in the environment) git commands that use color show control codes instead of color in the pager: $ git diff ^[[1mdiff --git a/.mailfilter b/.mailfilter^[[m ^[[1mindex aa4f0b2..17e113e 100644^[[m ^[[1m--- a/.mailfilter^[[m ^[[1m+++ b/.mailfilter^[[m ^[[36m@@ -1,11 +1,58 @@^[[m "less" avoids this problem because git uses the LESS environment variable to pass the -R option ('output ANSI color escapes in raw form') by default. Use the LV environment variable to pass 'lv' the -c option ('allow ANSI escape sequences for text decoration / color') to fix it for lv, too. Noticed when the default value for color.ui flipped to 'auto' in v1.8.4-rc0~36^2~1 (2013-06-10). Reported-by: Olaf Meeuwissen <olaf.meeuwissen@avasys.jp> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30	pack-bitmap: implement optional name_hash cache	Vicent Marti	1	-0/+11
	When we use pack bitmaps rather than walking the object graph, we end up with the list of objects to include in the packfile, but we do not know the path at which any tree or blob objects would be found. In a recently packed repository, this is fine. A fetch would use the paths only as a heuristic in the delta compression phase, and a fully packed repository should not need to do much delta compression. As time passes, though, we may acquire more objects on top of our large bitmapped pack. If clients fetch frequently, then they never even look at the bitmapped history, and all works as usual. However, a client who has not fetched since the last bitmap repack will have "have" tips in the bitmapped history, but "want" newer objects. The bitmaps themselves degrade gracefully in this circumstance. We manually walk the more recent bits of history, and then use bitmaps when we hit them. But we would also like to perform delta compression between the newer objects and the bitmapped objects (both to delta against what we know the user already has, but also between "new" and "old" objects that the user is fetching). The lack of pathnames makes our delta heuristics much less effective. This patch adds an optional cache of the 32-bit name_hash values to the end of the bitmap file. If present, a reader can use it to match bitmapped and non-bitmapped names during delta compression. Here are perf results for p5310: Test origin/master HEAD^ HEAD ------------------------------------------------------------------------------------------------- 5310.2: repack to disk 36.81(37.82+1.43) 47.70(48.74+1.41) +29.6% 47.75(48.70+1.51) +29.7% 5310.3: simulated clone 30.78(29.70+2.14) 1.08(0.97+0.10) -96.5% 1.07(0.94+0.12) -96.5% 5310.4: simulated fetch 3.16(6.10+0.08) 3.54(10.65+0.06) +12.0% 1.70(3.07+0.06) -46.2% 5310.6: partial bitmap 36.76(43.19+1.81) 6.71(11.25+0.76) -81.7% 4.08(6.26+0.46) -88.9% You can see that the time spent on an incremental fetch goes down, as our delta heuristics are able to do their work. And we save time on the partial bitmap clone for the same reason. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30	pack-objects: implement bitmap writing	Vicent Marti	1	-0/+8
	This commit extends more the functionality of `pack-objects` by allowing it to write out a `.bitmap` index next to any written packs, together with the `.idx` index that currently gets written. If bitmap writing is enabled for a given repository (either by calling `pack-objects` with the `--write-bitmap-index` flag or by having `pack.writebitmaps` set to `true` in the config) and pack-objects is writing a packfile that would normally be indexed (i.e. not piping to stdout), we will attempt to write the corresponding bitmap index for the packfile. Bitmap index writing happens after the packfile and its index has been successfully written to disk (`finish_tmp_packfile`). The process is performed in several steps: 1. `bitmap_writer_set_checksum`: this call stores the partial checksum for the packfile being written; the checksum will be written in the resulting bitmap index to verify its integrity 2. `bitmap_writer_build_type_index`: this call uses the array of `struct object_entry` that has just been sorted when writing out the actual packfile index to disk to generate 4 type-index bitmaps (one for each object type). These bitmaps have their nth bit set if the given object is of the bitmap's type. E.g. the nth bit of the Commits bitmap will be 1 if the nth object in the packfile index is a commit. This is a very cheap operation because the bitmap writing code has access to the metadata stored in the `struct object_entry` array, and hence the real type for each object in the packfile. 3. `bitmap_writer_reuse_bitmaps`: if there exists an existing bitmap index for one of the packfiles we're trying to repack, this call will efficiently rebuild the existing bitmaps so they can be reused on the new index. All the existing bitmaps will be stored in a `reuse` hash table, and the commit selection phase will prioritize these when selecting, as they can be written directly to the new index without having to perform a revision walk to fill the bitmap. This can greatly speed up the repack of a repository that already has bitmaps. 4. `bitmap_writer_select_commits`: if bitmap writing is enabled for a given `pack-objects` run, the sequence of commits generated during the Counting Objects phase will be stored in an array. We then use that array to build up the list of selected commits. Writing a bitmap in the index for each object in the repository would be cost-prohibitive, so we use a simple heuristic to pick the commits that will be indexed with bitmaps. The current heuristics are a simplified version of JGit's original implementation. We select a higher density of commits depending on their age: the 100 most recent commits are always selected, after that we pick 1 commit of each 100, and the gap increases as the commits grow older. On top of that, we make sure that every single branch that has not been merged (all the tips that would be required from a clone) gets their own bitmap, and when selecting commits between a gap, we tend to prioritize the commit with the most parents. Do note that there is no right/wrong way to perform commit selection; different selection algorithms will result in different commits being selected, but there's no such thing as "missing a commit". The bitmap walker algorithm implemented in `prepare_bitmap_walk` is able to adapt to missing bitmaps by performing manual walks that complete the bitmap: the ideal selection algorithm, however, would select the commits that are more likely to be used as roots for a walk in the future (e.g. the tips of each branch, and so on) to ensure a bitmap for them is always available. 5. `bitmap_writer_build`: this is the computationally expensive part of bitmap generation. Based on the list of commits that were selected in the previous step, we perform several incremental walks to generate the bitmap for each commit. The walks begin from the oldest commit, and are built up incrementally for each branch. E.g. consider this dag where A, B, C, D, E, F are the selected commits, and a, b, c, e are a chunk of simplified history that will not receive bitmaps. A---a---B--b--C--c--D \ E--e--F We start by building the bitmap for A, using A as the root for a revision walk and marking all the objects that are reachable until the walk is over. Once this bitmap is stored, we reuse the bitmap walker to perform the walk for B, assuming that once we reach A again, the walk will be terminated because A has already been SEEN on the previous walk. This process is repeated for C, and D, but when we try to generate the bitmaps for E, we can reuse neither the current walk nor the bitmap we have generated so far. What we do now is resetting both the walk and clearing the bitmap, and performing the walk from scratch using E as the origin. This new walk, however, does not need to be completed. Once we hit B, we can lookup the bitmap we have already stored for that commit and OR it with the existing bitmap we've composed so far, allowing us to limit the walk early. After all the bitmaps have been generated, another iteration through the list of commits is performed to find the best XOR offsets for compression before writing them to disk. Because of the incremental nature of these bitmaps, XORing one of them with its predecesor results in a minimal "bitmap delta" most of the time. We can write this delta to the on-disk bitmap index, and then re-compose the original bitmaps by XORing them again when loaded. This is a phase very similar to pack-object's `find_delta` (using bitmaps instead of objects, of course), except the heuristics have been greatly simplified: we only check the 10 bitmaps before any given one to find best compressing one. This gives good results in practice, because there is locality in the ordering of the objects (and therefore bitmaps) in the packfile. 6. `bitmap_writer_finish`: the last step in the process is serializing to disk all the bitmap data that has been generated in the two previous steps. The bitmap is written to a tmp file and then moved atomically to its final destination, using the same process as `pack-write.c:write_idx_file`. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30	pack-objects: use bitmaps when packing objects	Vicent Marti	1	-0/+6
	In this patch, we use the bitmap API to perform the `Counting Objects` phase in pack-objects, rather than a traditional walk through the object graph. For a reasonably-packed large repo, the time to fetch and clone is often dominated by the full-object revision walk during the Counting Objects phase. Using bitmaps can reduce the CPU time required on the server (and therefore start sending the actual pack data with less delay). For bitmaps to be used, the following must be true: 1. We must be packing to stdout (as a normal `pack-objects` from `upload-pack` would do). 2. There must be a .bitmap index containing at least one of the "have" objects that the client is asking for. 3. Bitmaps must be enabled (they are enabled by default, but can be disabled by setting `pack.usebitmaps` to false, or by using `--no-use-bitmap-index` on the command-line). If any of these is not true, we fall back to doing a normal walk of the object graph. Here are some sample timings from a full pack of `torvalds/linux` (i.e. something very similar to what would be generated for a clone of the repository) that show the speedup produced by various methods: [existing graph traversal] $ time git pack-objects --all --stdout --no-use-bitmap-index \ </dev/null >/dev/null Counting objects: 3237103, done. Compressing objects: 100% (508752/508752), done. Total 3237103 (delta 2699584), reused 3237103 (delta 2699584) real 0m44.111s user 0m42.396s sys 0m3.544s [bitmaps only, without partial pack reuse; note that pack reuse is automatic, so timing this required a patch to disable it] $ time git pack-objects --all --stdout </dev/null >/dev/null Counting objects: 3237103, done. Compressing objects: 100% (508752/508752), done. Total 3237103 (delta 2699584), reused 3237103 (delta 2699584) real 0m5.413s user 0m5.604s sys 0m1.804s [bitmaps with pack reuse (what you get with this patch)] $ time git pack-objects --all --stdout </dev/null >/dev/null Reusing existing pack: 3237103, done. Total 3237103 (delta 0), reused 0 (delta 0) real 0m1.636s user 0m1.460s sys 0m0.172s Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-10	receive-pack: allow pushes that update .git/shallow	Nguyễn Thái Ngọc Duy	1	-0/+4
	The basic 8 steps to update .git/shallow does not fully apply here because the user may choose to accept just a few refs (while fetch always accepts all refs). The steps are modified a bit. 1-6. same as before. After calling assign_shallow_commits_to_refs at step 6, each shallow commit has a bitmap that marks all refs that require it. 7. mark all "ours" shallow commits that are reachable from any refs. We will need to do the original step 7 on them later. 8. go over all shallow commit bitmaps, mark refs that require new shallow commits. 9. setup a strict temporary shallow file to plug all the holes, even if it may cut some of our history short. This file is used by all hooks. The hooks could use --shallow-file=$GIT_DIR/shallow to overcome this and reach everything in current repo. 10. go over the new refs one by one. For each ref, do the reachability test if it needs a shallow commit on the list from step 7. Remove it if it's reachable from our refs. Gather all required shallow commits, run check_everything_connected() with the new ref, then install them to .git/shallow. This mode is disabled by default and can be turned on with receive.shallowupdate Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-30	fetch --prune: prune only based on explicit refspecs	Michael Haggerty	1	-2/+2
	The old behavior of "fetch --prune" was to prune whatever was being fetched. In particular, "fetch --prune --tags" caused tags not only to be fetched, but also to be pruned. This is inappropriate because there is only one tags namespace that is shared among the local repository and all remotes. Therefore, if the user defines a local tag and then runs "git fetch --prune --tags", then the local tag is deleted. Moreover, "--prune" and "--tags" can also be configured via fetch.prune / remote.<name>.prune and remote.<name>.tagopt, making it even less obvious that an invocation of "git fetch" could result in tag lossage. Since the command "git remote update" invokes "git fetch", it had the same problem. The command "git remote prune", on the other hand, disregarded the setting of remote.<name>.tagopt, and so its behavior was inconsistent with that of the other commands. So the old behavior made it too easy to lose tags. To fix this problem, change "fetch --prune" to prune references based only on refspecs specified explicitly by the user, either on the command line or via remote.<name>.fetch. Thus, tags are no longer made subject to pruning by the --tags option or the remote.<name>.tagopt setting. However, tags are still subject to pruning if they are fetched as part of a refspec, and that is good. For example: * On the command line, git fetch --prune 'refs/tags/:refs/tags/' causes tags, and only tags, to be fetched and pruned, and is therefore a simple way for the user to get the equivalent of the old behavior of "--prune --tag". * For a remote that was configured with the "--mirror" option, the configuration is set to include [remote "name"] fetch = +refs/:refs/ , which causes tags to be subject to pruning along with all other references. This is the behavior that will typically be desired for a mirror. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-23	Merge branch 'po/dot-url'	Junio C Hamano	1	-2/+4
	Explain how '.' can be used to refer to the "current repository" in the documentation. * po/dot-url: doc/cli: make "dot repository" an independent bullet point config doc: update dot-repository notes doc: command line interface (cli) dot-repository dwimmery