summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-07-12pack-revindex: radix-sort the revindexLibravatar Jeff King1-5/+95
The pack revindex stores the offsets of the objects in the pack in sorted order, allowing us to easily find the on-disk size of each object. To compute it, we populate an array with the offsets from the sha1-sorted idx file, and then use qsort to order it by offsets. That does O(n log n) offset comparisons, and profiling shows that we spend most of our time in cmp_offset. However, since we are sorting on a simple off_t, we can use numeric sorts that perform better. A radix sort can run in O(k*n), where k is the number of "digits" in our number. For a 64-bit off_t, using 16-bit "digits" gives us k=4. On the linux.git repo, with about 3M objects to sort, this yields a 400% speedup. Here are the best-of-five numbers for running echo HEAD | git cat-file --batch-check="%(objectsize:disk) on a fully packed repository, which is dominated by time spent building the pack revindex: before after real 0m0.834s 0m0.204s user 0m0.788s 0m0.164s sys 0m0.040s 0m0.036s This matches our algorithmic expectations. log(3M) is ~21.5, so a traditional sort is ~21.5n. Our radix sort runs in k*n, where k is the number of radix digits. In the worst case, this is k=4 for a 64-bit off_t, but we can quit early when the largest value to be sorted is smaller. For any repository under 4G, k=2. Our algorithm makes two passes over the list per radix digit, so we end up with 4n. That should yield ~5.3x speedup. We see 4x here; the difference is probably due to the extra bucket book-keeping the radix sort has to do. On a smaller repo, the difference is less impressive, as log(n) is smaller. For git.git, with 173K objects (but still k=2), we see a 2.7x improvement: before after real 0m0.046s 0m0.017s user 0m0.036s 0m0.012s sys 0m0.008s 0m0.000s On even tinier repos (e.g., a few hundred objects), the speedup goes away entirely, as the small advantage of the radix sort gets erased by the book-keeping costs (and at those sizes, the cost to generate the the rev-index gets lost in the noise anyway). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Brandon Casey <drafnel@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-12pack-revindex: use unsigned to store number of objectsLibravatar Jeff King1-4/+4
A packfile may have up to 2^32-1 objects in it, so the "right" data type to use is uint32_t. We currently use a signed int, which means that we may behave incorrectly for packfiles with more than 2^31-1 objects on 32-bit systems. Nobody has noticed because having 2^31 objects is pretty insane. The linux.git repo has on the order of 2^22 objects, which is hundreds of times smaller than necessary to trigger the bug. Let's bump this up to an "unsigned". On 32-bit systems, this gives us the correct data-type, and on 64-bit systems, it is probably more efficient to use the native "unsigned" than a true uint32_t. While we're at it, we can fix the binary search not to overflow in such a case if our unsigned is 32 bits. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-12cat-file: split --batch input lines on whitespaceLibravatar Jeff King3-3/+34
If we get an input line to --batch or --batch-check that looks like "HEAD foo bar", we will currently feed the whole thing to get_sha1(). This means that to use --batch-check with `rev-list --objects`, one must pre-process the input, like: git rev-list --objects HEAD | cut -d' ' -f1 | git cat-file --batch-check Besides being more typing and slightly less efficient to invoke `cut`, the result loses information: we no longer know which path each object was found at. This patch teaches cat-file to split input lines at the first whitespace. Everything to the left of the whitespace is considered an object name, and everything to the right is made available as the %(reset) atom. So you can now do: git rev-list --objects HEAD | git cat-file --batch-check='%(objectsize) %(rest)' to collect object sizes at particular paths. Even if %(rest) is not used, we always do the whitespace split (which means you can simply eliminate the `cut` command from the first example above). This whitespace split is backwards compatible for any reasonable input. Object names cannot contain spaces, so any input with spaces would have resulted in a "missing" line. The only input hurt is if somebody really expected input of the form "HEAD is a fine-looking ref!" to fail; it will now parse HEAD, and make "is a fine-looking ref!" available as %(rest). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-12cat-file: add %(objectsize:disk) format atomLibravatar Jeff King2-0/+24
This atom is just like %(objectsize), except that it shows the on-disk size of the object rather than the object's true size. In other words, it makes the "disk_size" query of sha1_object_info_extended available via the command-line. This can be used for rough attribution of disk usage to particular refs, though see the caveats in the documentation. This patch does not include any tests, as the exact numbers returned are volatile and subject to zlib and packing decisions. We cannot even reliably guarantee that the on-disk size is smaller than the object content (though in general this should be the case for non-trivial objects). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-12cat-file: add --batch-check=<format>Libravatar Jeff King3-26/+142
The `cat-file --batch-check` command can be used to quickly get information about a large number of objects. However, it provides a fixed set of information. This patch adds an optional <format> option to --batch-check to allow a caller to specify which items they are interested in, and in which order to output them. This is not very exciting for now, since we provide the same limited set that you could already get. However, it opens the door to adding new format items in the future without breaking backwards compatibility (or forcing callers to pay the cost to calculate uninteresting items). Since the --batch option shares code with --batch-check, it receives the same feature, though it is less likely to be of interest there. The format atom names are chosen to match their counterparts in for-each-ref. Though we do not (yet) share any code with for-each-ref's formatter, this keeps the interface as consistent as possible, and may help later on if the implementations are unified. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-11cat-file: refactor --batch option parsingLibravatar Jeff King1-18/+38
We currently use an int to tell us whether --batch parsing is on, and if so, whether we should print the full object contents. Let's instead factor this into a struct, filled in by callback, which will make further batch-related options easy to add. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-11cat-file: teach --batch to stream blob objectsLibravatar Jeff King1-13/+28
The regular "git cat-file -p" and "git cat-file blob" code paths already learned to stream large blobs. Let's do the same here. Note that this means we look up the type and size before making a decision of whether to load the object into memory or stream (just like the "-p" code path does). That can lead to extra work, but it should be dwarfed by the cost of actually accessing the object itself. In my measurements, there was a 1-2% slowdown when using "--batch" on a large number of objects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-11t1006: modernize output comparisonsLibravatar Jeff King1-43/+18
In modern tests, we typically put output into a file and compare it with test_cmp. This is nicer than just comparing via "test", and much shorter than comparing via "test" and printing a custom message. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-07teach sha1_object_info_extended a "disk_size" queryLibravatar Jeff King2-4/+17
Using sha1_object_info_extended, a caller can find out the type of an object, its size, and information about where it is stored. In addition to the object's "true" size, it can also be useful to know the size that the object takes on disk (e.g., to generate statistics about which refs consume space). This patch adds a "disk_sizep" field to "struct object_info", and fills it in during sha1_object_info_extended if it is non-NULL. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-07zero-initialize object_info structsLibravatar Jeff King2-2/+2
The sha1_object_info_extended function expects the caller to provide a "struct object_info" which contains pointers to "query" items that will be filled in. The purpose of providing pointers rather than storing the response directly in the struct is so that callers can choose not to incur the expense in finding particular fields that they do not care about. Right now the only query item is "sizep", and all callers set it explicitly to choose whether or not to query it; they can then leave the rest of the struct uninitialized. However, as we add new query items, each caller will have to be updated to explicitly turn off the new ones (by setting them to NULL). Instead, let's teach each caller to zero-initialize the struct, so that they do not have to learn about each new query item added. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-24Git 1.8.3Libravatar Junio C Hamano1-11/+11
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-21remote-hg: fix order of configuration commentsLibravatar Felipe Contreras1-3/+3
The other configurations were added in the wrong place. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-21remote-hg: trivial configuration note cleanupLibravatar Felipe Contreras1-1/+1
Follow the style of the previous configurations. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-21completion: regression fix for zshLibravatar Felipe Contreras1-1/+1
zsh completion wrapper doesn't reimplement __gitcompadd(). Although it should be trivial to do that, let's use __gitcomp_nl() which achieves exactly the same thing, specially since the suffix ($4) has to be empty. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-20Merge git://git.bogomips.org/git-svnLibravatar Junio C Hamano3-6/+97
* git://git.bogomips.org/git-svn: git-svn: introduce --parents parameter for commands branch and tag git-svn: clarify explanation of --destination argument git-svn: multiple fetch/branches/tags keys are supported
2013-05-20git-svn: introduce --parents parameter for commands branch and tagLibravatar Tobias Schulte3-1/+71
This parameter is equivalent to the parameter --parents on svn cp commands and is useful for non-standard repository layouts. Signed-off-by: Tobias Schulte <tobias.schulte@gliderpilot.de> Signed-off-by: Eric Wong <normalperson@yhbt.net>
2013-05-20git-svn: clarify explanation of --destination argumentLibravatar Jonathan Nieder1-5/+14
The existing documentation for "-d" does not make it obvious whether its argument is supposed to be a full svn path, a partial svn path, the glob from the config file, or what. Clarify the text and add an example to get the reader started. Reported-by: Nathan Gray <n8gray@n8gray.org> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
2013-05-20git-svn: multiple fetch/branches/tags keys are supportedLibravatar Nathan Gray1-0/+12
"git svn" can be configured to use multiple fetch, branches, and tags refspecs by passing multiple --branches or --tags options at init time or editing the configuration file later, which can be handy when working with messy Subversion repositories. Add a note to the configuration section documenting how this works. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Eric Wong <normalperson@yhbt.net>
2013-05-20remote-hg: set stdout to binary mode on win32Libravatar Amit Bakshi1-0/+4
git clone hangs on windows, and file.write would return errno 22 inside of mercurial's windows.winstdout wrapper class. This patch sets stdout's mode to binary, fixing both issues. [fc: cleaned up] Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-17Git 1.8.3-rc3Libravatar Junio C Hamano1-1/+1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-17Merge branch 'fc/doc-style'Libravatar Junio C Hamano20-47/+37
* fc/doc-style: documentation: trivial style cleanups
2013-05-17Merge branch 'dw/asciidoc-sources-are-dot-txt-files'Libravatar Junio C Hamano1-2/+4
* dw/asciidoc-sources-are-dot-txt-files: CodingGuidelines: Documentation/*.txt are the sources
2013-05-17documentation: trivial style cleanupsLibravatar Felipe Contreras20-47/+37
White-spaces, missing braces, standardize --[no-]foo. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-17Merge git://ozlabs.org/~paulus/gitkLibravatar Junio C Hamano1-295/+338
* git://ozlabs.org/~paulus/gitk: gitk: Update Swedish translation (304t)
2013-05-17difftool: fix dir-diff when file does not exist in working treeLibravatar John Keeping2-0/+13
Commit 02c5631 (difftool --dir-diff: symlink all files matching the working tree, 2013-03-14) does not handle the case where a file that is being compared does not exist in the working tree. Fix this by checking for existence explicitly before running git-hash-object. Reported-by: Kevin Bracey <kevin@bracey.fi> Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-17remote-bzr: fixes for older versions of bzrLibravatar Felipe Contreras1-2/+4
Down to v2.0, by using older but still valid interfaces. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-17remote-bzr: fix old organization destroyLibravatar Sandor Bodo-Merle1-0/+2
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-17gitk: Update Swedish translation (304t)Libravatar Peter Krefting1-295/+338
Signed-off-by: Peter Krefting <peter@softwolves.pp.se> Signed-off-by: Paul Mackerras <paulus@samba.org>
2013-05-16Revert "remote-hg: update bookmarks when pulling"Libravatar Felipe Contreras1-3/+0
This reverts commit 24317ef32ac3111ed00792f9b2921dc19dd28fe2. Different versions of Mercurial have different arguments for bookmarks.updatefromremote(), while it should be possible to call the right function with the right arguments depending on the version, it's safer to restore the old behavior for now. Reported by Rodney Lorrimar. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-16git-submodule.txt: Clarify 'init' and 'add' subcommands.Libravatar Dale R. Worley1-2/+6
Describe how 'add' sets the submodule's logical name, which is used in the configuration entry names. Clarify that 'init' only sets up the configuration entries for submodules that have already been added elsewhere. Describe that <path> arguments limit the submodules that are configured. Signed-off-by: Dale Worley <worley@ariadne.com> Acked-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-16remote-bzr: fix cloning of non-listable reposLibravatar Felipe Contreras1-0/+3
Commit 95b0c60 (remote-bzr: add support for bzr repos) introduced a regression by assuming all bzr remote repos are listable, but they are not. If they are not listable they are basically useless, so let's assume there is no bzr repo. Reported-by: Thorsten Kranzkowski <dl8bcu@dl8bcu.de> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-15Merge branch 'fc/remote-hg' (early part)Libravatar Junio C Hamano2-24/+22
* 'fc/remote-hg' (early part): remote-hg: update bookmarks when pulling remote-hg: don't push fake 'master' bookmark remote-hg: disable forced push by default remote-hg: fix new branch creation remote-hg: add new get_config_bool() helper remote-hg: enable track-branches in hg-git mode remote-hg: get rid of unused exception checks remote-hg: trivial cleanups
2013-05-15remote-hg: update bookmarks when pullingLibravatar Felipe Contreras1-0/+3
Otherwise, the user would never ever see new bookmarks, only the ones that (s)he initially cloned. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-15remote-hg: don't push fake 'master' bookmarkLibravatar Felipe Contreras1-1/+2
We skip it locally, but not for the remote, so let's do so. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-15remote-hg: disable forced push by defaultLibravatar Felipe Contreras1-1/+1
In certain situations we might end up pushing garbage revisions (e.g. in a rebase), and the patches to deal with that haven't been merged yet. So let's disable forced pushes by default. We are essentially reverting back to the old v1.8.2 behavior, to minimize the possibility of regressions, but in a way the user can configure. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-15remote-hg: fix new branch creationLibravatar Felipe Contreras1-1/+1
When a user creates a new branch with git: % git checkout -b branches/devel and then pushes this branch % git push origin branches/devel which is the way to push new mercurial branches, we do want to create a branch, but the command would fail without newbranch=True. This only matters when force_push=False, but setting newbranch=True unconditionally does not hurt. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-15remote-hg: add new get_config_bool() helperLibravatar Felipe Contreras1-11/+13
No functional changes. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-15remote-hg: enable track-branches in hg-git modeLibravatar Felipe Contreras2-1/+1
The user can turn this off. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-15remote-hg: get rid of unused exception checksLibravatar Felipe Contreras1-15/+9
Remove try/except check because we are no longer calling check_output(), which may throw an exception. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-15remote-hg: trivial cleanupsLibravatar Felipe Contreras2-3/+1
Drop unused "global", and remove redundant comparison of two files. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-14remote-bzr: update old organizationLibravatar Felipe Contreras1-0/+7
If a clone exists with the old organization (v1.8.2) it will prevent the new shared bzr repository organization from working, so let's remove this repository, which is not used any more. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-13Git 1.8.3-rc2Libravatar Junio C Hamano2-4/+9
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-13Merge git://ozlabs.org/~paulus/gitkLibravatar Junio C Hamano1-55/+199
* git://ozlabs.org/~paulus/gitk: gitk: On OSX, bring the gitk window to front gitk: Add support for -G'regex' pickaxe variant gitk: Add menu item for reverting commits gitk: Simplify file filtering gitk: Display the date of a tag in a human-friendly way gitk: Improve behaviour of drop-down lists gitk: Move hard-coded colors to .gitk
2013-05-13gitk: On OSX, bring the gitk window to frontLibravatar Tair Sabirgaliev1-0/+9
On OSX, Tcl/Tk application windows are created behind all the applications down the stack of windows. This is very annoying, because once a gitk window appears, it's the downmost window and switching to it is pain. The patch is: if we are on OSX, use osascript to bring the current Wish process window to front. Signed-off-by: Tair Sabirgaliev <tair.sabirgaliev@gmail.com> Thanks-to: Stefan Haller <lists@haller-berlin.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
2013-05-13gitk: Add support for -G'regex' pickaxe variantLibravatar Martin Langhoff1-1/+4
git log -G'regex' is a very useful alternative to the classic pickaxe. Minimal patch to make it usable from gitk. [zj: reword message] [paulus@samba.org: reword droplist item] Signed-off-by: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> Signed-off-by: Paul Mackerras <paulus@samba.org>
2013-05-11test-bzr: do not use unportable sed '\+'Libravatar Torsten Bögershausen1-1/+1
Using sed -e '/[0-9]\+//' to find "one or more digits" is not portable. Use the Basic Regular Expression '/[0-9][0-9]*//' instead. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-11Merge git://git.bogomips.org/git-svnLibravatar Junio C Hamano9-7/+196
* git://git.bogomips.org/git-svn: git-svn: added an --include-path flag Git::SVN::*: add missing "NAME" section to perldoc git-svn: avoid self-referencing mergeinfo
2013-05-11gitk: Add menu item for reverting commitsLibravatar Knut Franke1-0/+62
Sometimes it's helpful (at least psychologically) to have this feature easily accessible. Code borrows heavily from cherrypick. Signed-off-by: Knut Franke <Knut.Franke@gmx.de> Signed-off-by: Paul Mackerras <paulus@samba.org>
2013-05-11gitk: Simplify file filteringLibravatar Felipe Contreras1-13/+7
git diff is perfectly able to do this with '-- files', no need for manual filtering. This makes gettreediffs consistent with getblobdiffs. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Paul Mackerras <paulus@samba.org>
2013-05-11gitk: Display the date of a tag in a human-friendly wayLibravatar Anand Kumria1-1/+1
By selecting a tag within gitk you can display information about it. This information is output by using the command 'git cat-file tag <tagid>' This outputs the *raw* information from the tag, amongst which is the time - in seconds since the epoch. As useful as that value is, I find it a lot easier to read and process time which it is something like: "Mon Dec 31 14:26:11 2012 -0800" This change will modify the display of tags in gitk like so: @@ -1,7 +1,7 @@ object 5d417842efeafb6e109db7574196901c4e95d273 type commit tag v1.8.1 -tagger Junio C Hamano <gitster@pobox.com> 1356992771 -0800 +tagger Junio C Hamano <gitster@pobox.com> Mon Dec 31 14:26:11 2012 -0800 Git 1.8.1 -----BEGIN PGP SIGNATURE----- Signed-off-by: Anand Kumria <wildfire@progsoc.org> Signed-off-by: Paul Mackerras <paulus@samba.org>