summaryrefslogtreecommitdiff
path: root/builtin
AgeCommit message (Collapse)AuthorFilesLines
2011-08-08Merge branch 'ms/reflog-show-is-default'Libravatar Junio C Hamano1-2/+1
* ms/reflog-show-is-default: reflog: actually default to subcommand 'show'
2011-08-08Merge branch 'js/ls-tree-error'Libravatar Junio C Hamano1-3/+1
* js/ls-tree-error: Ensure git ls-tree exits with a non-zero exit code if read_tree_recursive fails. Add a test to check that git ls-tree sets non-zero exit code on error.
2011-08-08Merge branch 'jk/reset-reflog-message-fix'Libravatar Junio C Hamano1-33/+16
* jk/reset-reflog-message-fix: reset: give better reflog messages
2011-08-05Merge branch 'jc/pack-order-tweak'Libravatar Junio C Hamano1-1/+137
* jc/pack-order-tweak: pack-objects: optimize "recency order" core: log offset pack data accesses happened
2011-08-01Merge branch 'jk/clone-detached'Libravatar Junio C Hamano1-3/+7
* jk/clone-detached: clone: always fetch remote HEAD make copy_ref globally available consider only branches in guess_remote_head t: add tests for cloning remotes with detached HEAD
2011-08-01Merge branch 'sr/transport-helper-fix'Libravatar Junio C Hamano1-0/+9
* sr/transport-helper-fix: (21 commits) transport-helper: die early on encountering deleted refs transport-helper: implement marks location as capability transport-helper: Use capname for refspec capability too transport-helper: change import semantics transport-helper: update ref status after push with export transport-helper: use the new done feature where possible transport-helper: check status code of finish_command transport-helper: factor out push_update_refs_status fast-export: support done feature fast-import: introduce 'done' command git-remote-testgit: fix error handling git-remote-testgit: only push for non-local repositories remote-curl: accept empty line as terminator remote-helpers: export GIT_DIR variable to helpers git_remote_helpers: push all refs during a non-local export transport-helper: don't feed bogus refs to export push git-remote-testgit: import non-HEAD refs t5800: document some non-functional parts of remote helpers t5800: use skip_all instead of prereq t5800: factor out some ref tests ...
2011-08-01Merge branch 'jc/maint-reset-unmerged-path'Libravatar Junio C Hamano1-1/+1
* jc/maint-reset-unmerged-path: reset [<commit>] paths...: do not mishandle unmerged paths
2011-08-01reflog: actually default to subcommand 'show'Libravatar Michael Schubert1-2/+1
The reflog manpage says: git reflog [show] [log-options] [<ref>] the subcommand 'show' is the default "in the absence of any subcommands". Currently this is only true if the user provided either at least one option or no additional argument at all. For example: git reflog master won't work. Change this by actually calling cmd_log_reflog in absence of any subcommand. Signed-off-by: Michael Schubert <mschub@elegosoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-25Ensure git ls-tree exits with a non-zero exit code if read_tree_recursive fails.Libravatar Jon Seymour1-3/+1
In the case of a corrupt repository, git ls-tree may report an error but presently it exits with a code of 0. This change uses the return code of read_tree_recursive instead. Improved-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Jon Seymour <jon.seymour@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-22Merge branch 'jk/tag-contains-ab'Libravatar Junio C Hamano1-1/+45
* jk/tag-contains-ab: Revert clock-skew based attempt to optimize tag --contains traversal git skew: a tool to find how big a clock skew exists in the history default core.clockskew variable to one day limit "contains" traversals based on commit timestamp tag: speed up --contains calculation
2011-07-22Merge branch 'jc/checkout-reflog-fix'Libravatar Junio C Hamano1-2/+5
* jc/checkout-reflog-fix: checkout: do not write bogus reflog entry out
2011-07-22reset: give better reflog messagesLibravatar Jeff King1-33/+16
The reset command creates its reflog entry from argv. However, it does so after having run parse_options, which means the only thing left in argv is any non-option arguments. Thus you would end up with confusing reflog entries like: $ git reset --hard HEAD^ $ git reset --soft HEAD@{1} $ git log -2 -g --oneline 8e46cad HEAD@{0}: HEAD@{1}: updating HEAD 1eb9486 HEAD@{1}: HEAD^: updating HEAD However, we must also consider that some scripts may set GIT_REFLOG_ACTION before calling reset, and we need to show their reflog action (with our text appended). For example: rebase -i (squash): updating HEAD On top of that, we also set the ORIG_HEAD reflog action (even though it doesn't generally exist). In that case, the reset argument is somewhat meaningless, as it has nothing to do with what's in ORIG_HEAD. This patch changes the reset reflog code to show: $GIT_REFLOG_ACTION: updating {HEAD,ORIG_HEAD} as before, but only if GIT_REFLOG_ACTION is set. Otherwise, show: reset: moving to $rev for HEAD, and: reset: updating ORIG_HEAD for ORIG_HEAD (this is still somewhat superfluous, since we are in the ORIG_HEAD reflog, obviously, but at least we now mention which command was used to update it). While we're at it, we can clean up the code a bit: - Use strbufs to make the message. - Use the "rev" parameter instead of showing all options. This makes more sense, since it is the only thing impacting the writing of the ref. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-19fast-export: support done featureLibravatar Sverre Rabbelier1-0/+9
If fast-export is being used to generate a fast-import stream that will be used afterwards it is desirable to indicate the end of the stream with the new 'done' command. Add a flag that causes fast-export to end with 'done'. Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-19Merge branch 'jc/index-pack'Libravatar Junio C Hamano3-158/+244
* jc/index-pack: verify-pack: use index-pack --verify index-pack: show histogram when emulating "verify-pack -v" index-pack: start learning to emulate "verify-pack -v" index-pack: a miniscule refactor index-pack --verify: read anomalous offsets from v2 idx file write_idx_file: need_large_offset() helper function index-pack: --verify write_idx_file: introduce a struct to hold idx customization options index-pack: group the delta-base array entries also by type Conflicts: builtin/verify-pack.c cache.h sha1_file.c
2011-07-19Merge branch 'jk/archive-tar-filter'Libravatar Junio C Hamano2-36/+17
* jk/archive-tar-filter: upload-archive: allow user to turn off filters archive: provide builtin .tar.gz filter archive: implement configurable tar filters archive: refactor file extension format-guessing archive: move file extension format-guessing lower archive: pass archiver struct to write_archive callback archive: refactor list of archive formats archive-tar: don't reload default config options archive: reorder option parsing and config reading
2011-07-19Merge branch 'jk/clone-cmdline-config'Libravatar Junio C Hamano2-13/+22
* jk/clone-cmdline-config: clone: accept config options on the command line config: make git_config_parse_parameter a public function remote: use new OPT_STRING_LIST parse-options: add OPT_STRING_LIST helper
2011-07-19Merge branch 'jk/tag-list-multiple-patterns'Libravatar Junio C Hamano1-9/+17
* jk/tag-list-multiple-patterns: tag: accept multiple patterns for --list
2011-07-19Merge branch 'jc/zlib-wrap'Libravatar Junio C Hamano4-17/+17
* jc/zlib-wrap: zlib: allow feeding more than 4GB in one go zlib: zlib can only process 4GB at a time zlib: wrap deflateBound() too zlib: wrap deflate side of the API zlib: wrap inflateInit2 used to accept only for gzip format zlib: wrap remaining calls to direct inflate/inflateEnd zlib wrapper: refactor error message formatter Conflicts: sha1_file.c
2011-07-14Revert clock-skew based attempt to optimize tag --contains traversalLibravatar Junio C Hamano2-83/+3
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-13reset [<commit>] paths...: do not mishandle unmerged pathsLibravatar Junio C Hamano1-1/+1
Because "diff --cached HEAD" showed an incorrect blob object name on the LHS of the diff, we ended up updating the index entry with bogus value, not what we read from the tree. Noticed by John Nowak. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-08pack-objects: optimize "recency order"Libravatar Junio C Hamano1-1/+137
This optimizes the "recency order" (see pack-heuristics.txt in Documentation/technical/ directory) used to order objects within a packfile in three ways: - Commits at the tip of tags are written together, in the hope that revision traversal done in incremental fetch (which starts by putting them in a revision queue marked as UNINTERESTING) will see a better locality of these objects; - In the original recency order, trees and blobs are intermixed. Write trees together before blobs, in the hope that this will improve locality when running pathspec-limited revision traversal, i.e. "git log paths..."; - When writing blob objects out, write the whole family of blobs that use the same delta base object together, by starting from the root of the delta chain, and writing its immediate children in a width-first manner, in the hope that this will again improve locality when reading blobs that belong to the same path, which are likely to be deltified against each other. I tried various workloads in the Linux kernel repositories (HEAD at v3.0-rc6-71-g4dd1b49) packed with v1.7.6 and with this patch, counting how large seeks are needed between adjacent accesses to objects in the pack, and the result looks promising. The history has 2072052 objects, weighing some 490MiB. * Simple commit-only log. $ git log >/dev/null There are 254656 commits in total. v1.7.6 with patch Total number of access : 258,031 258,032 0.0% percentile : 12 12 10.0% percentile : 259 259 20.0% percentile : 294 294 30.0% percentile : 326 326 40.0% percentile : 363 363 50.0% percentile : 415 415 60.0% percentile : 513 513 70.0% percentile : 857 858 80.0% percentile : 10,434 10,441 90.0% percentile : 91,985 91,996 95.0% percentile : 260,852 260,885 99.0% percentile : 1,150,680 1,152,811 99.9% percentile : 3,148,435 3,148,435 Less than 2MiB seek: 99.70% 99.69% 95% of the pack accesses look at data that is no further than 260kB from the previous location we accessed. The patch does not change the order of commit objects very much, and the result is very similar. * Pathspec-limited log. $ git log drivers/net >/dev/null The path is touched by 26551 commits and merges (among 254656 total). v1.7.6 with patch Total number of access : 559,511 558,663 0.0% percentile : 0 0 10.0% percentile : 182 167 20.0% percentile : 259 233 30.0% percentile : 357 304 40.0% percentile : 714 485 50.0% percentile : 5,046 3,976 60.0% percentile : 688,671 443,578 70.0% percentile : 319,574,732 110,370,100 80.0% percentile : 361,647,599 123,707,229 90.0% percentile : 393,195,669 128,947,636 95.0% percentile : 405,496,875 131,609,321 99.0% percentile : 412,942,470 133,078,115 99.5% percentile : 413,172,266 133,163,349 99.9% percentile : 413,354,356 133,240,445 Less than 2MiB seek: 61.71% 62.87% With the current pack heuristics, more than 30% of accesses have to seek further than 300MB; the updated pack heuristics ensures that less than 0.1% of accesses have to seek further than 135MB. This is largely due to the fact that the updated heuristics does not mix blobs and trees together. * Blame. $ git blame drivers/net/ne.c >/dev/null The path is touched by 34 commits and merges. v1.7.6 with patch Total number of access : 178,147 178,166 0.0% percentile : 0 0 10.0% percentile : 142 139 20.0% percentile : 222 194 30.0% percentile : 373 300 40.0% percentile : 1,168 837 50.0% percentile : 11,248 7,334 60.0% percentile : 305,121,284 106,850,130 70.0% percentile : 361,427,854 123,709,715 80.0% percentile : 388,127,343 128,171,047 90.0% percentile : 399,987,762 130,200,707 95.0% percentile : 408,230,673 132,174,308 99.0% percentile : 412,947,017 133,181,160 99.5% percentile : 413,312,798 133,220,425 99.9% percentile : 413,352,366 133,269,051 Less than 2MiB seek: 56.47% 56.83% The result is very similar to the pathspec-limited log above, which only looks at the tree objects. * Packing recent history. $ (git for-each-ref --format='^%(refname)' refs/tags; echo HEAD) | git pack-objects --revs --stdout >/dev/null This should pack data worth 71 commits. v1.7.6 with patch Total number of access : 11,511 11,514 0.0% percentile : 0 0 10.0% percentile : 48 47 20.0% percentile : 134 98 30.0% percentile : 332 178 40.0% percentile : 1,386 293 50.0% percentile : 8,030 478 60.0% percentile : 33,676 1,195 70.0% percentile : 147,268 26,216 80.0% percentile : 9,178,662 464,598 90.0% percentile : 67,922,665 965,782 95.0% percentile : 87,773,251 1,226,102 99.0% percentile : 98,011,763 1,932,377 99.5% percentile : 100,074,427 33,642,128 99.9% percentile : 105,336,398 275,772,650 Less than 2MiB seek: 77.09% 99.04% The long-tail part of the result looks worse with the patch, but the change helps majority of the access. 99.04% of the accesses need less than 2MiB of seeking, compared to 77.09% with the current packing heuristics. * Index pack. $ git index-pack -v .git/objects/pack/pack*.pack v1.7.6 with patch Total number of access : 2,791,228 2,788,802 0.0% percentile : 9 9 10.0% percentile : 140 89 20.0% percentile : 233 167 30.0% percentile : 322 235 40.0% percentile : 464 310 50.0% percentile : 862 423 60.0% percentile : 2,566 686 70.0% percentile : 25,827 1,498 80.0% percentile : 1,317,862 4,971 90.0% percentile : 11,926,385 119,398 95.0% percentile : 41,304,149 952,519 99.0% percentile : 227,613,070 6,709,650 99.5% percentile : 321,265,121 11,734,871 99.9% percentile : 382,919,785 33,155,191 Less than 2MiB seek: 81.73% 96.92% As the index-pack command already walks objects in the delta chain order, writing the blobs out in the delta chain order seems to drastically improve the locality of access. Note that a half-a-gigabyte packfile comfortably fits in the buffer cache, and you would unlikely to see much performance difference on a modern and reasonably beefy machine with enough memory and local disks. Benchmarking with cold cache (or over NFS) would be interesting. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-06Merge commit 'v1.7.6' into jc/checkout-reflog-fixLibravatar Junio C Hamano92-0/+45831
* commit 'v1.7.6': (3211 commits) Git 1.7.6 completion: replace core.abbrevguard to core.abbrev Git 1.7.6-rc3 Documentation: git diff --check respects core.whitespace gitweb: 'pickaxe' and 'grep' features requires 'search' to be enabled t7810: avoid unportable use of "echo" plug a few coverity-spotted leaks builtin/gc.c: add missing newline in message tests: link shell libraries into valgrind directory t/Makefile: pass test opts to valgrind target properly sh-i18n--envsubst.c: do not #include getopt.h Fix typo: existant->existent Git 1.7.6-rc2 gitweb: do not misparse nonnumeric content tag files that contain a digit Git 1.7.6-rc1 fetch: do not leak a refspec t3703: skip more tests using colons in file names on Windows gitweb: Fix usability of $prevent_xss gitweb: Move "Requirements" up in gitweb/INSTALL gitweb: Describe CSSMIN and JSMIN in gitweb/INSTALL ...
2011-06-30git skew: a tool to find how big a clock skew exists in the historyLibravatar Jeff King1-0/+50
> As you probably guessed from the specificity of the number, I wrote a > short program to actually traverse and find the worst skew. It takes > about 5 seconds to run (unsurprisingly, since it is doing the same full > traversal that we end up doing in the above numbers). So we could > "autoskew" by setting up the configuration on clone, and then > periodically updating it as part of "git gc". This patch doesn't implement auto-detection of skew, but is the program I used to calculate, and would provide the basis for such auto-detection. It would be interesting to see average skew numbers for popular repositories. You can run it as "git skew --all". Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-29Merge branch 'jc/streaming' into nextLibravatar Junio C Hamano1-2/+2
* jc/streaming: sha1_file: use the correct type (ssize_t, not size_t) for read-style function streaming: read loose objects incrementally sha1_file.c: expose helpers to read loose objects streaming: read non-delta incrementally from a pack streaming_write_entry(): support files with holes convert: CRLF_INPUT is a no-op in the output codepath streaming_write_entry(): use streaming API in write_entry() streaming: a new API to read from the object store write_entry(): separate two helper functions out unpack_object_header(): make it public sha1_object_info_extended(): hint about objects in delta-base cache sha1_object_info_extended(): expose a bit more info packed_object_info_detail(): do not return a string
2011-06-29Merge branch 'rs/grep-color'Libravatar Junio C Hamano1-8/+21
* rs/grep-color: grep: add --heading grep: add --break grep: fix coloring of hunk marks between files
2011-06-29Merge branch 'jc/maint-1.7.3-checkout-describe'Libravatar Junio C Hamano1-1/+1
* jc/maint-1.7.3-checkout-describe: checkout -b <name>: correctly detect existing branch
2011-06-29Merge branch 'jc/advice-about-to-lose-commit'Libravatar Junio C Hamano1-10/+11
* jc/advice-about-to-lose-commit: checkout: make advice when reattaching the HEAD less loud Conflicts: builtin/checkout.c
2011-06-29Merge branch 'maint-1.7.5' into maintLibravatar Junio C Hamano1-1/+1
* maint-1.7.5: test: skip clean-up when running under --immediate mode "branch -d" can remove more than one branches
2011-06-29"branch -d" can remove more than one branchesLibravatar Junio C Hamano1-1/+1
Since 03feddd (git-check-ref-format: reject funny ref names, 2005-10-13), "git branch -d" can take more than one branch names to remove. The documentation was correct, but the usage string was not. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22clone: accept config options on the command lineLibravatar Jeff King1-1/+20
Clone does all of init, "remote add", fetch, and checkout without giving the user a chance to intervene and set any configuration. This patch allows you to set config options in the newly created repository after the clone, but before we do any other operations. In many cases, this is a minor convenience over something like: git clone git://... git config core.whatever true But in some cases, it can bring extra efficiency by changing how the fetch or checkout work. For example, setting line-ending config before the checkout avoids having to re-checkout all of the contents with the correct line endings. It also provides a mechanism for passing information to remote helpers during a clone; the helpers may read the git config to influence how they operate. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22remote: use new OPT_STRING_LISTLibravatar Jeff King1-12/+2
This saves us having our own callback function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22upload-archive: allow user to turn off filtersLibravatar Jeff King2-2/+2
Some tar filters may be very expensive to run, so sites do not want to expose them via upload-archive. This patch lets users configure tar.<filter>.remote to turn them off. By default, gzip filters are left on, as they are about as expensive as creating zip archives. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22archive: move file extension format-guessing lowerLibravatar Jeff King2-36/+17
The process for guessing an archive output format based on the filename is something like this: a. parse --output in cmd_archive; check the filename against a static set of mapping heuristics (right now it just matches ".zip" for zip files). b. if found, stick a fake "--format=zip" at the beginning of the arguments list (if the user did specify a --format manually, the later option will override our fake one) c. if it's a remote call, ship the arguments to the remote (including the fake), which will call write_archive on their end d. if it's local, ship the arguments to write_archive locally There are two problems: 1. The set of mappings is static and at too high a level. The write_archive level is going to check config for user-defined formats, some of which will specify extensions. We need to delay lookup until those are parsed, so we can match against them. 2. For a remote archive call, our set of mappings (or formats) may not match the remote side's. This is OK in practice right now, because all versions of git understand "zip" and "tar". But as new formats are added, there is going to be a mismatch between what the client can do and what the remote server can do. To fix (1), this patch refactors the location guessing to happen at the write_archive level, instead of the cmd_archive level. So instead of sticking a fake --format field in the argv list, we actually pass a "name hint" down the callchain; this hint is used at the appropriate time to guess the format (if one hasn't been given already). This patch leaves (2) unfixed. The name_hint is converted to a "--format" option as before, and passed to the remote. This means the local side's idea of how extensions map to formats will take precedence. Another option would be to pass the name hint to the remote side and let the remote choose. This isn't a good idea for two reasons: 1. There's no room in the protocol for passing that information. We can pass a new argument, but older versions of git on the server will choke on it. 2. Letting the remote side decide creates a silent inconsistency in user experience. Consider the case that the locally installed git knows about the "tar.gz" format, but a remote server doesn't. Running "git archive -o foo.tar.gz" will use the tar.gz format. If we use --remote, and the local side chooses the format, then we send "--format=tar.gz" to the remote, which will complain about the unknown format. But if we let the remote side choose the format, then it will realize that it doesn't know about "tar.gz" and output uncompressed tar without even issuing a warning. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-20plug a few coverity-spotted leaksLibravatar Jim Meyering3-1/+6
Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-20tag: accept multiple patterns for --listLibravatar Jeff King1-9/+17
Until now, "git tag -l foo* bar*" would silently ignore the second argument, showing only refs starting with "foo". It's not just unfriendly not to take a second pattern; we actually generated subtly wrong results (from the user's perspective) because some of the requested tags were omitted. This patch allows an arbitrary number of patterns on the command line; if any of them matches, the ref is shown. While we're tweaking the documentation, let's also make it clear that the pattern is fnmatch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-19Merge branch 'maint'Libravatar Junio C Hamano1-1/+1
* maint: builtin/gc.c: add missing newline in message
2011-06-19builtin/gc.c: add missing newline in messageLibravatar Andreas Schwab1-1/+1
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-11default core.clockskew variable to one dayLibravatar Jeff King1-1/+1
This is the slop value used by name-rev, so presumably is a reasonable default. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-11limit "contains" traversals based on commit timestampLibravatar Jeff King1-3/+33
When looking for commits that contain other commits (e.g., via "git tag --contains"), we can end up traversing useless portions of the graph. For example, if I am looking for a tag that contains a commit made last week, there is not much point in traversing portions of the history graph made five years ago. This optimization can provide massive speedups. For example, doing "git tag --contains HEAD~200" in the linux-2.6 repository goes from: real 0m5.302s user 0m5.116s sys 0m0.184s to: real 0m0.030s user 0m0.020s sys 0m0.008s The downside is that we will no longer find some answers in the face of extreme clock skew, as we will stop the traversal early when seeing commits skewed too far into the past. Name-rev already implements a similar optimization, using a "slop" of one day to allow for a certain amount of clock skew in commit timestamps. This patch introduces a "core.clockskew" variable, which allows specifying the allowable amount of clock skew in seconds. For safety, it defaults to "none", causing a full traversal (i.e., no change in behavior from previous versions). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-11tag: speed up --contains calculationLibravatar Jeff King1-1/+45
When we want to know if commit A contains commit B (or any one of a set of commits, B through Z), we generally calculate the merge bases and see if B is a merge base of A (or for a set, if any of the commits B through Z have that property). When we are going to check a series of commits A1 through An to see whether each contains B (e.g., because we are deciding which tags to show with "git tag --contains"), we do a series of merge base calculations. This can be very expensive, as we repeat a lot of traversal work. Instead, let's leverage the fact that we are going to use the same --contains list for each tag, and mark areas of the commit graph is definitely containing those commits, or definitely not containing those commits. Later tags can then stop traversing as soon as they see a previously calculated answer. This sped up "git tag --contains HEAD~200" in the linux-2.6 repository from: real 0m15.417s user 0m15.197s sys 0m0.220s to: real 0m5.329s user 0m5.144s sys 0m0.184s Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10zlib: zlib can only process 4GB at a timeLibravatar Junio C Hamano4-10/+10
The size of objects we read from the repository and data we try to put into the repository are represented in "unsigned long", so that on larger architectures we can handle objects that weigh more than 4GB. But the interface defined in zlib.h to communicate with inflate/deflate limits avail_in (how many bytes of input are we calling zlib with) and avail_out (how many bytes of output from zlib are we ready to accept) fields effectively to 4GB by defining their type to be uInt. In many places in our code, we allocate a large buffer (e.g. mmap'ing a large loose object file) and tell zlib its size by assigning the size to avail_in field of the stream, but that will truncate the high octets of the real size. The worst part of this story is that we often pass around z_stream (the state object used by zlib) to keep track of the number of used bytes in input/output buffer by inspecting these two fields, which practically limits our callchain to the same 4GB limit. Wrap z_stream in another structure git_zstream that can express avail_in and avail_out in unsigned long. For now, just die() when the caller gives a size that cannot be given to a single zlib call. In later patches in the series, we would make git_inflate() and git_deflate() internally loop to give callers an illusion that our "improved" version of zlib interface can operate on a buffer larger than 4GB in one go. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10zlib: wrap deflateBound() tooLibravatar Junio C Hamano1-1/+1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10zlib: wrap deflate side of the APILibravatar Junio C Hamano2-6/+6
Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use of deflateInit2 in remote-curl.c to tell the library to use gzip header and trailer in git_deflate_init_gzip(). There is only one caller that cares about the status from deflateEnd(). Introduce git_deflate_end_gently() to let that sole caller retrieve the status and act on it (i.e. die) for now, but we would probably want to make inflate_end/deflate_end die when they ran out of memory and get rid of the _gently() kind. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-08Merge branch 'maint'Libravatar Junio C Hamano1-2/+4
* maint: fetch: do not leak a refspec
2011-06-08fetch: do not leak a refspecLibravatar Jim Meyering1-2/+4
Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-07clone: always fetch remote HEADLibravatar Jeff King1-3/+7
In most cases, fetching the remote HEAD explicitly is unnecessary. It's just a symref pointing to a branch which we are already fetching, so we will already ask for its sha1. However, if the remote has a detached HEAD, things are less certain. We do not ask for HEAD's sha1, but we do try to write it into a local detached HEAD. In most cases this is fine, as the remote HEAD is pointing to some part of the history graph that we will fetch via the refs. But if the remote HEAD points to an "orphan" commit (one which was is not an ancestor of any refs), then we will not have the object, and update_ref will complain when we try to write the detached HEAD, aborting the whole clone. This patch makes clone always explicitly ask the remote for the sha1 of its HEAD commit. In the non-detached case, this is a no-op, as we were going to ask for that sha1 anyway. In the regular detached case, this will add an extra "want" to the protocol negotiation, but will not change the history that gets sent. And in the detached orphan case, we will fetch the orphaned history so that we can write it into our local detached HEAD. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-06Merge branch 'bc/maint-status-z-to-use-porcelain'Libravatar Junio C Hamano1-3/+4
* bc/maint-status-z-to-use-porcelain: builtin/commit.c: set status_format _after_ option parsing t7508: demonstrate status's failure to use --porcelain format with -z Conflicts: builtin/commit.c
2011-06-05verify-pack: use index-pack --verifyLibravatar Junio C Hamano1-107/+25
This finally gets rid of the inefficient verify-pack implementation that walks objects in the packfile in their object name order and replaces it with a call to index-pack --verify. As a side effect, it also removes packed_object_info_detail() API which is rather expensive. As this changes the way errors are reported (verify-pack used to rely on the usual runtime error detection routine unpack_entry() to diagnose the CRC errors in an entry in the *.idx file; index-pack --verify checks the whole *.idx file in one go), update a test that expected the string "CRC" to appear in the error message. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-05index-pack: show histogram when emulating "verify-pack -v"Libravatar Junio C Hamano1-3/+23
The histogram produced by "verify-pack -v" always had an artificial limit of 50, but index-pack knows what the maximum delta depth is, so we do not have to limit it. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-05index-pack: start learning to emulate "verify-pack -v"Libravatar Junio C Hamano1-3/+40
The "index-pack" machinery already has almost enough knowledge to produce the same output as "verify-pack -v". Fill small gaps in its bookkeeping, and teach it to show what it knows. Add a few more command line options that do not have to be advertised to the end users. They will be used internally when verify-pack calls this. The eventual goal is to remove verify-pack implementation and redo it as a thin wrapper around the index-pack, so that we can remove the rather expensive packed_object_info_detail() API. This still does not do the delta-chain-depth histogram yet but that part is easy. Signed-off-by: Junio C Hamano <gitster@pobox.com>