summaryrefslogtreecommitdiff
path: root/builtin
AgeCommit message (Collapse)AuthorFilesLines
2018-09-04Merge branch 'en/directory-renames-nothanks'Libravatar Junio C Hamano1-0/+1
Recent addition of "directory rename" heuristics to the merge-recursive backend makes the command susceptible to false positives and false negatives. In the context of "git am -3", which does not know about surrounding unmodified paths and thus cannot inform the merge machinery about the full trees involved, this risk is particularly severe. As such, the heuristic is disabled for "git am -3" to keep the machinery "more stupid but predictable". * en/directory-renames-nothanks: am: avoid directory rename detection when calling recursive merge machinery merge-recursive: add ability to turn off directory rename detection t3401: add another directory rename testcase for rebase and am
2018-08-30am: avoid directory rename detection when calling recursive merge machineryLibravatar Elijah Newren1-0/+1
Let's say you have the following three trees, where Base is from one commit behind either master or branch: Base : bar_v1, foo/{file1, file2, file3} branch: bar_v2, foo/{file1, file2}, goo/file3 master: bar_v3, foo/{file1, file2, file3} Using git-am (or am-based rebase) to apply the changes from branch onto master results in the following tree: Result: bar_merged, goo/{file1, file2, file3} This is not what users want; they did not rename foo/ -> goo/, they only renamed one file within that directory. The reason this happens is am constructs fake trees (via build_fake_ancestor()) of the following form: Base_bfa : bar_v1, foo/file3 branch_bfa: bar_v2, goo/file3 Combining these two trees with master's tree: master: bar_v3, foo/{file1, file2, file3}, You can see that merge_recursive_generic() would see branch_bfa as renaming foo/ -> goo/, and master as just adding both foo/file1 and foo/file2. As such, it ends up with goo/{file1, file2, file3} The core problem is that am does not have access to the original trees; it can only construct trees using the blobs involved in the patch. As such, it is not safe to perform directory rename detection within am -3. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-27Merge branch 'ja/i18n-message-fixes'Libravatar Junio C Hamano1-1/+1
Messages fix. * ja/i18n-message-fixes: i18n: fix mistakes in translated strings
2018-08-27Merge branch 'js/range-diff'Libravatar Junio C Hamano1-1/+1
Finishing touched to help string. * js/range-diff: range-diff: update stale summary of --no-dual-color
2018-08-27Merge branch 'rs/opt-updates'Libravatar Junio C Hamano5-6/+6
"git cmd -h" updates. * rs/opt-updates: parseopt: group literal string alternatives in argument help remote: improve argument help for add --mirror checkout-index: improve argument help for --stage
2018-08-27Merge branch 'ep/worktree-quiet-option'Libravatar Junio C Hamano1-3/+13
"git worktree" command learned "--quiet" option to make it less verbose. * ep/worktree-quiet-option: worktree: add --quiet option
2018-08-27Merge branch 'sm/branch-sort-config'Libravatar Junio C Hamano1-1/+9
"git branch --list" learned to take the default sort order from the 'branch.sort' configuration variable, just like "git tag --list" pays attention to 'tag.sort'. * sm/branch-sort-config: branch: support configuring --sort via .gitconfig
2018-08-27range-diff: update stale summary of --no-dual-colorLibravatar Kyle Meyer1-1/+1
275267937b (range-diff: make dual-color the default mode, 2018-08-13) replaced --dual-color with --no-dual-color but left the option's summary untouched. Rewrite the summary to describe --no-dual-color rather than dual-color. Helped-by: Jonathan Nieder <jrnieder@gmail.com> Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Kyle Meyer <kyle@kyleam.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-23i18n: fix mistakes in translated stringsLibravatar Jean-Noël Avila1-1/+1
Fix typos and convert a question which does not expect to be replied to a simple advice. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-22Merge branch 'nd/pack-deltify-regression-fix'Libravatar Junio C Hamano1-4/+1
In a recent update in 2.18 era, "git pack-objects" started producing a larger than necessary packfiles by missing opportunities to use large deltas. * nd/pack-deltify-regression-fix: pack-objects: fix performance issues on packing large deltas
2018-08-21parseopt: group literal string alternatives in argument helpLibravatar René Scharfe3-4/+4
This formally clarifies that the "--option=" part is the same for all alternatives. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-21remote: improve argument help for add --mirrorLibravatar René Scharfe1-1/+1
Group the possible values using a pair of parentheses and don't mark them for translation, as they are literal strings that have to be used as-is in any locale. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-21checkout-index: improve argument help for --stageLibravatar René Scharfe1-1/+1
Spell out all alternatives and avoid using a numerical range operator, as it is not mentioned in CodingGuidelines and the resulting string is still concise. Wrap them in parentheses to document clearly that the "--stage=" part is common among them. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20Merge branch 'nd/cherry-pick-quit-fix'Libravatar Junio C Hamano1-2/+7
"git cherry-pick --quit" failed to remove CHERRY_PICK_HEAD even though we won't be in a cherry-pick session after it returns, which has been corrected. * nd/cherry-pick-quit-fix: cherry-pick: fix --quit not deleting CHERRY_PICK_HEAD
2018-08-20Merge branch 'sb/submodule-cleanup'Libravatar Junio C Hamano1-1/+0
A few preliminary minor clean-ups in the area around submodules. * sb/submodule-cleanup: builtin/submodule--helper: remove stray new line t7410: update to new style
2018-08-20Merge branch 'jt/repack-promisor-packs'Libravatar Junio C Hamano1-48/+135
After a partial clone, repeated fetches from promisor remote would have accumulated many packfiles marked with .promisor bit without getting them coalesced into fewer packfiles, hurting performance. "git repack" now learned to repack them. * jt/repack-promisor-packs: repack: repack promisor objects if -a or -A is set repack: refactor setup of pack-objects cmd
2018-08-20Merge branch 'js/range-diff'Libravatar Junio C Hamano1-0/+116
"git tbdiff" that lets us compare individual patches in two iterations of a topic has been rewritten and made into a built-in command. * js/range-diff: (21 commits) range-diff: use dim/bold cues to improve dual color mode range-diff: make --dual-color the default mode range-diff: left-pad patch numbers completion: support `git range-diff` range-diff: populate the man page range-diff --dual-color: skip white-space warnings range-diff: offer to dual-color the diffs diff: add an internal option to dual-color diffs of diffs color: add the meta color GIT_COLOR_REVERSE range-diff: use color for the commit pairs range-diff: add tests range-diff: do not show "function names" in hunk headers range-diff: adjust the output of the commit pairs range-diff: suppress the diff headers range-diff: indent the diffs just like tbdiff range-diff: right-trim commit messages range-diff: also show the diff between patches range-diff: improve the order of the shown commits range-diff: first rudimentary implementation Introduce `range-diff` to compare iterations of a topic branch ...
2018-08-20Merge branch 'nd/no-the-index'Libravatar Junio C Hamano19-32/+38
The more library-ish parts of the codebase learned to work on the in-core index-state instance that is passed in by their callers, instead of always working on the singleton "the_index" instance. * nd/no-the-index: (24 commits) blame.c: remove implicit dependency on the_index apply.c: remove implicit dependency on the_index apply.c: make init_apply_state() take a struct repository apply.c: pass struct apply_state to more functions resolve-undo.c: use the right index instead of the_index archive-*.c: use the right repository archive.c: avoid access to the_index grep: use the right index instead of the_index attr: remove index from git_attr_set_direction() entry.c: use the right index instead of the_index submodule.c: use the right index instead of the_index pathspec.c: use the right index instead of the_index unpack-trees: avoid the_index in verify_absent() unpack-trees: convert clear_ce_flags* to avoid the_index unpack-trees: don't shadow global var the_index unpack-trees: add a note about path invalidation unpack-trees: remove 'extern' on function declaration ls-files: correct index argument to get_convert_attr_ascii() preload-index.c: use the right index instead of the_index dir.c: remove an implicit dependency on the_index in pathspec code ...
2018-08-20Merge branch 'jk/for-each-object-iteration'Libravatar Junio C Hamano2-28/+82
The API to iterate over all objects learned to optionally list objects in the order they appear in packfiles, which helps locality of access if the caller accesses these objects while as objects are enumerated. * jk/for-each-object-iteration: for_each_*_object: move declarations to object-store.h cat-file: use a single strbuf for all output cat-file: split batch "buf" into two variables cat-file: use oidset check-and-insert cat-file: support "unordered" output for --batch-all-objects cat-file: rename batch_{loose,packed}_object callbacks t1006: test cat-file --batch-all-objects with duplicates for_each_packed_object: support iterating in pack-order for_each_*_object: give more comprehensive docstrings for_each_*_object: take flag arguments as enum for_each_*_object: store flag definitions in a single location
2018-08-17worktree: add --quiet optionLibravatar Elia Pinto1-3/+13
Add the '--quiet' option to git worktree, as for the other git commands. 'add' is the only command affected by it since all other commands, except 'list', are currently silent by default. [jc: appiled trivial fix-up to keep the tests from touching outside the scratch area] Helped-by: Martin Ågren <martin.agren@gmail.com> Helped-by: Duy Nguyen <pclouds@gmail.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-17Merge branch 'js/pull-rebase-type-shorthand'Libravatar Junio C Hamano1-3/+3
"git pull --rebase=interactive" learned "i" as a short-hand for "interactive". * js/pull-rebase-type-shorthand: pull --rebase=<type>: allow single-letter abbreviations for the type
2018-08-17Merge branch 'rs/parse-opt-lithelp'Libravatar Junio C Hamano10-16/+17
The parse-options machinery learned to refrain from enclosing placeholder string inside a "<bra" and "ket>" pair automatically without PARSE_OPT_LITERAL_ARGHELP. Existing help text for option arguments that are not formatted correctly have been identified and fixed. * rs/parse-opt-lithelp: parse-options: automatically infer PARSE_OPT_LITERAL_ARGHELP shortlog: correct option help for -w send-pack: specify --force-with-lease argument help explicitly pack-objects: specify --index-version argument help explicitly difftool: remove angular brackets from argument help add, update-index: fix --chmod argument help push: use PARSE_OPT_LITERAL_ARGHELP instead of unbalanced brackets
2018-08-16branch: support configuring --sort via .gitconfigLibravatar Samuel Maftoul1-1/+9
Add support for configuring default sort ordering for git branches. Command line option will override this configured value, using the exact same syntax. Signed-off-by: Samuel Maftoul <samuel.maftoul@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-16builtin/submodule--helper: remove stray new lineLibravatar Stefan Beller1-1/+0
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-16cherry-pick: fix --quit not deleting CHERRY_PICK_HEADLibravatar Nguyễn Thái Ngọc Duy1-2/+7
--quit is supposed to be --abort but without restoring HEAD. Leaving CHERRY_PICK_HEAD behind could make other commands mistake that cherry-pick is still ongoing (e.g. "git commit --amend" will refuse to work). Clean it too. For --abort, this job of deleting CHERRY_PICK_HEAD is on "git reset" so we don't need to do anything else. But let's add extra checks in --abort tests to confirm. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-15Merge branch 'jt/connectivity-check-after-unshallow'Libravatar Junio C Hamano2-26/+6
"git fetch" sometimes failed to update the remote-tracking refs, which has been corrected. * jt/connectivity-check-after-unshallow: fetch-pack: unify ref in and out param
2018-08-15Merge branch 'rs/remote-mv-leakfix'Libravatar Junio C Hamano1-2/+3
Leakfix. * rs/remote-mv-leakfix: remote: clear string_list after use in mv()
2018-08-15Merge branch 'nd/pack-objects-threading-doc'Libravatar Junio C Hamano1-0/+19
Doc fix. * nd/pack-objects-threading-doc: pack-objects: document about thread synchronization
2018-08-15Merge branch 'jt/tag-following-with-proto-v2-fix'Libravatar Junio C Hamano1-1/+1
The wire-protocol v2 relies on the client to send "ref prefixes" to limit the bandwidth spent on the initial ref advertisement. "git fetch $remote branch:branch" that asks tags that point into the history leading to the "branch" automatically followed sent to narrow prefix and broke the tag following, which has been fixed. * jt/tag-following-with-proto-v2-fix: fetch: send "refs/tags/" prefix upon CLI refspecs t5702: test fetch with multiple refspecs at a time
2018-08-15Merge branch 'jk/size-t'Libravatar Junio C Hamano1-1/+2
Code clean-up to use size_t/ssize_t when they are the right type. * jk/size-t: strbuf_humanise: use unsigned variables pass st.st_size as hint for strbuf_readlink() strbuf_readlink: use ssize_t strbuf: use size_t for length in intermediate variables reencode_string: use size_t for string lengths reencode_string: use st_add/st_mult helpers
2018-08-15Merge branch 'nd/i18n'Libravatar Junio C Hamano12-158/+169
Many more strings are prepared for l10n. * nd/i18n: (23 commits) transport-helper.c: mark more strings for translation transport.c: mark more strings for translation sha1-file.c: mark more strings for translation sequencer.c: mark more strings for translation replace-object.c: mark more strings for translation refspec.c: mark more strings for translation refs.c: mark more strings for translation pkt-line.c: mark more strings for translation object.c: mark more strings for translation exec-cmd.c: mark more strings for translation environment.c: mark more strings for translation dir.c: mark more strings for translation convert.c: mark more strings for translation connect.c: mark more strings for translation config.c: mark more strings for translation commit-graph.c: mark more strings for translation builtin/replace.c: mark more strings for translation builtin/pack-objects.c: mark more strings for translation builtin/grep.c: mark strings for translation builtin/config.c: mark more strings for translation ...
2018-08-15Merge branch 'bw/clone-ref-prefixes'Libravatar Junio C Hamano1-5/+15
The wire-protocol v2 relies on the client to send "ref prefixes" to limit the bandwidth spent on the initial ref advertisement. "git clone" when learned to speak v2 forgot to do so, which has been corrected. * bw/clone-ref-prefixes: clone: send ref-prefixes when using protocol v2
2018-08-15Merge branch 'jk/core-use-replace-refs'Libravatar Junio C Hamano7-7/+7
A new configuration variable core.usereplacerefs has been added, primarily to help server installations that want to ignore the replace mechanism altogether. * jk/core-use-replace-refs: add core.usereplacerefs config option check_replace_refs: rename to read_replace_refs check_replace_refs: fix outdated comment
2018-08-14for_each_*_object: move declarations to object-store.hLibravatar Jeff King1-0/+1
The for_each_loose_object() and for_each_packed_object() functions are meant to be part of a unified interface: they use the same set of for_each_object_flags, and it's not inconceivable that we might one day add a single for_each_object() wrapper around them. Let's put them together in a single file, so we can avoid awkwardness like saying "the flags for this function are over in cache.h". Moving the loose functions to packfile.h is silly. Moving the packed functions to cache.h works, but makes the "cache.h is a kitchen sink" problem worse. The best place is the recently-created object-store.h, since these are quite obviously related to object storage. The for_each_*_in_objdir() functions do not use the same flags, but they are logically part of the same interface as for_each_loose_object(), and share callback signatures. So we'll move those, as well, as they also make sense in object-store.h. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-14cat-file: use a single strbuf for all outputLibravatar Jeff King1-11/+17
When we're in batch mode, we end up in batch_object_write() for each object, which allocates its own strbuf for each call. Instead, we can provide a single "scratch" buffer that gets reused for each output. When running: git cat-file --batch-all-objects --batch-check='%(objectname)' on git.git, my best-of-five time drops from: real 0m0.171s user 0m0.159s sys 0m0.012s to: real 0m0.133s user 0m0.121s sys 0m0.012s Note that we could do this just by putting the "scratch" pointer into "struct expand_data", but I chose instead to add an extra parameter to the callstack. That's more verbose, but it makes it a bit more obvious what is going on, which in turn makes it easy to see where we need to be releasing the string in the caller (right after the loop which uses it in each case). Based-on-a-patch-by: René Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-14cat-file: split batch "buf" into two variablesLibravatar Jeff King1-6/+8
We use the "buf" strbuf for two things: to read incoming lines, and as a scratch space for test-expanding the user-provided format. Let's split this into two variables with descriptive names, which makes their purpose and lifetime more clear. It will also help in a future patch when we start using the "output" buffer for more expansions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-14cat-file: use oidset check-and-insertLibravatar Jeff King1-2/+1
We don't need to check if the oidset has our object before we insert it; that's done as part of the insertion. We can just rely on the return value from oidset_insert(), which saves one hash lookup per object. This measurable speedup is tiny and within the run-to-run noise, but the result is simpler to read, too. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13blame.c: remove implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy1-0/+1
Side note, since we gain access to the right repository, we can stop rely on the_repository in this code as well. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13apply.c: make init_apply_state() take a struct repositoryLibravatar Nguyễn Thái Ngọc Duy2-2/+2
We're moving away from the_index in this code. "struct index_state *" could be added to struct apply_state. But let's aim long term and put struct repository here instead so that we could even avoid more global states in the future. The index will be available via apply_state->repo->index. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13archive.c: avoid access to the_indexLibravatar Nguyễn Thái Ngọc Duy2-2/+3
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13grep: use the right index instead of the_indexLibravatar Nguyễn Thái Ngọc Duy1-2/+2
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13attr: remove index from git_attr_set_direction()Libravatar Nguyễn Thái Ngọc Duy1-1/+1
Since attr checking API now take the index, there's no need to set an index in advance with this call. Most call sites are straightforward because they either pass the_index or NULL (which defaults back to the_index previously). There's only one suspicious call site in unpack-trees.c where it sets a different index. This code in unpack-trees is about to check out entries from the new/temporary index after merging is done in it. The attributes will be used by entry.c code to do crlf conversion if needed. entry.c now respects struct checkout's istate field, and this field is correctly set in unpack-trees.c, there should be no regression from this change. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13entry.c: use the right index instead of the_indexLibravatar Nguyễn Thái Ngọc Duy1-0/+1
checkout-index.c needs update because if checkout->istate is NULL, ie_match_stat() will crash. Previously this is ie_match_stat(&the_index, ..) so it will not crash, but it is not technically correct either. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13ls-files: correct index argument to get_convert_attr_ascii()Libravatar Nguyễn Thái Ngọc Duy1-8/+9
write_eolinfo() does take an istate as function argument and it should be used instead of the_index. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13dir.c: remove an implicit dependency on the_index in pathspec codeLibravatar Nguyễn Thái Ngọc Duy9-15/+15
Make the match_patchspec API and friends take an index_state instead of assuming the_index in dir.c. All external call sites are converted blindly to keep the patch simple and retain current behavior. Individual call sites may receive further updates to use the right index instead of the_index. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13convert.c: remove an implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy2-2/+2
Make the convert API take an index_state instead of assuming the_index in convert.c. All external call sites are converted blindly to keep the patch simple and retain current behavior. Individual call sites may receive further updates to use the right index instead of the_index. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13attr: remove an implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy2-3/+3
Make the attr API take an index_state instead of assuming the_index in attr code. All call sites are converted blindly to keep the patch simple and retain current behavior. Individual call sites may receive further updates to use the right index instead of the_index. There is one ugly temporary workaround added in attr.c that needs some more explanation. Commit c24f3abace (apply: file commited with CRLF should roundtrip diff and apply - 2017-08-19) forces one convert_to_git() call to NOT read the index at all. But what do you know, we read it anyway by falling back to the_index. When "istate" from convert_to_git is now propagated down to read_attr_from_array() we will hit segfault somewhere inside read_blob_data_from_index. The right way of dealing with this is to kill "use_index" variable and only follow "istate" but at this stage we are not ready for that: while most git_attr_set_direction() calls just passes the_index to be assigned to use_index, unpack-trees passes a different one which is used by entry.c code, which has no way to know what index to use if we delete use_index. So this has to be done later. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13diff.c: move read_index() code back to the callerLibravatar Nguyễn Thái Ngọc Duy1-3/+5
This code is only needed for diff-tree (since f0c6b2a2fd ([PATCH] Optimize diff-tree -[CM] --stdin - 2005-05-27)). Let the caller do the preparation instead and avoid read_index() in diff.c code. read_index() should be avoided (in addition to the_index) because it uses get_index_file() underneath to get the path $GIT_DIR/index. This effectively pulls the_repository in and may become the only reason to pull a 'struct repository *' in diff.c. Let's keep the dependencies as few as possible and kick it back to diff-tree.c Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13cat-file: support "unordered" output for --batch-all-objectsLibravatar Jeff King1-5/+51
If you're going to access the contents of every object in a packfile, it's generally much more efficient to do so in pack order, rather than in hash order. That increases the locality of access within the packfile, which in turn is friendlier to the delta base cache, since the packfile puts related deltas next to each other. By contrast, hash order is effectively random, since the sha1 has no discernible relationship to the content. This patch introduces an "--unordered" option to cat-file which iterates over packs in pack-order under the hood. You can see the results when dumping all of the file content: $ time ./git cat-file --batch-all-objects --buffer --batch | wc -c 6883195596 real 0m44.491s user 0m42.902s sys 0m5.230s $ time ./git cat-file --unordered \ --batch-all-objects --buffer --batch | wc -c 6883195596 real 0m6.075s user 0m4.774s sys 0m3.548s Same output, different order, way faster. The same speed-up applies even if you end up accessing the object content in a different process, like: git cat-file --batch-all-objects --buffer --batch-check | grep blob | git cat-file --batch='%(objectname) %(rest)' | wc -c Adding "--unordered" to the first command drops the runtime in git.git from 24s to 3.5s. Side note: there are actually further speedups available for doing it all in-process now. Since we are outputting the object content during the actual pack iteration, we know where to find the object and could skip the extra lookup done by oid_object_info(). This patch stops short of that optimization since the underlying API isn't ready for us to make those sorts of direct requests. So if --unordered is so much better, why not make it the default? Two reasons: 1. We've promised in the documentation that --batch-all-objects outputs in hash order. Since cat-file is plumbing, people may be relying on that default, and we can't change it. 2. It's actually _slower_ for some cases. We have to compute the pack revindex to walk in pack order. And our de-duplication step uses an oidset, rather than a sort-and-dedup, which can end up being more expensive. If we're just accessing the type and size of each object, for example, like: git cat-file --batch-all-objects --buffer --batch-check my best-of-five warm cache timings go from 900ms to 1100ms using --unordered. Though it's possible in a cold-cache or under memory pressure that we could do better, since we'd have better locality within the packfile. And one final question: why is it "--unordered" and not "--pack-order"? The answer is again two-fold: 1. "pack order" isn't a well-defined thing across the whole set of objects. We're hitting loose objects, as well as objects in multiple packs, and the only ordering we're promising is _within_ a single pack. The rest is apparently random. 2. The point here is optimization. So we don't want to promise any particular ordering, but only to say that we will choose an ordering which is likely to be efficient for accessing the object content. That leaves the door open for further changes in the future without having to add another compatibility option. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13cat-file: rename batch_{loose,packed}_object callbacksLibravatar Jeff King1-9/+9
We're not really doing the batch-show operation in these callbacks, but just collecting the set of objects. That distinction will become more important in a future patch, so let's rename them now to avoid cluttering that diff. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>