summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-08-18unpack-trees: reuse (still valid) cache-tree from src_indexLibravatar Nguyễn Thái Ngọc Duy2-1/+3
We do n-way merge by walking the source index and n trees at the same time and add merge results to a new temporary index called o->result. The merge result for any given path could be either - keep_entry(): same old index entry in o->src_index is reused - merged_entry(): either a new entry is added, or an existing one updated - deleted_entry(): one entry from o->src_index is removed For some reason [1] we keep making sure that the source index's cache-tree is still valid if used by o->result: for all those merged/deleted entries, we invalidate the same path in o->src_index, so only cache-trees covering the "keep_entry" parts remain good. Because of this, the cache-tree from o->src_index can be perfectly reused in o->result. And in fact we already rely on this logic to reuse untracked cache in edf3b90553 (unpack-trees: preserve index extensions - 2017-05-08). Move the cache-tree to o->result before doing cache_tree_update() to reduce hashing cost. Since cache_tree_update() has risen up as one of the most expensive parts in unpack_trees() after the last few patches. This does help reduce unpack_trees() time significantly (on webkit.git): before after -------------------------------------------------------------------- 0.080394752 0.051258167 s: read cache .git/index 0.216010838 0.212106298 s: preload index 0.008534301 0.280521764 s: refresh index 0.251992198 0.218160442 s: traverse_trees 0.377031383 0.374948191 s: check_updates 0.372768105 0.037040114 s: cache_tree_update 1.045887251 0.672031609 s: unpack_trees 0.314983512 0.317456290 s: write index, changed mask = 2e 0.062572653 0.038382654 s: traverse_trees 0.000022544 0.000042731 s: check_updates 0.073795585 0.050930053 s: unpack_trees 0.073807557 0.051099735 s: diff-index 1.938191592 1.614241153 s: git command: git checkout - [1] I'm pretty sure the reason is an oversight in 34110cd4e3 (Make 'unpack_trees()' have a separate source and destination index - 2008-03-06). That patch aims to _not_ update the source index at all. The invalidation should have been done on o->result in that patch. But then there was no cache-tree on o->result even then so it's pointless to do so. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-18unpack-trees: reduce malloc in cache-tree walkLibravatar Nguyễn Thái Ngọc Duy1-9/+20
This is a micro optimization that probably only shines on repos with deep directory structure. Instead of allocating and freeing a new cache_entry in every iteration, we reuse the last one and only update the parts that are new each iteration. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-18unpack-trees: optimize walking same trees with cache-treeLibravatar Nguyễn Thái Ngọc Duy1-0/+127
In order to merge one or many trees with the index, unpack-trees code walks multiple trees in parallel with the index and performs n-way merge. If we find out at start of a directory that all trees are the same (by comparing OID) and cache-tree happens to be available for that directory as well, we could avoid walking the trees because we already know what these trees contain: it's flattened in what's called "the index". The upside is of course a lot less I/O since we can potentially skip lots of trees (think subtrees). We also save CPU because we don't have to inflate and apply the deltas. The downside is of course more fragile code since the logic in some functions are now duplicated elsewhere. "checkout -" with this patch on webkit.git (275k files): baseline new -------------------------------------------------------------------- 0.056651714 0.080394752 s: read cache .git/index 0.183101080 0.216010838 s: preload index 0.008584433 0.008534301 s: refresh index 0.633767589 0.251992198 s: traverse_trees 0.340265448 0.377031383 s: check_updates 0.381884638 0.372768105 s: cache_tree_update 1.401562947 1.045887251 s: unpack_trees 0.338687914 0.314983512 s: write index, changed mask = 2e 0.411927922 0.062572653 s: traverse_trees 0.000023335 0.000022544 s: check_updates 0.423697246 0.073795585 s: unpack_trees 0.423708360 0.073807557 s: diff-index 2.559524127 1.938191592 s: git command: git checkout - Another measurement from Ben's running "git checkout" with over 500k trees (on the whole series): baseline new ---------------------------------------------------------------------- 0.535510167 0.556558733 s: read cache .git/index 0.3057373 0.3147105 s: initialize name hash 0.0184082 0.023558433 s: preload index 0.086910967 0.089085967 s: refresh index 7.889590767 2.191554433 s: unpack trees 0.120760833 0.131941267 s: update worktree after a merge 2.2583504 2.572663167 s: repair cache-tree 0.8916137 0.959495233 s: write index, changed mask = 28 3.405199233 0.2710663 s: unpack trees 0.000999667 0.0021554 s: update worktree after a merge 3.4063306 0.273318333 s: diff-index 16.9524923 9.462943133 s: git command: git.exe checkout This command calls unpack_trees() twice, the first time on 2way merge and the second 1way merge. In both times, "unpack trees" time is reduced to one third. Overall time reduction is not that impressive of course because index operations take a big chunk. And there's that repair cache-tree line. PS. A note about cache-tree invalidation and the use of it in this code. We do invalidate cache-tree in _source_ index when we add new entries to the (temporary) "result" index. But we also use the cache-tree from source index in this optimization. Does this mean we end up having no cache-tree in the source index to activate this optimization? The answer is twisted: the order of finding a good cache-tree and invalidating it matters. In this case we check for a good cache-tree first in all_trees_same_as_cache_tree(), then we start to merge things and potentially invalidate that same cache-tree in the process. Since cache-tree invalidation happens after the optimization kicks in, we're still good. But we may lose that cache-tree at the very first call_unpack_fn() call in traverse_by_cache_tree(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-18unpack-trees: add performance tracingLibravatar Nguyễn Thái Ngọc Duy2-1/+10
We're going to optimize unpack_trees() a bit in the following patches. Let's add some tracing to measure how long it takes before and after. This is the baseline ("git checkout -" on webkit.git, 275k files on worktree) performance: 0.056651714 s: read cache .git/index performance: 0.183101080 s: preload index performance: 0.008584433 s: refresh index performance: 0.633767589 s: traverse_trees performance: 0.340265448 s: check_updates performance: 0.381884638 s: cache_tree_update performance: 1.401562947 s: unpack_trees performance: 0.338687914 s: write index, changed mask = 2e performance: 0.411927922 s: traverse_trees performance: 0.000023335 s: check_updates performance: 0.423697246 s: unpack_trees performance: 0.423708360 s: diff-index performance: 2.559524127 s: git command: git checkout - Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-18trace.h: support nested performance tracingLibravatar Nguyễn Thái Ngọc Duy7-20/+96
Performance measurements are listed right now as a flat list, which is fine when we measure big blocks. But when we start adding more and more measurements, some of them could be just part of a bigger measurement and a flat list gives a wrong impression that they are executed at the same level instead of nested. Add trace_performance_enter() and trace_performance_leave() to allow indent these nested measurements. For now it does not help much because the only nested thing is (lazy) name hash initialization (e.g. called in diff-index from "git status"). This will help more because I'm going to add some more tracing that's actually nested. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-17Seventh batch for 2.19 cycleLibravatar Junio C Hamano1-0/+64
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-17Merge branch 'nd/complete-config-vars'Libravatar Junio C Hamano1-1/+1
Build fix. * nd/complete-config-vars: Makefile: add missing dependency for command-list.h
2018-08-17Merge branch 'ar/t4150-am-scissors-test-fix'Libravatar Junio C Hamano1-19/+20
Test fix. * ar/t4150-am-scissors-test-fix: t4150: fix broken test for am --scissors
2018-08-17Merge branch 'js/pull-rebase-type-shorthand'Libravatar Junio C Hamano2-3/+15
"git pull --rebase=interactive" learned "i" as a short-hand for "interactive". * js/pull-rebase-type-shorthand: pull --rebase=<type>: allow single-letter abbreviations for the type
2018-08-17Merge branch 'jk/diff-rendered-docs'Libravatar Junio C Hamano2-0/+110
The end result of documentation update has been made to be inspected more easily to help developers. * jk/diff-rendered-docs: add a script to diff rendered documentation
2018-08-17Merge branch 'hn/config-in-code-comment'Libravatar Junio C Hamano1-1/+6
Header update. * hn/config-in-code-comment: config: document git config getter return value
2018-08-17Merge branch 'nd/config-blame-sort'Libravatar Junio C Hamano1-34/+34
Doc fix. * nd/config-blame-sort: config.txt: reorder blame stuff to keep config keys sorted
2018-08-17Merge branch 'en/t3031-title-fix'Libravatar Junio C Hamano1-1/+1
Test fix. * en/t3031-title-fix: t3031: update test description to mention desired behavior
2018-08-17Merge branch 'sb/indent-heuristic-optim'Libravatar Junio C Hamano1-1/+11
"git diff --indent-heuristic" had a bad corner case performance. * sb/indent-heuristic-optim: xdiff: reduce indent heuristic overhead
2018-08-17Merge branch 'en/abort-df-conflict-fixes'Libravatar Junio C Hamano4-9/+131
"git merge --abort" etc. did not clean things up properly when there were conflicted entries in the index in certain order that are involved in D/F conflicts. This has been corrected. * en/abort-df-conflict-fixes: read-cache: fix directory/file conflict handling in read_index_unmerged() t1015: demonstrate directory/file conflict recovery failures
2018-08-17Merge branch 'mk/http-backend-content-length'Libravatar Junio C Hamano6-15/+282
The http-backend (used for smart-http transport) used to slurp the whole input until EOF, without paying attention to CONTENT_LENGTH that is supplied in the environment and instead expecting the Web server to close the input stream. This has been fixed. * mk/http-backend-content-length: t5562: avoid non-portable "export FOO=bar" construct http-backend: respect CONTENT_LENGTH for receive-pack http-backend: respect CONTENT_LENGTH as specified by rfc3875 http-backend: cleanup writing to child process
2018-08-17Merge branch 'ot/ref-filter-object-info'Libravatar Junio C Hamano1-88/+138
A few atoms like %(objecttype) and %(objectsize) in the format specifier of "for-each-ref --format=<format>" can be filled without getting the full contents of the object, but just with the object header. These cases have been optimized by calling oid_object_info() API (instead of reading and inspecting the data). * ot/ref-filter-object-info: ref-filter: use oid_object_info() to get object ref-filter: merge get_obj and get_object ref-filter: initialize eaten variable ref-filter: fill empty fields with empty values ref-filter: add info_source to valid_atom
2018-08-17Merge branch 'nd/no-extern'Libravatar Junio C Hamano12-266/+269
Noiseword "extern" has been removed from function decls in the header files. * nd/no-extern: submodule.h: drop extern from function declaration revision.h: drop extern from function declaration repository.h: drop extern from function declaration rerere.h: drop extern from function declaration line-range.h: drop extern from function declaration diff.h: remove extern from function declaration diffcore.h: drop extern from function declaration convert.h: drop 'extern' from function declaration cache-tree.h: drop extern from function declaration blame.h: drop extern on func declaration attr.h: drop extern from function declaration apply.h: drop extern on func declaration
2018-08-17Merge branch 'es/want-color-fd-defensive'Libravatar Junio C Hamano1-0/+3
Futureproofing a helper function that can easily be misused. * es/want-color-fd-defensive: color: protect against out-of-bounds reads and writes
2018-08-17Merge branch 'ab/sha1dc'Libravatar Junio C Hamano2-1/+11
AIX portability update for the SHA1DC hash, imported from upstream. * ab/sha1dc: sha1dc: update from upstream
2018-08-17Merge branch 'rs/parse-opt-lithelp'Libravatar Junio C Hamano11-17/+19
The parse-options machinery learned to refrain from enclosing placeholder string inside a "<bra" and "ket>" pair automatically without PARSE_OPT_LITERAL_ARGHELP. Existing help text for option arguments that are not formatted correctly have been identified and fixed. * rs/parse-opt-lithelp: parse-options: automatically infer PARSE_OPT_LITERAL_ARGHELP shortlog: correct option help for -w send-pack: specify --force-with-lease argument help explicitly pack-objects: specify --index-version argument help explicitly difftool: remove angular brackets from argument help add, update-index: fix --chmod argument help push: use PARSE_OPT_LITERAL_ARGHELP instead of unbalanced brackets
2018-08-17Merge branch 'ab/fetch-nego'Libravatar Junio C Hamano4-4/+39
Update to a few other topics around 'git fetch'. * ab/fetch-nego: fetch doc: cross-link two new negotiation options negotiator: unknown fetch.negotiationAlgorithm should error out
2018-08-17Merge branch 'jt/refspec-dwim-precedence-fix'Libravatar Junio C Hamano3-8/+58
"git fetch $there refs/heads/s" ought to fetch the tip of the branch 's', but when "refs/heads/refs/heads/s", i.e. a branch whose name is "refs/heads/s" exists at the same time, fetched that one instead by mistake. This has been corrected to honor the usual disambiguation rules for abbreviated refnames. * jt/refspec-dwim-precedence-fix: remote: make refspec follow the same disambiguation rule as local refs
2018-08-17Merge branch 'jk/merge-subtree-heuristics'Libravatar Junio C Hamano2-17/+54
The automatic tree-matching in "git merge -s subtree" was broken 5 years ago and nobody has noticed since then, which is now fixed. * jk/merge-subtree-heuristics: score_trees(): fix iteration over trees with missing entries
2018-08-17Merge branch 'ab/test-must-be-empty'Libravatar Junio C Hamano2-4/+2
Test updates. * ab/test-must-be-empty: tests: make use of the test_must_be_empty function
2018-08-17Merge branch 'es/rebase-i-author-script-fix'Libravatar Junio C Hamano2-16/+33
The "author-script" file "git rebase -i" creates got broken when we started to move the command away from shell script, which is getting fixed now. * es/rebase-i-author-script-fix: sequencer: don't die() on bogus user-edited timestamp sequencer: fix "rebase -i --root" corrupting author header timestamp sequencer: fix "rebase -i --root" corrupting author header timezone sequencer: fix "rebase -i --root" corrupting author header
2018-08-17Merge branch 'ab/fsck-transfer-updates'Libravatar Junio C Hamano3-41/+255
The test performed at the receiving end of "git push" to prevent bad objects from entering repository can be customized via receive.fsck.* configuration variables; we now have gained a counterpart to do the same on the "git fetch" side, with fetch.fsck.* configuration variables. * ab/fsck-transfer-updates: fsck: test and document unknown fsck.<msg-id> values fsck: add stress tests for fsck.skipList fsck: test & document {fetch,receive}.fsck.* config fallback fetch: implement fetch.fsck.* transfer.fsckObjects tests: untangle confusing setup config doc: elaborate on fetch.fsckObjects security config doc: elaborate on what transfer.fsckObjects does config doc: unify the description of fsck.* and receive.fsck.* config doc: don't describe *.fetchObjects twice receive.fsck.<msg-id> tests: remove dead code
2018-08-15Sixth batch for 2.19 cycleLibravatar Junio C Hamano1-0/+77
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-15Merge branch 'jt/connectivity-check-after-unshallow'Libravatar Junio C Hamano9-84/+50
"git fetch" sometimes failed to update the remote-tracking refs, which has been corrected. * jt/connectivity-check-after-unshallow: fetch-pack: unify ref in and out param
2018-08-15Merge branch 'sg/travis-retrieve-trash-upon-failure'Libravatar Junio C Hamano3-3/+104
The Travis CI scripts were taught to ship back the test data from failed tests. * sg/travis-retrieve-trash-upon-failure: travis-ci: include the trash directories of failed tests in the trace log
2018-08-15Merge branch 'rs/remote-mv-leakfix'Libravatar Junio C Hamano1-2/+3
Leakfix. * rs/remote-mv-leakfix: remote: clear string_list after use in mv()
2018-08-15Merge branch 'es/mw-to-git-chain-fix'Libravatar Junio C Hamano1-1/+1
Test fix. * es/mw-to-git-chain-fix: mw-to-git/t9360: fix broken &&-chain
2018-08-15Merge branch 'ms/http-proto-doc'Libravatar Junio C Hamano1-2/+2
Doc fix. * ms/http-proto-doc: doc: fix want-capability separator
2018-08-15Merge branch 'nd/pack-objects-threading-doc'Libravatar Junio C Hamano1-0/+19
Doc fix. * nd/pack-objects-threading-doc: pack-objects: document about thread synchronization
2018-08-15Merge branch 'jn/subtree-test-fixes'Libravatar Junio C Hamano1-90/+31
Test fix. * jn/subtree-test-fixes: subtree test: simplify preparation of expected results subtree test: add missing && to &&-chain
2018-08-15Merge branch 'cb/p4-pre-submit-hook'Libravatar Junio C Hamano4-1/+59
"git p4 submit" learns to ask its own pre-submit hook if it should continue with submitting. * cb/p4-pre-submit-hook: git-p4: add the `p4-pre-submit` hook
2018-08-15Merge branch 'js/vscode'Libravatar Junio C Hamano6-12/+405
Add a script (in contrib/) to help users of VSCode work better with our codebase. * js/vscode: vscode: let cSpell work on commit messages, too vscode: add a dictionary for cSpell vscode: use 8-space tabs, no trailing ws, etc for Git's source code vscode: wrap commit messages at column 72 by default vscode: only overwrite C/C++ settings mingw: define WIN32 explicitly cache.h: extract enum declaration from inside a struct declaration vscode: hard-code a couple defines contrib: add a script to initialize VS Code configuration
2018-08-15Merge branch 'bb/redecl-enum-fix'Libravatar Junio C Hamano1-1/+1
Compilation fix. * bb/redecl-enum-fix: packfile: ensure that enum object_type is defined
2018-08-15Merge branch 'jk/banned-function'Libravatar Junio C Hamano2-0/+36
It is too easy to misuse system API functions such as strcat(); these selected functions are now forbidden in this codebase and will cause a compilation failure. * jk/banned-function: banned.h: mark strncpy() as banned banned.h: mark sprintf() as banned banned.h: mark strcat() as banned automatically ban strcpy()
2018-08-15Merge branch 'en/merge-recursive-skip-fix'Libravatar Junio C Hamano2-0/+29
When the sparse checkout feature is in use, "git cherry-pick" and other mergy operations lost the skip_worktree bit when a path that is excluded from checkout requires content level merge, which is resolved as the same as the HEAD version, without materializing the merge result in the working tree, which made the path appear as deleted. This has been corrected by preserving the skip_worktree bit (and not materializing the file in the working tree). * en/merge-recursive-skip-fix: merge-recursive: preserve skip_worktree bit when necessary t3507: add a testcase showing failure with sparse checkout
2018-08-15Merge branch 'jt/tag-following-with-proto-v2-fix'Libravatar Junio C Hamano2-4/+69
The wire-protocol v2 relies on the client to send "ref prefixes" to limit the bandwidth spent on the initial ref advertisement. "git fetch $remote branch:branch" that asks tags that point into the history leading to the "branch" automatically followed sent to narrow prefix and broke the tag following, which has been fixed. * jt/tag-following-with-proto-v2-fix: fetch: send "refs/tags/" prefix upon CLI refspecs t5702: test fetch with multiple refspecs at a time
2018-08-15Merge branch 'jk/size-t'Libravatar Junio C Hamano7-26/+27
Code clean-up to use size_t/ssize_t when they are the right type. * jk/size-t: strbuf_humanise: use unsigned variables pass st.st_size as hint for strbuf_readlink() strbuf_readlink: use ssize_t strbuf: use size_t for length in intermediate variables reencode_string: use size_t for string lengths reencode_string: use st_add/st_mult helpers
2018-08-15Merge branch 'sg/coccicheck-updates'Libravatar Junio C Hamano1-7/+17
Update the way we use Coccinelle to find out-of-style code that need to be modernised. * sg/coccicheck-updates: coccinelle: extract dedicated make target to clean Coccinelle's results coccinelle: put sane filenames into output patches coccinelle: exclude sha1dc source files from static analysis coccinelle: use $(addsuffix) in 'coccicheck' make target coccinelle: mark the 'coccicheck' make target as .PHONY
2018-08-15Merge branch 'sb/histogram-less-memory'Libravatar Junio C Hamano1-55/+78
"git diff --histogram" had a bad memory usage pattern, which has been rearranged to reduce the peak usage. * sb/histogram-less-memory: xdiff/histogram: remove tail recursion xdiff/xhistogram: move index allocation into find_lcs xdiff/xhistogram: factor out memory cleanup into free_index() xdiff/xhistogram: pass arguments directly to fall_back_to_classic_diff
2018-08-15Merge branch 'nd/i18n'Libravatar Junio C Hamano47-483/+502
Many more strings are prepared for l10n. * nd/i18n: (23 commits) transport-helper.c: mark more strings for translation transport.c: mark more strings for translation sha1-file.c: mark more strings for translation sequencer.c: mark more strings for translation replace-object.c: mark more strings for translation refspec.c: mark more strings for translation refs.c: mark more strings for translation pkt-line.c: mark more strings for translation object.c: mark more strings for translation exec-cmd.c: mark more strings for translation environment.c: mark more strings for translation dir.c: mark more strings for translation convert.c: mark more strings for translation connect.c: mark more strings for translation config.c: mark more strings for translation commit-graph.c: mark more strings for translation builtin/replace.c: mark more strings for translation builtin/pack-objects.c: mark more strings for translation builtin/grep.c: mark strings for translation builtin/config.c: mark more strings for translation ...
2018-08-15Merge branch 'hs/gpgsm'Libravatar Junio C Hamano10-23/+285
Teach "git tag -s" etc. a few configuration variables (gpg.format that can be set to "openpgp" or "x509", and gpg.<format>.program that is used to specify what program to use to deal with the format) to allow x.509 certs with CMS via "gpgsm" to be used instead of openpgp via "gnupg". * hs/gpgsm: gpg-interface t: extend the existing GPG tests with GPGSM gpg-interface: introduce new signature format "x509" using gpgsm gpg-interface: introduce new config to select per gpg format program gpg-interface: do not hardcode the key string len anymore gpg-interface: introduce an abstraction for multiple gpg formats t/t7510: check the validation of the new config gpg.format gpg-interface: add new config to select how to sign a commit
2018-08-15Merge branch 'bw/clone-ref-prefixes'Libravatar Junio C Hamano2-6/+21
The wire-protocol v2 relies on the client to send "ref prefixes" to limit the bandwidth spent on the initial ref advertisement. "git clone" when learned to speak v2 forgot to do so, which has been corrected. * bw/clone-ref-prefixes: clone: send ref-prefixes when using protocol v2
2018-08-15Merge branch 'jk/core-use-replace-refs'Libravatar Junio C Hamano16-17/+31
A new configuration variable core.usereplacerefs has been added, primarily to help server installations that want to ignore the replace mechanism altogether. * jk/core-use-replace-refs: add core.usereplacerefs config option check_replace_refs: rename to read_replace_refs check_replace_refs: fix outdated comment
2018-08-15Merge branch 'jh/json-writer'Libravatar Junio C Hamano8-0/+1471
Preparatory code to later add json output for telemetry data. * jh/json-writer: json_writer: new routines to create JSON data
2018-08-15Merge branch 'bb/make-developer-pedantic'Libravatar Junio C Hamano2-0/+10
"make DEVELOPER=1 DEVOPTS=pedantic" allows developers to compile with -pedantic option, which may catch more problematic program constructs and potential bugs. * bb/make-developer-pedantic: Makefile: add a DEVOPTS flag to get pedantic compilation