summaryrefslogtreecommitdiff
path: root/builtin
AgeCommit message (Collapse)AuthorFilesLines
2021-10-18Merge branch 'jk/cat-file-batch-all-wo-replace'Libravatar Junio C Hamano1-14/+40
"git cat-file --batch" with the "--batch-all-objects" option is supposed to iterate over all the objects found in a repository, but it used to translate these object names using the replace mechanism, which defeats the point of enumerating all objects in the repository. This has been corrected. * jk/cat-file-batch-all-wo-replace: cat-file: use packed_object_info() for --batch-all-objects cat-file: split ordered/unordered batch-all-objects callbacks cat-file: disable refs/replace with --batch-all-objects cat-file: mention --unordered along with --batch-all-objects t1006: clean up broken objects
2021-10-18Merge branch 'ab/designated-initializers-more'Libravatar Junio C Hamano3-74/+69
Code clean-up. * ab/designated-initializers-more: builtin/remote.c: add and use SHOW_INFO_INIT builtin/remote.c: add and use a REF_STATES_INIT urlmatch.[ch]: add and use URLMATCH_CONFIG_INIT builtin/blame.c: refactor commit_info_init() to COMMIT_INFO_INIT macro daemon.c: refactor hostinfo_init() to HOSTINFO_INIT macro
2021-10-18Merge branch 'tb/repack-write-midx'Libravatar Junio C Hamano2-36/+282
"git repack" has been taught to generate multi-pack reachability bitmaps. * tb/repack-write-midx: test-read-midx: fix leak of bitmap_index struct builtin/repack.c: pass `--refs-snapshot` when writing bitmaps builtin/repack.c: make largest pack preferred builtin/repack.c: support writing a MIDX while repacking builtin/repack.c: extract showing progress to a variable builtin/repack.c: rename variables that deal with non-kept packs builtin/repack.c: keep track of existing packs unconditionally midx: preliminary support for `--refs-snapshot` builtin/multi-pack-index.c: support `--stdin-packs` mode midx: expose `write_midx_file_only()` publicly
2021-10-18Merge branch 'js/retire-preserve-merges'Libravatar Junio C Hamano3-330/+20
The "--preserve-merges" option of "git rebase" has been removed. * js/retire-preserve-merges: sequencer: restrict scope of a formerly public function rebase: remove a no-longer-used function rebase: stop mentioning the -p option in comments rebase: remove obsolete code comment rebase: drop the internal `rebase--interactive` command git-svn: drop support for `--preserve-merges` rebase: drop support for `--preserve-merges` pull: remove support for `--rebase=preserve` tests: stop testing `git rebase --preserve-merges` remote: warn about unhandled branch.<name>.rebase values t5520: do not use `pull.rebase=preserve`
2021-10-13Merge branch 'ab/align-parse-options-help'Libravatar Junio C Hamano5-12/+12
When "git cmd -h" shows more than one line of usage text (e.g. the cmd subcommand may take sub-sub-command), parse-options API learned to align these lines, even across i18n/l10n. * ab/align-parse-options-help: parse-options: properly align continued usage output git rev-parse --parseopt tests: add more usagestr tests send-pack: properly use parse_options() API for usage string parse-options API users: align usage output in C-strings
2021-10-13Merge branch 'ab/help-config-vars'Libravatar Junio C Hamano1-48/+83
Teach "git help -c" into helping the command line completion of configuration variables. * ab/help-config-vars: help: move column config discovery to help.c library help / completion: make "git help" do the hard work help tests: test --config-for-completion option & output help: simplify by moving to OPT_CMDMODE() help: correct logic error in combining --all and --guides help: correct logic error in combining --all and --config help tests: add test for --config output help: correct usage & behavior of "git help --guides" help: correct the usage string in -h and documentation
2021-10-13Merge branch 'ab/config-based-hooks-1'Libravatar Junio C Hamano6-40/+17
Mostly preliminary clean-up in the hook API. * ab/config-based-hooks-1: hook-list.h: add a generated list of hooks, like config-list.h hook.c users: use "hook_exists()" instead of "find_hook()" hook.c: add a hook_exists() wrapper and use it in bugreport.c hook.[ch]: move find_hook() from run-command.c to hook.c Makefile: remove an out-of-date comment Makefile: don't perform "mv $@+ $@" dance for $(GENERATED_H) Makefile: stop hardcoding {command,config}-list.h Makefile: mark "check" target as .PHONY
2021-10-13Merge branch 'en/removing-untracked-fixes'Libravatar Junio C Hamano8-25/+35
Various fixes in code paths that move untracked files away to make room. * en/removing-untracked-fixes: Documentation: call out commands that nuke untracked files/directories Comment important codepaths regarding nuking untracked files/dirs unpack-trees: avoid nuking untracked dir in way of locally deleted file unpack-trees: avoid nuking untracked dir in way of unmerged file Change unpack_trees' 'reset' flag into an enum Remove ignored files by default when they are in the way unpack-trees: make dir an internal-only struct unpack-trees: introduce preserve_ignored to unpack_trees_options read-tree, merge-recursive: overwrite ignored files by default checkout, read-tree: fix leak of unpack_trees_options.dir t2500: add various tests for nuking untracked files
2021-10-13Merge branch 'ds/add-rm-with-sparse-index'Libravatar Junio C Hamano3-14/+80
"git add", "git mv", and "git rm" have been adjusted to avoid updating paths outside of the sparse-checkout definition unless the user specifies a "--sparse" option. * ds/add-rm-with-sparse-index: advice: update message to suggest '--sparse' mv: refuse to move sparse paths rm: skip sparse paths with missing SKIP_WORKTREE rm: add --sparse option add: update --renormalize to skip sparse paths add: update --chmod to skip sparse paths add: implement the --sparse option add: skip tracked paths outside sparse-checkout cone add: fail when adding an untracked sparse file dir: fix pattern matching on dirs dir: select directories correctly t1092: behavior for adding sparse files t3705: test that 'sparse_entry' is unstaged
2021-10-11Merge branch 'mr/bisect-in-c-4'Libravatar Junio C Hamano1-1/+1
Message fix. * mr/bisect-in-c-4: bisect--helper: add space between colon and following sentence
2021-10-11Merge branch 'da/difftool'Libravatar Junio C Hamano1-51/+53
Code clean-up in "git difftool". * da/difftool: difftool: add a missing space to the run_dir_diff() comments difftool: remove an unnecessary call to strbuf_release() difftool: refactor dir-diff to write files using helper functions difftool: create a tmpdir path without repeated slashes
2021-10-11Merge branch 'ab/designated-initializers'Libravatar Junio C Hamano1-10/+11
Code clean-up. * ab/designated-initializers: cbtree.h: define cb_init() in terms of CBTREE_INIT *.h: move some *_INIT to designated initializers *.h _INIT macros: don't specify fields equal to 0 *.[ch] *_INIT macros: use { 0 } for a "zero out" idiom submodule-config.h: remove unused SUBMODULE_INIT macro
2021-10-11Merge branch 'jk/ref-paranoia'Libravatar Junio C Hamano5-8/+4
The ref iteration code used to optionally allow dangling refs to be shown, which has been tightened up. * jk/ref-paranoia: refs: drop "broken" flag from for_each_fullref_in() ref-filter: drop broken-ref code entirely ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN repack, prune: drop GIT_REF_PARANOIA settings refs: turn on GIT_REF_PARANOIA by default refs: omit dangling symrefs when using GIT_REF_PARANOIA refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag refs-internal.h: reorganize DO_FOR_EACH_* flag documentation refs-internal.h: move DO_FOR_EACH_* flags next to each other t5312: be more assertive about command failure t5312: test non-destructive repack t5312: create bogus ref as necessary t5312: drop "verbose" helper t5600: provide detached HEAD for corruption failures t5516: don't use HEAD ref for invalid ref-deletion tests t7900: clean up some more broken refs
2021-10-11Merge branch 'tb/midx-write-propagate-namehash'Libravatar Junio C Hamano1-0/+21
"git multi-pack-index write --bitmap" learns to propagate the hashcache from original bitmap to resulting bitmap. * tb/midx-write-propagate-namehash: t5326: test propagating hashcache values p5326: generate pack bitmaps before writing the MIDX bitmap p5326: don't set core.multiPackIndex unnecessarily p5326: create missing 'perf-tag' tag midx.c: respect 'pack.writeBitmapHashcache' when writing bitmaps pack-bitmap.c: propagate namehash values from existing bitmaps t/helper/test-bitmap.c: add 'dump-hashes' mode
2021-10-08cat-file: use packed_object_info() for --batch-all-objectsLibravatar Jeff King1-14/+36
When "cat-file --batch-all-objects" iterates over each object, it knows where to find each one. But when we look up details of the object, we don't use that information at all. This patch teaches it to use the pack/offset pair when we're iterating over objects in a pack. This yields a measurable speed improvement (timings on a fully packed clone of linux.git): Benchmark #1: ./git.old cat-file --batch-all-objects --unordered --batch-check="%(objecttype) %(objectname)" Time (mean ± σ): 8.128 s ± 0.118 s [User: 7.968 s, System: 0.156 s] Range (min … max): 8.007 s … 8.301 s 10 runs Benchmark #2: ./git.new cat-file --batch-all-objects --unordered --batch-check="%(objecttype) %(objectname)" Time (mean ± σ): 4.294 s ± 0.064 s [User: 4.167 s, System: 0.125 s] Range (min … max): 4.227 s … 4.457 s 10 runs Summary './git.new cat-file --batch-all-objects --unordered --batch-check="%(objecttype) %(objectname)"' ran 1.89 ± 0.04 times faster than './git.old cat-file --batch-all-objects --unordered --batch-check="%(objecttype) %(objectname)" The implementation is pretty simple: we just call packed_object_info() instead of oid_object_info_extended() when we can. Most of the changes are just plumbing the pack/offset pair through the callstack. There is one subtlety: replace lookups are not handled by packed_object_info(). But since those are disabled for --batch-all-objects, and since we'll only have pack info when that option is in effect, we don't have to worry about that. There are a few limitations to this optimization which we could address with further work: - I didn't bother recording when we found an object loose. Technically this could save us doing a fruitless lookup in the pack index. But opening and mmap-ing a loose object is so expensive in the first place that this doesn't matter much. And if your repository is large enough to care about per-object performance, most objects are going to be packed anyway. - This works only in --unordered mode. For the sorted mode, we'd have to record the pack/offset pair as part of our oid-collection. That's more code, plus at least 16 extra bytes of heap per object. It would probably still be a net win in runtime, but we'd need to measure. - For --batch, this still helps us with getting the object metadata, but we still do a from-scratch lookup for the object contents. This probably doesn't matter that much, because the lookup cost will be much smaller relative to the cost of actually unpacking and printing the objects. For small objects, we could probably swap out read_object_file() for using packed_object_info() with a "object_info.contentp" to get the contents. But we'd still need to deal with streaming for larger objects. A better path forward here is to teach the initial oid_object_info_extended() / packed_object_info() calls to retrieve the contents of smaller objects while they are already being accessed. That would save the extra lookup entirely. But it's a non-trivial feature to add to the object_info code, so I left it for now. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08cat-file: split ordered/unordered batch-all-objects callbacksLibravatar Jeff King1-1/+3
When we originally added --batch-all-objects, it stuffed everything into an oid_array(), and then iterated over that array with a callback to write the actual output. When we later added --unordered, that code path writes immediately as we discover each object, but just calls the same batch_object_cb() as our entry point to the writing code. That callback has a narrow interface; it only receives the oid, but we know much more about each object in the unordered write (which we'll make use of in the next patch). So let's just call batch_object_write() directly. The callback wasn't saving us much effort. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08cat-file: disable refs/replace with --batch-all-objectsLibravatar Jeff King1-0/+2
When we're enumerating all objects in the object database, it doesn't make sense to respect refs/replace. The point of this option is to enumerate all of the objects in the database at a low level. By definition we'd already show the replacement object's contents (under its real oid), and showing those contents under another oid is almost certainly working against what the user is trying to do. Note that you could make the same argument for something like: git show-index <foo.idx | awk '{print $2}' | git cat-file --batch but there we can't know in cat-file exactly what the user intended, because we don't know the source of the input. They could be trying to do low-level debugging, or they could be doing something more high-level (e.g., imagine a porcelain built around cat-file for its object accesses). So in those cases, we'll have to rely on the user specifying "git --no-replace-objects" to tell us what to do. One _could_ make an argument that "cat-file --batch" is sufficiently low-level plumbing that it should not respect replace-objects at all (and the caller should do any replacement if they want it). But we have been doing so for some time. The history is a little tangled: - looking back as far as v1.6.6, we would not respect replace refs for --batch-check, but would for --batch (because the former used sha1_object_info(), and the replace mechanism only affected actual object reads) - this discrepancy was made even weirder by 98e2092b50 (cat-file: teach --batch to stream blob objects, 2013-07-10), where we always output the header using the --batch-check code, and then printed the object separately. This could lead to "cat-file --batch" dying (when it notices the size or type changed for a non-blob object) or even producing bogus output (in streaming mode, we didn't notice that we wrote the wrong number of bytes). - that persisted until 1f7117ef7a (sha1_file: perform object replacement in sha1_object_info_extended(), 2013-12-11), which then respected replace refs for both forms. So it has worked reliably this way for over 7 years, and we should make sure it continues to do so. That could also be an argument that --batch-all-objects should not change behavior (which this patch is doing), but I really consider the current behavior to be an unintended bug. It's a side effect of how the code is implemented (feeding the oids back into oid_object_info() rather than looking at what we found while reading the loose and packed object storage). The implementation is straight-forward: we just disable the global read_replace_refs flag when we're in --batch-all-objects mode. It would perhaps be a little cleaner to change the flag we pass to oid_object_info_extended(), but that's not enough. We also read objects via read_object_file() and stream_blob_to_fd(). The former could switch to its _extended() form, but the streaming code has no mechanism for disabling replace refs. Setting the global flag works, and as a bonus, it's impossible to have any "oops, we're sometimes replacing the object and sometimes not" bugs in the output (like the ones caused by 98e2092b50 above). The tests here cover the regular-input and --batch-all-objects cases, for both --batch-check and --batch. There is a test in t6050 that covers the regular-input case with --batch already, but this new one goes much further in actually verifying the output (plus covering --batch-check explicitly). This is perhaps a little overkill and the tests would be simpler just covering --batch-check, but I wanted to make sure we're checking that --batch output is consistent between the header and the content. The global-flag technique used here makes that easy to get right, but this is future-proofing us against regressions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-06Merge branch 'tb/commit-graph-usage-fix'Libravatar Junio C Hamano2-7/+30
Regression in "git commit-graph" command line parsing has been corrected. * tb/commit-graph-usage-fix: builtin/multi-pack-index.c: disable top-level --[no-]progress builtin/commit-graph.c: don't accept common --[no-]progress
2021-10-06Merge branch 'pw/rebase-of-a-tag-fix'Libravatar Junio C Hamano1-28/+21
"git rebase <upstream> <tag>" failed when aborted in the middle, as it mistakenly tried to write the tag object instead of peeling it to HEAD. * pw/rebase-of-a-tag-fix: rebase: dereference tags rebase: use lookup_commit_reference_by_name() rebase: use our standard error return value t3407: rework rebase --quit tests t3407: strengthen rebase --abort tests t3407: use test_path_is_missing t3407: rename a variable t3407: use test_cmp_rev t3407: use test_commit t3407: run tests in $TEST_DIRECTORY
2021-10-06Merge branch 'jt/add-submodule-odb-clean-up'Libravatar Junio C Hamano3-13/+3
More code paths that use the hack to add submodule's object database to the set of alternate object store have been cleaned up. * jt/add-submodule-odb-clean-up: revision: remove "submodule" from opt struct repository: support unabsorbed in repo_submodule_init submodule: remove unnecessary unabsorbed fallback
2021-10-03Merge branch 'bs/difftool-msg-tweak'Libravatar Junio C Hamano1-3/+3
Message tweak. * bs/difftool-msg-tweak: difftool: fix word spacing in the usage strings
2021-10-03Merge branch 'ab/bundle-remove-verbose-option'Libravatar Junio C Hamano1-3/+0
Doc update. * ab/bundle-remove-verbose-option: bundle: remove ignored & undocumented "--verbose" flag
2021-10-03Merge branch 'bs/ls-files-opt-help-text-update'Libravatar Junio C Hamano1-2/+2
Help text for "ls-files" options have been updated. * bs/ls-files-opt-help-text-update: ls-files: use imperative mood for -X and -z option description
2021-10-03Merge branch 'da/difftool-dir-diff-symlink-fix'Libravatar Junio C Hamano1-0/+2
"git difftool --dir-diff" mishandled symbolic links. * da/difftool-dir-diff-symlink-fix: difftool: fix symlink-file writing in dir-diff mode
2021-10-03Merge branch 'ab/refs-files-cleanup'Libravatar Junio C Hamano1-7/+6
Continued work on top of the hn/refs-errno-cleanup topic. * ab/refs-files-cleanup: refs/files: remove unused "errno != ENOTDIR" condition refs/files: remove unused "errno == EISDIR" code refs/files: remove unused "oid" in lock_ref_oid_basic() refs API: remove OID argument to reflog_expire() reflog expire: don't lock reflogs using previously seen OID refs/files: add a comment about refs_reflog_exists() call refs: make repo_dwim_log() accept a NULL oid refs/debug: re-indent argument list for "prepare" refs/files: remove unused "skip" in lock_raw_ref() too refs/files: remove unused "extras/skip" in lock_ref_oid_basic() refs: drop unused "flags" parameter to lock_ref_oid_basic() refs/files: remove unused REF_DELETING in lock_ref_oid_basic() refs/packet: add missing BUG() invocations to reflog callbacks
2021-10-03Merge branch 'jk/clone-unborn-head-in-bare'Libravatar Junio C Hamano1-16/+17
"git clone" from a repository whose HEAD is unborn into a bare repository didn't follow the branch name the other side used, which is corrected. * jk/clone-unborn-head-in-bare: clone: handle unborn branch in bare repos
2021-10-03Merge branch 'en/stash-df-fix'Libravatar Junio C Hamano1-3/+17
"git stash", where the tentative change involves changing a directory to a file (or vice versa), was confused, which has been corrected. * en/stash-df-fix: stash: restore untracked files AFTER restoring tracked files stash: avoid feeding directories to update-index t3903: document a pair of directory/file bugs
2021-10-01builtin/repack.c: pass `--refs-snapshot` when writing bitmapsLibravatar Taylor Blau1-0/+82
To prevent the race described in an earlier patch, generate and pass a reference snapshot to the multi-pack bitmap code, if we are writing one from `git repack`. This patch is mostly limited to creating a temporary file, and then calling for_each_ref(). Except we try to minimize duplicates, since doing so can drastically reduce the size in network-of-forks style repositories. In the kernel's fork network (the repository containing all objects from the kernel and all its forks), deduplicating the references drops the snapshot size from 934 MB to just 12 MB. But since we're handling duplicates in this way, we have to make sure that we preferred references (those listed in pack.preferBitmapTips) before non-preferred ones (to avoid recording an object which is pointed at by a preferred tip as non-preferred). We accomplish this by doing separate passes over the references: first visiting each prefix in pack.preferBitmapTips, and then over the rest of the references. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01bisect--helper: add space between colon and following sentenceLibravatar Bagas Sanjaya1-1/+1
Add missing space between colon sentence (`bisect-run failed:`) and the following sentence (`git bisect--helper --bisect-state`). Fixes: d1bbbe45df (bisect--helper: reimplement `bisect_run` shell function in C, 2021-09-13) Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01builtin/remote.c: add and use SHOW_INFO_INITLibravatar Ævar Arnfjörð Bjarmason1-45/+45
In the preceding commit we introduced REF_STATES_INIT, but did not change the "struct show_info" to have a corresponding initializer. Let's do that, and make it use "REF_STATES_INIT" and "STRING_LIST_INIT_DUP", doing that requires changing "list" and "states" away from being pointers. The resulting end-state is simpler since we omit the local "info_list" and "states" variables in show() as well as the memset(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01builtin/remote.c: add and use a REF_STATES_INITLibravatar Ævar Arnfjörð Bjarmason1-12/+11
Use a new REF_STATES_INIT designated initializer instead of assigning to the "strdup_strings" member of the previously memzero()'d version of this struct. The pattern of assigning to "strdup_strings" dates back to 211c89682ee (Make git-remote a builtin, 2008-02-29) (when it was "strdup_paths"), i.e. long before we used anything like our current established *_INIT patterns consistently. Then in e61e0cc6b70 (builtin-remote: teach show to display remote HEAD, 2009-02-25) and e5dcbfd9ab7 (builtin-remote: new show output style for push refspecs, 2009-02-25) we added some more of these. As it turns out we only initialized this struct three times, all the other uses were of pointers to those initialized structs. So let's initialize it in those three places, skip the memset(), and pass those structs down appropriately. This would be a behavior change if we had codepaths that relied say on implicitly having had "new_refs" initialized to STRING_LIST_INIT_NODUP with the memset(), but only set the "strdup_strings" on some other struct, but then called string_list_append() on "new_refs". There isn't any such codepath, all of the late assignments to "strdup_strings" assigned to those structs that we'd use for those codepaths. So just initializing them all up-front makes for easier to understand code, i.e. in the pre-image it looked as though we had that tricky edge case, but we didn't. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-01urlmatch.[ch]: add and use URLMATCH_CONFIG_INITLibravatar Ævar Arnfjörð Bjarmason1-1/+1
Change the initialization pattern of "struct urlmatch_config" to use an *_INIT macro and designated initializers. Right now there's no other "struct" member of "struct urlmatch_config" which would require its own *_INIT, but it's good practice not to assume that. Let's also change this to a designated initializer while we're at it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-30difftool: add a missing space to the run_dir_diff() commentsLibravatar David Aguilar1-1/+1
Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-30difftool: remove an unnecessary call to strbuf_release()Libravatar David Aguilar1-2/+0
The `buf` strbuf is reused again later in the same function, so there is no benefit to calling strbuf_release(). The subsequent usage is already using strbuf_reset() to reset the buffer, so releasing it early is only going to lead to a wasteful reallocation. Remove the early call to strbuf_release(). The same strbuf is already cleaned up in the "finish:" section so nothing is leaked, either. Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-30difftool: refactor dir-diff to write files using helper functionsLibravatar David Aguilar1-22/+28
Add a helpers function to handle the unlinking and writing of the dir-diff submodule and symlink stand-in files. Use the helpers to implement the guts of the hashmap loops. This eliminate duplicate code and safeguards the submodules hashmap loop against the symlink-chasing behavior that 5bafb3576a (difftool: fix symlink-file writing in dir-diff mode, 2021-09-22) addressed. The submodules loop should not strictly require the unlink() call that this is introducing to them, but it does not necessarily hurt them either beyond the cost of the extra unlink(). Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-30difftool: create a tmpdir path without repeated slashesLibravatar David Aguilar1-26/+24
The paths generated by difftool are passed to user-facing diff tools. Using paths with repeated slashes in them is a cosmetic blemish that is exposed to users and can be avoided. Use a strbuf to create the buffer used for the dir-diff tmpdir. Strip trailing slashes from the value read from TMPDIR to avoid repeated slashes in the generated paths. Adjust the error handling to avoid leaking strbufs and to avoid returning -1 to cmd_main(). Signed-off-by: David Aguilar <davvid@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28builtin/repack.c: make largest pack preferredLibravatar Taylor Blau1-1/+26
When repacking into a geometric series and writing a multi-pack bitmap, it is beneficial to have the largest resulting pack be the preferred object source in the bitmap's MIDX, since selecting the large packs can lead to fewer broken delta chains and better compression. Teach 'git repack' to identify this pack and pass it to the MIDX write machinery in order to mark it as preferred. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28builtin/repack.c: support writing a MIDX while repackingLibravatar Taylor Blau1-19/+119
Teach `git repack` a new `--write-midx` option for callers that wish to persist a multi-pack index in their repository while repacking. There are two existing alternatives to this new flag, but they don't cover our particular use-case. These alternatives are: - Call 'git multi-pack-index write' after running 'git repack', or - Set 'GIT_TEST_MULTI_PACK_INDEX=1' in your environment when running 'git repack'. The former works, but introduces a gap in bitmap coverage between repacking and writing a new MIDX (since the repack may have deleted a pack included in the existing MIDX, invalidating it altogether). Setting the 'GIT_TEST_' environment variable is obviously unsupported. In fact, even if it were supported officially, it still wouldn't work, because it generates the MIDX *after* redundant packs have been dropped, leading to the same issue as above. Introduce a new option which eliminates this race by teaching `git repack` to generate the MIDX at the critical point: after the new packs have been written and moved into place, but before the redundant packs have been removed. This option is compatible with `git repack`'s '--bitmap' option (it changes the interpretation to be: "write a bitmap corresponding to the MIDX after one has been generated"). There is a little bit of additional noise in the patch below to avoid repeating ourselves when selecting which packs to delete. Instead of a single loop as before (where we iterate over 'existing_packs', decide if a pack is worth deleting, and if so, delete it), we have two loops (the first where we decide which ones are worth deleting, and the second where we actually do the deleting). This makes it so we have a single check we can make consistently when (1) telling the MIDX which packs we want to exclude, and (2) actually unlinking the redundant packs. There is also a tiny change to short-circuit the body of write_midx_included_packs() when no packs remain in the case of an empty repository. The MIDX code does not handle this, so avoid trying to generate a MIDX covering zero packs in the first place. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28builtin/repack.c: extract showing progress to a variableLibravatar Taylor Blau1-1/+2
We only ask whether stderr is a tty before calling 'prune_packed_objects()', but the subsequent patch will add another use. Extract this check into a variable so that both can use it without having to call 'isatty()' twice. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28builtin/repack.c: rename variables that deal with non-kept packsLibravatar Taylor Blau1-11/+11
The new variable `existing_kept_packs` (and corresponding parameter `fname_kept_list`) added by the previous patch make it seem like `existing_packs` and `fname_list` are each subsets of the other two respectively. In reality, each pair is disjoint: one stores the packs without .keep files, and the other stores the packs with .keep files. Rename each to more clearly reflect this. Suggested-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28builtin/repack.c: keep track of existing packs unconditionallyLibravatar Taylor Blau1-25/+31
In order to be able to write a multi-pack index during repacking, `git repack` must keep track of which packs it wants to write into the MIDX. This set is the union of existing packs which will not be deleted, new pack(s) generated as a result of the repack, and .keep packs. Prior to this patch, `git repack` populated the list of existing packs only when repacking all-into-one (i.e., with `-A` or `-a`), but we will soon need to know this list when repacking when writing a MIDX without a-i-o. Populate the list of existing packs unconditionally, and guard removing packs from that list only when repacking a-i-o. Additionally, keep track of filenames of kept packs separately, since this, too, will be used in an upcoming patch. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28midx: preliminary support for `--refs-snapshot`Libravatar Taylor Blau2-4/+9
To figure out which commits we can write a bitmap for, the multi-pack index/bitmap code does a reachability traversal, marking any commit which can be found in the MIDX as eligible to receive a bitmap. This approach will cause a problem when multi-pack bitmaps are able to be generated from `git repack`, since the reference tips can change during the repack. Even though we ignore commits that don't exist in the MIDX (when doing a scan of the ref tips), it's possible that a commit in the MIDX reaches something that isn't. This can happen when a multi-pack index contains some pack which refers to loose objects (e.g., if a pack was pushed after starting the repack but before generating the MIDX which depends on an object which is stored as loose in the repository, and by definition isn't included in the multi-pack index). By taking a snapshot of the references before we start repacking, we can close that race window. In the above scenario (where we have a packed object pointing at a loose one), we'll either (a) take a snapshot of the references before seeing the packed one, or (b) take it after, at which point we can guarantee that the loose object will be packed and included in the MIDX. This patch does just that. It writes a temporary "reference snapshot", which is a list of OIDs that are at the ref tips before writing a multi-pack bitmap. References that are "preferred" (i.e,. are a suffix of at least one value of the 'pack.preferBitmapTips' configuration) are marked with a special '+'. The format is simple: one line per commit at each tip, with an optional '+' at the beginning (for preferred references, as described above). When provided, the reference snapshot is used to drive bitmap selection instead of the MIDX code doing its own traversal. When it isn't provided, the usual traversal takes place instead. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28builtin/multi-pack-index.c: support `--stdin-packs` modeLibravatar Taylor Blau1-0/+27
To power a new `--write-midx` mode, `git repack` will want to write a multi-pack index containing a certain set of packs in the repository. This new option will be used by `git repack` to write a MIDX which contains only the packs which will survive after the repack (that is, it will exclude any packs which are about to be deleted). This patch effectively exposes the function implemented in the previous commit via the `git multi-pack-index` builtin. An alternative approach would have been to call that function from the `git repack` builtin directly, but this introduces awkward problems around closing and reopening the object store, so the MIDX will be written out-of-process. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28mv: refuse to move sparse pathsLibravatar Derrick Stolee1-9/+43
Since cmd_mv() does not operate on cache entries and instead directly checks the filesystem, we can only use path_in_sparse_checkout() as a mechanism for seeing if a path is sparse or not. Be sure to skip returning a failure if '-k' is specified. To ensure that the advice around sparse paths is the only reason a move failed, be sure to check this as the very last thing before inserting into the src_for_dst list. The tests cover a variety of cases such as whether the target is tracked or untracked, and whether the source or destination are in or outside of the sparse-checkout definition. Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28rm: skip sparse paths with missing SKIP_WORKTREELibravatar Derrick Stolee1-1/+3
If a path does not match the sparse-checkout cone but is somehow missing the SKIP_WORKTREE bit, then 'git rm' currently succeeds in removing the file. One reason a user might be in this situation is a merge conflict outside of the sparse-checkout cone. Removing such a file might be problematic for users who are not sure what they are doing. Add a check to path_in_sparse_checkout() when 'git rm' is checking if a path should be considered for deletion. Of course, this check is ignored if the '--sparse' option is specified, allowing users who accept the risks to continue with the removal. This also removes a confusing behavior where a user asks for a directory to be removed, but only the entries that are within the sparse-checkout definition are removed. Now, 'git rm <dir>' will fail without '--sparse' and will succeed in removing all contained paths with '--sparse'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28rm: add --sparse optionLibravatar Derrick Stolee1-2/+6
As we did previously in 'git add', add a '--sparse' option to 'git rm' that allows modifying paths outside of the sparse-checkout definition. The existing checks in 'git rm' are restricted to tracked files that have the SKIP_WORKTREE bit in the current index. Future changes will cause 'git rm' to reject removing paths outside of the sparse-checkout definition, even if they are untracked or do not have the SKIP_WORKTREE bit. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28add: update --renormalize to skip sparse pathsLibravatar Derrick Stolee1-1/+3
We added checks for path_in_sparse_checkout() to portions of 'git add' that add warnings and prevent stagins a modification, but we skipped the --renormalize mode. Update renormalize_tracked_files() to ignore cache entries whose path is outside of the sparse-checkout cone (unless --sparse is provided). Add a test in t3705. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28add: update --chmod to skip sparse pathsLibravatar Derrick Stolee1-1/+3
We added checks for path_in_sparse_checkout() to portions of 'git add' that add warnings and prevent staging a modification, but we skipped the --chmod mode. Update chmod_pathspec() to ignore cache entries whose path is outside of the sparse-checkout cone (unless --sparse is provided). Add a test in t3705. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28add: implement the --sparse optionLibravatar Derrick Stolee1-4/+8
We previously modified 'git add' to refuse updating index entries outside of the sparse-checkout cone. This is justified to prevent users from accidentally getting into a confusing state when Git removes those files from the working tree at some later point. Unfortunately, this caused some workflows that were previously possible to become impossible, especially around merge conflicts outside of the sparse-checkout cone. These were documented in tests within t1092. We now re-enable these workflows using a new '--sparse' option to 'git add'. This allows users to signal "Yes, I do know what I'm doing with these files," and accept the consequences of the files leaving the worktree later. We delay updating the advice message until implementing a similar option in 'git rm' and 'git mv'. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28add: skip tracked paths outside sparse-checkout coneLibravatar Derrick Stolee1-0/+4
When 'git add' adds a tracked file that is outside of the sparse-checkout cone, it checks the SKIP_WORKTREE bit to see if the file exists outside of the sparse-checkout cone. This is usually correct, except in the case of a merge conflict outside of the cone. Modify add_pathspec_matched_against_index() to be more careful about paths by checking the sparse-checkout patterns in addition to the SKIP_WORKTREE bit. This causes 'git add' to no longer allow files outside of the cone that removed the SKIP_WORKTREE bit due to a merge conflict. With only this change, users will only be able to add the file after adding the file to the sparse-checkout cone. A later change will allow users to force adding even though the file is outside of the sparse-checkout cone. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>