summaryrefslogtreecommitdiff
path: root/builtin/clone.c
AgeCommit message (Collapse)AuthorFilesLines
2022-02-25Merge branch 'js/apply-partial-clone-filters-recursively'Libravatar Junio C Hamano1-2/+34
"git clone --filter=... --recurse-submodules" only makes the top-level a partial clone, while submodules are fully cloned. This behaviour is changed to pass the same filter down to the submodules. * js/apply-partial-clone-filters-recursively: clone, submodule: pass partial clone filters to submodules
2022-02-18Merge branch 'ab/release-transport-ls-refs-options'Libravatar Junio C Hamano1-7/+6
* ab/release-transport-ls-refs-options: ls-remote & transport API: release "struct transport_ls_refs_options"
2022-02-09clone, submodule: pass partial clone filters to submodulesLibravatar Josh Steadmon1-2/+34
When cloning a repo with a --filter and with --recurse-submodules enabled, the partial clone filter only applies to the top-level repo. This can lead to unexpected bandwidth and disk usage for projects which include large submodules. For example, a user might wish to make a partial clone of Gerrit and would run: `git clone --recurse-submodules --filter=blob:5k https://gerrit.googlesource.com/gerrit`. However, only the superproject would be a partial clone; all the submodules would have all blobs downloaded regardless of their size. With this change, the same filter can also be applied to submodules, meaning the expected bandwidth and disk savings apply consistently. To avoid changing default behavior, add a new clone flag, `--also-filter-submodules`. When this is set along with `--filter` and `--recurse-submodules`, the filter spec is passed along to git-submodule and git-submodule--helper, such that submodule clones also have the filter applied. This applies the same filter to the superproject and all submodules. Users who need to customize the filter per-submodule would need to clone with `--no-recurse-submodules` and then manually initialize each submodule with the proper filter. Applying filters to submodules should be safe thanks to Jonathan Tan's recent work [1, 2, 3] eliminating the use of alternates as a method of accessing submodule objects, so any submodule object access now triggers a lazy fetch from the submodule's promisor remote if the accessed object is missing. This patch is a reworked version of [4], which was created prior to Jonathan Tan's work. [1]: 8721e2e (Merge branch 'jt/partial-clone-submodule-1', 2021-07-16) [2]: 11e5d0a (Merge branch 'jt/grep-wo-submodule-odb-as-alternate', 2021-09-20) [3]: 162a13b (Merge branch 'jt/no-abuse-alternate-odb-for-submodules', 2021-10-25) [4]: https://lore.kernel.org/git/52bf9d45b8e2b72ff32aa773f2415bf7b2b86da2.1563322192.git.steadmon@google.com/ Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-09Merge branch 'jt/clone-not-quite-empty'Libravatar Junio C Hamano1-6/+6
Cloning from a repository that does not yet have any branches or tags but has other refs resulted in a "remote transport reported error", which has been corrected. * jt/clone-not-quite-empty: clone: support unusual remote ref configurations
2022-02-09Merge branch 'ab/config-based-hooks-2'Libravatar Junio C Hamano1-1/+2
More "config-based hooks". * ab/config-based-hooks-2: run-command: remove old run_hook_{le,ve}() hook API receive-pack: convert push-to-checkout hook to hook.h read-cache: convert post-index-change to use hook.h commit: convert {pre-commit,prepare-commit-msg} hook to hook.h git-p4: use 'git hook' to run hooks send-email: use 'git hook run' for 'sendemail-validate' git hook run: add an --ignore-missing flag hooks: convert worktree 'post-checkout' hook to hook library hooks: convert non-worktree 'post-checkout' hook to hook library merge: convert post-merge to use hook.h am: convert applypatch-msg to use hook.h rebase: convert pre-rebase to use hook.h hook API: add a run_hooks_l() wrapper am: convert {pre,post}-applypatch to use hook.h gc: use hook library for pre-auto-gc hook hook API: add a run_hooks() wrapper hook: add 'run' subcommand
2022-02-06ls-remote & transport API: release "struct transport_ls_refs_options"Libravatar Ævar Arnfjörð Bjarmason1-7/+6
Fix a memory leak in codepaths that use the "struct transport_ls_refs_options" API. Since the introduction of the struct in 39835409d10 (connect, transport: encapsulate arg in struct, 2021-02-05) the caller has been responsible for freeing it. That commit in turn migrated code originally added in 402c47d9391 (clone: send ref-prefixes when using protocol v2, 2018-07-20) and b4be74105fe (ls-remote: pass ref prefixes when requesting a remote's refs, 2018-03-15). Only some of those codepaths were releasing the allocated resources of the struct, now all of them will. Mark the "t/t5511-refspec.sh" test as passing when git is compiled with SANITIZE=leak. They'll now be listed as running under the "GIT_TEST_PASSING_SANITIZE_LEAK=true" test mode (the "linux-leaks" CI target). Previously 24/47 tests would fail. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-26clone: support unusual remote ref configurationsLibravatar Jonathan Tan1-6/+6
When cloning a branchless and tagless but not refless remote using protocol v0 or v1, Git calls transport_fetch_refs() with an empty ref list. This makes the clone fail with the message "remote transport reported error". Git should have refrained from calling transport_fetch_refs(), just like it does in the case that the remote is refless. Therefore, teach Git to do this. In protocol v2, this does not happen because the client passes ref-prefix arguments that filter out non-branches and non-tags in the ref advertisement, making the remote appear empty. Note that this bug concerns logic in builtin/clone.c and only affects cloning, not fetching. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-12Merge branch 'ps/lockfile-cleanup-fix'Libravatar Junio C Hamano1-1/+1
Some lockfile code called free() in signal-death code path, which has been corrected. * ps/lockfile-cleanup-fix: fetch: fix deadlock when cleaning up lockfiles in async signals
2022-01-10Merge branch 'ja/i18n-similar-messages'Libravatar Junio C Hamano1-3/+3
Similar message templates have been consolidated so that translators need to work on fewer number of messages. * ja/i18n-similar-messages: i18n: turn even more messages into "cannot be used together" ones i18n: ref-filter: factorize "%(foo) atom used without %(bar) atom" i18n: factorize "--foo outside a repository" i18n: refactor "unrecognized %(foo) argument" strings i18n: factorize "no directory given for --foo" i18n: factorize "--foo requires --bar" and the like i18n: tag.c factorize i18n strings i18n: standardize "cannot open" and "cannot read" i18n: turn "options are incompatible" into "cannot be used together" i18n: refactor "%s, %s and %s are mutually exclusive" i18n: refactor "foo and bar are mutually exclusive"
2022-01-07hooks: convert non-worktree 'post-checkout' hook to hook libraryLibravatar Emily Shaffer1-1/+2
Move the running of the 'post-checkout' hook away from run-command.h to the new hook.h library, except in the case of builtin/worktree.c. That special-case will be handled in a subsequent commit. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-07fetch: fix deadlock when cleaning up lockfiles in async signalsLibravatar Patrick Steinhardt1-1/+1
When fetching packfiles, we write a bunch of lockfiles for the packfiles we're writing into the repository. In order to not leave behind any cruft in case we exit or receive a signal, we register both an exit handler as well as signal handlers for common signals like SIGINT. These handlers will then unlink the locks and free the data structure tracking them. We have observed a deadlock in this logic though: (gdb) bt #0 __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95 #1 0x00007f4932bea2cd in _int_free (av=0x7f4932f2eb20 <main_arena>, p=0x3e3e4200, have_lock=0) at malloc.c:3969 #2 0x00007f4932bee58c in __GI___libc_free (mem=<optimized out>) at malloc.c:2975 #3 0x0000000000662ab1 in string_list_clear () #4 0x000000000044f5bc in unlock_pack_on_signal () #5 <signal handler called> #6 _int_free (av=0x7f4932f2eb20 <main_arena>, p=<optimized out>, have_lock=0) at malloc.c:4024 #7 0x00007f4932bee58c in __GI___libc_free (mem=<optimized out>) at malloc.c:2975 #8 0x000000000065afd5 in strbuf_release () #9 0x000000000066ddb9 in delete_tempfile () #10 0x0000000000610d0b in files_transaction_cleanup.isra () #11 0x0000000000611718 in files_transaction_abort () #12 0x000000000060d2ef in ref_transaction_abort () #13 0x000000000060d441 in ref_transaction_prepare () #14 0x000000000060e0b5 in ref_transaction_commit () #15 0x00000000004511c2 in fetch_and_consume_refs () #16 0x000000000045279a in cmd_fetch () #17 0x0000000000407c48 in handle_builtin () #18 0x0000000000408df2 in cmd_main () #19 0x00000000004078b5 in main () The process was killed with a signal, which caused the signal handler to kick in and try free the data structures after we have unlinked the locks. It then deadlocks while calling free(3P). The root cause of this is that it is not allowed to call certain functions in async-signal handlers, as specified by signal-safety(7). Next to most I/O functions, this list of disallowed functions also includes memory-handling functions like malloc(3P) and free(3P) because they may not be reentrant. As a result, if we execute such functions in the signal handler, then they may operate on inconistent state and fail in unexpected ways. Fix this bug by not calling non-async-signal-safe functions when running in the signal handler. We're about to re-raise the signal anyway and will thus exit, so it's not much of a problem to keep the string list of lockfiles untouched. Note that it's fine though to call unlink(2), so we'll still clean up the lockfiles correctly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-05i18n: turn even more messages into "cannot be used together" onesLibravatar Jean-Noël Avila1-2/+2
Even if some of these messages are not subject to gettext i18n, this helps bring a single style of message for a given error type. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-05i18n: turn "options are incompatible" into "cannot be used together"Libravatar Jean-Noël Avila1-1/+1
Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15clone: avoid using deprecated `sparse-checkout init`Libravatar Elijah Newren1-1/+1
The previous commits marked `sparse-checkout init` as deprecated; we can just use `set` instead here and pass it no paths. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-23clone: fix a memory leak of the "git_dir" variableLibravatar Ævar Arnfjörð Bjarmason1-1/+3
At this point in cmd_clone the "git_dir" is always either an xstrdup()'d string, or something we got from mkpathdup(). Let's free() it before we clobber it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-13Merge branch 'en/removing-untracked-fixes'Libravatar Junio C Hamano1-0/+1
Various fixes in code paths that move untracked files away to make room. * en/removing-untracked-fixes: Documentation: call out commands that nuke untracked files/directories Comment important codepaths regarding nuking untracked files/dirs unpack-trees: avoid nuking untracked dir in way of locally deleted file unpack-trees: avoid nuking untracked dir in way of unmerged file Change unpack_trees' 'reset' flag into an enum Remove ignored files by default when they are in the way unpack-trees: make dir an internal-only struct unpack-trees: introduce preserve_ignored to unpack_trees_options read-tree, merge-recursive: overwrite ignored files by default checkout, read-tree: fix leak of unpack_trees_options.dir t2500: add various tests for nuking untracked files
2021-10-03Merge branch 'jk/clone-unborn-head-in-bare'Libravatar Junio C Hamano1-16/+17
"git clone" from a repository whose HEAD is unborn into a bare repository didn't follow the branch name the other side used, which is corrected. * jk/clone-unborn-head-in-bare: clone: handle unborn branch in bare repos
2021-09-27Remove ignored files by default when they are in the wayLibravatar Elijah Newren1-2/+1
Change several commands to remove ignored files by default when they are in the way. Since some commands (checkout, merge) take a --no-overwrite-ignore option to allow the user to configure this, and it may make sense to add that option to more commands (and in the case of merge, actually plumb that configuration option through to more of the backends than just the fast-forwarding special case), add little comments about where such flags would be used. Incidentally, this fixes a test failure in t7112. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27unpack-trees: introduce preserve_ignored to unpack_trees_optionsLibravatar Elijah Newren1-0/+2
Currently, every caller of unpack_trees() that wants to ensure ignored files are overwritten by default needs to: * allocate unpack_trees_options.dir * flip the DIR_SHOW_IGNORED flag in unpack_trees_options.dir->flags * call setup_standard_excludes AND then after the call to unpack_trees() needs to * call dir_clear() * deallocate unpack_trees_options.dir That's a fair amount of boilerplate, and every caller uses identical code. Make this easier by instead introducing a new boolean value where the default value (0) does what we want so that new callers of unpack_trees() automatically get the appropriate behavior. And move all the handling of unpack_trees_options.dir into unpack_trees() itself. While preserve_ignored = 0 is the behavior we feel is the appropriate default, we defer fixing commands to use the appropriate default until a later commit. So, this commit introduces several locations where we manually set preserve_ignored=1. This makes it clear where code paths were previously preserving ignored files when they should not have been; a future commit will flip these to instead use a value of 0 to get the behavior we want. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20Merge branch 'ar/submodule-add-more'Libravatar Junio C Hamano1-116/+2
More parts of "git submodule add" has been rewritten in C. * ar/submodule-add-more: submodule--helper: rename compute_submodule_clone_url() submodule--helper: remove resolve-relative-url subcommand submodule--helper: remove add-config subcommand submodule--helper: remove add-clone subcommand submodule--helper: convert the bulk of cmd_add() to C dir: libify and export helper functions from clone.c submodule--helper: remove repeated code in sync_submodule() submodule--helper: refactor resolve_relative_url() helper submodule--helper: add options for compute_submodule_clone_url()
2021-09-20Merge branch 'ps/fetch-optim'Libravatar Junio C Hamano1-5/+3
Optimize code that handles large number of refs in the "git fetch" code path. * ps/fetch-optim: fetch: avoid second connectivity check if we already have all objects fetch: merge fetching and consuming refs fetch: refactor fetch refs to be more extendable fetch-pack: optimize loading of refs via commit graph connected: refactor iterator to return next object ID directly fetch: avoid unpacking headers in object existence check fetch: speed up lookup of want refs via commit-graph
2021-09-20clone: handle unborn branch in bare reposLibravatar Jeff King1-16/+17
When cloning a repository with an unborn HEAD, we'll set the local HEAD to match it only if the local repository is non-bare. This is inconsistent with all other combinations: remote HEAD | local repo | local HEAD ----------------------------------------------- points to commit | non-bare | same as remote points to commit | bare | same as remote unborn | non-bare | same as remote unborn | bare | local default So I don't think this is some clever or subtle behavior, but just a bug in 4f37d45706 (clone: respect remote unborn HEAD, 2021-02-05). And it's easy to see how we ended up there. Before that commit, the code to set up the HEAD for an empty repo was guarded by "if (!option_bare)". That's because the only thing it did was call install_branch_config(), and we don't want to do so for a bare repository (unborn HEAD or not). That commit put the handling of unborn HEADs into the same block, since those also need to call install_branch_config(). But the unborn case has an additional side effect of calling create_symref(), and we want that to happen whether we are bare or not. This patch just pulls all of the "figure out the default branch" code out of the "!option_bare" block. Only the actual config installation is kept there. Note that this does mean we might allocate "ref" and not use it (if the remote is empty but did not advertise an unborn HEAD). But that's not really a big deal since this isn't a hot code path, and it keeps the code simple. The alternative would be handling unborn_head_target separately, but that gets confusing since its memory ownership is tangled up with the "ref" variable. There's just one new test, for the case we're fixing. The other ones in the table are handled elsewhere (the unborn non-bare case just above, and the actually-born cases in t5601, t5606, and t5609, as they do not require v2's "unborn" protocol extension). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10Merge branch 'ab/retire-advice-config'Libravatar Junio C Hamano1-1/+1
Code clean up to migrate callers from older advice_config[] based API to newer advice_if_enabled() and advice_enabled() API. * ab/retire-advice-config: advice: move advice.graftFileDeprecated squashing to commit.[ch] advice: remove use of global advice_add_embedded_repo advice: remove read uses of most global `advice_` variables advice: add enum variants for missing advice variables
2021-09-01connected: refactor iterator to return next object ID directlyLibravatar Patrick Steinhardt1-5/+3
The object ID iterator used by the connectivity checks returns the next object ID via an out-parameter and then uses a return code to indicate whether an item was found. This is a bit roundabout: instead of a separate error code, we can just return the next object ID directly and use `NULL` pointers as indicator that the iterator got no items left. Furthermore, this avoids a copy of the object ID. Refactor the iterator and all its implementations to return object IDs directly. This brings a tiny performance improvement when doing a mirror-fetch of a repository with about 2.3M refs: Benchmark #1: 328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch Time (mean ± σ): 30.110 s ± 0.148 s [User: 27.161 s, System: 5.075 s] Range (min … max): 29.934 s … 30.406 s 10 runs Benchmark #2: 328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch Time (mean ± σ): 29.899 s ± 0.109 s [User: 26.916 s, System: 5.104 s] Range (min … max): 29.696 s … 29.996 s 10 runs Summary '328dc58b49919c43897240f2eabfa30be2ce32a4: git-fetch' ran 1.01 ± 0.01 times faster than '328dc58b49919c43897240f2eabfa30be2ce32a4~: git-fetch' While this 1% speedup could be labelled as statistically insignificant, the speedup is consistent on my machine. Furthermore, this is an end to end test, so it is expected that the improvement in the connectivity check itself is more significant. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-30clone: set submodule.recurse=true if submodule.stickyRecursiveClone enabledLibravatar Mahi Kolla1-0/+5
Based on current experience, when running git clone --recurse-submodules, developers do not expect other commands such as pull or checkout to run recursively into active submodules. However, setting submodule.recurse=true at this step could make for a simpler workflow by eliminating the need for the --recurse-submodules option in subsequent commands. To collect more data on developers' preference in regards to making submodule.recurse=true a default config value in the future, deploy this feature under the opt in submodule.stickyRecursiveClone flag. Signed-off-by: Mahi Kolla <mkolla2@illinois.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25advice: remove read uses of most global `advice_` variablesLibravatar Ben Boeckel1-1/+1
In c4a09cc9ccb (Merge branch 'hw/advise-ng', 2020-03-25), a new API for accessing advice variables was introduced and deprecated `advice_config` in favor of a new array, `advice_setting`. This patch ports all but two uses which read the status of the global `advice_` variables over to the new `advice_enabled` API. We'll deal with advice_add_embedded_repo and advice_graft_file_deprecated separately. Signed-off-by: Ben Boeckel <mathstuf@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-10dir: libify and export helper functions from clone.cLibravatar Atharva Raykar1-116/+2
These functions can be useful to other parts of Git. Let's move them to dir.c, while renaming them to be make their functionality more explicit. Signed-off-by: Atharva Raykar <raykar.ath@gmail.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Shourya Shukla <periperidip@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-14Merge branch 'jk/clone-clean-upon-transport-error'Libravatar Junio C Hamano1-7/+4
Recent "git clone" left a temporary directory behind when the transport layer returned an failure. * jk/clone-clean-upon-transport-error: clone: clean up directory after transport_fetch_refs() failure
2021-05-19clone: clean up directory after transport_fetch_refs() failureLibravatar Jeff King1-7/+4
git-clone started respecting errors from the transport subsystem in aab179d937 (builtin/clone.c: don't ignore transport_fetch_refs() errors, 2020-12-03). However, that commit didn't handle the cleanup of the filesystem quite right. The cleanup of the directory that cmd_clone() creates is done by an atexit() handler, which we control with a flag. It starts as JUNK_LEAVE_NONE ("clean up everything"), then progresses to JUNK_LEAVE_REPO when we know we have a valid repo but not working tree, and then finally JUNK_LEAVE_ALL when we have a successful checkout. Most errors cause us to die(), which then triggers the handler to do the right thing based on how far into cmd_clone() we got. But the checks added by aab179d937 instead set the "err" variable and then jump to a new "cleanup" label, which then returns our non-zero status. However, the code after the cleanup label includes setting the flag to JUNK_LEAVE_ALL, and so we accidentally leave the repository and working tree in place. One obvious option to fix this is to reorder the end of the function to set the flag first, before cleanup code, and put the label between them. But we can observe another small bug: the error return from transport_fetch_refs() is generally "-1", and we propagate that to the return value of cmd_clone(), which ultimately becomes the exit code of the process. And we try to avoid transmitting negative values via exit codes (only the low 8 bits are passed along as an unsigned value, though in practice for "-1" this at least retains the property that it's non-zero). Instead, let's just die(). That makes us consistent with rest of the code in the function. It does add a new "fatal:" line to the output, but I'd argue that's a good thing: - in the rare case that the transport code didn't say anything, now the user gets _some_ error message - even if the transport code said something like "error: ssh died of signal 9", it's nice to also say "fatal" to indicate that we considered that to be a show-stopper. Triggering this in the test suite turns out to be surprisingly difficult. Almost every error we'd encounter, including ones deep inside the transport code, cause us to just die() right there! However, one way is to put a fake wrapper around git-upload-pack that sends the complete packfile but exits with a failure code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27hash: provide per-algorithm null OIDsLibravatar brian m. carlson1-1/+1
Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08Merge branch 'll/clone-reject-shallow'Libravatar Junio C Hamano1-0/+21
"git clone --reject-shallow" option fails the clone as soon as we notice that we are cloning from a shallow repository. * ll/clone-reject-shallow: builtin/clone.c: add --reject-shallow option
2021-04-01builtin/clone.c: add --reject-shallow optionLibravatar Li Linchao1-0/+21
In some scenarios, users may want more history than the repository offered for cloning, which happens to be a shallow repository, can give them. But because users don't know it is a shallow repository until they download it to local, we may want to refuse to clone this kind of repository, without creating any unnecessary files. The '--depth=x' option cannot be used as a solution; the source may be deep enough to give us 'x' commits when cloned, but the user may later need to deepen the history to arbitrary depth. Teach '--reject-shallow' option to "git clone" to abort as soon as we find out that we are cloning from a shallow repository. Signed-off-by: Li Linchao <lilinchao@oschina.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-14clone: free or UNLEAK further pointers when finishedLibravatar Andrzej Hunt1-4/+10
Most of these pointers can safely be freed when cmd_clone() completes, therefore we make sure to free them. The one exception is that we have to UNLEAK(repo) because it can point either to argv[0], or a malloc'd string returned by absolute_pathdup(). We also have to free(path) in the middle of cmd_clone(): later during cmd_clone(), path is unconditionally overwritten with a different path, triggering a leak. Freeing the first path immediately after use (but only in the case where it contains data) seems like the cleanest solution, as opposed to freeing it unconditionally before path is reused for another path. This leak appears to have been introduced in: f38aa83f9a (use local cloning if insteadOf makes a local URL, 2014-07-17) These leaks were found when running t0001 with LSAN, see also an excerpt of the LSAN output below (the full list is omitted because it's far too long, and mostly consists of indirect leakage of members of the refs we are freeing). Direct leak of 178 byte(s) in 1 object(s) allocated from: #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8 #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9 #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8 #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10 #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 165 byte(s) in 1 object(s) allocated from: #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9a6fc4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8 #2 0x9a6f9a in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9 #3 0x8ce266 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8 #4 0x51e9bd in wanted_peer_refs /home/ahunt/oss-fuzz/git/builtin/clone.c:574:21 #5 0x51cfe1 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1284:17 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69c42e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f8fef0c2349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 178 byte(s) in 1 object(s) allocated from: #0 0x49a53d in malloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 #1 0x9a6ff4 in do_xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:41:8 #2 0x9a6fca in xmalloc /home/ahunt/oss-fuzz/git/wrapper.c:62:9 #3 0x8ce296 in copy_ref /home/ahunt/oss-fuzz/git/remote.c:885:8 #4 0x8d2ebd in guess_remote_head /home/ahunt/oss-fuzz/git/remote.c:2215:10 #5 0x51d0c5 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1308:4 #6 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #7 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #8 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #9 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #10 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #11 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 165 byte(s) in 1 object(s) allocated from: #0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3 #1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8 #2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20 #3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9 #4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8 #5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8 #6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4 #7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9 #8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4 #9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9 #10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Direct leak of 105 byte(s) in 1 object(s) allocated from: #0 0x49a859 in realloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:164:3 #1 0x9a71f6 in xrealloc /home/ahunt/oss-fuzz/git/wrapper.c:126:8 #2 0x93622d in strbuf_grow /home/ahunt/oss-fuzz/git/strbuf.c:98:2 #3 0x937a73 in strbuf_addch /home/ahunt/oss-fuzz/git/./strbuf.h:231:3 #4 0x939fcd in strbuf_add_absolute_path /home/ahunt/oss-fuzz/git/strbuf.c:911:4 #5 0x69d3ce in absolute_pathdup /home/ahunt/oss-fuzz/git/abspath.c:261:2 #6 0x51c688 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1021:10 #7 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #8 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #9 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #10 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #11 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #12 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-17Merge branch 'jt/clone-unborn-head'Libravatar Junio C Hamano1-9/+25
"git clone" tries to locally check out the branch pointed at by HEAD of the remote repository after it is done, but the protocol did not convey the information necessary to do so when copying an empty repository. The protocol v2 learned how to do so. * jt/clone-unborn-head: clone: respect remote unborn HEAD connect, transport: encapsulate arg in struct ls-refs: report unborn targets of symrefs
2021-02-05clone: respect remote unborn HEADLibravatar Jonathan Tan1-2/+14
Teach Git to use the "unborn" feature introduced in a previous patch as follows: Git will always send the "unborn" argument if it is supported by the server. During "git clone", if cloning an empty repository, Git will use the new information to determine the local branch to create. In all other cases, Git will ignore it. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05connect, transport: encapsulate arg in structLibravatar Jonathan Tan1-7/+11
In a future patch we plan to return the name of an unborn current branch from deep in the callchain to a caller via a new pointer parameter that points at a variable in the caller when the caller calls get_remote_refs() and transport_get_remote_refs(). In preparation for that, encapsulate the existing ref_prefixes parameter into a struct. The aforementioned unborn current branch will go into this new struct in the future patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-18Merge branch 'js/init-defaultbranch-advice'Libravatar Junio C Hamano1-1/+1
Our users are going to be trained to prepare for future change of init.defaultBranch configuration variable. * js/init-defaultbranch-advice: init: provide useful advice about init.defaultBranch get_default_branch_name(): prepare for showing some advice branch -m: allow renaming a yet-unborn branch init: document `init.defaultBranch` better
2020-12-13get_default_branch_name(): prepare for showing some adviceLibravatar Johannes Schindelin1-1/+1
We are about to introduce a message giving users running `git init` some advice about `init.defaultBranch`. This will necessarily be done in `repo_default_branch_name()`. Not all code paths want to show that advice, though. In particular, the `git clone` codepath _specifically_ asks for `init_db()` to be quiet, via the `INIT_DB_QUIET` flag. In preparation for showing users above-mentioned advice, let's change the function signature of `get_default_branch_name()` to accept the parameter `quiet`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-03builtin/clone.c: don't ignore transport_fetch_refs() errorsLibravatar Taylor Blau1-4/+11
If 'git clone' couldn't execute 'transport_fetch_refs()' (e.g., because of an error on the remote's side in 'git upload-pack'), then it will silently ignore it. Even though this has been the case at least since clone was ported to C (way back in 8434c2f1af (Build in clone, 2008-04-27)), 'git fetch' doesn't ignore these and reports any failures it sees. That suggests that ignoring the return value in 'git clone' is simply an oversight that should be corrected. That's exactly what this patch does. (Noticing and fixing this is no coincidence, we'll want it in the next patch in order to demonstrate a regression in 'git upload-pack' via a 'git clone'.) There's no additional logging here, but that matches how 'git fetch' handles the same case. An assumption there is that whichever part of transport_fetch_refs() fails will complain loudly, so any additional logging here is redundant. Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-27Merge branch 'sb/clone-origin'Libravatar Junio C Hamano1-15/+56
"git clone" learned clone.defaultremotename configuration variable to customize what nickname to use to call the remote the repository was cloned from. * sb/clone-origin: clone: allow configurable default for `-o`/`--origin` clone: read new remote name from remote_name instead of option_origin clone: validate --origin option before use refs: consolidate remote name validation remote: add tests for add and rename with invalid names clone: use more conventional config/option layering clone: add tests for --template and some disallowed option pairs
2020-09-30clone: allow configurable default for `-o`/`--origin`Libravatar Sean Barag1-7/+19
While the default remote name of "origin" can be changed at clone-time with `git clone`'s `--origin` option, it was previously not possible to specify a default value for the name of that remote. Add support for a new `clone.defaultRemoteName` config, with the newly-created remote name resolved in priority order: 1. (Highest priority) A remote name passed directly to `git clone -o` 2. A `clone.defaultRemoteName=new_name` in config `git clone -c` 3. A `clone.defaultRemoteName` value set in `/path/to/template/config`, where `--template=/path/to/template` is provided 4. A `clone.defaultRemoteName` value set in a non-template config file 5. The default value of `origin` Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Helped-by: Derrick Stolee <stolee@gmail.com> Helped-by: Andrei Rybak <rybak.a.v@gmail.com> Signed-off-by: Sean Barag <sean@barag.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30clone: read new remote name from remote_name instead of option_originLibravatar Sean Barag1-15/+16
In a future patch, the name of the remote created by `git clone` may come from multiple sources. To avoid confusion, convert most uses of option_origin to remote_name, leaving option_origin to exclusively represent the -o/--origin option. Helped-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Sean Barag <sean@barag.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30clone: validate --origin option before useLibravatar Sean Barag1-0/+3
Providing a bad origin name to `git clone` currently reports an 'invalid refspec' error instead of a more explicit message explaining that the `--origin` option was malformed. This behavior dates back to since 8434c2f1 (Build in clone, 2008-04-27). Reintroduce validation for the provided `--origin` option, but notably _don't_ include a multi-level check (e.g. "foo/bar") that was present in the original `git-clone.sh`. `git remote` allows multi-level remote names since at least 46220ca100 (remote.c: Fix overtight refspec validation, 2008-03-20), so that appears to be the desired behavior. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Derrick Stolee <stolee@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Sean Barag <sean@barag.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-30clone: use more conventional config/option layeringLibravatar Sean Barag1-1/+26
Parsing command-line options before reading from config required careful handling to ensure CLI options were treated with higher priority. Read config first to let parsed CLI naively overwrite matching config values. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Sean Barag <sean@barag.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-22builtin/clone: avoid failure with GIT_DEFAULT_HASHLibravatar brian m. carlson1-1/+1
If a user is cloning a SHA-1 repository with GIT_DEFAULT_HASH set to "sha256", then we can end up with a repository where the repository format version is 0 but the extensions.objectformat key is set to "sha256". This is both wrong (the user has a SHA-1 repository) and nonfunctional (because the extension cannot be used in a v0 repository). This happens because in a clone, we initially set up the repository, and then change its algorithm based on what the remote side tells us it's using. We've initially set up the repository as SHA-256 in this case, and then later on reset the repository version without clearing the extension. We could just always set the extension in this case, but that would mean that our SHA-1 repositories weren't compatible with older Git versions, even though there's no reason why they shouldn't be. And we also don't want to initialize the repository as SHA-1 initially, since that means if we're cloning an empty repository, we'll have failed to honor the GIT_DEFAULT_HASH variable and will end up with a SHA-1 repository, not a SHA-256 repository. Neither of those are appealing, so let's tell the repository initialization code if we're doing a reinit like this, and if so, to clear the extension if we're using SHA-1. This makes sure we produce a valid and functional repository and doesn't break any of our other use cases. Reported-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-06refspec: add and use refspec_appendf()Libravatar René Scharfe1-5/+2
Add a function for building a refspec using printf-style formatting. It frees callers from managing their own buffer. Use it throughout the tree to shorten and simplify its callers. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-10Merge branch 'jk/strvec'Libravatar Junio C Hamano1-19/+19
The argv_array API is useful for not just managing argv but any "vector" (NULL-terminated array) of strings, and has seen adoption to a certain degree. It has been renamed to "strvec" to reduce the barrier to adoption. * jk/strvec: strvec: rename struct fields strvec: drop argv_array compatibility layer strvec: update documention to avoid argv_array strvec: fix indentation in renamed calls strvec: convert remaining callers away from argv_array name strvec: convert more callers away from argv_array name strvec: convert builtin/ callers away from argv_array name quote: rename sq_dequote_to_argv_array to mention strvec strvec: rename files from argv-array to strvec argv-array: rename to strvec argv-array: use size_t for count and alloc
2020-07-30strvec: rename struct fieldsLibravatar Jeff King1-2/+2
The "argc" and "argv" names made sense when the struct was argv_array, but now they're just confusing. Let's rename them to "nr" (which we use for counts elsewhere) and "v" (which is rather terse, but reads well when combined with typical variable names like "args.v"). Note that we have to update all of the callers immediately. Playing tricks with the preprocessor is hard here, because we wouldn't want to rewrite unrelated tokens. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-30Merge branch 'bw/fail-cloning-into-non-empty' into masterLibravatar Junio C Hamano1-2/+10
"git clone --separate-git-dir=$elsewhere" used to stomp on the contents of the existing directory $elsewhere, which has been taught to fail when $elsewhere is not an empty directory. * bw/fail-cloning-into-non-empty: git clone: don't clone into non-empty directory
2020-07-28strvec: convert builtin/ callers away from argv_array nameLibravatar Jeff King1-17/+17
We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts all of the files in builtin/ to keep the diff to a manageable size. The conversion was done purely mechanically with: git ls-files '*.c' '*.h' | xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' and then selectively staging files with "git add builtin/". We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>