summaryrefslogtreecommitdiff
path: root/transport.c
AgeCommit message (Collapse)AuthorFilesLines
2021-07-16Merge branch 'ab/struct-init'Libravatar Junio C Hamano1-1/+1
Code cleanup around struct_type_init() functions. * ab/struct-init: string-list.h users: change to use *_{nodup,dup}() string-list.[ch]: add a string_list_init_{nodup,dup}() dir.[ch]: replace dir_init() with DIR_INIT *.c *_init(): define in terms of corresponding *_INIT macro *.h: move some *_INIT to designated initializers
2021-07-06bundle: remove "ref_list" in favor of string-list.c APILibravatar Ævar Arnfjörð Bjarmason1-3/+5
Move away from the "struct ref_list" in bundle.c in favor of the almost identical string-list.c API. That API fits this use-case perfectly, but did not exist in its current form when this code was added in 2e0afafebd (Add git-bundle: move objects and references by archive, 2007-02-22), with hindsight we could have used the path-list API, which later got renamed to string-list. See 8fd2cb4069 (Extract helper bits from c-merge-recursive work, 2006-07-25) We need to change "name" to "string" and "oid" to "util" to make this conversion, but other than that the APIs are pretty much identical for what bundle.c made use of. Let's also replace the memset(..,0,...) pattern with a more idiomatic "INIT" macro, and finally add a *_release() function so to free the allocated memory. Before this the add_to_ref_list() would leak memory, now e.g. "bundle list-heads" reports no memory leaks at all under valgrind. In the bundle_header_init() function we're using a clever trick to memcpy() what we'd get from the corresponding BUNDLE_HEADER_INIT. There is a concurrent series to make use of that pattern more generally, see [1]. 1. https://lore.kernel.org/git/cover-0.5-00000000000-20210701T104855Z-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-06bundle.c: use a temporary variable for OIDs and namesLibravatar Ævar Arnfjörð Bjarmason1-2/+4
In preparation for moving away from accessing the OID and name via the "oid" and "name" slots in a subsequent commit, change the code that accesses it to use named variables. This makes the subsequent change smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-01string-list.h users: change to use *_{nodup,dup}()Libravatar Ævar Arnfjörð Bjarmason1-1/+1
Change all in-tree users of the string_list_init(LIST, BOOL) API to use string_list_init_{nodup,dup}(LIST) instead. As noted in the preceding commit let's leave the now-unused string_list_init() wrapper in-place for any in-flight users, it can be removed at some later date. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-20fetch-pack: signal v2 server that we are done making requestsLibravatar Jeff King1-2/+4
When fetching with the v0 protocol over ssh (or a local upload-pack with pipes), the server closes the connection as soon as it is finished sending the pack. So even though the client may still be operating on the data via index-pack (e.g., resolving deltas, checking connectivity, etc), the server has released all resources. With the v2 protocol, however, the server considers the ssh session only as a transport, with individual requests coming over it. After sending the pack, it goes back to its main loop, waiting for another request to come from the client. As a result, the ssh session hangs around until the client process ends, which may be much later (because resolving deltas, etc, may consume a lot of CPU). This is bad for two reasons: - it's consuming resources on the server to leave open a connection that won't see any more use - if something bad happens to the ssh connection in the meantime (say, it gets killed by the network because it's idle, as happened in a real-world report), then ssh will exit non-zero, and we'll propagate the error up the stack. The server is correct here not to hang up after serving the pack. The v2 protocol's design is meant to allow multiple requests like this, and hanging up would be the wrong thing for a hypothetical client which was planning to make more requests (though in practice, the git.git client never would, and I doubt any other implementations would either). The right thing is instead for the client to signal to the server that it's not interested in making more requests. We can do that by closing the pipe descriptor we use to write to ssh. This will propagate to the server upload-pack as an EOF when it tries to read the next request (and then it will close its half, and the whole connection will go away). It's important to do this "half duplex" shutdown, because we have to do it _before_ we actually receive the pack. This is an artifact of the way fetch-pack and index-pack (or unpack-objects) interact. We hand the connection off to index-pack (really, a sideband demuxer which feeds it), and then wait until it returns. And it doesn't do that until it has resolved all of the deltas in the pack, even though it was done reading from the server long before. So just closing the connection fully after index-pack returns would be too late; we'd have held it open much longer than was necessary. And teaching index-pack to close the connection is awkward. It's not even seeing the whole conversation (the sideband demuxer is, but it doesn't actually know what's in the packets, or when the end comes). Note that this close() is happening deep within the transport code. It's possible that a caller would want to perform other operations over the same ssh transport after receiving the pack. But as of the current code, none of the callers do, and there haven't been discussions of any plans to change this. If we need to support that later, we can probably do so by passing down a flag for "you're the last request on the transport; it's OK to close" instead of the code just assuming that's true. The description above all discusses v2 ssh, so it's worth thinking about how this interacts with other protocols: - in v0 protocols, we could do the same half-duplex shutdown (it just goes into the v0 do_fetch_pack() instead). This does work, but since it doesn't have the same persistence problem in the first place, there's little reason to change it at this point. - local fetches against git-upload-pack on the same machine will behave the same as ssh (they are talking over two pipes, and see EOF on their input pipe) - fetches against git-daemon will run this same code, and close one of the descriptors. In practice, this won't do anything, since there our two descriptors are dups of each other, and not part of a half-duplex pair. The right thing would probably be to call shutdown(SHUT_WR) on it. I didn't bother with that here. It doesn't face the same error-code problem (since it's just a TCP connection), so it's really only an optimization problem. And git:// is not that widely used these days, and has less impact on server resources than an ssh termination. - v2 http doesn't suffer from this problem in the first place, as our pipes terminate at a local git-remote-https, which is passing data along as individual requests via curl. Probably curl is keeping the TCP/TLS connection open for more requests, and we might be able to tell it manually "hey, we are done making requests now". But I think that's much less important. It again doesn't suffer from the error-code problem, and HTTP keepalive is pretty well understood (importantly, the timeouts can be set low, because clients like curl know how to reconnect for subsequent requests if necessary). So it's probably not worth figuring out how to tell curl that we're done (though if we do, this patch is probably the first step anyway; fetch-pack closes the pipe back to remote-https, which would be the signal that it should tell curl we're done). The code is pretty straightforward. We close the pipe at the right moment, and set it to -1 to mark it as invalid. I modified the later cleanup code to avoid calling close(-1). That's not strictly necessary, since close(-1) is a noop, but hopefully makes things a bit more obvious to a reader. I suspect that trying to call more transport functions after the close() (e.g., calling transport_fetch_refs() again) would fail, as it's not smart enough to realize we need to re-open the ssh connection. But that's already true when v0 is in use. And no current callers want to do that (and again, the solution is probably a flag in the transport code to keep things open, which can be added later). There's no test here, as the situation it covers is inherently racy (the question is when upload-pack exits, compared to when index-pack finishes resolving deltas and exits). The rather gross shell snippet below does recreate the problematic situation; when run on a sufficiently-large repository (git.git works fine), it kills an "idle" upload-pack while the client is resolving deltas, leading to a failed clone. ( git clone --no-local --progress . foo.git 2>&1 echo >&2 "clone exit code=$?" ) | tr '\r' '\n' | while read line do case "$done,$line" in ,Resolving*) echo "hit resolving deltas; killing upload-pack" killall -9 git-upload-pack done=t ;; esac done Reported-by: Greg Pflaum <greg.pflaum@pnp-hcl.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-16Merge branch 'jt/push-negotiation'Libravatar Junio C Hamano1-5/+25
"git push" learns to discover common ancestor with the receiving end over protocol v2. * jt/push-negotiation: send-pack: support push negotiation fetch: teach independent negotiation (no packfile) fetch-pack: refactor command and capability write fetch-pack: refactor add_haves() fetch-pack: refactor process_acks()
2021-05-05fetch: teach independent negotiation (no packfile)Libravatar Jonathan Tan1-5/+25
Currently, the packfile negotiation step within a Git fetch cannot be done independent of sending the packfile, even though there is at least one application wherein this is useful. Therefore, make it possible for this negotiation step to be done independently. A subsequent commit will use this for one such application - push negotiation. This feature is for protocol v2 only. (An implementation for protocol v0 would require a separate implementation in the fetch, transport, and transport helper code.) In the protocol, the main hindrance towards independent negotiation is that the server can unilaterally decide to send the packfile. This is solved by a "wait-for-done" argument: the server will then wait for the client to say "done". In practice, the client will never say it; instead it will cease requests once it is satisfied. In the client, the main change lies in the transport and transport helper code. fetch_refs_via_pack() performs everything needed - protocol version and capability checks, and the negotiation itself. There are 2 code paths that do not go through fetch_refs_via_pack() that needed to be individually excluded: the bundle transport (excluded through requiring smart_options, which the bundle transport doesn't support) and transport helpers that do not support takeover. If or when we support independent negotiation for protocol v0, we will need to modify these 2 code paths to support it. But for now, report failure if independent negotiation is requested in these cases. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-30Merge branch 'ow/push-quiet-set-upstream'Libravatar Junio C Hamano1-5/+5
"git push --quiet --set-upstream" was not quiet when setting the upstream branch configuration, which has been corrected. * ow/push-quiet-set-upstream: transport: respect verbosity when setting upstream
2021-04-15transport: respect verbosity when setting upstreamLibravatar Øystein Walle1-5/+5
A command such as `git push -qu origin feature` will print "Branch 'feature' set up to track remote branch 'feature' from 'origin'." even when --quiet is passed. In this case it's because install_branch_config() is always called with BRANCH_CONFIG_VERBOSE. struct transport keeps track of the desired verbosity. Fix the above issue by passing BRANCH_CONFIG_VERBOSE conditionally based on that. Signed-off-by: Øystein Walle <oystwa@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-08Merge branch 'll/clone-reject-shallow'Libravatar Junio C Hamano1-0/+4
"git clone --reject-shallow" option fails the clone as soon as we notice that we are cloning from a shallow repository. * ll/clone-reject-shallow: builtin/clone.c: add --reject-shallow option
2021-04-07Merge branch 'ah/plugleaks'Libravatar Junio C Hamano1-0/+2
Plug or annotate remaining leaks that trigger while running the very basic set of tests. * ah/plugleaks: transport: also free remote_refs in transport_disconnect() parse-options: don't leak alias help messages parse-options: convert bitfield values to use binary shift init-db: silence template_dir leak when converting to absolute path init: remove git_init_db_config() while fixing leaks worktree: fix leak in dwim_branch() clone: free or UNLEAK further pointers when finished reset: free instead of leaking unneeded ref symbolic-ref: don't leak shortened refname in check_symref()
2021-04-01builtin/clone.c: add --reject-shallow optionLibravatar Li Linchao1-0/+4
In some scenarios, users may want more history than the repository offered for cloning, which happens to be a shallow repository, can give them. But because users don't know it is a shallow repository until they download it to local, we may want to refuse to clone this kind of repository, without creating any unnecessary files. The '--depth=x' option cannot be used as a solution; the source may be deep enough to give us 'x' commits when cloned, but the user may later need to deepen the history to arbitrary depth. Teach '--reject-shallow' option to "git clone" to abort as soon as we find out that we are cloning from a shallow repository. Signed-off-by: Li Linchao <lilinchao@oschina.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-21transport: also free remote_refs in transport_disconnect()Libravatar Andrzej Hunt1-0/+2
transport_get_remote_refs() can populate the transport struct's remote_refs. transport_disconnect() is already responsible for most of transport's cleanup - therefore we also take care of freeing remote_refs there. There are 2 locations where transport_disconnect() is called before we're done using the returned remote_refs. This patch changes those callsites to only call transport_disconnect() after the returned refs are no longer being used - which is necessary to safely be able to free remote_refs during transport_disconnect(). This commit fixes the following leak which was found while running t0000, but is expected to also fix the same pattern of leak in all locations that use transport_get_remote_refs(): Direct leak of 165 byte(s) in 1 object(s) allocated from: #0 0x49a6b2 in calloc /home/abuild/rpmbuild/BUILD/llvm-11.0.0.src/build/../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3 #1 0x9a72f2 in xcalloc /home/ahunt/oss-fuzz/git/wrapper.c:140:8 #2 0x8ce203 in alloc_ref_with_prefix /home/ahunt/oss-fuzz/git/remote.c:867:20 #3 0x8ce1a2 in alloc_ref /home/ahunt/oss-fuzz/git/remote.c:875:9 #4 0x72f63e in process_ref_v2 /home/ahunt/oss-fuzz/git/connect.c:426:8 #5 0x72f21a in get_remote_refs /home/ahunt/oss-fuzz/git/connect.c:525:8 #6 0x979ab7 in handshake /home/ahunt/oss-fuzz/git/transport.c:305:4 #7 0x97872d in get_refs_via_connect /home/ahunt/oss-fuzz/git/transport.c:339:9 #8 0x9774b5 in transport_get_remote_refs /home/ahunt/oss-fuzz/git/transport.c:1388:4 #9 0x51cf80 in cmd_clone /home/ahunt/oss-fuzz/git/builtin/clone.c:1271:9 #10 0x4cd60d in run_builtin /home/ahunt/oss-fuzz/git/git.c:453:11 #11 0x4cb2da in handle_builtin /home/ahunt/oss-fuzz/git/git.c:704:3 #12 0x4ccc37 in run_argv /home/ahunt/oss-fuzz/git/git.c:771:4 #13 0x4cac29 in cmd_main /home/ahunt/oss-fuzz/git/git.c:902:19 #14 0x69c45e in main /home/ahunt/oss-fuzz/git/common-main.c:52:11 #15 0x7f6a459d5349 in __libc_start_main (/lib64/libc.so.6+0x24349) Signed-off-by: Andrzej Hunt <ajrhunt@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13use CALLOC_ARRAYLibravatar René Scharfe1-1/+1
Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-05connect, transport: encapsulate arg in structLibravatar Jonathan Tan1-11/+12
In a future patch we plan to return the name of an unborn current branch from deep in the callchain to a caller via a new pointer parameter that points at a variable in the caller when the caller calls get_remote_refs() and transport_get_remote_refs(). In preparation for that, encapsulate the existing ref_prefixes parameter into a struct. The aforementioned unborn current branch will go into this new struct in the future patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-11transport: log received server session IDLibravatar Josh Steadmon1-0/+10
When a client receives a session-id capability from a protocol v0, v1, or v2 server, log the received session ID via a trace2 data event. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-03push: parse and set flag for "--force-if-includes"Libravatar Srinidhi Kaushik1-0/+2
The previous commit added the necessary machinery to implement the "--force-if-includes" protection, when "--force-with-lease" is used without giving exact object the remote still ought to have. Surface the feature by adding a command line option and a configuration variable to enable it. - Add a flag: "TRANSPORT_PUSH_FORCE_IF_INCLUDES" to indicate that the new option was passed from the command line of via configuration settings; update command line and configuration parsers to set the new flag accordingly. - Introduce a new configuration option "push.useForceIfIncludes", which is equivalent to setting "--force-if-includes" in the command line. - Update "remote-curl" to recognize and pass this option to "send-pack" when enabled. - Update "advise" to catch the reject reason "REJECT_REF_NEEDS_UPDATE", set when the ref status is "REF_STATUS_REJECT_REMOTE_UPDATED" and (optionally) print a help message when the push fails. - The new option is a "no-op" in the following scenarios: * When used without "--force-with-lease". * When used with "--force-with-lease", and if the expected commit on the remote side is specified as an argument. Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-03push: add reflog check for "--force-if-includes"Libravatar Srinidhi Kaushik1-0/+6
Add a check to verify if the remote-tracking ref of the local branch is reachable from one of its "reflog" entries. The check iterates through the local ref's reflog to see if there is an entry for the remote-tracking ref and collecting any commits that are seen, into a list; the iteration stops if an entry in the reflog matches the remote ref or if the entry timestamp is older the latest entry of the remote ref's "reflog". If there wasn't an entry found for the remote ref, "in_merge_bases_many()" is called to check if it is reachable from the list of collected commits. When a local branch that is based on a remote ref, has been rewound and is to be force pushed on the remote, "--force-if-includes" runs a check that ensures any updates to the remote-tracking ref that may have happened (by push from another repository) in-between the time of the last update to the local branch (via "git-pull", for instance) and right before the time of push, have been integrated locally before allowing a forced update. If the new option is passed without specifying "--force-with-lease", or specified along with "--force-with-lease=<refname>:<expect>" it is a "no-op". Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-25Merge branch 'jx/proc-receive-hook'Libravatar Junio C Hamano1-36/+120
"git receive-pack" that accepts requests by "git push" learned to outsource most of the ref updates to the new "proc-receive" hook. * jx/proc-receive-hook: doc: add documentation for the proc-receive hook transport: parse report options for tracking refs t5411: test updates of remote-tracking branches receive-pack: new config receive.procReceiveRefs doc: add document for capability report-status-v2 New capability "report-status-v2" for git-push receive-pack: feed report options to post-receive receive-pack: add new proc-receive hook t5411: add basic test cases for proc-receive hook transport: not report a non-head push as a branch
2020-09-03Merge branch 'jt/lazy-fetch'Libravatar Junio C Hamano1-4/+0
Updates to on-demand fetching code in lazily cloned repositories. * jt/lazy-fetch: fetch: no FETCH_HEAD display if --no-write-fetch-head fetch-pack: remove no_dependents code promisor-remote: lazy-fetch objects in subprocess fetch-pack: do not lazy-fetch during ref iteration fetch: only populate existing_refs if needed fetch: avoid reading submodule config until needed fetch: allow refspecs specified through stdin negotiator/noop: add noop fetch negotiator
2020-08-27transport: parse report options for tracking refsLibravatar Jiang Xin1-8/+29
When pushing a pseudo reference (such as "refs/for/master/topic"), may create or update one or more references. The real names of the references will be stored in the report options. Parse report options to create or update remote-tracking branches properly. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-27New capability "report-status-v2" for git-pushLibravatar Jiang Xin1-28/+88
The new introduced "proc-receive" hook may handle a command for a pseudo-reference with a zero-old as its old-oid, while the hook may create or update a reference with different name, different new-oid, and different old-oid (the reference may exist already with a non-zero old-oid). Current "report-status" protocol cannot report the status for such reference rewrite. Add new capability "report-status-v2" and new report protocol which is not backward compatible for report of git-push. If a user pushes to a pseudo-reference "refs/for/master/topic", and "receive-pack" creates two new references "refs/changes/23/123/1" and "refs/changes/24/124/1", for client without the knowledge of "report-status-v2", "receive-pack" will only send "ok/ng" directives in the report, such as: ok ref/for/master/topic But for client which has the knowledge of "report-status-v2", "receive-pack" will use "option" directives to report more attributes for the reference given by the above "ok/ng" directive. ok refs/for/master/topic option refname refs/changes/23/123/1 option new-oid <new-oid> ok refs/for/master/topic option refname refs/changes/24/124/1 option new-oid <new-oid> The client will report two new created references to the end user. Suggested-by: Junio C Hamano <gitster@pobox.com> Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-27transport: not report a non-head push as a branchLibravatar Jiang Xin1-3/+6
When pushing a new reference (not a head or tag), report it as a new reference instead of a new branch. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-18fetch-pack: remove no_dependents codeLibravatar Jonathan Tan1-4/+0
Now that Git has switched to using a subprocess to lazy-fetch missing objects, remove the no_dependents code as it is no longer used. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-17refspec: make sure stack refspec_item variables are zeroedLibravatar Jacob Keller1-0/+1
A couple of functions that used struct refspec_item did not zero out the structure memory. This can result in unexpected behavior, especially if additional parameters are ever added to refspec_item in the future. Use memset to ensure that unset structure members are zero. It may make sense to convert most of these uses of struct refspec_item to use either struct initializers or refspec_item_init_or_die. However, other similar code uses memset. Converting all of these uses has been left as a future exercise. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28strvec: convert remaining callers away from argv_array nameLibravatar Jeff King1-6/+6
We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts all of the remaining files, as the resulting diff is reasonably sized. The conversion was done purely mechanically with: git ls-files '*.c' '*.h' | xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-06Merge branch 'bc/sha-256-part-2'Libravatar Junio C Hamano1-2/+16
SHA-256 migration work continues. * bc/sha-256-part-2: (44 commits) remote-testgit: adapt for object-format bundle: detect hash algorithm when reading refs t5300: pass --object-format to git index-pack t5704: send object-format capability with SHA-256 t5703: use object-format serve option t5702: offer an object-format capability in the test t/helper: initialize the repository for test-sha1-array remote-curl: avoid truncating refs with ls-remote t1050: pass algorithm to index-pack when outside repo builtin/index-pack: add option to specify hash algorithm remote-curl: detect algorithm for dumb HTTP by size builtin/ls-remote: initialize repository based on fetch t5500: make hash independent serve: advertise object-format capability for protocol v2 connect: parse v2 refs with correct hash algorithm connect: pass full packet reader when parsing v2 refs Documentation/technical: document object-format for protocol v2 t1302: expect repo format version 1 for SHA-256 builtin/show-index: provide options to determine hash algo t5302: modernize test formatting ...
2020-06-25Merge branch 'jt/cdn-offload'Libravatar Junio C Hamano1-5/+7
The "fetch/clone" protocol has been updated to allow the server to instruct the clients to grab pre-packaged packfile(s) in addition to the packed object data coming over the wire. * jt/cdn-offload: upload-pack: fix a sparse '0 as NULL pointer' warning upload-pack: send part of packfile response as uri fetch-pack: support more than one pack lockfile upload-pack: refactor reading of pack-objects out Documentation: add Packfile URIs design doc Documentation: order protocol v2 sections http-fetch: support fetching packfiles by URL http-fetch: refactor into function http: refactor finish_http_pack_request() http: use --stdin when indexing dumb HTTP pack
2020-06-19bundle: detect hash algorithm when reading refsLibravatar brian m. carlson1-2/+8
Much like with the dumb HTTP transport, there isn't a way to explicitly specify the hash algorithm when dealing with a bundle, so detect the algorithm based on the length of the object IDs in the prerequisites and ref advertisements. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-10fetch-pack: support more than one pack lockfileLibravatar Jonathan Tan1-6/+8
Whenever a fetch results in a packfile being downloaded, a .keep file is generated, so that the packfile can be preserved (from, say, a running "git repack") until refs are written referring to the contents of the packfile. In a subsequent patch, a successful fetch using protocol v2 may result in more than one .keep file being generated. Therefore, teach fetch_pack() and the transport mechanism to support multiple .keep files. Implementation notes: - builtin/fetch-pack.c normally does not generate .keep files, and thus is unaffected by this or future changes. However, it has an undocumented "--lock-pack" feature, used by remote-curl.c when implementing the "fetch" remote helper command. In keeping with the remote helper protocol, only one "lock" line will ever be written; the rest will result in warnings to stderr. However, in practice, warnings will never be written because the remote-curl.c "fetch" is only used for protocol v0/v1 (which will not generate multiple .keep files). (Protocol v2 uses the "stateless-connect" command, not the "fetch" command.) - connected.c has an optimization in that connectivity checks on a ref need not be done if the target object is in a pack known to be self-contained and connected. If there are multiple packfiles, this optimization can no longer be done. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-27transport: add a hash algorithm memberLibravatar brian m. carlson1-0/+8
When connecting to a remote system, we need to know what hash algorithm it will be using to talk to us. Add a hash_algo member to struct transport and add a function to read this data from the transport object. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-24stateless-connect: send response end packetLibravatar Denton Liu1-1/+2
Currently, remote-curl acts as a proxy and blindly forwards packets between an HTTP server and fetch-pack. In the case of a stateless RPC connection where the connection is terminated before the transaction is complete, remote-curl will blindly forward the packets before waiting on more input from fetch-pack. Meanwhile, fetch-pack will read the transaction and continue reading, expecting more input to continue the transaction. This results in a deadlock between the two processes. This can be seen in the following command which does not terminate: $ git -c protocol.version=2 clone https://github.com/git/git.git --shallow-since=20151012 Cloning into 'git'... whereas the v1 version does terminate as expected: $ git -c protocol.version=1 clone https://github.com/git/git.git --shallow-since=20151012 Cloning into 'git'... fatal: the remote end hung up unexpectedly Instead of blindly forwarding packets, make remote-curl insert a response end packet after proxying the responses from the remote server when using stateless_connect(). On the RPC client side, ensure that each response ends as described. A separate control packet is chosen because we need to be able to differentiate between what the remote server sends and remote-curl's control packets. By ensuring in the remote-curl code that a server cannot send response end packets, we prevent a malicious server from being able to perform a denial of service attack in which they spoof a response end packet and cause the described deadlock to happen. Reported-by: Force Charlie <charlieio@outlook.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-19transport: extract common fetch_pack() callLibravatar Denton Liu1-17/+8
In the switch statement, the difference between the `protocol_v2` and `protocol_v{1,0}` arms is a preparatory call to die_if_server_options() in the latter. The fetch_pack() call is identical in both arms. However, since this fetch_pack() call has so many parameters, it is not immediately obvious that the call is identical in both cases. Extract the common fetch_pack() call out of the switch statement so that code duplication is reduced and the logic is more clear for future readers. While we're at it, rewrite the switch statement as an if-else tower for increased clarity. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-28Merge branch 'jx/atomic-push'Libravatar Junio C Hamano1-15/+9
"git push --atomic" used to show failures for refs that weren't even pushed, which has been corrected. * jx/atomic-push: transport-helper: new method reject_atomic_push() transport-helper: mark failure for atomic push send-pack: mark failure of atomic push properly t5543: never report what we do not push send-pack: fix inconsistent porcelain output
2020-04-17send-pack: mark failure of atomic push properlyLibravatar Jiang Xin1-14/+0
When pushing with SSH or other smart protocol, references are validated by function `check_to_send_update()` before they are sent in commands to `send_pack()` of "receve-pack". For atomic push, if a reference is rejected after the validation, only references pushed by user should be marked as failure, instead of report failure on all remote references. Commit v2.22.0-1-g3bca1e7f9f (transport-helper: enforce atomic in push_refs_with_push, 2019-07-11) wanted to fix report issue of HTTP protocol, but marked all remote references failure for atomic push. In order to fix the issue of status report for SSH or other built-in smart protocol, revert part of that commit and add additional status for function `atomic_push_failure()`. The additional status for it except the "REF_STATUS_EXPECTING_REPORT" status are: - REF_STATUS_NONE : Not marked as "REF_STATUS_EXPECTING_REPORT" yet. - REF_STATUS_OK : Assume OK for dryrun or status_report is disabled. This fix won't resolve the issue of status report in transport-helper for HTTP or other protocols, and breaks test case in t5541. Will fix it in additional commit. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-17send-pack: fix inconsistent porcelain outputLibravatar Jiang Xin1-1/+9
The porcelain output of a failed `git-push` command is inconsistent for different protocols. For example, the following `git-push` command may fail due to the failure of the `pre-receive` hook. git push --porcelain origin HEAD:refs/heads/master For SSH protocol, the porcelain output does not end with a "Done" message: To <URL/of/upstream.git> ! HEAD:refs/heads/master [remote rejected] (pre-receive hook declined) While for HTTP protocol, the porcelain output does end with a "Done" message: To <URL/of/upstream.git> ! HEAD:refs/heads/master [remote rejected] (pre-receive hook declined) Done The following code at the end of function `send_pack()` indicates that `send_pack()` should not return an error if some references are rejected in porcelain mode. int send_pack(...) ... ... if (args->porcelain) return 0; for (ref = remote_refs; ref; ref = ref->next) { switch (ref->status) { case REF_STATUS_NONE: case REF_STATUS_UPTODATE: case REF_STATUS_OK: break; default: return -1; } } return 0; } So if atomic push failed, must check the porcelain mode before return an error. And `receive_status()` should not return an error for a failed updated reference, because `send_pack()` will check them instead. Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-30oid_array: rename source file from sha1-arrayLibravatar Jeff King1-1/+1
We renamed the actual data structure in 910650d2f8 (Rename sha1_array to oid_array, 2017-03-31), but the file is still called sha1-array. Besides being slightly confusing, it makes it more annoying to grep for leftover occurrences of "sha1" in various files, because the header is included in so many places. Let's complete the transition by renaming the source and header files (and fixing up a few comment references). I kept the "-" in the name, as that seems to be our style; cf. fc1395f4a4 (sha1_file.c: rename to use dash in file name, 2018-04-10). We also have oidmap.h and oidset.h without any punctuation, but those are "struct oidmap" and "struct oidset" in the code. We _could_ make this "oidarray" to match, but somehow it looks uglier to me because of the length of "array" (plus it would be a very invasive patch for little gain). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-22Merge branch 'jk/no-flush-upon-disconnecting-slrpc-transport'Libravatar Junio C Hamano1-1/+1
Reduce unnecessary round-trip when running "ls-remote" over the stateless RPC mechanism. * jk/no-flush-upon-disconnecting-slrpc-transport: transport: don't flush when disconnecting stateless-rpc helper
2020-01-08transport: don't flush when disconnecting stateless-rpc helperLibravatar Jeff King1-1/+1
Since ba227857d2 (Reduce the number of connects when fetching, 2008-02-04), when we disconnect a git transport, we send a final flush packet. This cleanly tells the other side that we're done, and avoids the other side complaining "the remote end hung up unexpectedly" (though we'd only see that for transports that pass along the server stderr, like ssh or local-host). But when we've initiated a v2 stateless-connect session over a transport helper, there's no point in sending this flush packet. Each operation we've performed is self-contained, and the other side is fine with us hanging up between operations. But much worse, by sending the flush packet we may cause the helper to issue an entirely new request _just_ to send the flush packet. So we can incur an extra network request just to say "by the way, we have nothing more to send". Let's drop this extra flush packet. As the test shows, this reduces the number of POSTs required for a v2 ls-remote over http from 2 to 1. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-08transport: push codepath can take arbitrary repositoryLibravatar Junio C Hamano1-10/+10
The previous step added annotations with "the_repository" to various functions in the push codepath in the transport layer, but they all can take arbitrary repository pointer, and may be working on a repository that is not the_repository. Fix them. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-03push: add trace2 instrumentationLibravatar Josh Steadmon1-2/+12
Add trace2 regions in transport.c and builtin/push.c to better track time spent in various phases of pushing: * Listing refs * Checking submodules * Pushing submodules * Pushing refs Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-18Merge branch 'jt/avoid-ls-refs-with-http'Libravatar Junio C Hamano1-12/+6
The http transport lacked some optimization the native transports learned to avoid unnecessary ref advertisement, which has been corrected. * jt/avoid-ls-refs-with-http: transport: teach all vtables to allow fetch first transport-helper: skip ls-refs if unnecessary
2019-09-18Merge branch 'md/list-objects-filter-combo'Libravatar Junio C Hamano1-0/+1
The list-objects-filter API (used to create a sparse/lazy clone) learned to take a combined filter specification. * md/list-objects-filter-combo: list-objects-filter-options: make parser void list-objects-filter-options: clean up use of ALLOC_GROW list-objects-filter-options: allow mult. --filter strbuf: give URL-encoding API a char predicate fn list-objects-filter-options: make filter_spec a string_list list-objects-filter-options: move error check up list-objects-filter: implement composite filters list-objects-filter-options: always supply *errbuf list-objects-filter: put omits set in filter struct list-objects-filter: encapsulate filter components
2019-08-22transport: teach all vtables to allow fetch firstLibravatar Jonathan Tan1-12/+6
The only transport that does not allow fetch() to be called before get_refs_list() is the bundle transport. Clean up the code by teaching the bundle transport the ability to do this, and removing support for transports that don't support this order of invocation. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-25Merge branch 'es/local-atomic-push-failure-with-http'Libravatar Junio C Hamano1-0/+14
"git push --atomic" that goes over the transport-helper (namely, the smart http transport) failed to prevent refs to be pushed when it can locally tell that one of the ref update will fail without having to consult the other end, which has been corrected. * es/local-atomic-push-failure-with-http: transport-helper: avoid var decl in for () loop control transport-helper: enforce atomic in push_refs_with_push
2019-07-16transport-helper: avoid var decl in for () loop controlLibravatar Junio C Hamano1-1/+2
We do allow a few selected C99 constructs in our codebase these days, but this is not among them (yet). Reported-by: Carlo Arenas <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-12transport-helper: enforce atomic in push_refs_with_pushLibravatar Emily Shaffer1-0/+13
Teach transport-helper how to notice if skipping a ref during push would violate atomicity on the client side. We notice that a ref would be rejected, and choose not to send it, but don't notice that if the client has asked for --atomic we are violating atomicity if all the other pushes we are sending would succeed. Asking the server end to uphold atomicity wouldn't work here as the server doesn't have any idea that we tried to update a ref that's broken. The added test-case is a succinct way to reproduce this issue that fails today. The same steps work fine when we aren't using a transport-helper to get to the upstream, i.e. when we've added a local repository as a remote: git remote add ~/upstream upstream Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-01object-store.h: move for_each_alternate_ref() from transport.hLibravatar Jeff King1-97/+0
There's nothing inherently transport-related about enumerating the alternate ref tips. The code has lived in transport.[ch] because the only use so far had been advertising available tips during transport. But it could be used for more, and a future patch will teach rev-list to access these refs. Let's move it alongside the other alt-odb code, declaring it in object-store.h with the implementation in sha1-file.c. This lets us drop the inclusion of transport.h from receive-pack, which perhaps shows how it was misplaced (though receive-pack is about transporting objects, transport.h is mostly about the client side). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-28list-objects-filter-options: allow mult. --filterLibravatar Matthew DeVore1-0/+1
Allow combining of multiple filters by simply repeating the --filter flag. Before this patch, the user had to combine them in a single flag somewhat awkwardly (e.g. --filter=combine:FOO+BAR), including URL-encoding the individual filters. To make this work, in the --filter flag parsing callback, rather than error out when we detect that the filter_options struct is already populated, we modify it in-place to contain the added sub-filter. The existing sub-filter becomes the lhs of the combined filter, and the next sub-filter becomes the rhs. We also have to URL-encode the LHS and RHS sub-filters. We can simplify the operation if the LHS is already a combine: filter. In that case, we just append the URL-encoded RHS sub-filter to the LHS spec to get the new spec. Helped-by: Emily Shaffer <emilyshaffer@google.com> Helped-by: Jeff Hostetler <git@jeffhostetler.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-09Merge branch 'jt/clone-server-option'Libravatar Junio C Hamano1-0/+10
"git clone" learned a new --server-option option when talking over the protocol version 2. * jt/clone-server-option: clone: send server options when using protocol v2 transport: die if server options are unsupported