summaryrefslogtreecommitdiff
path: root/http.h
AgeCommit message (Collapse)AuthorFilesLines
2021-02-22http: allow custom index-pack argsLibravatar Jonathan Tan1-5/+5
Currently, when fetching, packfiles referenced by URIs are run through index-pack without any arguments other than --stdin and --keep, no matter what arguments are used for the packfile that is inline in the fetch response. As a preparation for ensuring that all packs (whether inline or not) use the same index-pack arguments, teach the http subsystem to allow custom index-pack arguments. http-fetch has been updated to use the new API. For now, it passes --keep alone instead of --keep with a process ID, but this is only temporary because http-fetch itself will be taught to accept index-pack parameters (instead of using a hardcoded constant) in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-25Merge branch 'jt/cdn-offload'Libravatar Junio C Hamano1-3/+21
The "fetch/clone" protocol has been updated to allow the server to instruct the clients to grab pre-packaged packfile(s) in addition to the packed object data coming over the wire. * jt/cdn-offload: upload-pack: fix a sparse '0 as NULL pointer' warning upload-pack: send part of packfile response as uri fetch-pack: support more than one pack lockfile upload-pack: refactor reading of pack-objects out Documentation: add Packfile URIs design doc Documentation: order protocol v2 sections http-fetch: support fetching packfiles by URL http-fetch: refactor into function http: refactor finish_http_pack_request() http: use --stdin when indexing dumb HTTP pack
2020-06-10http-fetch: support fetching packfiles by URLLibravatar Jonathan Tan1-0/+11
Teach http-fetch the ability to download packfiles directly, given a URL, and to verify them. The http_pack_request suite has been augmented with a function that takes a URL directly. With this function, the hash is only used to determine the name of the temporary file. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-10http: refactor finish_http_pack_request()Libravatar Jonathan Tan1-3/+10
finish_http_pack_request() does multiple tasks, including some housekeeping on a struct packed_git - (1) closing its index, (2) removing it from a list, and (3) installing it. These concerns are independent of fetching a pack through HTTP: they are there only because (1) the calling code opens the pack's index before deciding to fetch it, (2) the calling code maintains a list of packfiles that can be fetched, and (3) the calling code fetches it in order to make use of its objects in the same process. In preparation for a subsequent commit, which adds a feature that does not need any of this housekeeping, remove (1), (2), and (3) from finish_http_pack_request(). (2) and (3) are now done by a helper function, and (1) is the responsibility of the caller (in this patch, done closer to the point where the pack index is opened). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-05-11http, imap-send: stop using CURLOPT_VERBOSELibravatar Jonathan Tan1-0/+7
Whenever GIT_CURL_VERBOSE is set, teach Git to behave as if GIT_TRACE_CURL=1 and GIT_TRACE_CURL_NO_DATA=1 is set, instead of setting CURLOPT_VERBOSE. This is to prevent inadvertent revelation of sensitive data. In particular, GIT_CURL_VERBOSE redacts neither the "Authorization" header nor any cookies specified by GIT_REDACT_COOKIES. Unifying the tracing mechanism also has the future benefit that any improvements to the tracing mechanism will benefit both users of GIT_CURL_VERBOSE and GIT_TRACE_CURL, and we do not need to remember to implement any improvement twice. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-15http: use xmalloc with cURLLibravatar Carlo Marcelo Arenas Belón1-0/+4
f0ed8226c9 (Add custom memory allocator to MinGW and MacOS builds, 2009-05-31) never told cURL about it. Correct that by using the cURL initializer available since version 7.12 to point to xmalloc and friends for consistency which then will pass the allocation requests along when USE_NED_ALLOCATOR=YesPlease is used (most likely in Windows) Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-13Merge branch 'dl/no-extern-in-func-decl'Libravatar Junio C Hamano1-31/+31
Mechanically and systematically drop "extern" from function declarlation. * dl/no-extern-in-func-decl: *.[ch]: manually align parameter lists *.[ch]: remove extern from function declarations using sed *.[ch]: remove extern from function declarations using spatch
2019-05-05*.[ch]: manually align parameter listsLibravatar Denton Liu1-5/+5
In previous patches, extern was mechanically removed from function declarations without care to formatting, causing parameter lists to be misaligned. Manually format changed sections such that the parameter lists should be realigned. Viewing this patch with 'git diff -w' should produce no output. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-05*.[ch]: remove extern from function declarations using spatchLibravatar Denton Liu1-26/+26
There has been a push to remove extern from function declarations. Remove some instances of "extern" for function declarations which are caught by Coccinelle. Note that Coccinelle has some difficulty with processing functions with `__attribute__` or varargs so some `extern` declarations are left behind to be dealt with in a future patch. This was the Coccinelle patch used: @@ type T; identifier f; @@ - extern T f(...); and it was run with: $ git ls-files \*.{c,h} | grep -v ^compat/ | xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place Files under `compat/` are intentionally excluded as some are directly copied from external sources and we should avoid churning them as much as possible. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-25Merge branch 'bc/hash-transition-16'Libravatar Junio C Hamano1-1/+1
Conversion from unsigned char[20] to struct object_id continues. * bc/hash-transition-16: (35 commits) gitweb: make hash size independent Git.pm: make hash size independent read-cache: read data in a hash-independent way dir: make untracked cache extension hash size independent builtin/difftool: use parse_oid_hex refspec: make hash size independent archive: convert struct archiver_args to object_id builtin/get-tar-commit-id: make hash size independent get-tar-commit-id: parse comment record hash: add a function to lookup hash algorithm by length remote-curl: make hash size independent http: replace sha1_to_hex http: compute hash of downloaded objects using the_hash_algo http: replace hard-coded constant with the_hash_algo http-walker: replace sha1_to_hex http-push: remove remaining uses of sha1_to_hex http-backend: allow 64-character hex names http-push: convert to use the_hash_algo builtin/pull: make hash-size independent builtin/am: make hash size independent ...
2019-04-01http: compute hash of downloaded objects using the_hash_algoLibravatar brian m. carlson1-1/+1
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-24http: factor out curl result code normalizationLibravatar Jeff King1-0/+9
We make some requests with CURLOPT_FAILONERROR and some without, and then handle_curl_result() normalizes any failures to a uniform CURLcode. There are some other code paths in the dumb-http walker which don't use handle_curl_result(); let's pull the normalization into its own function so it can be reused. Arguably those code paths would benefit from the rest of handle_curl_result(), notably the auth handling. But retro-fitting it now would be a lot of work, and in practice it doesn't matter too much (whatever authentication we needed to make the initial contact with the server is generally sufficient for the rest of the dumb-http requests). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-06Merge branch 'jk/loose-object-cache-oid'Libravatar Junio C Hamano1-3/+3
Code clean-up. * jk/loose-object-cache-oid: prefer "hash mismatch" to "sha1 mismatch" sha1-file: avoid "sha1 file" for generic use in messages sha1-file: prefer "loose object file" to "sha1 file" in messages sha1-file: drop has_sha1_file() convert has_sha1_file() callers to has_object_file() sha1-file: convert pass-through functions to object_id sha1-file: modernize loose header/stream functions sha1-file: modernize loose object file functions http: use struct object_id instead of bare sha1 update comment references to sha1_object_info() sha1-file: fix outdated sha1 comment references
2019-01-10http: enable keep_error for HTTP requestsLibravatar Masaya Suzuki1-1/+0
curl stops parsing a response when it sees a bad HTTP status code and it has CURLOPT_FAILONERROR set. This prevents GIT_CURL_VERBOSE to show HTTP headers on error. keep_error is an option to receive the HTTP response body for those error responses. By enabling this option, curl will process the HTTP response headers, and they're shown if GIT_CURL_VERBOSE is set. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-08http: use struct object_id instead of bare sha1Libravatar Jeff King1-3/+3
The dumb-http walker code still passes around and stores object ids as "unsigned char *sha1". Let's modernize it. There's probably still more work to be done to handle dumb-http fetches with a new, larger hash. But that can wait; this is enough that we can now convert some of the low-level object routines that we call into from here (and in fact, some of the "oid.hash" references added here will be further improved in the next patch). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-30Merge branch 'jk/snprintf-truncation'Libravatar Junio C Hamano1-2/+2
Avoid unchecked snprintf() to make future code auditing easier. * jk/snprintf-truncation: fmt_with_err: add a comment that truncation is OK shorten_unambiguous_ref: use xsnprintf fsmonitor: use internal argv_array of struct child_process log_write_email_headers: use strbufs http: use strbufs instead of fixed buffers
2018-05-21http: use strbufs instead of fixed buffersLibravatar Jeff King1-2/+2
We keep the names of incoming packs and objects in fixed PATH_MAX-size buffers, and snprintf() into them. This is unlikely to end up with truncated filenames, but it is possible (especially on systems where PATH_MAX is shorter than actual paths can be). Let's switch to using strbufs, which makes the question go away entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15http: allow providing extra headers for http requestsLibravatar Brandon Williams1-0/+7
Add a way for callers to request that extra headers be included when making http requests. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-13http.postbuffer: allow full range of ssize_t valuesLibravatar David Turner1-1/+1
Unfortunately, in order to push some large repos where a server does not support chunked encoding, the http postbuffer must sometimes exceed two gigabytes. On a 64-bit system, this is OK: we just malloc a larger buffer. This means that we need to use CURLOPT_POSTFIELDSIZE_LARGE to set the buffer size. Signed-off-by: David Turner <dturner@twosigma.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-19Merge branch 'jk/http-walker-limit-redirect-2.9'Libravatar Junio C Hamano1-1/+9
Transport with dumb http can be fooled into following foreign URLs that the end user does not intend to, especially with the server side redirects and http-alternates mechanism, which can lead to security issues. Tighten the redirection and make it more obvious to the end user when it happens. * jk/http-walker-limit-redirect-2.9: http: treat http-alternates like redirects http: make redirects more obvious remote-curl: rename shadowed options variable http: always update the base URL for redirects http: simplify update_url_from_redirect
2016-12-06http: make redirects more obviousLibravatar Jeff King1-1/+9
We instruct curl to always follow HTTP redirects. This is convenient, but it creates opportunities for malicious servers to create confusing situations. For instance, imagine Alice is a git user with access to a private repository on Bob's server. Mallory runs her own server and wants to access objects from Bob's repository. Mallory may try a few tricks that involve asking Alice to clone from her, build on top, and then push the result: 1. Mallory may simply redirect all fetch requests to Bob's server. Git will transparently follow those redirects and fetch Bob's history, which Alice may believe she got from Mallory. The subsequent push seems like it is just feeding Mallory back her own objects, but is actually leaking Bob's objects. There is nothing in git's output to indicate that Bob's repository was involved at all. The downside (for Mallory) of this attack is that Alice will have received Bob's entire repository, and is likely to notice that when building on top of it. 2. If Mallory happens to know the sha1 of some object X in Bob's repository, she can instead build her own history that references that object. She then runs a dumb http server, and Alice's client will fetch each object individually. When it asks for X, Mallory redirects her to Bob's server. The end result is that Alice obtains objects from Bob, but they may be buried deep in history. Alice is less likely to notice. Both of these attacks are fairly hard to pull off. There's a social component in getting Mallory to convince Alice to work with her. Alice may be prompted for credentials in accessing Bob's repository (but not always, if she is using a credential helper that caches). Attack (1) requires a certain amount of obliviousness on Alice's part while making a new commit. Attack (2) requires that Mallory knows a sha1 in Bob's repository, that Bob's server supports dumb http, and that the object in question is loose on Bob's server. But we can probably make things a bit more obvious without any loss of functionality. This patch does two things to that end. First, when we encounter a whole-repo redirect during the initial ref discovery, we now inform the user on stderr, making attack (1) much more obvious. Second, the decision to follow redirects is now configurable. The truly paranoid can set the new http.followRedirects to false to avoid any redirection entirely. But for a more practical default, we will disallow redirects only after the initial ref discovery. This is enough to thwart attacks similar to (2), while still allowing the common use of redirects at the repository level. Since c93c92f30 (http: update base URLs when we see redirects, 2013-09-28) we re-root all further requests from the redirect destination, which should generally mean that no further redirection is necessary. As an escape hatch, in case there really is a server that needs to redirect individual requests, the user can set http.followRedirects to "true" (and this can be done on a per-server basis via http.*.followRedirects config). Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06Merge branch 'ep/http-curl-trace'Libravatar Junio C Hamano1-0/+2
HTTP transport gained an option to produce more detailed debugging trace. * ep/http-curl-trace: imap-send.c: introduce the GIT_TRACE_CURL enviroment variable http.c: implement the GIT_TRACE_CURL environment variable
2016-05-24http.c: implement the GIT_TRACE_CURL environment variableLibravatar Elia Pinto1-0/+2
Implement the GIT_TRACE_CURL environment variable to allow a greater degree of detail of GIT_CURL_VERBOSE, in particular the complete transport header and all the data payload exchanged. It might be useful if a particular situation could require a more thorough debugging analysis. Document the new GIT_TRACE_CURL environment variable. Helped-by: Torsten Bögershausen <tboegi@web.de> Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-27http: support sending custom HTTP headersLibravatar Johannes Schindelin1-0/+1
We introduce a way to send custom HTTP headers with all requests. This allows us, for example, to send an extra token from build agents for temporary access to private repositories. (This is the use case that triggered this patch.) This feature can be used like this: git -c http.extraheader='Secret: sssh!' fetch $URL $REF Note that `curl_easy_setopt(..., CURLOPT_HTTPHEADER, ...)` takes only a single list, overriding any previous call. This means we have to collect _all_ of the headers we want to use into a single list, and feed it to cURL in one shot. Since we already unconditionally set a "pragma" header when initializing the curl handles, we can add our new headers to that list. For callers which override the default header list (like probe_rpc), we provide `http_copy_default_headers()` so they can do the same trick. Big thanks to Jeff King and Junio Hamano for their outstanding help and patient reviews. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-24Merge branch 'ew/force-ipv4'Libravatar Junio C Hamano1-0/+1
"git fetch" and friends that make network connections can now be told to only use ipv4 (or ipv6). * ew/force-ipv4: connect & http: support -4 and -6 switches for remote operations
2016-02-12connect & http: support -4 and -6 switches for remote operationsLibravatar Eric Wong1-0/+1
Sometimes it is necessary to force IPv4-only or IPv6-only operation on networks where name lookups may return a non-routable address and stall remote operations. The ssh(1) command has an equivalent switches which we may pass when we run them. There may be old ssh(1) implementations out there which do not support these switches; they should report the appropriate error in that case. rsync support is untouched for now since it is deprecated and scheduled to be removed. Signed-off-by: Eric Wong <normalperson@yhbt.net> Reviewed-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26http: use credential API to handle proxy authenticationLibravatar Knut Franke1-0/+1
Currently, the only way to pass proxy credentials to curl is by including them in the proxy URL. Usually, this means they will end up on disk unencrypted, one way or another (by inclusion in ~/.gitconfig, shell profile or history). Since proxy authentication often uses a domain user, credentials can be security sensitive; therefore, a safer way of passing credentials is desirable. If the configured proxy contains a username but not a password, query the credential API for one. Also, make sure we approve/reject proxy credentials properly. For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy environment variables, which would otherwise be evaluated as a fallback by curl. Without this, we would have different semantics for git configuration and environment variables. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Knut Franke <k.franke@science-computing.de> Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-02http.c: use CURLOPT_RANGE for range requestsLibravatar David Turner1-1/+0
A HTTP server is permitted to return a non-range response to a HTTP range request (and Apache httpd in fact does this in some cases). While libcurl knows how to correctly handle this (by skipping bytes before and after the requested range), it only turns on this handling if it is aware that a range request is being made. By manually setting the range header instead of using CURLOPT_RANGE, we were hiding the fact that this was a range request from libcurl. This could cause corruption. Signed-off-by: David Turner <dturner@twopensource.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-15http.c: make finish_active_slot() and handle_curl_result() staticLibravatar Junio C Hamano1-2/+0
They used to be used directly by remote-curl.c for the smart-http protocol. But they got wrapped by run_one_slot() in beed336 (http: never use curl_easy_perform, 2014-02-18). Any future users are expected to follow that route. Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-27http: optionally extract charset parameter from content-typeLibravatar Jeff King1-0/+7
Since the previous commit, we now give a sanitized, shortened version of the content-type header to any callers who ask for it. This patch adds back a way for them to cleanly access specific parameters to the type. We could easily extract all parameters and make them available via a string_list, but: 1. That complicates the interface and memory management. 2. In practice, no planned callers care about anything except the charset. This patch therefore goes with the simplest thing, and we can expand or change the interface later if it becomes necessary. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-08Merge branch 'jl/nor-or-nand-and'Libravatar Junio C Hamano1-2/+1
Eradicate mistaken use of "nor" (that is, essentially "nor" used not in "neither A nor B" ;-)) from in-code comments, command output strings, and documentations. * jl/nor-or-nand-and: code and test: fix misuses of "nor" comments: fix misuses of "nor" contrib: fix misuses of "nor" Documentation: fix misuses of "nor"
2014-03-31comments: fix misuses of "nor"Libravatar Justin Lebar1-2/+1
Signed-off-by: Justin Lebar <jlebar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18http: never use curl_easy_performLibravatar Jeff King1-0/+9
We currently don't reuse http connections when fetching via the smart-http protocol. This is bad because the TCP handshake introduces latency, and especially because SSL connection setup may be non-trivial. We can fix it by consistently using curl's "multi" interface. The reason is rather complicated: Our http code has two ways of being used: queuing many "slots" to be fetched in parallel, or fetching a single request in a blocking manner. The parallel code is built on curl's "multi" interface. Most of the single-request code uses http_request, which is built on top of the parallel code (we just feed it one slot, and wait until it finishes). However, one could also accomplish the single-request scheme by avoiding curl's multi interface entirely and just using curl_easy_perform. This is simpler, and is used by post_rpc in the smart-http protocol. It does work to use the same curl handle in both contexts, as long as it is not at the same time. However, internally curl may not share all of the cached resources between both contexts. In particular, a connection formed using the "multi" code will go into a reuse pool connected to the "multi" object. Further requests using the "easy" interface will not be able to reuse that connection. The smart http protocol does ref discovery via http_request, which uses the "multi" interface, and then follows up with the "easy" interface for its rpc calls. As a result, we make two HTTP connections rather than reusing a single one. We could teach the ref discovery to use the "easy" interface. But it is only once we have done this discovery that we know whether the protocol will be smart or dumb. If it is dumb, then our further requests, which want to fetch objects in parallel, will not be able to reuse the same connection. Instead, this patch switches post_rpc to build on the parallel interface, which means that we use it consistently everywhere. It's a little more complicated to use, but since we have the infrastructure already, it doesn't add any code; we can just factor out the relevant bits from http_request. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05Merge branch 'bc/http-100-continue'Libravatar Junio C Hamano1-0/+1
Issue "100 Continue" responses to help use of GSS-Negotiate authentication scheme over HTTP transport when needed. * bc/http-100-continue: remote-curl: fix large pushes with GSSAPI remote-curl: pass curl slot_results back through run_slot http: return curl's AUTHAVAIL via slot_results
2013-10-31http: return curl's AUTHAVAIL via slot_resultsLibravatar Jeff King1-0/+1
Callers of the http code may want to know which auth types were available for the previous request. But after finishing with the curl slot, they are not supposed to look at the curl handle again. We already handle returning other information via the slot_results struct; let's add a flag to check the available auth. Note that older versions of curl did not support this, so we simply return 0 (something like "-1" would be worse, as the value is a bitflag and we might accidentally set a flag). This is sufficient for the callers planned in this series, who only trigger some optional behavior if particular bits are set, and can live with a fake "no bits" answer. Signed-off-by: Jeff King <peff@peff.net>
2013-10-14http: update base URLs when we see redirectsLibravatar Jeff King1-0/+8
If a caller asks the http_get_* functions to go to a particular URL and we end up elsewhere due to a redirect, the effective_url field can tell us where we went. It would be nice to remember this redirect and short-cut further requests for two reasons: 1. It's more efficient. Otherwise we spend an extra http round-trip to the server for each subsequent request, just to get redirected. 2. If we end up with an http 401 and are going to ask for credentials, it is to feed them to the redirect target. If the redirect is an http->https upgrade, this means our credentials may be provided on the http leg, just to end up redirected to https. And if the redirect crosses server boundaries, then curl will drop the credentials entirely as it follows the redirect. However, it, it is not enough to simply record the effective URL we saw and use that for subsequent requests. We were originally fed a "base" url like: http://example.com/foo.git and we want to figure out what the new base is, even though the URLs we see may be: original: http://example.com/foo.git/info/refs effective: http://example.com/bar.git/info/refs Subsequent requests will not be for "info/refs", but for other paths relative to the base. We must ask the caller to pass in the original base, and we must pass the redirected base back to the caller (so that it can generate more URLs from it). Furthermore, we need to feed the new base to the credential code, so that requests to credential helpers (or to the user) match the URL we will be requesting. This patch teaches http_request_reauth to do this munging. Since it is the caller who cares about making more URLs, it seems at first glance that callers could simply check effective_url themselves and handle it. However, since we need to update the credential struct before the second re-auth request, we have to do it inside http_request_reauth. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-10-14http: provide effective url to callersLibravatar Jeff King1-0/+6
When we ask curl to access a URL, it may follow one or more redirects to reach the final location. We have no idea this has happened, as curl takes care of the details and simply returns the final content to us. The final URL that we ended up with can be accessed via CURLINFO_EFFECTIVE_URL. Let's make that optionally available to callers of http_get_*, so that they can make further decisions based on the redirection. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-10-14http: hoist credential request out of handle_curl_resultLibravatar Jeff King1-0/+1
When we are handling a curl response code in http_request or in the remote-curl RPC code, we use the handle_curl_result helper to translate curl's response into an easy-to-use code. When we see an HTTP 401, we do one of two things: 1. If we already had a filled-in credential, we mark it as rejected, and then return HTTP_NOAUTH to indicate to the caller that we failed. 2. If we didn't, then we ask for a new credential and tell the caller HTTP_REAUTH to indicate that they may want to try again. Rejecting in the first case makes sense; it is the natural result of the request we just made. However, prompting for more credentials in the second step does not always make sense. We do not know for sure that the caller is going to make a second request, and nor are we sure that it will be to the same URL. Logically, the prompt belongs not to the request we just finished, but to the request we are (maybe) about to make. In practice, it is very hard to trigger any bad behavior. Currently, if we make a second request, it will always be to the same URL (even in the face of redirects, because curl handles the redirects internally). And we almost always retry on HTTP_REAUTH these days. The one exception is if we are streaming a large RPC request to the server (e.g., a pushed packfile), in which case we cannot restart. It's extremely unlikely to see a 401 response at this stage, though, as we would typically have seen it when we sent a probe request, before streaming the data. This patch drops the automatic prompt out of case 2, and instead requires the caller to do it. This is a few extra lines of code, and the bug it fixes is unlikely to come up in practice. But it is conceptually cleaner, and paves the way for better handling of credentials across redirects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-30http: refactor options to http_get_*Libravatar Jeff King1-5/+10
Over time, the http_get_strbuf function has grown several optional parameters. We now have a bitfield with multiple boolean options, as well as an optional strbuf for returning the content-type of the response. And a future patch in this series is going to add another strbuf option. Treating these as separate arguments has a few downsides: 1. Most call sites need to add extra NULLs and 0s for the options they aren't interested in. 2. The http_get_* functions are actually wrappers around 2 layers of low-level implementation functions. We have to pass these options through individually. 3. The http_get_strbuf wrapper learned these options, but nobody bothered to do so for http_get_file, even though it is backed by the same function that does understand the options. Let's consolidate the options into a single struct. For the common case of the default options, we'll allow callers to simply pass a NULL for the options struct. The resulting code is often a few lines longer, but it ends up being easier to read (and to change as we add new options, since we do not need to update each call site). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-04-19Merge branch 'mv/ssl-ftp-curl'Libravatar Junio C Hamano1-0/+9
Does anybody really use commit walkers over (s)ftp? * mv/ssl-ftp-curl: Support FTP-over-SSL/TLS for regular FTP
2013-04-12Support FTP-over-SSL/TLS for regular FTPLibravatar Modestas Vainius1-0/+9
Add a boolean http.sslTry option which allows to enable AUTH SSL/TLS and encrypted data transfers when connecting via regular FTP protocol. Default is false since it might trigger certificate verification errors on misconfigured servers. Signed-off-by: Modestas Vainius <modestas@vainius.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06http: drop http_error functionLibravatar Jeff King1-5/+0
This function is a single-liner and is only called from one place. Just inline it, which makes the code more obvious. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06http: simplify http_error helper functionLibravatar Jeff King1-3/+2
This helper function should really be a one-liner that prints an error message, but it has ended up unnecessarily complicated: 1. We call error() directly when we fail to start the curl request, so we must later avoid printing a duplicate error in http_error(). It would be much simpler in this case to just stuff the error message into our usual curl_errorstr buffer rather than printing it ourselves. This means that http_error does not even have to care about curl's exit value (the interesting part is in the errorstr buffer already). 2. We return the "ret" value passed in to us, but none of the callers actually cares about our return value. We can just drop this entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06http: add HTTP_KEEP_ERROR optionLibravatar Jeff King1-0/+1
We currently set curl's FAILONERROR option, which means that any http failures are reported as curl errors, and the http body content from the server is thrown away. This patch introduces a new option to http_get_strbuf which specifies that the body content from a failed http response should be placed in the destination strbuf, where it can be accessed by the caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-04Verify Content-Type from smart HTTP serversLibravatar Shawn Pearce1-1/+1
Before parsing a suspected smart-HTTP response verify the returned Content-Type matches the standard. This protects a client from attempting to process a payload that smells like a smart-HTTP server response. JGit has been doing this check on all responses since the dawn of time. I mistakenly failed to include it in git-core when smart HTTP was introduced. At the time I didn't know how to get the Content-Type from libcurl. I punted, meant to circle back and fix this, and just plain forgot about it. Signed-off-by: Shawn Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-12http: do not set up curl auth after a 401Libravatar Jeff King1-2/+1
When we get an http 401, we prompt for credentials and put them in our global credential struct. We also feed them to the curl handle that produced the 401, with the intent that they will be used on a retry. When the code was originally introduced in commit 42653c0, this was a necessary step. However, since dfa1725, we always feed our global credential into every curl handle when we initialize the slot with get_active_slot. So every further request already feeds the credential to curl. Moreover, accessing the slot here is somewhat dubious. After the slot has produced a response, we don't actually control it any more. If we are using curl_multi, it may even have been re-initialized to handle a different request. It just so happens that we will reuse the curl handle within the slot in such a case, and that because we only keep one global credential, it will be the one we want. So the current code is not buggy, but it is misleading. By cleaning it up, we can remove the slot argument entirely from handle_curl_result, making it much more obvious that slots should not be accessed after they are marked as finished. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-12http: fix segfault in handle_curl_resultLibravatar Jeff King1-1/+2
When we create an http active_request_slot, we can set its "results" pointer back to local storage. The http code will fill in the details of how the request went, and we can access those details even after the slot has been cleaned up. Commit 8809703 (http: factor out http error code handling) switched us from accessing our local results struct directly to accessing it via the "results" pointer of the slot. That means we're accessing the slot after it has been marked as finished, defeating the whole purpose of keeping the results storage separate. Most of the time this doesn't matter, as finishing the slot does not actually clean up the pointer. However, when using curl's multi interface with the dumb-http revision walker, we might actually start a new request before handing control back to the original caller. In that case, we may reuse the slot, zeroing its results pointer, and leading the original caller to segfault while looking for its results inside the slot. Instead, we need to pass a pointer to our local results storage to the handle_curl_result function, rather than relying on the pointer in the slot struct. This matches what the original code did before the refactoring (which did not use a separate function, and therefore just accessed the results struct directly). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-27http: factor out http error code handlingLibravatar Jeff King1-0/+1
Most of our http requests go through the http_request() interface, which does some nice post-processing on the results. In particular, it handles prompting for missing credentials as well as approving and rejecting valid or invalid credentials. Unfortunately, it only handles GET requests. Making it handle POSTs would be quite complex, so let's pull result handling code into its own function so that it can be reused from the POST code paths. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-28correct spelling: an URL -> a URLLibravatar Jim Meyering1-1/+1
Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-19Merge branch 'jk/maint-push-over-dav'Libravatar Junio C Hamano1-1/+2
* jk/maint-push-over-dav: http-push: enable "proactive auth" t5540: test DAV push with authentication Conflicts: http.c