summaryrefslogtreecommitdiff
path: root/http.h
AgeCommit message (Collapse)AuthorFilesLines
2017-01-17Merge branch 'jk/http-walker-limit-redirect' into maintLibravatar Junio C Hamano1-1/+9
Update the error messages from the dumb-http client when it fails to obtain loose objects; we used to give sensible error message only upon 404 but we now forbid unexpected redirects that needs to be reported with something sensible. * jk/http-walker-limit-redirect: http-walker: complain about non-404 loose object errors http: treat http-alternates like redirects http: make redirects more obvious remote-curl: rename shadowed options variable http: always update the base URL for redirects http: simplify update_url_from_redirect
2016-12-06http: make redirects more obviousLibravatar Jeff King1-1/+9
We instruct curl to always follow HTTP redirects. This is convenient, but it creates opportunities for malicious servers to create confusing situations. For instance, imagine Alice is a git user with access to a private repository on Bob's server. Mallory runs her own server and wants to access objects from Bob's repository. Mallory may try a few tricks that involve asking Alice to clone from her, build on top, and then push the result: 1. Mallory may simply redirect all fetch requests to Bob's server. Git will transparently follow those redirects and fetch Bob's history, which Alice may believe she got from Mallory. The subsequent push seems like it is just feeding Mallory back her own objects, but is actually leaking Bob's objects. There is nothing in git's output to indicate that Bob's repository was involved at all. The downside (for Mallory) of this attack is that Alice will have received Bob's entire repository, and is likely to notice that when building on top of it. 2. If Mallory happens to know the sha1 of some object X in Bob's repository, she can instead build her own history that references that object. She then runs a dumb http server, and Alice's client will fetch each object individually. When it asks for X, Mallory redirects her to Bob's server. The end result is that Alice obtains objects from Bob, but they may be buried deep in history. Alice is less likely to notice. Both of these attacks are fairly hard to pull off. There's a social component in getting Mallory to convince Alice to work with her. Alice may be prompted for credentials in accessing Bob's repository (but not always, if she is using a credential helper that caches). Attack (1) requires a certain amount of obliviousness on Alice's part while making a new commit. Attack (2) requires that Mallory knows a sha1 in Bob's repository, that Bob's server supports dumb http, and that the object in question is loose on Bob's server. But we can probably make things a bit more obvious without any loss of functionality. This patch does two things to that end. First, when we encounter a whole-repo redirect during the initial ref discovery, we now inform the user on stderr, making attack (1) much more obvious. Second, the decision to follow redirects is now configurable. The truly paranoid can set the new http.followRedirects to false to avoid any redirection entirely. But for a more practical default, we will disallow redirects only after the initial ref discovery. This is enough to thwart attacks similar to (2), while still allowing the common use of redirects at the repository level. Since c93c92f30 (http: update base URLs when we see redirects, 2013-09-28) we re-root all further requests from the redirect destination, which should generally mean that no further redirection is necessary. As an escape hatch, in case there really is a server that needs to redirect individual requests, the user can set http.followRedirects to "true" (and this can be done on a per-server basis via http.*.followRedirects config). Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-06Merge branch 'ep/http-curl-trace'Libravatar Junio C Hamano1-0/+2
HTTP transport gained an option to produce more detailed debugging trace. * ep/http-curl-trace: imap-send.c: introduce the GIT_TRACE_CURL enviroment variable http.c: implement the GIT_TRACE_CURL environment variable
2016-05-24http.c: implement the GIT_TRACE_CURL environment variableLibravatar Elia Pinto1-0/+2
Implement the GIT_TRACE_CURL environment variable to allow a greater degree of detail of GIT_CURL_VERBOSE, in particular the complete transport header and all the data payload exchanged. It might be useful if a particular situation could require a more thorough debugging analysis. Document the new GIT_TRACE_CURL environment variable. Helped-by: Torsten Bögershausen <tboegi@web.de> Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-27http: support sending custom HTTP headersLibravatar Johannes Schindelin1-0/+1
We introduce a way to send custom HTTP headers with all requests. This allows us, for example, to send an extra token from build agents for temporary access to private repositories. (This is the use case that triggered this patch.) This feature can be used like this: git -c http.extraheader='Secret: sssh!' fetch $URL $REF Note that `curl_easy_setopt(..., CURLOPT_HTTPHEADER, ...)` takes only a single list, overriding any previous call. This means we have to collect _all_ of the headers we want to use into a single list, and feed it to cURL in one shot. Since we already unconditionally set a "pragma" header when initializing the curl handles, we can add our new headers to that list. For callers which override the default header list (like probe_rpc), we provide `http_copy_default_headers()` so they can do the same trick. Big thanks to Jeff King and Junio Hamano for their outstanding help and patient reviews. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-24Merge branch 'ew/force-ipv4'Libravatar Junio C Hamano1-0/+1
"git fetch" and friends that make network connections can now be told to only use ipv4 (or ipv6). * ew/force-ipv4: connect & http: support -4 and -6 switches for remote operations
2016-02-12connect & http: support -4 and -6 switches for remote operationsLibravatar Eric Wong1-0/+1
Sometimes it is necessary to force IPv4-only or IPv6-only operation on networks where name lookups may return a non-routable address and stall remote operations. The ssh(1) command has an equivalent switches which we may pass when we run them. There may be old ssh(1) implementations out there which do not support these switches; they should report the appropriate error in that case. rsync support is untouched for now since it is deprecated and scheduled to be removed. Signed-off-by: Eric Wong <normalperson@yhbt.net> Reviewed-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26http: use credential API to handle proxy authenticationLibravatar Knut Franke1-0/+1
Currently, the only way to pass proxy credentials to curl is by including them in the proxy URL. Usually, this means they will end up on disk unencrypted, one way or another (by inclusion in ~/.gitconfig, shell profile or history). Since proxy authentication often uses a domain user, credentials can be security sensitive; therefore, a safer way of passing credentials is desirable. If the configured proxy contains a username but not a password, query the credential API for one. Also, make sure we approve/reject proxy credentials properly. For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy environment variables, which would otherwise be evaluated as a fallback by curl. Without this, we would have different semantics for git configuration and environment variables. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Knut Franke <k.franke@science-computing.de> Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-02http.c: use CURLOPT_RANGE for range requestsLibravatar David Turner1-1/+0
A HTTP server is permitted to return a non-range response to a HTTP range request (and Apache httpd in fact does this in some cases). While libcurl knows how to correctly handle this (by skipping bytes before and after the requested range), it only turns on this handling if it is aware that a range request is being made. By manually setting the range header instead of using CURLOPT_RANGE, we were hiding the fact that this was a range request from libcurl. This could cause corruption. Signed-off-by: David Turner <dturner@twopensource.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-15http.c: make finish_active_slot() and handle_curl_result() staticLibravatar Junio C Hamano1-2/+0
They used to be used directly by remote-curl.c for the smart-http protocol. But they got wrapped by run_one_slot() in beed336 (http: never use curl_easy_perform, 2014-02-18). Any future users are expected to follow that route. Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-27http: optionally extract charset parameter from content-typeLibravatar Jeff King1-0/+7
Since the previous commit, we now give a sanitized, shortened version of the content-type header to any callers who ask for it. This patch adds back a way for them to cleanly access specific parameters to the type. We could easily extract all parameters and make them available via a string_list, but: 1. That complicates the interface and memory management. 2. In practice, no planned callers care about anything except the charset. This patch therefore goes with the simplest thing, and we can expand or change the interface later if it becomes necessary. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-08Merge branch 'jl/nor-or-nand-and'Libravatar Junio C Hamano1-2/+1
Eradicate mistaken use of "nor" (that is, essentially "nor" used not in "neither A nor B" ;-)) from in-code comments, command output strings, and documentations. * jl/nor-or-nand-and: code and test: fix misuses of "nor" comments: fix misuses of "nor" contrib: fix misuses of "nor" Documentation: fix misuses of "nor"
2014-03-31comments: fix misuses of "nor"Libravatar Justin Lebar1-2/+1
Signed-off-by: Justin Lebar <jlebar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18http: never use curl_easy_performLibravatar Jeff King1-0/+9
We currently don't reuse http connections when fetching via the smart-http protocol. This is bad because the TCP handshake introduces latency, and especially because SSL connection setup may be non-trivial. We can fix it by consistently using curl's "multi" interface. The reason is rather complicated: Our http code has two ways of being used: queuing many "slots" to be fetched in parallel, or fetching a single request in a blocking manner. The parallel code is built on curl's "multi" interface. Most of the single-request code uses http_request, which is built on top of the parallel code (we just feed it one slot, and wait until it finishes). However, one could also accomplish the single-request scheme by avoiding curl's multi interface entirely and just using curl_easy_perform. This is simpler, and is used by post_rpc in the smart-http protocol. It does work to use the same curl handle in both contexts, as long as it is not at the same time. However, internally curl may not share all of the cached resources between both contexts. In particular, a connection formed using the "multi" code will go into a reuse pool connected to the "multi" object. Further requests using the "easy" interface will not be able to reuse that connection. The smart http protocol does ref discovery via http_request, which uses the "multi" interface, and then follows up with the "easy" interface for its rpc calls. As a result, we make two HTTP connections rather than reusing a single one. We could teach the ref discovery to use the "easy" interface. But it is only once we have done this discovery that we know whether the protocol will be smart or dumb. If it is dumb, then our further requests, which want to fetch objects in parallel, will not be able to reuse the same connection. Instead, this patch switches post_rpc to build on the parallel interface, which means that we use it consistently everywhere. It's a little more complicated to use, but since we have the infrastructure already, it doesn't add any code; we can just factor out the relevant bits from http_request. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05Merge branch 'bc/http-100-continue'Libravatar Junio C Hamano1-0/+1
Issue "100 Continue" responses to help use of GSS-Negotiate authentication scheme over HTTP transport when needed. * bc/http-100-continue: remote-curl: fix large pushes with GSSAPI remote-curl: pass curl slot_results back through run_slot http: return curl's AUTHAVAIL via slot_results
2013-10-31http: return curl's AUTHAVAIL via slot_resultsLibravatar Jeff King1-0/+1
Callers of the http code may want to know which auth types were available for the previous request. But after finishing with the curl slot, they are not supposed to look at the curl handle again. We already handle returning other information via the slot_results struct; let's add a flag to check the available auth. Note that older versions of curl did not support this, so we simply return 0 (something like "-1" would be worse, as the value is a bitflag and we might accidentally set a flag). This is sufficient for the callers planned in this series, who only trigger some optional behavior if particular bits are set, and can live with a fake "no bits" answer. Signed-off-by: Jeff King <peff@peff.net>
2013-10-14http: update base URLs when we see redirectsLibravatar Jeff King1-0/+8
If a caller asks the http_get_* functions to go to a particular URL and we end up elsewhere due to a redirect, the effective_url field can tell us where we went. It would be nice to remember this redirect and short-cut further requests for two reasons: 1. It's more efficient. Otherwise we spend an extra http round-trip to the server for each subsequent request, just to get redirected. 2. If we end up with an http 401 and are going to ask for credentials, it is to feed them to the redirect target. If the redirect is an http->https upgrade, this means our credentials may be provided on the http leg, just to end up redirected to https. And if the redirect crosses server boundaries, then curl will drop the credentials entirely as it follows the redirect. However, it, it is not enough to simply record the effective URL we saw and use that for subsequent requests. We were originally fed a "base" url like: http://example.com/foo.git and we want to figure out what the new base is, even though the URLs we see may be: original: http://example.com/foo.git/info/refs effective: http://example.com/bar.git/info/refs Subsequent requests will not be for "info/refs", but for other paths relative to the base. We must ask the caller to pass in the original base, and we must pass the redirected base back to the caller (so that it can generate more URLs from it). Furthermore, we need to feed the new base to the credential code, so that requests to credential helpers (or to the user) match the URL we will be requesting. This patch teaches http_request_reauth to do this munging. Since it is the caller who cares about making more URLs, it seems at first glance that callers could simply check effective_url themselves and handle it. However, since we need to update the credential struct before the second re-auth request, we have to do it inside http_request_reauth. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-10-14http: provide effective url to callersLibravatar Jeff King1-0/+6
When we ask curl to access a URL, it may follow one or more redirects to reach the final location. We have no idea this has happened, as curl takes care of the details and simply returns the final content to us. The final URL that we ended up with can be accessed via CURLINFO_EFFECTIVE_URL. Let's make that optionally available to callers of http_get_*, so that they can make further decisions based on the redirection. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-10-14http: hoist credential request out of handle_curl_resultLibravatar Jeff King1-0/+1
When we are handling a curl response code in http_request or in the remote-curl RPC code, we use the handle_curl_result helper to translate curl's response into an easy-to-use code. When we see an HTTP 401, we do one of two things: 1. If we already had a filled-in credential, we mark it as rejected, and then return HTTP_NOAUTH to indicate to the caller that we failed. 2. If we didn't, then we ask for a new credential and tell the caller HTTP_REAUTH to indicate that they may want to try again. Rejecting in the first case makes sense; it is the natural result of the request we just made. However, prompting for more credentials in the second step does not always make sense. We do not know for sure that the caller is going to make a second request, and nor are we sure that it will be to the same URL. Logically, the prompt belongs not to the request we just finished, but to the request we are (maybe) about to make. In practice, it is very hard to trigger any bad behavior. Currently, if we make a second request, it will always be to the same URL (even in the face of redirects, because curl handles the redirects internally). And we almost always retry on HTTP_REAUTH these days. The one exception is if we are streaming a large RPC request to the server (e.g., a pushed packfile), in which case we cannot restart. It's extremely unlikely to see a 401 response at this stage, though, as we would typically have seen it when we sent a probe request, before streaming the data. This patch drops the automatic prompt out of case 2, and instead requires the caller to do it. This is a few extra lines of code, and the bug it fixes is unlikely to come up in practice. But it is conceptually cleaner, and paves the way for better handling of credentials across redirects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-30http: refactor options to http_get_*Libravatar Jeff King1-5/+10
Over time, the http_get_strbuf function has grown several optional parameters. We now have a bitfield with multiple boolean options, as well as an optional strbuf for returning the content-type of the response. And a future patch in this series is going to add another strbuf option. Treating these as separate arguments has a few downsides: 1. Most call sites need to add extra NULLs and 0s for the options they aren't interested in. 2. The http_get_* functions are actually wrappers around 2 layers of low-level implementation functions. We have to pass these options through individually. 3. The http_get_strbuf wrapper learned these options, but nobody bothered to do so for http_get_file, even though it is backed by the same function that does understand the options. Let's consolidate the options into a single struct. For the common case of the default options, we'll allow callers to simply pass a NULL for the options struct. The resulting code is often a few lines longer, but it ends up being easier to read (and to change as we add new options, since we do not need to update each call site). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-04-19Merge branch 'mv/ssl-ftp-curl'Libravatar Junio C Hamano1-0/+9
Does anybody really use commit walkers over (s)ftp? * mv/ssl-ftp-curl: Support FTP-over-SSL/TLS for regular FTP
2013-04-12Support FTP-over-SSL/TLS for regular FTPLibravatar Modestas Vainius1-0/+9
Add a boolean http.sslTry option which allows to enable AUTH SSL/TLS and encrypted data transfers when connecting via regular FTP protocol. Default is false since it might trigger certificate verification errors on misconfigured servers. Signed-off-by: Modestas Vainius <modestas@vainius.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06http: drop http_error functionLibravatar Jeff King1-5/+0
This function is a single-liner and is only called from one place. Just inline it, which makes the code more obvious. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06http: simplify http_error helper functionLibravatar Jeff King1-3/+2
This helper function should really be a one-liner that prints an error message, but it has ended up unnecessarily complicated: 1. We call error() directly when we fail to start the curl request, so we must later avoid printing a duplicate error in http_error(). It would be much simpler in this case to just stuff the error message into our usual curl_errorstr buffer rather than printing it ourselves. This means that http_error does not even have to care about curl's exit value (the interesting part is in the errorstr buffer already). 2. We return the "ret" value passed in to us, but none of the callers actually cares about our return value. We can just drop this entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-06http: add HTTP_KEEP_ERROR optionLibravatar Jeff King1-0/+1
We currently set curl's FAILONERROR option, which means that any http failures are reported as curl errors, and the http body content from the server is thrown away. This patch introduces a new option to http_get_strbuf which specifies that the body content from a failed http response should be placed in the destination strbuf, where it can be accessed by the caller. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-04Verify Content-Type from smart HTTP serversLibravatar Shawn Pearce1-1/+1
Before parsing a suspected smart-HTTP response verify the returned Content-Type matches the standard. This protects a client from attempting to process a payload that smells like a smart-HTTP server response. JGit has been doing this check on all responses since the dawn of time. I mistakenly failed to include it in git-core when smart HTTP was introduced. At the time I didn't know how to get the Content-Type from libcurl. I punted, meant to circle back and fix this, and just plain forgot about it. Signed-off-by: Shawn Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-12http: do not set up curl auth after a 401Libravatar Jeff King1-2/+1
When we get an http 401, we prompt for credentials and put them in our global credential struct. We also feed them to the curl handle that produced the 401, with the intent that they will be used on a retry. When the code was originally introduced in commit 42653c0, this was a necessary step. However, since dfa1725, we always feed our global credential into every curl handle when we initialize the slot with get_active_slot. So every further request already feeds the credential to curl. Moreover, accessing the slot here is somewhat dubious. After the slot has produced a response, we don't actually control it any more. If we are using curl_multi, it may even have been re-initialized to handle a different request. It just so happens that we will reuse the curl handle within the slot in such a case, and that because we only keep one global credential, it will be the one we want. So the current code is not buggy, but it is misleading. By cleaning it up, we can remove the slot argument entirely from handle_curl_result, making it much more obvious that slots should not be accessed after they are marked as finished. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-12http: fix segfault in handle_curl_resultLibravatar Jeff King1-1/+2
When we create an http active_request_slot, we can set its "results" pointer back to local storage. The http code will fill in the details of how the request went, and we can access those details even after the slot has been cleaned up. Commit 8809703 (http: factor out http error code handling) switched us from accessing our local results struct directly to accessing it via the "results" pointer of the slot. That means we're accessing the slot after it has been marked as finished, defeating the whole purpose of keeping the results storage separate. Most of the time this doesn't matter, as finishing the slot does not actually clean up the pointer. However, when using curl's multi interface with the dumb-http revision walker, we might actually start a new request before handing control back to the original caller. In that case, we may reuse the slot, zeroing its results pointer, and leading the original caller to segfault while looking for its results inside the slot. Instead, we need to pass a pointer to our local results storage to the handle_curl_result function, rather than relying on the pointer in the slot struct. This matches what the original code did before the refactoring (which did not use a separate function, and therefore just accessed the results struct directly). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-27http: factor out http error code handlingLibravatar Jeff King1-0/+1
Most of our http requests go through the http_request() interface, which does some nice post-processing on the results. In particular, it handles prompting for missing credentials as well as approving and rejecting valid or invalid credentials. Unfortunately, it only handles GET requests. Making it handle POSTs would be quite complex, so let's pull result handling code into its own function so that it can be reused from the POST code paths. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-28correct spelling: an URL -> a URLLibravatar Jim Meyering1-1/+1
Signed-off-by: Jim Meyering <meyering@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-19Merge branch 'jk/maint-push-over-dav'Libravatar Junio C Hamano1-1/+2
* jk/maint-push-over-dav: http-push: enable "proactive auth" t5540: test DAV push with authentication Conflicts: http.c
2011-12-13http-push: enable "proactive auth"Libravatar Jeff King1-1/+2
Before commit 986bbc08, git was proactive about asking for http passwords. It assumed that if you had a username in your URL, you would also want a password, and asked for it before making any http requests. However, this could interfere with the use of .netrc (see 986bbc08 for details). And it was also unnecessary, since the http fetching code had learned to recognize an HTTP 401 and prompt the user then. Furthermore, the proactive prompt could interfere with the usage of .netrc (see 986bbc08 for details). Unfortunately, the http push-over-DAV code never learned to recognize HTTP 401, and so was broken by this change. This patch does a quick fix of re-enabling the "proactive auth" strategy only for http-push, leaving the dumb http fetch and smart-http as-is. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-05Merge branch 'mf/curl-select-fdset'Libravatar Junio C Hamano1-2/+0
* mf/curl-select-fdset: http: drop "local" member from request struct http.c: Rely on select instead of tracking whether data was received http.c: Use timeout suggested by curl instead of fixed 50ms timeout http.c: Use curl_multi_fdset to select on curl fds instead of just sleeping
2011-11-04http: drop "local" member from request structLibravatar Jeff King1-1/+0
This is a FILE pointer in the case that we are sending our output to a file. We originally used it to run ftell() to determine whether data had been written to our file during our last call to curl. However, as of the last patch, we no longer care about that flag anymore. All uses of this struct member are now just book-keeping that can go away. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-11-04http.c: Rely on select instead of tracking whether data was receivedLibravatar Mika Fischer1-1/+0
Since now select is used with the file descriptors of the http connections, tracking whether data was received recently (and trying to read more in that case) is no longer necessary. Instead, always call select and rely on it to return as soon as new data can be read. Signed-off-by: Mika Fischer <mika.fischer@zoopnet.de> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-15http_init: accept separate URL parameterLibravatar Jeff King1-1/+1
The http_init function takes a "struct remote". Part of its initialization procedure is to look at the remote's url and grab some auth-related parameters. However, using the url included in the remote is: - wrong; the remote-curl helper may have a separate, unrelated URL (e.g., from remote.*.pushurl). Looking at the remote's configured url is incorrect. - incomplete; http-fetch doesn't have a remote, so passes NULL. So http_init never gets to see the URL we are actually going to use. - cumbersome; http-push has a similar problem to http-fetch, but actually builds a fake remote just to pass in the URL. Instead, let's just add a separate URL parameter to http_init, and all three callsites can pass in the appropriate information. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-19Merge branch 'jc/zlib-wrap'Libravatar Junio C Hamano1-1/+1
* jc/zlib-wrap: zlib: allow feeding more than 4GB in one go zlib: zlib can only process 4GB at a time zlib: wrap deflateBound() too zlib: wrap deflate side of the API zlib: wrap inflateInit2 used to accept only for gzip format zlib: wrap remaining calls to direct inflate/inflateEnd zlib wrapper: refactor error message formatter Conflicts: sha1_file.c
2011-06-10zlib: zlib can only process 4GB at a timeLibravatar Junio C Hamano1-1/+1
The size of objects we read from the repository and data we try to put into the repository are represented in "unsigned long", so that on larger architectures we can handle objects that weigh more than 4GB. But the interface defined in zlib.h to communicate with inflate/deflate limits avail_in (how many bytes of input are we calling zlib with) and avail_out (how many bytes of output from zlib are we ready to accept) fields effectively to 4GB by defining their type to be uInt. In many places in our code, we allocate a large buffer (e.g. mmap'ing a large loose object file) and tell zlib its size by assigning the size to avail_in field of the stream, but that will truncate the high octets of the real size. The worst part of this story is that we often pass around z_stream (the state object used by zlib) to keep track of the number of used bytes in input/output buffer by inspecting these two fields, which practically limits our callchain to the same 4GB limit. Wrap z_stream in another structure git_zstream that can express avail_in and avail_out in unsigned long. For now, just die() when the caller gives a size that cannot be given to a single zlib call. In later patches in the series, we would make git_inflate() and git_deflate() internally loop to give callers an illusion that our "improved" version of zlib interface can operate on a buffer larger than 4GB in one go. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-04http: make curl callbacks match contracts from curl headerLibravatar Dan McGee1-3/+3
Yes, these don't match perfectly with the void* first parameter of the fread/fwrite in the standard library, but they do match the curl expected method signature. This is needed when a refactor passes a curl_write_callback around, which would otherwise give incorrect parameter warnings. Signed-off-by: Dan McGee <dpmcgee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-16standardize brace placement in struct definitionsLibravatar Jonathan Nieder1-10/+5
In a struct definitions, unlike functions, the prevailing style is for the opening brace to go on the same line as the struct name, like so: struct foo { int bar; char *baz; }; Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many matches as 'struct [a-z_]*$'. Linus sayeth: Heretic people all over the world have claimed that this inconsistency is ... well ... inconsistent, but all right-thinking people know that (a) K&R are _right_ and (b) K&R are right. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-26shift end_url_with_slash() from http.[ch] to url.[ch]Libravatar Tay Ray Chuan1-1/+1
This allows non-http/curl users to access it too (eg. http-backend.c). Update include headers in end_url_with_slash() users too. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-12Standardize do { ... } while (0) styleLibravatar Jonathan Nieder1-2/+2
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-21Merge branch 'sp/maint-dumb-http-pack-reidx'Libravatar Junio C Hamano1-2/+0
* sp/maint-dumb-http-pack-reidx: http.c::new_http_pack_request: do away with the temp variable filename http-fetch: Use temporary files for pack-*.idx until verified http-fetch: Use index-pack rather than verify-pack to check packs Allow parse_pack_index on temporary files Extract verify_pack_index for reuse from verify_pack Introduce close_pack_index to permit replacement http.c: Remove unnecessary strdup of sha1_to_hex result http.c: Don't store destination name in request structures http.c: Drop useless != NULL test in finish_http_pack_request http.c: Tiny refactoring of finish_http_pack_request t5550-http-fetch: Use subshell for repository operations http.c: Remove bad free of static block
2010-05-08Merge branch 'rc/maint-curl-helper'Libravatar Junio C Hamano1-0/+1
* rc/maint-curl-helper: remote-curl: ensure that URLs have a trailing slash http: make end_url_with_slash() public t5541-http-push: add test for URLs with trailing slash Conflicts: remote-curl.c
2010-04-17http.c: Don't store destination name in request structuresLibravatar Shawn O. Pearce1-2/+0
The destination name within the object store is easily computed on demand, reusing a static buffer held by sha1_file.c. We don't need to copy the entire path into the request structure for safe keeping, when it can be easily reformatted after the download has been completed. This reduces the size of the per-request structure, and removes yet another PATH_MAX based limit. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-09http: make end_url_with_slash() publicLibravatar Tay Ray Chuan1-0/+1
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-01Prompt for a username when an HTTP request 401sLibravatar Scott Chacon1-0/+2
When an HTTP request returns a 401, Git will currently fail with a confusing message saying that it got a 401, which is not very descriptive. Currently if a user wants to use Git over HTTP, they have to use one URL with the username in the URL (e.g. "http://user@host.com/repo.git") for write access and another without the username for unauthenticated read access (unless they want to be prompted for the password each time). However, since the HTTP servers will return a 401 if an action requires authentication, we can prompt for username and password if we see this, allowing us to use a single URL for both purposes. This patch changes http_request to prompt for the username and password, then return HTTP_REAUTH so http_get_strbuf can try again. If it gets a 401 even when a user/pass is supplied, http_request will now return HTTP_NOAUTH which remote_curl can then use to display a more intelligent error message that is less confusing. Signed-off-by: Scott Chacon <schacon@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-12http.c: mark file-local functions staticLibravatar Junio C Hamano1-9/+0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-04Smart push over HTTP: client sideLibravatar Shawn O. Pearce1-0/+2
The git-remote-curl backend detects if the remote server supports the git-receive-pack service, and if so, runs git-send-pack in a pipe to dump the command and pack data as a single POST request. The advertisements from the server that were obtained during the discovery are passed into git-send-pack before the POST request starts. This permits git-send-pack to operate largely unmodified. For smaller packs (those under 1 MiB) a HTTP/1.0 POST with a Content-Length is used, permitting interaction with any server. The 1 MiB limit is arbitrary, but is sufficent to fit most deltas created by human authors against text sources with the occasional small binary file (e.g. few KiB icon image). The configuration option http.postBuffer can be used to increase (or shink) this buffer if the default is not sufficient. For larger packs which cannot be spooled entirely into the helper's memory space (due to http.postBuffer being too small), the POST request requires HTTP/1.1 and sets "Transfer-Encoding: chunked". This permits the client to upload an unknown amount of data in one HTTP transaction without needing to pregenerate the entire pack file locally. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> CC: Daniel Barkalow <barkalow@iabervon.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06http*: add helper methods for fetching objects (loose)Libravatar Tay Ray Chuan1-4/+33
The code handling the fetching of loose objects in http-push.c and http-walker.c have been refactored into new methods and a new struct (object_http_request) in http.c. They are not meant to be invoked elsewhere. The new methods in http.c are - new_http_object_request - process_http_object_request - finish_http_object_request - abort_http_object_request - release_http_object_request and the new struct is http_object_request. RANGER_HEADER_SIZE and no_pragma_header is no longer made available outside of http.c, since after the above changes, there are no other instances of usage outside of http.c. Remove members of the transfer_request struct in http-push.c and http-walker.c, including filename, real_sha1 and zret, as they are used no longer used. Move the methods append_remote_object_url() and get_remote_object_url() from http-push.c to http.c. Additionally, get_remote_object_url() is no longer defined only when USE_CURL_MULTI is defined, since non-USE_CURL_MULTI code in http.c uses it (namely, in new_http_object_request()). Refactor code from http-push.c::start_fetch_loose() and http-walker.c::start_object_fetch_request() that deals with the details of coming up with the filename to store the retrieved object, resuming a previously aborted request, and making a new curl request, into a new function, new_http_object_request(). Refactor code from http-walker.c::process_object_request() into the function, process_http_object_request(). Refactor code from http-push.c::finish_request() and http-walker.c::finish_object_request() into a new function, finish_http_object_request(). It returns the result of the move_temp_to_file() invocation. Add a function, release_http_object_request(), which cleans up object request data. http-push.c and http-walker.c invoke this function separately; http-push.c::release_request() and http-walker.c::release_object_request() do not invoke this function. Add a function, abort_http_object_request(), which unlink()s the object file and invokes release_http_object_request(). Update http-walker.c::abort_object_request() to use this. Signed-off-by: Tay Ray Chuan <rctay89@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>