summaryrefslogtreecommitdiff
path: root/http.c
AgeCommit message (Collapse)AuthorFilesLines
2018-03-26sha1_file: add repository argument to sha1_file_nameLibravatar Stefan Beller1-3/+2
Add a repository argument to allow sha1_file_name callers to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. While at it, move the declaration to object-store.h, where it should be easier to find. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-26object-store: move packed_git and packed_git_mru to object storeLibravatar Stefan Beller1-0/+1
In a process with multiple repositories open, packfile accessors should be associated to a single repository and not shared globally. Move packed_git and packed_git_mru into the_repository and adjust callers to reflect this. [nd: while at there, wrap access to these two fields in get_packed_git() and get_packed_git_mru(). This allows us to lazily initialize these fields without caller doing that explicitly] Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-13Merge branch 'jt/http-redact-cookies'Libravatar Junio C Hamano1-8/+74
The http tracing code, often used to debug connection issues, learned to redact potentially sensitive information from its output so that it can be more safely sharable. * jt/http-redact-cookies: http: support omitting data from traces http: support cookie redaction when tracing
2018-02-13Merge branch 'cc/sha1-file-name'Libravatar Junio C Hamano1-6/+10
Code clean-up. * cc/sha1-file-name: sha1_file: improve sha1_file_name() perfs sha1_file: remove static strbuf from sha1_file_name()
2018-01-19http: support omitting data from tracesLibravatar Jonathan Tan1-8/+19
GIT_TRACE_CURL provides a way to debug what is being sent and received over HTTP, with automatic redaction of sensitive information. But it also logs data transmissions, which significantly increases the log file size, sometimes unnecessarily. Add an option "GIT_TRACE_CURL_NO_DATA" to allow the user to omit such data transmissions. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-19http: support cookie redaction when tracingLibravatar Jonathan Tan1-0/+55
When using GIT_TRACE_CURL, Git already redacts the "Authorization:" and "Proxy-Authorization:" HTTP headers. Extend this redaction to a user-specified list of cookies, specified through the "GIT_REDACT_COOKIES" environment variable. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-17sha1_file: remove static strbuf from sha1_file_name()Libravatar Christian Couder1-6/+10
Using a static buffer in sha1_file_name() is error prone and the performance improvements it gives are not needed in many of the callers. So let's get rid of this static buffer and, if necessary or helpful, let's use one in the caller. Suggested-by: Jeff Hostetler <git@jeffhostetler.com> Helped-by: Kevin Daudt <me@ikke.info> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-05Merge branch 'rs/use-argv-array-in-child-process'Libravatar Junio C Hamano1-8/+3
Code cleanup. * rs/use-argv-array-in-child-process: send-pack: use internal argv_array of struct child_process http: use internal argv_array of struct child_process
2017-12-28Merge branch 'ws/curl-http-proxy-over-https'Libravatar Junio C Hamano1-0/+5
Git has been taught to support an https:// URL used for http.proxy when using recent versions of libcurl. * ws/curl-http-proxy-over-https: http: support CURLPROXY_HTTPS
2017-12-22http: use internal argv_array of struct child_processLibravatar René Scharfe1-8/+3
Avoid a strangely magic array size (it's slightly too big) and explicit index numbers by building the command line for index-pack using the embedded argv_array of the child_process. Add the flag -o and its argument with argv_array_pushl() to make it obvious that they belong together. The resulting code is shorter and easier to extend. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-19http: support CURLPROXY_HTTPSLibravatar Wei Shuyu1-0/+5
HTTP proxy over SSL is supported by curl since 7.52.0. This is very useful for networks with protocol whitelist. Signed-off-by: Wei Shuyu <wsy@dogben.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-06Merge branch 'bw/protocol-v1'Libravatar Junio C Hamano1-0/+18
A new mechanism to upgrade the wire protocol in place is proposed and demonstrated that it works with the older versions of Git without harming them. * bw/protocol-v1: Documentation: document Extra Parameters ssh: introduce a 'simple' ssh variant i5700: add interop test for protocol transition http: tell server that the client understands v1 connect: tell server that the client understands v1 connect: teach client to recognize v1 server response upload-pack, receive-pack: introduce protocol version 1 daemon: recognize hidden request arguments protocol: introduce protocol extension mechanisms pkt-line: add packet_write function connect: in ref advertisement, shallows are last
2017-10-17http: tell server that the client understands v1Libravatar Brandon Williams1-0/+18
Tell a server that protocol v1 can be used by sending the http header 'Git-Protocol' with 'version=1' indicating this. Also teach the apache http server to pass through the 'Git-Protocol' header as an environment variable 'GIT_PROTOCOL'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-22curl_trace(): eliminate switch fallthroughLibravatar Jeff King1-3/+4
Our trace handler is called by curl with a curl_infotype variable to interpret its data field. For most types we print the data and then break out of the switch. But for CURLINFO_TEXT, we print data and then fall through to the "default" case, which does the exact same thing (nothing!) that breaking out of the switch would. This is probably a leftover from an early iteration of the patch where the code after the switch _did_ do something interesting that was unique to the non-text case arms. But in its current form, this fallthrough is merely confusing (and causes gcc's -Wimplicit-fallthrough to complain). Let's make CURLINFO_TEXT like the other case arms, and push the default arm to the end where it's more obviously a catch-all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-26Merge branch 'jt/packmigrate'Libravatar Junio C Hamano1-0/+1
Code movement to make it easier to hack later. * jt/packmigrate: (23 commits) pack: move for_each_packed_object() pack: move has_pack_index() pack: move has_sha1_pack() pack: move find_pack_entry() and make it global pack: move find_sha1_pack() pack: move find_pack_entry_one(), is_pack_valid() pack: move check_pack_index_ptr(), nth_packed_object_offset() pack: move nth_packed_object_{sha1,oid} pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry() pack: move unpack_object_header() pack: move get_size_from_delta() pack: move unpack_object_header_buffer() pack: move {,re}prepare_packed_git and approximate_object_count pack: move install_packed_git() pack: move add_packed_git() pack: move unuse_pack() pack: move use_pack() pack: move pack-closing functions pack: move release_pack_memory() pack: move open_pack_index(), parse_pack_index() ...
2017-08-24Merge branch 'tc/curl-with-backports'Libravatar Junio C Hamano1-4/+6
Updates to the HTTP layer we made recently unconditionally used features of libCurl without checking the existence of them, causing compilation errors, which has been fixed. Also migrate the code to check feature macros, not version numbers, to cope better with libCurl that vendor ships with backported features. * tc/curl-with-backports: http: use a feature check to enable GSSAPI delegation control http: fix handling of missing CURLPROTO_*
2017-08-23pack: move pack name-related functionsLibravatar Jonathan Tan1-0/+1
Currently, sha1_file.c and cache.h contain many functions, both related to and unrelated to packfiles. This makes both files very large and causes an unclear separation of concerns. Create a new file, packfile.c, to hold all packfile-related functions currently in sha1_file.c. It has a corresponding header packfile.h. In this commit, the pack name-related functions are moved. Subsequent commits will move the other functions. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-11http: use a feature check to enable GSSAPI delegation controlLibravatar Tom G. Christensen1-3/+3
Turn the version check into a feature check to ensure this functionality is also enabled with vendor supported curl versions where the feature may have been backported. Signed-off-by: Tom G. Christensen <tgc@jupiterrise.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-11http: fix handling of missing CURLPROTO_*Libravatar Tom G. Christensen1-1/+3
Commit aeae4db1 refactored the handling of the curl protocol restriction support into a function but failed to add a version check for older versions of curl that lack CURLPROTO_* support. Add the missing check and at the same time convert it to a feature check instead of a version based check. This is done to ensure that vendor supported curl versions that have had CURLPROTO_* support backported are handled correctly. Signed-off-by: Tom G. Christensen <tgc@jupiterrise.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-11Merge branch 'jc/http-sslkey-and-ssl-cert-are-paths'Libravatar Junio C Hamano1-2/+2
The http.{sslkey,sslCert} configuration variables are to be interpreted as a pathname that honors "~[username]/" prefix, but weren't, which has been fixed. * jc/http-sslkey-and-ssl-cert-are-paths: http.c: http.sslcert and http.sslkey are both pathnames
2017-07-20http.c: http.sslcert and http.sslkey are both pathnamesLibravatar Junio C Hamano1-2/+2
Back when the modern http_options() codepath was created to parse various http.* options at 29508e1e ("Isolate shared HTTP request functionality", 2005-11-18), and then later was corrected for interation between the multiple configuration files in 7059cd99 ("http_init(): Fix config file parsing", 2009-03-09), we parsed configuration variables like http.sslkey, http.sslcert as plain vanilla strings, because git_config_pathname() that understands "~[username]/" prefix did not exist. Later, we converted some of them (namely, http.sslCAPath and http.sslCAInfo) to use the function, and added variables like http.cookeyFile http.pinnedpubkey to use the function from the beginning. Because of that, these variables all understand "~[username]/" prefix. Make the remaining two variables, http.sslcert and http.sslkey, also aware of the convention, as they are both clearly pathnames to files. Noticed-by: Victor Toni <victor.toni@gmail.com> Helped-by: Charles Bailey <cbailey32@bloomberg.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24Merge branch 'ab/free-and-null'Libravatar Junio C Hamano1-10/+5
A common pattern to free a piece of memory and assign NULL to the pointer that used to point at it has been replaced with a new FREE_AND_NULL() macro. * ab/free-and-null: *.[ch] refactoring: make use of the FREE_AND_NULL() macro coccinelle: make use of the "expression" FREE_AND_NULL() rule coccinelle: add a rule to make "expression" code use FREE_AND_NULL() coccinelle: make use of the "type" FREE_AND_NULL() rule coccinelle: add a rule to make "type" code use FREE_AND_NULL() git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
2017-06-16coccinelle: make use of the "type" FREE_AND_NULL() ruleLibravatar Ævar Arnfjörð Bjarmason1-10/+5
Apply the result of the just-added coccinelle rule. This manually excludes a few occurrences, mostly things that resulted in many FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15config: don't include config.h by defaultLibravatar Brandon Williams1-0/+1
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-23Merge branch 'dt/http-postbuffer-can-be-large'Libravatar Junio C Hamano1-2/+4
Allow the http.postbuffer configuration variable to be set to a size that can be expressed in size_t, which can be larger than ulong on some platforms. * dt/http-postbuffer-can-be-large: http.postbuffer: allow full range of ssize_t values
2017-04-23Merge branch 'sr/http-proxy-configuration-fix'Libravatar Junio C Hamano1-2/+11
"http.proxy" set to an empty string is used to disable the usage of proxy. We broke this early last year. * sr/http-proxy-configuration-fix: http: fix the silent ignoring of proxy misconfiguraion http: honor empty http.proxy option to bypass proxy
2017-04-13http.postbuffer: allow full range of ssize_t valuesLibravatar David Turner1-2/+4
Unfortunately, in order to push some large repos where a server does not support chunked encoding, the http postbuffer must sometimes exceed two gigabytes. On a 64-bit system, this is OK: we just malloc a larger buffer. This means that we need to use CURLOPT_POSTFIELDSIZE_LARGE to set the buffer size. Signed-off-by: David Turner <dturner@twosigma.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-13http: fix the silent ignoring of proxy misconfiguraionLibravatar Sergey Ryazanov1-0/+3
Earlier, the whole http.proxy option string was passed to curl without any preprocessing so curl could complain about the invalid proxy configuration. After the commit 372370f167 ("http: use credential API to handle proxy authentication", 2016-01-26), if the user specified an invalid HTTP proxy option in the configuration, then the option parsing silently fails and NULL will be passed to curl as a proxy. This forces curl to fall back to detecting the proxy configuration from the environment, causing the http.proxy option ignoring. Fix this issue by checking the proxy option parsing result. If parsing failed then print an error message and die. Such behaviour allows the user to quickly figure the proxy misconfiguration and correct it. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-13http: honor empty http.proxy option to bypass proxyLibravatar Sergey Ryazanov1-2/+8
Curl distinguishes between an empty proxy address and a NULL proxy address. In the first case it completely disables proxy usage, but if the proxy address option is NULL then curl attempts to determine the proxy address from the http_proxy environment variable. According to the documentation, if the http.proxy option is set to an empty string, git should bypass proxy and connect to the server directly: export http_proxy=http://network-proxy/ cd ~/foobar-project git config remote.origin.proxy "" git fetch Previously, proxy host was configured by one line: curl_easy_setopt(result, CURLOPT_PROXY, curl_http_proxy); Commit 372370f167 ("http: use credential API to handle proxy authentication", 2016-01-26) parses the proxy option, then extracts the proxy host address and updates the curl configuration, making the previous call a noop: credential_from_url(&proxy_auth, curl_http_proxy); curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host); But if the proxy option is empty then the proxy host field becomes NULL. This forces curl to fall back to detecting the proxy configuration from the environment, causing the http.proxy option to not work anymore. Fix this issue by explicitly handling http.proxy being set the empty string. This also makes the code a bit more clear and should help us avoid such regressions in the future. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-30convert unchecked snprintf into xsnprintfLibravatar Jeff King1-5/+5
These calls to snprintf should always succeed, because their input is small and fixed. Let's use xsnprintf to make sure this is the case (and to make auditing for actual truncation easier). These could be candidates for turning into heap buffers, but they fall into a few broad categories that make it not worth doing: - formatting single numbers is simple enough that we can see the result should fit - the size of a sha1 is likewise well-known, and I didn't want to cause unnecessary conflicts with the ongoing process to convert these constants to GIT_MAX_HEXSZ - the interface for curl_errorstr is dictated by curl Signed-off-by: Jeff King <peff@peff.net>
2017-03-16Merge branch 'jt/http-base-url-update-upon-redirect' into maintLibravatar Junio C Hamano1-0/+3
When a redirected http transport gets an error during the redirected request, we ignored the error we got from the server, and ended up giving a not-so-useful error message. * jt/http-base-url-update-upon-redirect: http: attempt updating base URL only if no error
2017-03-16Merge branch 'jk/http-auth' into maintLibravatar Junio C Hamano1-4/+46
Reduce authentication round-trip over HTTP when the server supports just a single authentication method. * jk/http-auth: http: add an "auto" mode for http.emptyauth http: restrict auth methods to what the server advertises
2017-03-10Merge branch 'jt/http-base-url-update-upon-redirect'Libravatar Junio C Hamano1-0/+3
When a redirected http transport gets an error during the redirected request, we ignored the error we got from the server, and ended up giving a not-so-useful error message. * jt/http-base-url-update-upon-redirect: http: attempt updating base URL only if no error
2017-03-10Merge branch 'jk/http-auth'Libravatar Junio C Hamano1-4/+46
Reduce authentication round-trip over HTTP when the server supports just a single authentication method. * jk/http-auth: http: add an "auto" mode for http.emptyauth http: restrict auth methods to what the server advertises
2017-02-28http: attempt updating base URL only if no errorLibravatar Jonathan Tan1-0/+3
http.c supports HTTP redirects of the form http://foo/info/refs?service=git-upload-pack -> http://anything -> http://bar/info/refs?service=git-upload-pack (that is to say, as long as the Git part of the path and the query string is preserved in the final redirect destination, the intermediate steps can have any URL). However, if one of the intermediate steps results in an HTTP exception, a confusing "unable to update url base from redirection" message is printed instead of a Curl error message with the HTTP exception code. This was introduced by 2 commits. Commit c93c92f ("http: update base URLs when we see redirects", 2013-09-28) introduced a best-effort optimization that required checking if only the "base" part of the URL differed between the initial request and the final redirect destination, but it performed the check before any HTTP status checking was done. If something went wrong, the normal code path was still followed, so this did not cause any confusing error messages until commit 6628eb4 ("http: always update the base URL for redirects", 2016-12-06), which taught http to die if the non-"base" part of the URL differed. Therefore, teach http to check the HTTP status before attempting to check if only the "base" part of the URL differed. This commit teaches http_request_reauth to return early without updating options->base_url upon an error; the only invoker of this function that passes a non-NULL "options" is remote-curl.c (through "http_get_strbuf"), which only uses options->base_url for an informational message in the situations that this commit cares about (that is, when the return value is not HTTP_OK). The included test checks that the redirect scheme at the beginning of this commit message works, and that returning a 502 in the middle of the redirect scheme produces the correct result. Note that this is different from the test in commit 6628eb4 ("http: always update the base URL for redirects", 2016-12-06) in that this commit tests that a Git-shaped URL (http://.../info/refs?service=git-upload-pack) works, whereas commit 6628eb4 tests that a non-Git-shaped URL (http://.../info/refs/foo?service=git-upload-pack) does not work (even though Git is processing that URL) and is an error that is fatal, not silently swallowed. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-27http: add an "auto" mode for http.emptyauthLibravatar Jeff King1-5/+45
This variable needs to be specified to make some types of non-basic authentication work, but ideally this would just work out of the box for everyone. However, simply setting it to "1" by default introduces an extra round-trip for cases where it _isn't_ useful. We end up sending a bogus empty credential that the server rejects. Instead, let's introduce an automatic mode, that works like this: 1. We won't try to send the bogus credential on the first request. We'll wait to get an HTTP 401, as usual. 2. After seeing an HTTP 401, the empty-auth hack will kick in only when we know there is an auth method available that might make use of it (i.e., something besides "Basic" or "Digest"). That should make it work out of the box, without incurring any extra round-trips for people hitting Basic-only servers. This _does_ incur an extra round-trip if you really want to use "Basic" but your server advertises other methods (the emptyauth hack will kick in but fail, and then Git will actually ask for a password). The auto mode may incur an extra round-trip over setting http.emptyauth=true, because part of the emptyauth hack is to feed this blank password to curl even before we've made a single request. Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-23http: restrict auth methods to what the server advertisesLibravatar Jeff King1-0/+2
By default, we tell curl to use CURLAUTH_ANY, which does not limit its set of auth methods. However, this results in an extra round-trip to the server when authentication is required. After we've fed the credential to curl, it wants to probe the server to find its list of available methods before sending an Authorization header. We can shortcut this by limiting our http_auth_methods by what the server told us it supports. In some cases (such as when the server only supports Basic), that lets curl skip the extra probe request. The end result should look the same to the user, but you can use GIT_TRACE_CURL to verify the sequence of requests: GIT_TRACE_CURL=1 \ git ls-remote https://example.com/repo.git \ 2>&1 >/dev/null | egrep '(Send|Recv) header: (GET|HTTP|Auth)' Before this patch, hitting a Basic-only server like github.com results in: Send header: GET /repo.git/info/refs?service=git-upload-pack HTTP/1.1 Recv header: HTTP/1.1 401 Authorization Required Send header: GET /repo.git/info/refs?service=git-upload-pack HTTP/1.1 Recv header: HTTP/1.1 401 Authorization Required Send header: GET /repo.git/info/refs?service=git-upload-pack HTTP/1.1 Send header: Authorization: Basic <redacted> Recv header: HTTP/1.1 200 OK And after: Send header: GET /repo.git/info/refs?service=git-upload-pack HTTP/1.1 Recv header: HTTP/1.1 401 Authorization Required Send header: GET /repo.git/info/refs?service=git-upload-pack HTTP/1.1 Send header: Authorization: Basic <redacted> Recv header: HTTP/1.1 200 OK The possible downsides are: - This only helps for a Basic-only server; for a server with multiple auth options, curl may still send a probe request to see which ones are available (IOW, there's no way to say "don't probe, I already know what the server will say"). - The http_auth_methods variable is global, so this will apply to all further requests. That's acceptable for Git's usage of curl, though, which also treats the credentials as global. I.e., in any given program invocation we hit only one conceptual server (we may be redirected at the outset, but in that case that's whose auth_avail field we'd see). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-17Merge branch 'jk/http-walker-limit-redirect' into maintLibravatar Junio C Hamano1-13/+43
Update the error messages from the dumb-http client when it fails to obtain loose objects; we used to give sensible error message only upon 404 but we now forbid unexpected redirects that needs to be reported with something sensible. * jk/http-walker-limit-redirect: http-walker: complain about non-404 loose object errors http: treat http-alternates like redirects http: make redirects more obvious remote-curl: rename shadowed options variable http: always update the base URL for redirects http: simplify update_url_from_redirect
2016-12-27Merge branch 'bw/transport-protocol-policy'Libravatar Junio C Hamano1-14/+21
Finer-grained control of what protocols are allowed for transports during clone/fetch/push have been enabled via a new configuration mechanism. * bw/transport-protocol-policy: http: respect protocol.*.allow=user for http-alternates transport: add from_user parameter to is_transport_allowed http: create function to get curl allowed protocols transport: add protocol policy config option http: always warn if libcurl version is too old lib-proto-disable: variable name fix
2016-12-19Merge branch 'jk/http-walker-limit-redirect'Libravatar Junio C Hamano1-1/+1
Update the error messages from the dumb-http client when it fails to obtain loose objects; we used to give sensible error message only upon 404 but we now forbid unexpected redirects that needs to be reported with something sensible. * jk/http-walker-limit-redirect: http-walker: complain about non-404 loose object errors
2016-12-19Merge branch 'jk/http-walker-limit-redirect-2.9'Libravatar Junio C Hamano1-12/+42
Transport with dumb http can be fooled into following foreign URLs that the end user does not intend to, especially with the server side redirects and http-alternates mechanism, which can lead to security issues. Tighten the redirection and make it more obvious to the end user when it happens. * jk/http-walker-limit-redirect-2.9: http: treat http-alternates like redirects http: make redirects more obvious remote-curl: rename shadowed options variable http: always update the base URL for redirects http: simplify update_url_from_redirect
2016-12-15transport: add from_user parameter to is_transport_allowedLibravatar Brandon Williams1-7/+7
Add a from_user parameter to is_transport_allowed() to allow http to be able to distinguish between protocol restrictions for redirects versus initial requests. CURLOPT_REDIR_PROTOCOLS can now be set differently from CURLOPT_PROTOCOLS to disallow use of protocols with the "user" policy in redirects. This change allows callers to query if a transport protocol is allowed, given that the caller knows that the protocol is coming from the user (1) or not from the user (0) such as redirects in libcurl. If unknown a -1 should be provided which falls back to reading `GIT_PROTOCOL_FROM_USER` to determine if the protocol came from the user. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-15http: create function to get curl allowed protocolsLibravatar Brandon Williams1-11/+20
Move the creation of an allowed protocols whitelist to a helper function. This will be useful when we need to compute the set of allowed protocols differently for normal and redirect cases. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-15http: always warn if libcurl version is too oldLibravatar Brandon Williams1-3/+2
Always warn if libcurl version is too old because: 1. Even without a protocol whitelist, newer versions of curl have all non-standard protocols disabled by default. 2. A future patch will introduce default "known-good" and "known-bad" protocols which are allowed/disallowed by 'is_transport_allowed' which older version of libcurl can't respect. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06http-walker: complain about non-404 loose object errorsLibravatar Jeff King1-1/+1
Since commit 17966c0a6 (http: avoid disconnecting on 404s for loose objects, 2016-07-11), we turn off curl's FAILONERROR option and instead manually deal with failing HTTP codes. However, the logic to do so only recognizes HTTP 404 as a failure. This is probably the most common result, but if we were to get another code, the curl result remains CURLE_OK, and we treat it as success. We still end up detecting the failure when we try to zlib-inflate the object (which will fail), but instead of reporting the HTTP error, we just claim that the object is corrupt. Instead, let's catch anything in the 300's or above as an error (300's are redirects which are not an error at the HTTP level, but are an indication that we've explicitly disabled redirects, so we should treat them as such; we certainly don't have the resulting object content). Note that we also fill in req->errorstr, which we didn't do before. Without FAILONERROR, curl will not have filled this in, and it will remain a blank string. This never mattered for the 404 case, because in the logic below we hit the "missing_target()" branch and print nothing. But for other errors, we'd want to say _something_, if only to fill in the blank slot in the error message. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06Merge branch 'ew/http-walker' into jk/http-walker-limit-redirectLibravatar Junio C Hamano1-2/+14
* ew/http-walker: list: avoid incompatibility with *BSD sys/queue.h http-walker: reduce O(n) ops with doubly-linked list http: avoid disconnecting on 404s for loose objects http-walker: remove unused parameter from fetch_object
2016-12-06http: treat http-alternates like redirectsLibravatar Jeff King1-0/+1
The previous commit made HTTP redirects more obvious and tightened up the default behavior. However, there's another way for a server to ask a git client to fetch arbitrary content: by having an http-alternates file (or a regular alternates file, which is used as a backup). Similar to the HTTP redirect case, a malicious server can claim to have refs pointing at object X, return a 404 when the client asks for X, but point to some other URL via http-alternates, which the client will transparently fetch. The end result is that it looks from the user's perspective like the objects came from the malicious server, as the other URL is not mentioned at all. Worse, because we feed the new URL to curl ourselves, the usual protocol restrictions do not kick in (neither curl's default of disallowing file://, nor the protocol whitelisting in f4113cac0 (http: limit redirection to protocol-whitelist, 2015-09-22). Let's apply the same rules here as we do for HTTP redirects. Namely: - unless http.followRedirects is set to "always", we will not follow remote redirects from http-alternates (or alternates) at all - set CURLOPT_PROTOCOLS alongside CURLOPT_REDIR_PROTOCOLS restrict ourselves to a known-safe set and respect any user-provided whitelist. - mention alternate object stores on stderr so that the user is aware another source of objects may be involved The first item may prove to be too restrictive. The most common use of alternates is to point to another path on the same server. While it's possible for a single-server redirect to be an attack, it takes a fairly obscure setup (victim and evil repository on the same host, host speaks dumb http, and evil repository has access to edit its own http-alternates file). So we could make the checks more specific, and only cover cross-server redirects. But that means parsing the URLs ourselves, rather than letting curl handle them. This patch goes for the simpler approach. Given that they are only used with dumb http, http-alternates are probably pretty rare. And there's an escape hatch: the user can allow redirects on a specific server by setting http.<url>.followRedirects to "always". Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06http: make redirects more obviousLibravatar Jeff King1-2/+29
We instruct curl to always follow HTTP redirects. This is convenient, but it creates opportunities for malicious servers to create confusing situations. For instance, imagine Alice is a git user with access to a private repository on Bob's server. Mallory runs her own server and wants to access objects from Bob's repository. Mallory may try a few tricks that involve asking Alice to clone from her, build on top, and then push the result: 1. Mallory may simply redirect all fetch requests to Bob's server. Git will transparently follow those redirects and fetch Bob's history, which Alice may believe she got from Mallory. The subsequent push seems like it is just feeding Mallory back her own objects, but is actually leaking Bob's objects. There is nothing in git's output to indicate that Bob's repository was involved at all. The downside (for Mallory) of this attack is that Alice will have received Bob's entire repository, and is likely to notice that when building on top of it. 2. If Mallory happens to know the sha1 of some object X in Bob's repository, she can instead build her own history that references that object. She then runs a dumb http server, and Alice's client will fetch each object individually. When it asks for X, Mallory redirects her to Bob's server. The end result is that Alice obtains objects from Bob, but they may be buried deep in history. Alice is less likely to notice. Both of these attacks are fairly hard to pull off. There's a social component in getting Mallory to convince Alice to work with her. Alice may be prompted for credentials in accessing Bob's repository (but not always, if she is using a credential helper that caches). Attack (1) requires a certain amount of obliviousness on Alice's part while making a new commit. Attack (2) requires that Mallory knows a sha1 in Bob's repository, that Bob's server supports dumb http, and that the object in question is loose on Bob's server. But we can probably make things a bit more obvious without any loss of functionality. This patch does two things to that end. First, when we encounter a whole-repo redirect during the initial ref discovery, we now inform the user on stderr, making attack (1) much more obvious. Second, the decision to follow redirects is now configurable. The truly paranoid can set the new http.followRedirects to false to avoid any redirection entirely. But for a more practical default, we will disallow redirects only after the initial ref discovery. This is enough to thwart attacks similar to (2), while still allowing the common use of redirects at the repository level. Since c93c92f30 (http: update base URLs when we see redirects, 2013-09-28) we re-root all further requests from the redirect destination, which should generally mean that no further redirection is necessary. As an escape hatch, in case there really is a server that needs to redirect individual requests, the user can set http.followRedirects to "true" (and this can be done on a per-server basis via http.*.followRedirects config). Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06http: always update the base URL for redirectsLibravatar Jeff King1-4/+8
If a malicious server redirects the initial ref advertisement, it may be able to leak sha1s from other, unrelated servers that the client has access to. For example, imagine that Alice is a git user, she has access to a private repository on a server hosted by Bob, and Mallory runs a malicious server and wants to find out about Bob's private repository. Mallory asks Alice to clone an unrelated repository from her over HTTP. When Alice's client contacts Mallory's server for the initial ref advertisement, the server issues an HTTP redirect for Bob's server. Alice contacts Bob's server and gets the ref advertisement for the private repository. If there is anything to fetch, she then follows up by asking the server for one or more sha1 objects. But who is the server? If it is still Mallory's server, then Alice will leak the existence of those sha1s to her. Since commit c93c92f30 (http: update base URLs when we see redirects, 2013-09-28), the client usually rewrites the base URL such that all further requests will go to Bob's server. But this is done by textually matching the URL. If we were originally looking for "http://mallory/repo.git/info/refs", and we got pointed at "http://bob/other.git/info/refs", then we know that the right root is "http://bob/other.git". If the redirect appears to change more than just the root, we punt and continue to use the original server. E.g., imagine the redirect adds a URL component that Bob's server will ignore, like "http://bob/other.git/info/refs?dummy=1". We can solve this by aborting in this case rather than silently continuing to use Mallory's server. In addition to protecting from sha1 leakage, it's arguably safer and more sane to refuse a confusing redirect like that in general. For example, part of the motivation in c93c92f30 is avoiding accidentally sending credentials over clear http, just to get a response that says "try again over https". So even in a non-malicious case, we'd prefer to err on the side of caution. The downside is that it's possible this will break a legitimate but complicated server-side redirection scheme. The setup given in the newly added test does work, but it's convoluted enough that we don't need to care about it. A more plausible case would be a server which redirects a request for "info/refs?service=git-upload-pack" to just "info/refs" (because it does not do smart HTTP, and for some reason really dislikes query parameters). Right now we would transparently downgrade to dumb-http, but with this patch, we'd complain (and the user would have to set GIT_SMART_HTTP=0 to fetch). Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06http: simplify update_url_from_redirectLibravatar Jeff King1-6/+4
This function looks for a common tail between what we asked for and where we were redirected to, but it open-codes the comparison. We can avoid some confusing subtractions by using strip_suffix_mem(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>