diff options
author | Jeff King <peff@peff.net> | 2021-10-12 17:12:26 -0400 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2021-10-12 18:29:25 -0700 |
commit | e4c497a1944f035b3e4e947d9518d947d42d40a1 (patch) | |
tree | 60f56689aa8b6c5b5ede2feb5c72851964706f25 /urlmatch.c | |
parent | af6d1d602a8f64164b266364339c4e936d5bbc33 (diff) |
urlmatch: add underscore to URL_HOST_CHARS
When parsing a URL to normalize it, we allow hostnames to contain only dot (".") or dash ("-"), plus brackets and colons for IPv6 literals. This matches the old URL standard in RFC 1738, which says: host = hostname | hostnumber hostname = *[ domainlabel "." ] toplabel domainlabel = alphadigit | alphadigit *[ alphadigit | "-" ] alphadigit But this was later updated by RFC 3986, which is more liberal: host = IP-literal / IPv4address / reg-name reg-name = *( unreserved / pct-encoded / sub-delims ) unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" While names with underscore in them are not common and possibly violate some DNS rules, they do work in practice, and we will happily contact them over http://, git://, or ssh://. It seems odd to ignore them for purposes of URL matching, especially when the URL RFC seems to allow them. There shouldn't be any downside here. It's not a syntactically significant character in a URL, so we won't be confused about parsing; we'd have simply rejected such a URL previously (the test here checks the url code directly, but the obvious user-visible effect would be failing to match credential.http://foo_bar.example.com.helper, or similar config in http.<url>.*). Arguably we'd want to allow tilde ("~") here, too. There's likewise probably no downside, but I didn't add it simply because it seems like an even less likely character to appear in a hostname. Reported-by: Alex Waite <alex@waite.eu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'urlmatch.c')
-rw-r--r-- | urlmatch.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/urlmatch.c b/urlmatch.c index 33a2ccd306..03ad3f30a9 100644 --- a/urlmatch.c +++ b/urlmatch.c @@ -5,7 +5,7 @@ #define URL_DIGIT "0123456789" #define URL_ALPHADIGIT URL_ALPHA URL_DIGIT #define URL_SCHEME_CHARS URL_ALPHADIGIT "+.-" -#define URL_HOST_CHARS URL_ALPHADIGIT ".-[:]" /* IPv6 literals need [:] */ +#define URL_HOST_CHARS URL_ALPHADIGIT ".-_[:]" /* IPv6 literals need [:] */ #define URL_UNSAFE_CHARS " <>\"%{}|\\^`" /* plus 0x00-0x1F,0x7F-0xFF */ #define URL_GEN_RESERVED ":/?#[]@" #define URL_SUB_RESERVED "!$&'()*+,;=" |