From 69897bc2b8b49c09190cce065c027612b21c2d97 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Fri, 28 Feb 2014 05:01:29 -0500 Subject: docs: clarify remote restrictions for git-upload-archive Commits ee27ca4 and 0f544ee introduced rules by which git-upload-archive would restrict clients from accessing unreachable objects. However, we never documented those rules anywhere, nor their reason for being. Let's do so now. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- Documentation/git-archive.txt | 5 ++++- Documentation/git-upload-archive.txt | 26 ++++++++++++++++++++++++++ 2 files changed, 30 insertions(+), 1 deletion(-) diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt index b97aaab4ed..cfa1e4ebe4 100644 --- a/Documentation/git-archive.txt +++ b/Documentation/git-archive.txt @@ -65,7 +65,10 @@ OPTIONS --remote=:: Instead of making a tar archive from the local repository, - retrieve a tar archive from a remote repository. + retrieve a tar archive from a remote repository. Note that the + remote repository may place restrictions on which sha1 + expressions may be allowed in ``. See + linkgit:git-upload-archive[1] for details. --exec=:: Used with --remote to specify the path to the diff --git a/Documentation/git-upload-archive.txt b/Documentation/git-upload-archive.txt index d09bbb52b1..8ae65d80c4 100644 --- a/Documentation/git-upload-archive.txt +++ b/Documentation/git-upload-archive.txt @@ -20,6 +20,32 @@ This command is usually not invoked directly by the end user. The UI for the protocol is on the 'git archive' side, and the program pair is meant to be used to get an archive from a remote repository. +SECURITY +-------- + +In order to protect the privacy of objects that have been removed from +history but may not yet have been pruned, `git-upload-archive` avoids +serving archives for commits and trees that are not reachable from the +repository's refs. However, because calculating object reachability is +computationally expensive, `git-upload-archive` implements a stricter +but easier-to-check set of rules: + + 1. Clients may request a commit or tree that is pointed to directly by + a ref. E.g., `git archive --remote=origin v1.0`. + + 2. Clients may request a sub-tree within a commit or tree using the + `ref:path` syntax. E.g., `git archive --remote=origin v1.0:Documentation`. + + 3. Clients may _not_ use other sha1 expressions, even if the end + result is reachable. E.g., neither a relative commit like `master^` + nor a literal sha1 like `abcd1234` is allowed, even if the result + is reachable from the refs. + +Note that rule 3 disallows many cases that do not have any privacy +implications. These rules are subject to change in future versions of +git, and the server accessed by `git archive --remote` may or may not +follow these exact rules. + OPTIONS ------- :: -- cgit v1.2.3 From 7671b63211712e5163ed46d4c93d0b75680c886c Mon Sep 17 00:00:00 2001 From: "Scott J. Goldman" Date: Fri, 28 Feb 2014 05:04:19 -0500 Subject: add uploadarchive.allowUnreachable option In commit ee27ca4, we started restricting remote git-archive invocations to only accessing reachable commits. This matches what upload-pack allows, but does restrict some useful cases (e.g., HEAD:foo). We loosened this in 0f544ee, which allows `foo:bar` as long as `foo` is a ref tip. However, that still doesn't allow many useful things, like: 1. Commits accessible from a ref, like `foo^:bar`, which are reachable 2. Arbitrary sha1s, even if they are reachable. We can do a full object-reachability check for these cases, but it can be quite expensive if the client has sent us the sha1 of a tree; we have to visit every sub-tree of every commit in the worst case. Let's instead give site admins an escape hatch, in case they prefer the more liberal behavior. For many sites, the full object database is public anyway (e.g., if you allow dumb walker access), or the site admin may simply decide the security/convenience tradeoff is not worth it. This patch adds a new config option to disable the restrictions added in ee27ca4. It defaults to off, meaning there is no change in behavior by default. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- Documentation/config.txt | 7 +++++++ Documentation/git-upload-archive.txt | 6 ++++++ archive.c | 13 +++++++++++-- t/t5000-tar-tree.sh | 9 +++++++++ 4 files changed, 33 insertions(+), 2 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 5f4d7939ed..64b69eeb6b 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2291,6 +2291,13 @@ transfer.unpackLimit:: not set, the value of this variable is used instead. The default value is 100. +uploadarchive.allowUnreachable:: + If true, allow clients to use `git archive --remote` to request + any tree, whether reachable from the ref tips or not. See the + discussion in the `SECURITY` section of + linkgit:git-upload-archive[1] for more details. Defaults to + `false`. + uploadpack.hiderefs:: String(s) `upload-pack` uses to decide which refs to omit from its initial advertisement. Use more than one diff --git a/Documentation/git-upload-archive.txt b/Documentation/git-upload-archive.txt index 8ae65d80c4..cbef61ba88 100644 --- a/Documentation/git-upload-archive.txt +++ b/Documentation/git-upload-archive.txt @@ -46,6 +46,12 @@ implications. These rules are subject to change in future versions of git, and the server accessed by `git archive --remote` may or may not follow these exact rules. +If the config option `uploadArchive.allowUnreachable` is true, these +rules are ignored, and clients may use arbitrary sha1 expressions. +This is useful if you do not care about the privacy of unreachable +objects, or if your object database is already publicly available for +access via non-smart-http. + OPTIONS ------- :: diff --git a/archive.c b/archive.c index 346f3b2f1a..7d0976fe55 100644 --- a/archive.c +++ b/archive.c @@ -17,6 +17,7 @@ static char const * const archive_usage[] = { static const struct archiver **archivers; static int nr_archivers; static int alloc_archivers; +static int remote_allow_unreachable; void register_archiver(struct archiver *ar) { @@ -257,7 +258,7 @@ static void parse_treeish_arg(const char **argv, unsigned char sha1[20]; /* Remotes are only allowed to fetch actual refs */ - if (remote) { + if (remote && !remote_allow_unreachable) { char *ref = NULL; const char *colon = strchr(name, ':'); int refnamelen = colon ? colon - name : strlen(name); @@ -401,6 +402,14 @@ static int parse_archive_args(int argc, const char **argv, return argc; } +static int git_default_archive_config(const char *var, const char *value, + void *cb) +{ + if (!strcmp(var, "uploadarchive.allowunreachable")) + remote_allow_unreachable = git_config_bool(var, value); + return git_default_config(var, value, cb); +} + int write_archive(int argc, const char **argv, const char *prefix, int setup_prefix, const char *name_hint, int remote) { @@ -411,7 +420,7 @@ int write_archive(int argc, const char **argv, const char *prefix, if (setup_prefix && prefix == NULL) prefix = setup_git_directory_gently(&nongit); - git_config(git_default_config, NULL); + git_config(git_default_archive_config, NULL); init_tar_archiver(); init_zip_archiver(); diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh index 05f011d38e..1cf0a4e103 100755 --- a/t/t5000-tar-tree.sh +++ b/t/t5000-tar-tree.sh @@ -213,6 +213,15 @@ test_expect_success 'clients cannot access unreachable commits' ' test_must_fail git archive --remote=. $sha1 >remote.tar ' +test_expect_success 'upload-archive can allow unreachable commits' ' + test_commit unreachable1 && + sha1=`git rev-parse HEAD` && + git reset --hard HEAD^ && + git archive $sha1 >remote.tar && + test_config uploadarchive.allowUnreachable true && + git archive --remote=. $sha1 >remote.tar +' + test_expect_success 'setup tar filters' ' git config tar.tar.foo.command "tr ab ba" && git config tar.bar.command "tr ab ba" && -- cgit v1.2.3