From f94804c1f2626831c6bdf8cc269a571324e3f2f2 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 29 Aug 2019 11:19:18 -0400 Subject: t9300: drop some useless uses of cat These waste a process, and make the line longer than it needs to be. Signed-off-by: Jeff King --- t/t9300-fast-import.sh | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index d47560b634..1d2a7516fd 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -2125,12 +2125,12 @@ test_expect_success 'R: export-marks feature results in a marks file being creat EOF - cat input | git fast-import && + git fast-import output && + git fast-import 2>output Date: Thu, 29 Aug 2019 13:43:23 -0400 Subject: t9300: create marks files for double-import-marks test Our tests confirm that providing two "import-marks" options in a fast-import stream is an error. However, the invoked command would fail even without covering this case, because the marks files themselves do not actually exist. Let's create the files to make sure we fail for the right reason (we actually do, because the option parsing happens before we open anything, but this future-proofs our test). Signed-off-by: Jeff King --- t/t9300-fast-import.sh | 2 ++ 1 file changed, 2 insertions(+) diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 1d2a7516fd..c0d04ec3ee 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -2107,6 +2107,8 @@ test_expect_success 'R: abort on receiving feature after data command' ' ' test_expect_success 'R: only one import-marks feature allowed per stream' ' + >git.marks && + >git2.marks && cat >input <<-EOF && feature import-marks=git.marks feature import-marks=git2.marks -- cgit v1.2.3 From 11e934d56e46875b24d8a047d44b45ff243f6715 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 29 Aug 2019 11:25:45 -0400 Subject: fast-import: tighten parsing of boolean command line options We parse options like "--max-pack-size=" using skip_prefix(), which makes sense to get at the bytes after the "=". However, we also parse "--quiet" and "--stats" with skip_prefix(), which allows things like "--quiet-nonsense" to behave like "--quiet". This was a mistaken conversion in 0f6927c229 (fast-import: put option parsing code in separate functions, 2009-12-04). Let's tighten this to an exact match, which was the original intent. Signed-off-by: Jeff King --- fast-import.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fast-import.c b/fast-import.c index 32ac5323f6..1d0a06ccfd 100644 --- a/fast-import.c +++ b/fast-import.c @@ -3312,9 +3312,9 @@ static int parse_one_option(const char *option) option_active_branches(option); } else if (skip_prefix(option, "export-pack-edges=", &option)) { option_export_pack_edges(option); - } else if (starts_with(option, "quiet")) { + } else if (!strcmp(option, "quiet")) { show_stats = 0; - } else if (starts_with(option, "stats")) { + } else if (!strcmp(option, "stats")) { show_stats = 1; } else { return 0; -- cgit v1.2.3 From e075dba3723875f478654068609f69b2a5af8566 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 29 Aug 2019 13:07:04 -0400 Subject: fast-import: stop creating leading directories for import-marks When asked to import marks from "subdir/file.marks", we create the leading directory "subdir" if it doesn't exist. This makes no sense for importing marks, where we only ever open the path for reading. Most of the time this would be a noop, since if the marks file exists, then the leading directories exist, too. But if it doesn't (e.g., because --import-marks-if-exists was used), then we'd create the useless directory. This dates back to 580d5f83e7 (fast-import: always create marks_file directories, 2010-03-29). Even then it was useless, so it seems to have been added in error alongside the --export-marks case (which _is_ helpful). Signed-off-by: Jeff King --- fast-import.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fast-import.c b/fast-import.c index 1d0a06ccfd..a5b7685e84 100644 --- a/fast-import.c +++ b/fast-import.c @@ -3228,7 +3228,6 @@ static void option_import_marks(const char *marks, } import_marks_file = make_fast_import_path(marks); - safe_create_leading_directories_const(import_marks_file); import_marks_file_from_stream = from_stream; import_marks_file_ignore_missing = ignore_missing; } -- cgit v1.2.3 From 019683025f1b14d7cb671312ab01f7330e9b33e7 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 29 Aug 2019 13:33:48 -0400 Subject: fast-import: delay creating leading directories for export-marks When we parse the --export-marks option, we don't immediately open the file, but we do create any leading directories. This can be especially confusing when a command-line option overrides an in-stream one, in which case we'd create the leading directory for the in-stream file, even though we never actually write the file. Let's instead create the directories just before opening the file, which means we'll create only useful directories. Note that this could change the handling of relative paths if we chdir() in between, but we don't actually do so; the only permanent chdir is from setup_git_directory() which runs before either code path (potentially we should take the pre-setup dir into account to avoid surprising the user, but that's an orthogonal change). The test just adapts the existing "override" test to use paths with leading directories. This checks both that the correct directory is created (which worked before but was not tested), and that the overridden one is not (our new fix here). While we're here, let's also check the error result of safe_create_leading_directories(). We'd presumably notice any failure immediately after when we try to open the file itself, but we can give a more specific error message in this case. Signed-off-by: Jeff King --- fast-import.c | 7 ++++++- t/t9300-fast-import.sh | 13 +++++++++++-- 2 files changed, 17 insertions(+), 3 deletions(-) diff --git a/fast-import.c b/fast-import.c index a5b7685e84..d64609b083 100644 --- a/fast-import.c +++ b/fast-import.c @@ -1861,6 +1861,12 @@ static void dump_marks(void) if (!export_marks_file || (import_marks_file && !import_marks_file_done)) return; + if (safe_create_leading_directories_const(export_marks_file)) { + failure |= error_errno("unable to create leading directories of %s", + export_marks_file); + return; + } + if (hold_lock_file_for_update(&mark_lock, export_marks_file, 0) < 0) { failure |= error_errno("Unable to write marks file %s", export_marks_file); @@ -3268,7 +3274,6 @@ static void option_active_branches(const char *branches) static void option_export_marks(const char *marks) { export_marks_file = make_fast_import_path(marks); - safe_create_leading_directories_const(export_marks_file); } static void option_cat_blob_fd(const char *fd) diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index c0d04ec3ee..1ba20c1f1a 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -2132,8 +2132,17 @@ test_expect_success 'R: export-marks feature results in a marks file being creat ' test_expect_success 'R: export-marks options can be overridden by commandline options' ' - git fast-import --export-marks=other.marks input <<-\EOF && + feature export-marks=feature-sub/git.marks + blob + mark :1 + data 3 + hi + + EOF + git fast-import --export-marks=cmdline-sub/other.marks Date: Thu, 29 Aug 2019 14:37:26 -0400 Subject: fast-import: disallow "feature export-marks" by default The fast-import stream command "feature export-marks=" lets the stream write marks to an arbitrary path. This may be surprising if you are running fast-import against an untrusted input (which otherwise cannot do anything except update Git objects and refs). Let's disallow the use of this feature by default, and provide a command-line option to re-enable it (you can always just use the command-line --export-marks as well, but the in-stream version provides an easy way for exporters to control the process). This is a backwards-incompatible change, since the default is flipping to the new, safer behavior. However, since the main users of the in-stream versions would be import/export-based remote helpers, and since we trust remote helpers already (which are already running arbitrary code), we'll pass the new option by default when reading a remote helper's stream. This should minimize the impact. Note that the implementation isn't totally simple, as we have to work around the fact that fast-import doesn't parse its command-line options until after it has read any "feature" lines from the stream. This is how it lets command-line options override in-stream. But in our case, it's important to parse the new --allow-unsafe-features first. There are three options for resolving this: 1. Do a separate "early" pass over the options. This is easy for us to do because there are no command-line options that allow the "unstuck" form (so there's no chance of us mistaking an argument for an option), though it does introduce a risk of incorrect parsing later (e.g,. if we convert to parse-options). 2. Move the option parsing phase back to the start of the program, but teach the stream-reading code never to override an existing value. This is tricky, because stream "feature" lines override each other (meaning we'd have to start tracking the source for every option). 3. Accept that we might parse a "feature export-marks" line that is forbidden, as long we don't _act_ on it until after we've parsed the command line options. This would, in fact, work with the current code, but only because the previous patch fixed the export-marks parser to avoid touching the filesystem. So while it works, it does carry risk of somebody getting it wrong in the future in a rather subtle and unsafe way. I've gone with option (1) here as simple, safe, and unlikely to cause regressions. This fixes CVE-2019-1348. Signed-off-by: Jeff King --- Documentation/git-fast-import.txt | 14 ++++++++++++++ fast-import.c | 25 +++++++++++++++++++++++++ t/t9300-fast-import.sh | 23 +++++++++++++++-------- transport-helper.c | 1 + 4 files changed, 55 insertions(+), 8 deletions(-) diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index 3d3d219e58..fbb3f914f2 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -50,6 +50,20 @@ OPTIONS memory used by fast-import during this run. Showing this output is currently the default, but can be disabled with --quiet. +--allow-unsafe-features:: + Many command-line options can be provided as part of the + fast-import stream itself by using the `feature` or `option` + commands. However, some of these options are unsafe (e.g., + allowing fast-import to access the filesystem outside of the + repository). These options are disabled by default, but can be + allowed by providing this option on the command line. This + currently impacts only the `feature export-marks` command. ++ + Only enable this option if you trust the program generating the + fast-import stream! This option is enabled automatically for + remote-helpers that use the `import` capability, as they are + already trusted to run their own code. + Options for Frontends ~~~~~~~~~~~~~~~~~~~~~ diff --git a/fast-import.c b/fast-import.c index d64609b083..967077ad0b 100644 --- a/fast-import.c +++ b/fast-import.c @@ -366,6 +366,7 @@ static uintmax_t next_mark; static struct strbuf new_data = STRBUF_INIT; static int seen_data_command; static int require_explicit_termination; +static int allow_unsafe_features; /* Signal handling */ static volatile sig_atomic_t checkpoint_requested; @@ -3320,6 +3321,8 @@ static int parse_one_option(const char *option) show_stats = 0; } else if (!strcmp(option, "stats")) { show_stats = 1; + } else if (!strcmp(option, "allow-unsafe-features")) { + ; /* already handled during early option parsing */ } else { return 0; } @@ -3327,6 +3330,13 @@ static int parse_one_option(const char *option) return 1; } +static void check_unsafe_feature(const char *feature, int from_stream) +{ + if (from_stream && !allow_unsafe_features) + die(_("feature '%s' forbidden in input without --allow-unsafe-features"), + feature); +} + static int parse_one_feature(const char *feature, int from_stream) { const char *arg; @@ -3338,6 +3348,7 @@ static int parse_one_feature(const char *feature, int from_stream) } else if (skip_prefix(feature, "import-marks-if-exists=", &arg)) { option_import_marks(arg, from_stream, 1); } else if (skip_prefix(feature, "export-marks=", &arg)) { + check_unsafe_feature(feature, from_stream); option_export_marks(arg); } else if (!strcmp(feature, "get-mark")) { ; /* Don't die - this feature is supported */ @@ -3464,6 +3475,20 @@ int cmd_main(int argc, const char **argv) avail_tree_table = xcalloc(avail_tree_table_sz, sizeof(struct avail_tree_content*)); marks = pool_calloc(1, sizeof(struct mark_set)); + /* + * We don't parse most options until after we've seen the set of + * "feature" lines at the start of the stream (which allows the command + * line to override stream data). But we must do an early parse of any + * command-line options that impact how we interpret the feature lines. + */ + for (i = 1; i < argc; i++) { + const char *arg = argv[i]; + if (*arg != '-' || !strcmp(arg, "--")) + break; + if (!strcmp(arg, "--allow-unsafe-features")) + allow_unsafe_features = 1; + } + global_argc = argc; global_argv = argv; diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index 1ba20c1f1a..ba5a35c32c 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -2117,6 +2117,11 @@ test_expect_success 'R: only one import-marks feature allowed per stream' ' test_must_fail git fast-import input && + test_must_fail git fast-import input <<-EOF && feature export-marks=git.marks @@ -2127,7 +2132,7 @@ test_expect_success 'R: export-marks feature results in a marks file being creat EOF - git fast-import one.marks && tail -n +3 marks.out > two.marks && - git fast-import --import-marks=one.marks --import-marks=two.marks in = helper->out; argv_array_push(&fastimport->args, "fast-import"); + argv_array_push(&fastimport->args, "--allow-unsafe-features"); argv_array_push(&fastimport->args, debug ? "--stats" : "--quiet"); if (data->bidi_import) { -- cgit v1.2.3 From a52ed76142f6e8d993bb4c50938a408966eb2b7c Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 29 Aug 2019 15:08:42 -0400 Subject: fast-import: disallow "feature import-marks" by default As with export-marks in the previous commit, import-marks can access the filesystem. This is significantly less dangerous than export-marks because it only involves reading from arbitrary paths, rather than writing them. However, it could still be surprising and have security implications (e.g., exfiltrating data from a service that accepts fast-import streams). Let's lump it (and its "if-exists" counterpart) in with export-marks, and enable the in-stream version only if --allow-unsafe-features is set. Signed-off-by: Jeff King --- Documentation/git-fast-import.txt | 3 ++- fast-import.c | 2 ++ t/t9300-fast-import.sh | 22 +++++++++++++++++----- 3 files changed, 21 insertions(+), 6 deletions(-) diff --git a/Documentation/git-fast-import.txt b/Documentation/git-fast-import.txt index fbb3f914f2..ff71fc2962 100644 --- a/Documentation/git-fast-import.txt +++ b/Documentation/git-fast-import.txt @@ -57,7 +57,8 @@ OPTIONS allowing fast-import to access the filesystem outside of the repository). These options are disabled by default, but can be allowed by providing this option on the command line. This - currently impacts only the `feature export-marks` command. + currently impacts only the `export-marks`, `import-marks`, and + `import-marks-if-exists` feature commands. + Only enable this option if you trust the program generating the fast-import stream! This option is enabled automatically for diff --git a/fast-import.c b/fast-import.c index 967077ad0b..93c3838254 100644 --- a/fast-import.c +++ b/fast-import.c @@ -3344,8 +3344,10 @@ static int parse_one_feature(const char *feature, int from_stream) if (skip_prefix(feature, "date-format=", &arg)) { option_date_format(arg); } else if (skip_prefix(feature, "import-marks=", &arg)) { + check_unsafe_feature("import-marks", from_stream); option_import_marks(arg, from_stream, 0); } else if (skip_prefix(feature, "import-marks-if-exists=", &arg)) { + check_unsafe_feature("import-marks-if-exists", from_stream); option_import_marks(arg, from_stream, 1); } else if (skip_prefix(feature, "export-marks=", &arg)) { check_unsafe_feature(feature, from_stream); diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh index ba5a35c32c..77104f9daa 100755 --- a/t/t9300-fast-import.sh +++ b/t/t9300-fast-import.sh @@ -2106,6 +2106,14 @@ test_expect_success 'R: abort on receiving feature after data command' ' test_must_fail git fast-import git.marks && + echo "feature import-marks=git.marks" >input && + test_must_fail git fast-import input && + test_must_fail git fast-import git.marks && >git2.marks && @@ -2114,7 +2122,7 @@ test_expect_success 'R: only one import-marks feature allowed per stream' ' feature import-marks=git2.marks EOF - test_must_fail git fast-import expect && - git fast-import --export-marks=io.marks <<-\EOF && + git fast-import --export-marks=io.marks \ + --allow-unsafe-features <<-\EOF && feature import-marks-if-exists=not_io.marks EOF test_cmp expect io.marks && @@ -2221,7 +2230,8 @@ test_expect_success 'R: feature import-marks-if-exists' ' echo ":1 $blob" >expect && echo ":2 $blob" >>expect && - git fast-import --export-marks=io.marks <<-\EOF && + git fast-import --export-marks=io.marks \ + --allow-unsafe-features <<-\EOF && feature import-marks-if-exists=io.marks blob mark :2 @@ -2234,7 +2244,8 @@ test_expect_success 'R: feature import-marks-if-exists' ' echo ":3 $blob" >>expect && git fast-import --import-marks=io.marks \ - --export-marks=io.marks <<-\EOF && + --export-marks=io.marks \ + --allow-unsafe-features <<-\EOF && feature import-marks-if-exists=not_io.marks blob mark :3 @@ -2247,7 +2258,8 @@ test_expect_success 'R: feature import-marks-if-exists' ' >expect && git fast-import --import-marks-if-exists=not_io.marks \ - --export-marks=io.marks <<-\EOF && + --export-marks=io.marks \ + --allow-unsafe-features <<-\EOF && feature import-marks-if-exists=io.marks EOF test_cmp expect io.marks -- cgit v1.2.3 From 0060fd1511b94c918928fa3708f69a3f33895a4a Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 12 Sep 2019 14:20:39 +0200 Subject: clone --recurse-submodules: prevent name squatting on Windows In addition to preventing `.git` from being tracked by Git, on Windows we also have to prevent `git~1` from being tracked, as the default NTFS short name (also known as the "8.3 filename") for the file name `.git` is `git~1`, otherwise it would be possible for malicious repositories to write directly into the `.git/` directory, e.g. a `post-checkout` hook that would then be executed _during_ a recursive clone. When we implemented appropriate protections in 2b4c6efc821 (read-cache: optionally disallow NTFS .git variants, 2014-12-16), we had analyzed carefully that the `.git` directory or file would be guaranteed to be the first directory entry to be written. Otherwise it would be possible e.g. for a file named `..git` to be assigned the short name `git~1` and subsequently, the short name generated for `.git` would be `git~2`. Or `git~3`. Or even `~9999999` (for a detailed explanation of the lengths we have to go to protect `.gitmodules`, see the commit message of e7cb0b4455c (is_ntfs_dotgit: match other .git files, 2018-05-11)). However, by exploiting two issues (that will be addressed in a related patch series close by), it is currently possible to clone a submodule into a non-empty directory: - On Windows, file names cannot end in a space or a period (for historical reasons: the period separating the base name from the file extension was not actually written to disk, and the base name/file extension was space-padded to the full 8/3 characters, respectively). Helpfully, when creating a directory under the name, say, `sub.`, that trailing period is trimmed automatically and the actual name on disk is `sub`. This means that while Git thinks that the submodule names `sub` and `sub.` are different, they both access `.git/modules/sub/`. - While the backslash character is a valid file name character on Linux, it is not so on Windows. As Git tries to be cross-platform, it therefore allows backslash characters in the file names stored in tree objects. Which means that it is totally possible that a submodule `c` sits next to a file `c\..git`, and on Windows, during recursive clone a file called `..git` will be written into `c/`, of course _before_ the submodule is cloned. Note that the actual exploit is not quite as simple as having a submodule `c` next to a file `c\..git`, as we have to make sure that the directory `.git/modules/b` already exists when the submodule is checked out, otherwise a different code path is taken in `module_clone()` that does _not_ allow a non-empty submodule directory to exist already. Even if we will address both issues nearby (the next commit will disallow backslash characters in tree entries' file names on Windows, and another patch will disallow creating directories/files with trailing spaces or periods), it is a wise idea to defend in depth against this sort of attack vector: when submodules are cloned recursively, we now _require_ the directory to be empty, addressing CVE-2019-1349. Note: the code path we patch is shared with the code path of `git submodule update --init`, which must not expect, in general, that the directory is empty. Hence we have to introduce the new option `--force-init` and hand it all the way down from `git submodule` to the actual `git submodule--helper` process that performs the initial clone. Reported-by: Nicolas Joly Signed-off-by: Johannes Schindelin --- builtin/clone.c | 2 +- builtin/submodule--helper.c | 13 ++++++++++++- git-submodule.sh | 6 ++++++ t/t7415-submodule-names.sh | 31 +++++++++++++++++++++++++++++++ 4 files changed, 50 insertions(+), 2 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index f7e17d2295..44b716923f 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -757,7 +757,7 @@ static int checkout(int submodule_progress) if (!err && (option_recurse_submodules.nr > 0)) { struct argv_array args = ARGV_ARRAY_INIT; - argv_array_pushl(&args, "submodule", "update", "--init", "--recursive", NULL); + argv_array_pushl(&args, "submodule", "update", "--require-init", "--recursive", NULL); if (option_shallow_submodules == 1) argv_array_push(&args, "--depth=1"); diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index 676cfed770..79156fac45 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -13,6 +13,7 @@ #include "remote.h" #include "refs.h" #include "connect.h" +#include "dir.h" static char *get_default_remote(void) { @@ -623,6 +624,7 @@ static int module_clone(int argc, const char **argv, const char *prefix) char *p, *path = NULL, *sm_gitdir; struct strbuf sb = STRBUF_INIT; struct string_list reference = STRING_LIST_INIT_NODUP; + int require_init = 0; char *sm_alternate = NULL, *error_strategy = NULL; struct option module_clone_options[] = { @@ -647,6 +649,8 @@ static int module_clone(int argc, const char **argv, const char *prefix) OPT__QUIET(&quiet, "Suppress output for cloning a submodule"), OPT_BOOL(0, "progress", &progress, N_("force cloning progress")), + OPT_BOOL(0, "require-init", &require_init, + N_("disallow cloning into non-empty directory")), OPT_END() }; @@ -685,6 +689,8 @@ static int module_clone(int argc, const char **argv, const char *prefix) die(_("clone of '%s' into submodule path '%s' failed"), url, path); } else { + if (require_init && !access(path, X_OK) && !is_empty_dir(path)) + die(_("directory not empty: '%s'"), path); if (safe_create_leading_directories_const(path) < 0) die(_("could not create directory '%s'"), path); strbuf_addf(&sb, "%s/index", sm_gitdir); @@ -733,6 +739,7 @@ struct submodule_update_clone { int quiet; int recommend_shallow; struct string_list references; + unsigned require_init; const char *depth; const char *recursive_prefix; const char *prefix; @@ -748,7 +755,7 @@ struct submodule_update_clone { int failed_clones_nr, failed_clones_alloc; }; #define SUBMODULE_UPDATE_CLONE_INIT {0, MODULE_LIST_INIT, 0, \ - SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, \ + SUBMODULE_UPDATE_STRATEGY_INIT, 0, 0, -1, STRING_LIST_INIT_DUP, 0, \ NULL, NULL, NULL, \ STRING_LIST_INIT_DUP, 0, NULL, 0, 0} @@ -850,6 +857,8 @@ static int prepare_to_clone_next_submodule(const struct cache_entry *ce, argv_array_pushl(&child->args, "--prefix", suc->prefix, NULL); if (suc->recommend_shallow && sub->recommend_shallow == 1) argv_array_push(&child->args, "--depth=1"); + if (suc->require_init) + argv_array_push(&child->args, "--require-init"); argv_array_pushl(&child->args, "--path", sub->path, NULL); argv_array_pushl(&child->args, "--name", sub->name, NULL); argv_array_pushl(&child->args, "--url", sub->url, NULL); @@ -992,6 +1001,8 @@ static int update_clone(int argc, const char **argv, const char *prefix) OPT__QUIET(&suc.quiet, N_("don't print cloning progress")), OPT_BOOL(0, "progress", &suc.progress, N_("force cloning progress")), + OPT_BOOL(0, "require-init", &suc.require_init, + N_("disallow cloning into non-empty directory")), OPT_END() }; diff --git a/git-submodule.sh b/git-submodule.sh index 8f260fbd9c..e4843a5874 100755 --- a/git-submodule.sh +++ b/git-submodule.sh @@ -34,6 +34,7 @@ reference= cached= recursive= init= +require_init= files= remote= nofetch= @@ -528,6 +529,10 @@ cmd_update() -i|--init) init=1 ;; + --require-init) + init=1 + require_init=1 + ;; --remote) remote=1 ;; @@ -606,6 +611,7 @@ cmd_update() ${update:+--update "$update"} \ ${reference:+"$reference"} \ ${depth:+--depth "$depth"} \ + ${require_init:+--require-init} \ ${recommend_shallow:+"$recommend_shallow"} \ ${jobs:+$jobs} \ "$@" || echo "#unmatched" $? diff --git a/t/t7415-submodule-names.sh b/t/t7415-submodule-names.sh index 75fa071c6d..e1cd0a3545 100755 --- a/t/t7415-submodule-names.sh +++ b/t/t7415-submodule-names.sh @@ -73,4 +73,35 @@ test_expect_success 'clone evil superproject' ' ! grep "RUNNING POST CHECKOUT" output ' +test_expect_success MINGW 'prevent git~1 squatting on Windows' ' + git init squatting && + ( + cd squatting && + mkdir a && + touch a/..git && + git add a/..git && + test_tick && + git commit -m initial && + + modules="$(test_write_lines \ + "[submodule \"b.\"]" "url = ." "path = c" \ + "[submodule \"b\"]" "url = ." "path = d\\\\a" | + git hash-object -w --stdin)" && + rev="$(git rev-parse --verify HEAD)" && + hash="$(echo x | git hash-object -w --stdin)" && + git update-index --add \ + --cacheinfo 100644,$modules,.gitmodules \ + --cacheinfo 160000,$rev,c \ + --cacheinfo 160000,$rev,d\\a \ + --cacheinfo 100644,$hash,d./a/x \ + --cacheinfo 100644,$hash,d./a/..git && + test_tick && + git commit -m "module" + ) && + test_must_fail git \ + clone --recurse-submodules squatting squatting-clone 2>err && + test_i18ngrep "directory not empty" err && + ! grep gitdir squatting-clone/d/a/git~2 +' + test_done -- cgit v1.2.3 From e1d911dd4c7b76a5a8cec0f5c8de15981e34da83 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Thu, 12 Sep 2019 14:54:05 +0200 Subject: mingw: disallow backslash characters in tree objects' file names The backslash character is not a valid part of a file name on Windows. Hence it is dangerous to allow writing files that were unpacked from tree objects, when the stored file name contains a backslash character: it will be misinterpreted as directory separator. This not only causes ambiguity when a tree contains a blob `a\b` and a tree `a` that contains a blob `b`, but it also can be used as part of an attack vector to side-step the careful protections against writing into the `.git/` directory during a clone of a maliciously-crafted repository. Let's prevent that, addressing CVE-2019-1354. Note: we guard against backslash characters in tree objects' file names _only_ on Windows (because on other platforms, even on those where NTFS volumes can be mounted, the backslash character is _not_ a directory separator), and _only_ when `core.protectNTFS = true` (because users might need to generate tree objects for other platforms, of course without touching the worktree, e.g. using `git update-index --cacheinfo`). Signed-off-by: Johannes Schindelin --- t/t1450-fsck.sh | 1 + t/t7415-submodule-names.sh | 8 +++++--- t/t9350-fast-export.sh | 1 + tree-walk.c | 6 ++++++ 4 files changed, 13 insertions(+), 3 deletions(-) diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index cb4b66e29d..33c955f912 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -419,6 +419,7 @@ while read name path pretty; do ( git init $name-$type && cd $name-$type && + git config core.protectNTFS false && echo content >file && git add file && git commit -m base && diff --git a/t/t7415-submodule-names.sh b/t/t7415-submodule-names.sh index e1cd0a3545..7c65e7a35c 100755 --- a/t/t7415-submodule-names.sh +++ b/t/t7415-submodule-names.sh @@ -89,16 +89,18 @@ test_expect_success MINGW 'prevent git~1 squatting on Windows' ' git hash-object -w --stdin)" && rev="$(git rev-parse --verify HEAD)" && hash="$(echo x | git hash-object -w --stdin)" && - git update-index --add \ + git -c core.protectNTFS=false update-index --add \ --cacheinfo 100644,$modules,.gitmodules \ --cacheinfo 160000,$rev,c \ --cacheinfo 160000,$rev,d\\a \ --cacheinfo 100644,$hash,d./a/x \ --cacheinfo 100644,$hash,d./a/..git && test_tick && - git commit -m "module" + git -c core.protectNTFS=false commit -m "module" && + test_must_fail git show HEAD: 2>err && + test_i18ngrep backslash err ) && - test_must_fail git \ + test_must_fail git -c core.protectNTFS=false \ clone --recurse-submodules squatting squatting-clone 2>err && test_i18ngrep "directory not empty" err && ! grep gitdir squatting-clone/d/a/git~2 diff --git a/t/t9350-fast-export.sh b/t/t9350-fast-export.sh index 866ddf6058..e6062071e6 100755 --- a/t/t9350-fast-export.sh +++ b/t/t9350-fast-export.sh @@ -421,6 +421,7 @@ test_expect_success 'directory becomes symlink' ' test_expect_success 'fast-export quotes pathnames' ' git init crazy-paths && + test_config -C crazy-paths core.protectNTFS false && (cd crazy-paths && blob=$(echo foo | git hash-object -w --stdin) && git update-index --add \ diff --git a/tree-walk.c b/tree-walk.c index d459feda23..54ff959d7f 100644 --- a/tree-walk.c +++ b/tree-walk.c @@ -41,6 +41,12 @@ static int decode_tree_entry(struct tree_desc *desc, const char *buf, unsigned l strbuf_addstr(err, _("empty filename in tree entry")); return -1; } +#ifdef GIT_WINDOWS_NATIVE + if (protect_ntfs && strchr(path, '\\')) { + strbuf_addf(err, _("filename in tree entry contains backslash: '%s'"), path); + return -1; + } +#endif len = strlen(path) + 1; /* Initialize the descriptor entry */ -- cgit v1.2.3 From 525e7fba7854c23ee3530d0bf88d75f106f14c95 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 16 Sep 2019 20:44:31 +0200 Subject: path.c: document the purpose of `is_ntfs_dotgit()` Previously, this function was completely undocumented. It is worth, though, to explain what is going on, as it is not really obvious at all. Suggested-by: Garima Singh Signed-off-by: Johannes Schindelin --- path.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/path.c b/path.c index 9ac0531a29..22bd0b6f52 100644 --- a/path.c +++ b/path.c @@ -1302,6 +1302,34 @@ static int only_spaces_and_periods(const char *path, size_t len, size_t skip) return 1; } +/* + * On NTFS, we need to be careful to disallow certain synonyms of the `.git/` + * directory: + * + * - For historical reasons, file names that end in spaces or periods are + * automatically trimmed. Therefore, `.git . . ./` is a valid way to refer + * to `.git/`. + * + * - For other historical reasons, file names that do not conform to the 8.3 + * format (up to eight characters for the basename, three for the file + * extension, certain characters not allowed such as `+`, etc) are associated + * with a so-called "short name", at least on the `C:` drive by default. + * Which means that `git~1/` is a valid way to refer to `.git/`. + * + * Note: Technically, `.git/` could receive the short name `git~2` if the + * short name `git~1` were already used. In Git, however, we guarantee that + * `.git` is the first item in a directory, therefore it will be associated + * with the short name `git~1` (unless short names are disabled). + * + * When this function returns 1, it indicates that the specified file/directory + * name refers to a `.git` file or directory, or to any of these synonyms, and + * Git should therefore not track it. + * + * This function is intended to be used by `git fsck` even on platforms where + * the backslash is a regular filename character, therefore it needs to handle + * backlash characters in the provided `name` specially: they are interpreted + * as directory separators. + */ int is_ntfs_dotgit(const char *name) { size_t len; -- cgit v1.2.3 From a62f9d1ace8c6556cbc1bb7df69eff0a0bb9e774 Mon Sep 17 00:00:00 2001 From: Garima Singh Date: Wed, 4 Sep 2019 13:36:39 -0400 Subject: test-path-utils: offer to run a protectNTFS/protectHFS benchmark In preparation to flipping the default on `core.protectNTFS`, let's have some way to measure the speed impact of this config setting reliably (and for comparison, the `core.protectHFS` config setting). For now, this is a manual performance benchmark: ./t/helper/test-path-utils protect_ntfs_hfs [arguments...] where the arguments are an optional number of file names to test with, optionally followed by minimum and maximum length of the random file names. The default values are one million, 3 and 20, respectively. Just like `sqrti()` in `bisect.c`, we introduce a very simple function to approximation the square root of a given value, in order to avoid having to introduce the first user of `` in Git's source code. Note: this is _not_ implemented as a Unix shell script in t/perf/ because we really care about _very_ precise timings here, and Unix shell scripts are simply unsuited for precise and consistent benchmarking. Signed-off-by: Garima Singh Signed-off-by: Johannes Schindelin --- t/helper/test-path-utils.c | 96 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 96 insertions(+) diff --git a/t/helper/test-path-utils.c b/t/helper/test-path-utils.c index 94846550f7..16d8e689c8 100644 --- a/t/helper/test-path-utils.c +++ b/t/helper/test-path-utils.c @@ -176,6 +176,99 @@ static int is_dotgitmodules(const char *path) return is_hfs_dotgitmodules(path) || is_ntfs_dotgitmodules(path); } +/* + * A very simple, reproducible pseudo-random generator. Copied from + * `test-genrandom.c`. + */ +static uint64_t my_random_value = 1234; + +static uint64_t my_random(void) +{ + my_random_value = my_random_value * 1103515245 + 12345; + return my_random_value; +} + +/* + * A fast approximation of the square root, without requiring math.h. + * + * It uses Newton's method to approximate the solution of 0 = x^2 - value. + */ +static double my_sqrt(double value) +{ + const double epsilon = 1e-6; + double x = value; + + if (value == 0) + return 0; + + for (;;) { + double delta = (value / x - x) / 2; + if (delta < epsilon && delta > -epsilon) + return x + delta; + x += delta; + } +} + +static int protect_ntfs_hfs_benchmark(int argc, const char **argv) +{ + size_t i, j, nr, min_len = 3, max_len = 20; + char **names; + int repetitions = 15, file_mode = 0100644; + uint64_t begin, end; + double m[3][2], v[3][2]; + uint64_t cumul; + double cumul2; + + if (argc > 1 && !strcmp(argv[1], "--with-symlink-mode")) { + file_mode = 0120000; + argc--; + argv++; + } + + nr = argc > 1 ? strtoul(argv[1], NULL, 0) : 1000000; + ALLOC_ARRAY(names, nr); + + if (argc > 2) { + min_len = strtoul(argv[2], NULL, 0); + if (argc > 3) + max_len = strtoul(argv[3], NULL, 0); + if (min_len > max_len) + die("min_len > max_len"); + } + + for (i = 0; i < nr; i++) { + size_t len = min_len + (my_random() % (max_len + 1 - min_len)); + + names[i] = xmallocz(len); + while (len > 0) + names[i][--len] = (char)(' ' + (my_random() % ('\x7f' - ' '))); + } + + for (protect_ntfs = 0; protect_ntfs < 2; protect_ntfs++) + for (protect_hfs = 0; protect_hfs < 2; protect_hfs++) { + cumul = 0; + cumul2 = 0; + for (i = 0; i < repetitions; i++) { + begin = getnanotime(); + for (j = 0; j < nr; j++) + verify_path(names[j], file_mode); + end = getnanotime(); + printf("protect_ntfs = %d, protect_hfs = %d: %lfms\n", protect_ntfs, protect_hfs, (end-begin) / (double)1e6); + cumul += end - begin; + cumul2 += (end - begin) * (end - begin); + } + m[protect_ntfs][protect_hfs] = cumul / (double)repetitions; + v[protect_ntfs][protect_hfs] = my_sqrt(cumul2 / (double)repetitions - m[protect_ntfs][protect_hfs] * m[protect_ntfs][protect_hfs]); + printf("mean: %lfms, stddev: %lfms\n", m[protect_ntfs][protect_hfs] / (double)1e6, v[protect_ntfs][protect_hfs] / (double)1e6); + } + + for (protect_ntfs = 0; protect_ntfs < 2; protect_ntfs++) + for (protect_hfs = 0; protect_hfs < 2; protect_hfs++) + printf("ntfs=%d/hfs=%d: %lf%% slower\n", protect_ntfs, protect_hfs, (m[protect_ntfs][protect_hfs] - m[0][0]) * 100 / m[0][0]); + + return 0; +} + int cmd_main(int argc, const char **argv) { if (argc == 3 && !strcmp(argv[1], "normalize_path_copy")) { @@ -290,6 +383,9 @@ int cmd_main(int argc, const char **argv) return !!res; } + if (argc > 1 && !strcmp(argv[1], "protect_ntfs_hfs")) + return !!protect_ntfs_hfs_benchmark(argc - 1, argv + 1); + fprintf(stderr, "%s: unknown function name: %s\n", argv[0], argv[1] ? argv[1] : "(there was none)"); return 1; -- cgit v1.2.3 From 288a74bcd28229a00c3632f18cba92dbfdf73ee9 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 23 Sep 2019 08:58:11 +0200 Subject: is_ntfs_dotgit(): only verify the leading segment The config setting `core.protectNTFS` is specifically designed to work not only on Windows, but anywhere, to allow for repositories hosted on, say, Linux servers to be protected against NTFS-specific attack vectors. As a consequence, `is_ntfs_dotgit()` manually splits backslash-separated paths (but does not do the same for paths separated by forward slashes), under the assumption that the backslash might not be a valid directory separator on the _current_ Operating System. However, the two callers, `verify_path()` and `fsck_tree()`, are supposed to feed only individual path segments to the `is_ntfs_dotgit()` function. This causes a lot of duplicate scanning (and very inefficient scanning, too, as the inner loop of `is_ntfs_dotgit()` was optimized for readability rather than for speed. Let's simplify the design of `is_ntfs_dotgit()` by putting the burden of splitting the paths by backslashes as directory separators on the callers of said function. Consequently, the `verify_path()` function, which already splits the path by directory separators, now treats backslashes as directory separators _explicitly_ when `core.protectNTFS` is turned on, even on platforms where the backslash is _not_ a directory separator. Note that we have to repeat some code in `verify_path()`: if the backslash is not a directory separator on the current Operating System, we want to allow file names like `\`, but we _do_ want to disallow paths that are clearly intended to cause harm when the repository is cloned on Windows. The `fsck_tree()` function (the other caller of `is_ntfs_dotgit()`) now needs to look for backslashes in tree entries' names specifically when `core.protectNTFS` is turned on. While it would be tempting to completely disallow backslashes in that case (much like `fsck` reports names containing forward slashes as "full paths"), this would be overzealous: when `core.protectNTFS` is turned on in a non-Windows setup, backslashes are perfectly valid characters in file names while we _still_ want to disallow tree entries that are clearly designed to exploit NTFS-specific behavior. This simplification will make subsequent changes easier to implement, such as turning `core.protectNTFS` on by default (not only on Windows) or protecting against attack vectors involving NTFS Alternate Data Streams. Incidentally, this change allows for catching malicious repositories that contain tree entries of the form `dir\.gitmodules` already on the server side rather than only on the client side (and previously only on Windows): in contrast to `is_ntfs_dotgit()`, the `is_ntfs_dotgitmodules()` function already expects the caller to split the paths by directory separators. Signed-off-by: Johannes Schindelin --- fsck.c | 11 ++++++++++- path.c | 5 +---- read-cache.c | 8 ++++++++ 3 files changed, 19 insertions(+), 5 deletions(-) diff --git a/fsck.c b/fsck.c index b1579c7e28..d80a96f4be 100644 --- a/fsck.c +++ b/fsck.c @@ -551,7 +551,7 @@ static int fsck_tree(struct tree *item, struct fsck_options *options) while (desc.size) { unsigned mode; - const char *name; + const char *name, *backslash; const struct object_id *oid; oid = tree_entry_extract(&desc, &name, &mode); @@ -565,6 +565,15 @@ static int fsck_tree(struct tree *item, struct fsck_options *options) is_hfs_dotgit(name) || is_ntfs_dotgit(name)); has_zero_pad |= *(char *)desc.buffer == '0'; + + if ((backslash = strchr(name, '\\'))) { + while (backslash) { + backslash++; + has_dotgit |= is_ntfs_dotgit(backslash); + backslash = strchr(backslash, '\\'); + } + } + if (update_tree_entry_gently(&desc)) { retval += report(options, &item->object, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree"); break; diff --git a/path.c b/path.c index 22bd0b6f52..f62a37d5f5 100644 --- a/path.c +++ b/path.c @@ -1342,10 +1342,7 @@ int is_ntfs_dotgit(const char *name) if (only_spaces_and_periods(name, len, 5) && !strncasecmp(name, "git~1", 5)) return 1; - if (name[len] != '\\') - return 0; - name += len + 1; - len = -1; + return 0; } } diff --git a/read-cache.c b/read-cache.c index 5b57b369e8..bde1e70c51 100644 --- a/read-cache.c +++ b/read-cache.c @@ -874,7 +874,15 @@ inside: if ((c == '.' && !verify_dotfile(path, mode)) || is_dir_sep(c) || c == '\0') return 0; + } else if (c == '\\' && protect_ntfs) { + if (is_ntfs_dotgit(path)) + return 0; + if (S_ISLNK(mode)) { + if (is_ntfs_dotgitmodules(path)) + return 0; + } } + c = *path++; } } -- cgit v1.2.3 From 7c3745fc6185495d5765628b4dfe1bd2c25a2981 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 28 Aug 2019 12:22:17 +0200 Subject: path: safeguard `.git` against NTFS Alternate Streams Accesses Probably inspired by HFS' resource streams, NTFS supports "Alternate Data Streams": by appending `:` to the file name, information in addition to the file contents can be written and read, information that is copied together with the file (unless copied to a non-NTFS location). These Alternate Data Streams are typically used for things like marking an executable as having just been downloaded from the internet (and hence not necessarily being trustworthy). In addition to a stream name, a stream type can be appended, like so: `::`. Unless specified, the default stream type is `$DATA` for files and `$INDEX_ALLOCATION` for directories. In other words, `.git::$INDEX_ALLOCATION` is a valid way to reference the `.git` directory! In our work in Git v2.2.1 to protect Git on NTFS drives under `core.protectNTFS`, we focused exclusively on NTFS short names, unaware of the fact that NTFS Alternate Data Streams offer a similar attack vector. Let's fix this. Seeing as it is better to be safe than sorry, we simply disallow paths referring to *any* NTFS Alternate Data Stream of `.git`, not just `::$INDEX_ALLOCATION`. This also simplifies the implementation. This closes CVE-2019-1352. Further reading about NTFS Alternate Data Streams: https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c54dec26-1551-4d3a-a0ea-4fa40f848eb3 Reported-by: Nicolas Joly Signed-off-by: Johannes Schindelin --- path.c | 12 +++++++++++- t/t1014-read-tree-confusing.sh | 1 + 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/path.c b/path.c index f62a37d5f5..e39ecf4689 100644 --- a/path.c +++ b/path.c @@ -1321,10 +1321,19 @@ static int only_spaces_and_periods(const char *path, size_t len, size_t skip) * `.git` is the first item in a directory, therefore it will be associated * with the short name `git~1` (unless short names are disabled). * + * - For yet other historical reasons, NTFS supports so-called "Alternate Data + * Streams", i.e. metadata associated with a given file, referred to via + * `::`. There exists a default stream + * type for directories, allowing `.git/` to be accessed via + * `.git::$INDEX_ALLOCATION/`. + * * When this function returns 1, it indicates that the specified file/directory * name refers to a `.git` file or directory, or to any of these synonyms, and * Git should therefore not track it. * + * For performance reasons, _all_ Alternate Data Streams of `.git/` are + * forbidden, not just `::$INDEX_ALLOCATION`. + * * This function is intended to be used by `git fsck` even on platforms where * the backslash is a regular filename character, therefore it needs to handle * backlash characters in the provided `name` specially: they are interpreted @@ -1335,7 +1344,8 @@ int is_ntfs_dotgit(const char *name) size_t len; for (len = 0; ; len++) - if (!name[len] || name[len] == '\\' || is_dir_sep(name[len])) { + if (!name[len] || name[len] == '\\' || is_dir_sep(name[len]) || + name[len] == ':') { if (only_spaces_and_periods(name, len, 4) && !strncasecmp(name, ".git", 4)) return 1; diff --git a/t/t1014-read-tree-confusing.sh b/t/t1014-read-tree-confusing.sh index 2f5a25d503..da3376b3bb 100755 --- a/t/t1014-read-tree-confusing.sh +++ b/t/t1014-read-tree-confusing.sh @@ -49,6 +49,7 @@ git~1 .git.SPACE .git.{space} .\\\\.GIT\\\\foobar backslashes .git\\\\foobar backslashes2 +.git...:alternate-stream EOF test_expect_success 'utf-8 paths allowed with core.protectHFS off' ' -- cgit v1.2.3 From 3a85dc7d534fc2d410ddc0c771c963b20d1b4857 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 6 Sep 2019 21:09:35 +0200 Subject: is_ntfs_dotgit(): speed it up Previously, this function was written without focusing on speed, intending to make reviewing the code as easy as possible, to avoid any bugs in this critical code. Turns out: we can do much better on both accounts. With this patch, we make it as fast as this developer can make it go: - We avoid the call to `is_dir_sep()` and make all the character comparisons explicit. - We avoid the cost of calling `strncasecmp()` and unroll the test for `.git` and `git~1`, not even using `tolower()` because it is faster to compare against two constant values. - We look for `.git` and `.git~1` first thing, and return early if not found. - We also avoid calling a separate function for detecting chains of spaces and periods. Each of these improvements has a noticeable impact on the speed of `is_ntfs_dotgit()`. Signed-off-by: Johannes Schindelin --- path.c | 55 ++++++++++++++++++++++++++++++------------------------- 1 file changed, 30 insertions(+), 25 deletions(-) diff --git a/path.c b/path.c index 2037e2d8c1..43b16aabd4 100644 --- a/path.c +++ b/path.c @@ -1288,20 +1288,6 @@ int daemon_avoid_alias(const char *p) } } -static int only_spaces_and_periods(const char *path, size_t len, size_t skip) -{ - if (len < skip) - return 0; - len -= skip; - path += skip; - while (len-- > 0) { - char c = *(path++); - if (c != ' ' && c != '.') - return 0; - } - return 1; -} - /* * On NTFS, we need to be careful to disallow certain synonyms of the `.git/` * directory: @@ -1341,19 +1327,38 @@ static int only_spaces_and_periods(const char *path, size_t len, size_t skip) */ int is_ntfs_dotgit(const char *name) { - size_t len; + char c; - for (len = 0; ; len++) - if (!name[len] || name[len] == '\\' || is_dir_sep(name[len]) || - name[len] == ':') { - if (only_spaces_and_periods(name, len, 4) && - !strncasecmp(name, ".git", 4)) - return 1; - if (only_spaces_and_periods(name, len, 5) && - !strncasecmp(name, "git~1", 5)) - return 1; + /* + * Note that when we don't find `.git` or `git~1` we end up with `name` + * advanced partway through the string. That's okay, though, as we + * return immediately in those cases, without looking at `name` any + * further. + */ + c = *(name++); + if (c == '.') { + /* .git */ + if (((c = *(name++)) != 'g' && c != 'G') || + ((c = *(name++)) != 'i' && c != 'I') || + ((c = *(name++)) != 't' && c != 'T')) return 0; - } + } else if (c == 'g' || c == 'G') { + /* git ~1 */ + if (((c = *(name++)) != 'i' && c != 'I') || + ((c = *(name++)) != 't' && c != 'T') || + *(name++) != '~' || + *(name++) != '1') + return 0; + } else + return 0; + + for (;;) { + c = *(name++); + if (!c || c == '\\' || c == '/' || c == ':') + return 1; + if (c != '.' && c != ' ') + return 0; + } } static int is_ntfs_dot_generic(const char *name, -- cgit v1.2.3 From 6d8684161ee9c03bed5cb69ae76dfdddb85a0003 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Fri, 13 Sep 2019 16:32:43 +0200 Subject: mingw: fix quoting of arguments We need to be careful to follow proper quoting rules. For example, if an argument contains spaces, we have to quote them. Double-quotes need to be escaped. Backslashes need to be escaped, but only if they are followed by a double-quote character. We need to be _extra_ careful to consider the case where an argument ends in a backslash _and_ needs to be quoted: in this case, we append a double-quote character, i.e. the backslash now has to be escaped! The current code, however, fails to recognize that, and therefore can turn an argument that ends in a single backslash into a quoted argument that now ends in an escaped double-quote character. This allows subsequent command-line parameters to be split and part of them being mistaken for command-line options, e.g. through a maliciously-crafted submodule URL during a recursive clone. Technically, we would not need to quote _all_ arguments which end in a backslash _unless_ the argument needs to be quoted anyway. For example, `test\` would not need to be quoted, while `test \` would need to be. To keep the code simple, however, and therefore easier to reason about and ensure its correctness, we now _always_ quote an argument that ends in a backslash. This addresses CVE-2019-1350. Signed-off-by: Johannes Schindelin --- compat/mingw.c | 9 ++++++--- t/t7416-submodule-dash-url.sh | 14 ++++++++++++++ 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/compat/mingw.c b/compat/mingw.c index 8b6fa0db44..459ee20df6 100644 --- a/compat/mingw.c +++ b/compat/mingw.c @@ -872,7 +872,7 @@ static const char *quote_arg(const char *arg) p++; len++; } - if (*p == '"') + if (*p == '"' || !*p) n += count*2 + 1; continue; } @@ -894,16 +894,19 @@ static const char *quote_arg(const char *arg) count++; *d++ = *arg++; } - if (*arg == '"') { + if (*arg == '"' || !*arg) { while (count-- > 0) *d++ = '\\'; + /* don't escape the surrounding end quote */ + if (!*arg) + break; *d++ = '\\'; } } *d++ = *arg++; } *d++ = '"'; - *d++ = 0; + *d++ = '\0'; return q; } diff --git a/t/t7416-submodule-dash-url.sh b/t/t7416-submodule-dash-url.sh index 459193c976..2966e93071 100755 --- a/t/t7416-submodule-dash-url.sh +++ b/t/t7416-submodule-dash-url.sh @@ -31,4 +31,18 @@ test_expect_success 'clone rejects unprotected dash' ' test_i18ngrep ignoring err ' +test_expect_success 'trailing backslash is handled correctly' ' + git init testmodule && + test_commit -C testmodule c && + git submodule add ./testmodule && + : ensure that the name ends in a double backslash && + sed -e "s|\\(submodule \"testmodule\\)\"|\\1\\\\\\\\\"|" \ + -e "s|url = .*|url = \" --should-not-be-an-option\"|" \ + <.gitmodules >.new && + mv .new .gitmodules && + git commit -am "Add testmodule" && + test_must_fail git clone --verbose --recurse-submodules . dolly 2>err && + test_i18ngrep ! "unknown option" err +' + test_done -- cgit v1.2.3 From 91bd46588e6959e6903e275f78b10bd07830d547 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Wed, 28 Aug 2019 12:22:17 +0200 Subject: path: also guard `.gitmodules` against NTFS Alternate Data Streams We just safe-guarded `.git` against NTFS Alternate Data Stream-related attack vectors, and now it is time to do the same for `.gitmodules`. Note: In the added regression test, we refrain from verifying all kinds of variations between short names and NTFS Alternate Data Streams: as the new code disallows _all_ Alternate Data Streams of `.gitmodules`, it is enough to test one in order to know that all of them are guarded against. Signed-off-by: Johannes Schindelin --- path.c | 2 +- t/t0060-path-utils.sh | 7 ++++++- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/path.c b/path.c index e39ecf4689..2037e2d8c1 100644 --- a/path.c +++ b/path.c @@ -1369,7 +1369,7 @@ static int is_ntfs_dot_generic(const char *name, only_spaces_and_periods: for (;;) { char c = name[i++]; - if (!c) + if (!c || c == ':') return 1; if (c != ' ' && c != '.') return 0; diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh index 3f3357ed9f..2b8589e921 100755 --- a/t/t0060-path-utils.sh +++ b/t/t0060-path-utils.sh @@ -408,6 +408,9 @@ test_expect_success 'match .gitmodules' ' ~1000000 \ ~9999999 \ \ + .gitmodules:\$DATA \ + "gitmod~4 . :\$DATA" \ + \ --not \ ".gitmodules x" \ ".gitmodules .x" \ @@ -432,7 +435,9 @@ test_expect_success 'match .gitmodules' ' \ GI7EB~1 \ GI7EB~01 \ - GI7EB~1X + GI7EB~1X \ + \ + .gitmodules,:\$DATA ' test_done -- cgit v1.2.3 From 9102f958ee5254b10c0be72672aa3305bf4f4704 Mon Sep 17 00:00:00 2001 From: Johannes Schindelin Date: Mon, 9 Sep 2019 21:04:41 +0200 Subject: protect_ntfs: turn on NTFS protection by default Back in the DOS days, in the FAT file system, file names always consisted of a base name of length 8 plus a file extension of length 3. Shorter file names were simply padded with spaces to the full 8.3 format. Later, the FAT file system was taught to support _also_ longer names, with an 8.3 "short name" as primary file name. While at it, the same facility allowed formerly illegal file names, such as `.git` (empty base names were not allowed), which would have the "short name" `git~1` associated with it. For backwards-compatibility, NTFS supports alternative 8.3 short filenames, too, even if starting with Windows Vista, they are only generated on the system drive by default. We addressed the problem that the `.git/` directory can _also_ be accessed via `git~1/` (when short names are enabled) in 2b4c6efc821 (read-cache: optionally disallow NTFS .git variants, 2014-12-16), i.e. since Git v1.9.5, by introducing the config setting `core.protectNTFS` and enabling it by default on Windows. In the meantime, Windows 10 introduced the "Windows Subsystem for Linux" (short: WSL), i.e. a way to run Linux applications/distributions in a thinly-isolated subsystem on Windows (giving rise to many a "2016 is the Year of Linux on the Desktop" jokes). WSL is getting increasingly popular, also due to the painless way Linux application can operate directly ("natively") on files on Windows' file system: the Windows drives are mounted automatically (e.g. `C:` as `/mnt/c/`). Taken together, this means that we now have to enable the safe-guards of Git v1.9.5 also in WSL: it is possible to access a `.git` directory inside `/mnt/c/` via the 8.3 name `git~1` (unless short name generation was disabled manually). Since regular Linux distributions run in WSL, this means we have to enable `core.protectNTFS` at least on Linux, too. To enable Services for Macintosh in Windows NT to store so-called resource forks, NTFS introduced "Alternate Data Streams". Essentially, these constitute additional metadata that are connected to (and copied with) their associated files, and they are accessed via pseudo file names of the form `filename::`. In a recent patch, we extended `core.protectNTFS` to also protect against accesses via NTFS Alternate Data Streams, e.g. to prevent contents of the `.git/` directory to be "tracked" via yet another alternative file name. While it is not possible (at least by default) to access files via NTFS Alternate Data Streams from within WSL, the defaults on macOS when mounting network shares via SMB _do_ allow accessing files and directories in that way. Therefore, we need to enable `core.protectNTFS` on macOS by default, too, and really, on any Operating System that can mount network shares via SMB/CIFS. A couple of approaches were considered for fixing this: 1. We could perform a dynamic NTFS check similar to the `core.symlinks` check in `init`/`clone`: instead of trying to create a symbolic link in the `.git/` directory, we could create a test file and try to access `.git/config` via 8.3 name and/or Alternate Data Stream. 2. We could simply "flip the switch" on `core.protectNTFS`, to make it "on by default". The obvious downside of 1. is that it won't protect worktrees that were clone with a vulnerable Git version already. We considered patching code paths that check out files to check whether we're running on an NTFS system dynamically and persist the result in the repository-local config setting `core.protectNTFS`, but in the end decided that this solution would be too fragile, and too involved. The obvious downside of 2. is that everybody will have to "suffer" the performance penalty incurred from calling `is_ntfs_dotgit()` on every path, even in setups where. After the recent work to accelerate `is_ntfs_dotgit()` in