From 5180dd2e9ffb46eb02ee24439f2cd24a534d2954 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?= Date: Fri, 27 Jul 2018 14:37:12 +0000 Subject: config doc: don't describe *.fetchObjects twice MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Refer readers of fetch.fsckObjects and receive.fsckObjects to transfer.fsckObjects instead of repeating the description at each location. I don't think this description of them makes much sense, but for now I'm just moving the existing documentation around. Making it better will be done in a later patch. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano --- Documentation/config.txt | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) (limited to 'Documentation') diff --git a/Documentation/config.txt b/Documentation/config.txt index 43b2de7b5f..6b99cf8d71 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1463,10 +1463,9 @@ fetch.recurseSubmodules:: fetch.fsckObjects:: If it is set to true, git-fetch-pack will check all fetched - objects. It will abort in the case of a malformed object or a - broken link. The result of an abort are only dangling objects. - Defaults to false. If not set, the value of `transfer.fsckObjects` - is used instead. + objects. See `transfer.fsckObjects` for what's + checked. Defaults to false. If not set, the value of + `transfer.fsckObjects` is used instead. fetch.unpackLimit:: If the number of objects fetched over the Git native @@ -2889,10 +2888,9 @@ receive.certNonceSlop:: receive.fsckObjects:: If it is set to true, git-receive-pack will check all received - objects. It will abort in the case of a malformed object or a - broken link. The result of an abort are only dangling objects. - Defaults to false. If not set, the value of `transfer.fsckObjects` - is used instead. + objects. See `transfer.fsckObjects` for what's checked. + Defaults to false. If not set, the value of + `transfer.fsckObjects` is used instead. receive.fsck.:: When `receive.fsckObjects` is set to true, errors can be switched @@ -3389,6 +3387,10 @@ transfer.fsckObjects:: When `fetch.fsckObjects` or `receive.fsckObjects` are not set, the value of this variable is used instead. Defaults to false. ++ +When set, the fetch or receive will abort in the case of a malformed +object or a broken link. The result of an abort are only dangling +objects. transfer.hideRefs:: String(s) `receive-pack` and `upload-pack` use to decide which -- cgit v1.2.3 From b2558abdc471758ae8b30c1198e833a7a1c6616c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?= Date: Fri, 27 Jul 2018 14:37:13 +0000 Subject: config doc: unify the description of fsck.* and receive.fsck.* MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The documentation for the fsck. and receive.fsck. variables was mostly duplicated in two places, with fsck. making no mention of the corresponding receive.fsck., and the same for fsck.skipList. I spent quite a lot of time today wondering why setting the fsck. variant wasn't working to clone a legacy repository (not that that would have worked anyway, but a subsequent patch implements fetch.fsck.). Rectify this situation by describing the feature in general terms under the fsck.* documentation, and make the receive.fsck.* documentation refer to those variables instead. This documentation was initially added in 2becf00ff7 ("fsck: support demoting errors to warnings", 2015-06-22) and 4b55b9b479 ("fsck: document the new receive.fsck. options", 2015-06-22). Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano --- Documentation/config.txt | 62 +++++++++++++++++++++++++++--------------------- 1 file changed, 35 insertions(+), 27 deletions(-) (limited to 'Documentation') diff --git a/Documentation/config.txt b/Documentation/config.txt index 6b99cf8d71..8d08250a5b 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1595,15 +1595,30 @@ filter..smudge:: linkgit:gitattributes[5] for details. fsck.:: - Allows overriding the message type (error, warn or ignore) of a - specific message ID such as `missingEmail`. -+ -For convenience, fsck prefixes the error/warning with the message ID, -e.g. "missingEmail: invalid author/committer line - missing email" means -that setting `fsck.missingEmail = ignore` will hide that issue. -+ -This feature is intended to support working with legacy repositories -which cannot be repaired without disruptive changes. + During fsck git may find issues with legacy data which + wouldn't be generated by current versions of git, and which + wouldn't be sent over the wire if `transfer.fsckObjects` was + set. This feature is intended to support working with legacy + repositories containing such data. ++ +Setting `fsck.` will be picked up by linkgit:git-fsck[1], but +to accept pushes of such data set `receive.fsck.` instead. ++ +The rest of the documentation discusses `fsck.*` for brevity, but the +same applies for the corresponding `receive.fsck.*` variables. ++ +When `fsck.` is set, errors can be switched to warnings and +vice versa by configuring the `fsck.` setting where the +`` is the fsck message ID and the value is one of `error`, +`warn` or `ignore`. For convenience, fsck prefixes the error/warning +with the message ID, e.g. "missingEmail: invalid author/committer line +- missing email" means that setting `fsck.missingEmail = ignore` will +hide that issue. ++ +In general, it is better to enumerate existing objects with problems +with `fsck.skipList`, instead of listing the kind of breakages these +problematic objects share to be ignored, as doing the latter will +allow new instances of the same breakages go unnoticed. fsck.skipList:: The path to a sorted list of object names (i.e. one SHA-1 per @@ -1612,6 +1627,9 @@ fsck.skipList:: should be accepted despite early commits containing errors that can be safely ignored such as invalid committer email addresses. Note: corrupt objects cannot be skipped with this setting. ++ +Like `fsck.` this variable has a corresponding +`receive.fsck.skipList` variant. gc.aggressiveDepth:: The depth parameter used in the delta compression @@ -2893,26 +2911,16 @@ receive.fsckObjects:: `transfer.fsckObjects` is used instead. receive.fsck.:: - When `receive.fsckObjects` is set to true, errors can be switched - to warnings and vice versa by configuring the `receive.fsck.` - setting where the `` is the fsck message ID and the value - is one of `error`, `warn` or `ignore`. For convenience, fsck prefixes - the error/warning with the message ID, e.g. "missingEmail: invalid - author/committer line - missing email" means that setting - `receive.fsck.missingEmail = ignore` will hide that issue. -+ -This feature is intended to support working with legacy repositories -which would not pass pushing when `receive.fsckObjects = true`, allowing -the host to accept repositories with certain known issues but still catch -other issues. + Acts like `fsck.`, but is used by + linkgit:git-receive-pack[1] instead of + linkgit:git-fsck[1]. See the `fsck.` documentation for + details. receive.fsck.skipList:: - The path to a sorted list of object names (i.e. one SHA-1 per - line) that are known to be broken in a non-fatal way and should - be ignored. This feature is useful when an established project - should be accepted despite early commits containing errors that - can be safely ignored such as invalid committer email addresses. - Note: corrupt objects cannot be skipped with this setting. + Acts like `fsck.skipList`, but is used by + linkgit:git-receive-pack[1] instead of + linkgit:git-fsck[1]. See the `fsck.skipList` documentation for + details. receive.keepAlive:: After receiving the pack from the client, `receive-pack` may -- cgit v1.2.3 From 456bab87b2e5e698f1024df7a79a76627b4bdb32 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?= Date: Fri, 27 Jul 2018 14:37:14 +0000 Subject: config doc: elaborate on what transfer.fsckObjects does MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The existing documentation led the user to believe that all we were doing were basic reachability sanity checks, but that hasn't been true for a very long time. Update the description to match reality, and note the caveat that there's a quarantine for accepting pushes, but not for fetching. Also mention that the fsck checks for security issues, which was my initial motivation for writing this fetch.fsck.* series. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano --- Documentation/config.txt | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) (limited to 'Documentation') diff --git a/Documentation/config.txt b/Documentation/config.txt index 8d08250a5b..291b4f3c57 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -3397,8 +3397,17 @@ transfer.fsckObjects:: Defaults to false. + When set, the fetch or receive will abort in the case of a malformed -object or a broken link. The result of an abort are only dangling -objects. +object or a link to a nonexistent object. In addition, various other +issues are checked for, including legacy issues (see `fsck.`), +and potential security issues like the existence of a `.GIT` directory +or a malicious `.gitmodules` file (see the release notes for v2.2.1 +and v2.17.1 for details). Other sanity and security checks may be +added in future releases. ++ +On the receiving side, failing fsckObjects will make those objects +unreachable, see "QUARANTINE ENVIRONMENT" in +linkgit:git-receive-pack[1]. On the fetch side, malformed objects will +instead be left unreferenced in the repository. transfer.hideRefs:: String(s) `receive-pack` and `upload-pack` use to decide which -- cgit v1.2.3 From 720dae5a19074e404224c9051152d7c64ba2acf0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?= Date: Fri, 27 Jul 2018 14:37:15 +0000 Subject: config doc: elaborate on fetch.fsckObjects security MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Change the transfer.fsckObjects documentation to explicitly note the unique security and/or corruption issues fetch.fsckObjects suffers from, since it doesn't have a quarantine environment. This was already alluded to in the existing documentation, but let's spell it out so there's no confusion here, and give a concrete example of how to work around this limitation. Let's also prominently note that this is considered to be a limitation of the current implementation, rather than something that's intended and by design, since we might change this in the future. See https://public-inbox.org/git/20180531060259.GE17344@sigill.intra.peff.net/ for further details. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano --- Documentation/config.txt | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) (limited to 'Documentation') diff --git a/Documentation/config.txt b/Documentation/config.txt index 291b4f3c57..7ff453c53b 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -3408,6 +3408,27 @@ On the receiving side, failing fsckObjects will make those objects unreachable, see "QUARANTINE ENVIRONMENT" in linkgit:git-receive-pack[1]. On the fetch side, malformed objects will instead be left unreferenced in the repository. ++ +Due to the non-quarantine nature of the `fetch.fsckObjects` +implementation it can not be relied upon to leave the object store +clean like `receive.fsckObjects` can. ++ +As objects are unpacked they're written to the object store, so there +can be cases where malicious objects get introduced even though the +"fetch" failed, only to have a subsequent "fetch" succeed because only +new incoming objects are checked, not those that have already been +written to the object store. That difference in behavior should not be +relied upon. In the future, such objects may be quarantined for +"fetch" as well. ++ +For now, the paranoid need to find some way to emulate the quarantine +environment if they'd like the same protection as "push". E.g. in the +case of an internal mirror do the mirroring in two steps, one to fetch +the untrusted objects, and then do a second "push" (which will use the +quarantine) to another internal repo, and have internal clients +consume this pushed-to repository, or embargo internal fetches and +only allow them once a full "fsck" has run (and no new fetches have +happened in the meantime). transfer.hideRefs:: String(s) `receive-pack` and `upload-pack` use to decide which -- cgit v1.2.3 From 1362df0d41311beac8c778030ad0e8dcc2ef9af2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?= Date: Fri, 27 Jul 2018 14:37:17 +0000 Subject: fetch: implement fetch.fsck.* MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Implement support for fetch.fsck.* corresponding with the existing receive.fsck.*. This allows for pedantically cloning repositories with specific issues without turning off fetch.fsckObjects. One such repository is https://github.com/robbyrussell/oh-my-zsh.git which before this change will emit this error when cloned with fetch.fsckObjects: error: object 2b7227859263b6aabcc28355b0b994995b7148b6: zeroPaddedFilemode: contains zero-padded file modes fatal: Error in object fatal: index-pack failed Now with fetch.fsck.zeroPaddedFilemode=warn we'll warn about that issue, but the clone will succeed: warning: object 2b7227859263b6aabcc28355b0b994995b7148b6: zeroPaddedFilemode: contains zero-padded file modes warning: object a18c4d13c2a5fa2d4ecd5346c50e119b999b807d: zeroPaddedFilemode: contains zero-padded file modes warning: object 84df066176c8da3fd59b13731a86d90f4f1e5c9d: zeroPaddedFilemode: contains zero-padded file modes The motivation for this is to be able to turn on fetch.fsckObjects globally across a fleet of computers but still be able to manually clone various legacy repositories by either white-listing specific issues, or better yet whitelist specific objects. The use of --git-dir=* instead of -C in the tests could be considered somewhat archaic, but the tests I'm adding here are duplicating the corresponding receive.* tests with as few changes as possible. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano --- Documentation/config.txt | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) (limited to 'Documentation') diff --git a/Documentation/config.txt b/Documentation/config.txt index 7ff453c53b..8dace49daa 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1467,6 +1467,16 @@ fetch.fsckObjects:: checked. Defaults to false. If not set, the value of `transfer.fsckObjects` is used instead. +fetch.fsck.:: + Acts like `fsck.`, but is used by + linkgit:git-fetch-pack[1] instead of linkgit:git-fsck[1]. See + the `fsck.` documentation for details. + +fetch.fsck.skipList:: + Acts like `fsck.skipList`, but is used by + linkgit:git-fetch-pack[1] instead of linkgit:git-fsck[1]. See + the `fsck.skipList` documentation for details. + fetch.unpackLimit:: If the number of objects fetched over the Git native transfer is below this @@ -1602,10 +1612,12 @@ fsck.:: repositories containing such data. + Setting `fsck.` will be picked up by linkgit:git-fsck[1], but -to accept pushes of such data set `receive.fsck.` instead. +to accept pushes of such data set `receive.fsck.` instead, or +to clone or fetch it set `fetch.fsck.`. + The rest of the documentation discusses `fsck.*` for brevity, but the -same applies for the corresponding `receive.fsck.*` variables. +same applies for the corresponding `receive.fsck.*` and +`fetch..*`. variables. + When `fsck.` is set, errors can be switched to warnings and vice versa by configuring the `fsck.` setting where the @@ -1628,8 +1640,8 @@ fsck.skipList:: can be safely ignored such as invalid committer email addresses. Note: corrupt objects cannot be skipped with this setting. + -Like `fsck.` this variable has a corresponding -`receive.fsck.skipList` variant. +Like `fsck.` this variable has corresponding +`receive.fsck.skipList` and `fetch.fsck.skipList` variants. gc.aggressiveDepth:: The depth parameter used in the delta compression -- cgit v1.2.3 From d786da1cd9c43b6747c78811a4b6ab2028dfaf50 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?= Date: Fri, 27 Jul 2018 14:37:18 +0000 Subject: fsck: test & document {fetch,receive}.fsck.* config fallback MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Test and document that the {fetch,receive}.fsck.* family of variables doesn't fall back on the corresponding .fsck.* variables. This was alluded to in the existing documentation by saying that "receive" looks at receive.fsck.* and "fsck" looks at fsck.* etc., but it wasn't explicitly stated that there was no fallback, and if you'd e.g. like to configure the skipList you need to do that for all three. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano --- Documentation/config.txt | 12 ++++++++++++ 1 file changed, 12 insertions(+) (limited to 'Documentation') diff --git a/Documentation/config.txt b/Documentation/config.txt index 8dace49daa..57c463c6e2 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1619,6 +1619,12 @@ The rest of the documentation discusses `fsck.*` for brevity, but the same applies for the corresponding `receive.fsck.*` and `fetch..*`. variables. + +Unlike variables like `color.ui` and `core.editor` the +`receive.fsck.` and `fetch.fsck.` variables will not +fall back on the `fsck.` configuration if they aren't set. To +uniformly configure the same fsck settings in different circumstances +all three of them they must all set to the same values. ++ When `fsck.` is set, errors can be switched to warnings and vice versa by configuring the `fsck.` setting where the `` is the fsck message ID and the value is one of `error`, @@ -1642,6 +1648,12 @@ fsck.skipList:: + Like `fsck.` this variable has corresponding `receive.fsck.skipList` and `fetch.fsck.skipList` variants. ++ +Unlike variables like `color.ui` and `core.editor` the +`receive.fsck.skipList` and `fetch.fsck.skipList` variables will not +fall back on the `fsck.skipList` configuration if they aren't set. To +uniformly configure the same fsck settings in different circumstances +all three of them they must all set to the same values. gc.aggressiveDepth:: The depth parameter used in the delta compression -- cgit v1.2.3 From 8a6d0525b74fe07bde0436e4cbf87b23adf7df0a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?= Date: Fri, 27 Jul 2018 14:37:20 +0000 Subject: fsck: test and document unknown fsck. values MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When fsck. is set to an unknown value it'll cause "fsck" to die, but the same is not true of the "fetch" and "receive" variants. Document this and test for it. Signed-off-by: Ævar Arnfjörð Bjarmason Signed-off-by: Junio C Hamano --- Documentation/config.txt | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'Documentation') diff --git a/Documentation/config.txt b/Documentation/config.txt index 57c463c6e2..4cead6119a 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1637,6 +1637,10 @@ In general, it is better to enumerate existing objects with problems with `fsck.skipList`, instead of listing the kind of breakages these problematic objects share to be ignored, as doing the latter will allow new instances of the same breakages go unnoticed. ++ +Setting an unknown `fsck.` value will cause fsck to die, but +doing the same for `receive.fsck.` and `fetch.fsck.` +will only cause git to warn. fsck.skipList:: The path to a sorted list of object names (i.e. one SHA-1 per -- cgit v1.2.3