summaryrefslogtreecommitdiff
path: root/refs/files-backend.c
AgeCommit message (Collapse)AuthorFilesLines
2021-10-25Merge branch 'jt/no-abuse-alternate-odb-for-submodules'Libravatar Junio C Hamano1-5/+11
Follow through the work to use the repo interface to access submodule objects in-process, instead of abusing the alternate object database interface. * jt/no-abuse-alternate-odb-for-submodules: submodule: trace adding submodule ODB as alternate submodule: pass repo to check_has_commit() object-file: only register submodule ODB if needed merge-{ort,recursive}: remove add_submodule_odb() refs: peeling non-the_repository iterators is BUG refs: teach arbitrary repo support to iterators refs: plumb repo into ref stores
2021-10-11Merge branch 'jk/ref-paranoia'Libravatar Junio C Hamano1-0/+5
The ref iteration code used to optionally allow dangling refs to be shown, which has been tightened up. * jk/ref-paranoia: refs: drop "broken" flag from for_each_fullref_in() ref-filter: drop broken-ref code entirely ref-filter: stop setting FILTER_REFS_INCLUDE_BROKEN repack, prune: drop GIT_REF_PARANOIA settings refs: turn on GIT_REF_PARANOIA by default refs: omit dangling symrefs when using GIT_REF_PARANOIA refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag refs-internal.h: reorganize DO_FOR_EACH_* flag documentation refs-internal.h: move DO_FOR_EACH_* flags next to each other t5312: be more assertive about command failure t5312: test non-destructive repack t5312: create bogus ref as necessary t5312: drop "verbose" helper t5600: provide detached HEAD for corruption failures t5516: don't use HEAD ref for invalid ref-deletion tests t7900: clean up some more broken refs
2021-10-08refs: peeling non-the_repository iterators is BUGLibravatar Jonathan Tan1-2/+3
There is currently no support for peeling the current ref of an iterator iterating over a non-the_repository ref store, and none is needed. Thus, for now, BUG() if that happens. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08refs: teach arbitrary repo support to iteratorsLibravatar Jonathan Tan1-1/+4
Note that should_pack_ref() is called when writing refs, which is only supported for the_repository, hence the_repository is hardcoded there. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-08refs: plumb repo into ref storesLibravatar Jonathan Tan1-2/+4
In preparation for the next 2 patches that adds (partial) support for arbitrary repositories to ref iterators, plumb a repository into all ref stores. There are no changes to program logic. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-06Merge branch 'ab/retire-refs-unused-funcs'Libravatar Junio C Hamano1-3/+3
Code cleanup. * ab/retire-refs-unused-funcs: refs/ref-cache.[ch]: remove "incomplete" from create_dir_entry() refs/ref-cache.c: remove "mkdir" parameter from find_containing_dir() refs/ref-cache.[ch]: remove unused add_ref_entry() refs/ref-cache.[ch]: remove unused remove_entry_from_dir() refs.[ch]: remove unused ref_storage_backend_exists()
2021-10-03Merge branch 'hn/refs-errno-cleanup'Libravatar Junio C Hamano1-23/+126
Futz with the way 'errno' is relied on in the refs API to carry the failure modes up the call chain. * hn/refs-errno-cleanup: refs: make errno output explicit for read_raw_ref_fn refs/files-backend: stop setting errno from lock_ref_oid_basic refs: remove EINVAL errno output from specification of read_raw_ref_fn refs file backend: move raceproof_create_file() here
2021-10-03Merge branch 'ab/refs-files-cleanup'Libravatar Junio C Hamano1-97/+35
Continued work on top of the hn/refs-errno-cleanup topic. * ab/refs-files-cleanup: refs/files: remove unused "errno != ENOTDIR" condition refs/files: remove unused "errno == EISDIR" code refs/files: remove unused "oid" in lock_ref_oid_basic() refs API: remove OID argument to reflog_expire() reflog expire: don't lock reflogs using previously seen OID refs/files: add a comment about refs_reflog_exists() call refs: make repo_dwim_log() accept a NULL oid refs/debug: re-indent argument list for "prepare" refs/files: remove unused "skip" in lock_raw_ref() too refs/files: remove unused "extras/skip" in lock_ref_oid_basic() refs: drop unused "flags" parameter to lock_ref_oid_basic() refs/files: remove unused REF_DELETING in lock_ref_oid_basic() refs/packet: add missing BUG() invocations to reflog callbacks
2021-09-28refs/ref-cache.[ch]: remove "incomplete" from create_dir_entry()Libravatar Ævar Arnfjörð Bjarmason1-3/+3
Remove the now-unused "incomplete" parameter from create_dir_entry(), all its callers specify it as "1", so let's drop the "incomplete=0" case. The last caller to use it was search_for_subdir(), but that code was removed in the preceding commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flagLibravatar Jeff King1-0/+5
When the DO_FOR_EACH_INCLUDE_BROKEN flag is used, we include both actual corrupt refs (illegal names, missing objects), but also symrefs that point to nothing. This latter is not really a corruption, but just something that may happen normally. For example, the symref at refs/remotes/origin/HEAD may point to a tracking branch which is later deleted. (The local HEAD may also be unborn, of course, but we do not access it through ref iteration). Most callers of for_each_ref() etc, do not care. They don't pass INCLUDE_BROKEN, so don't see it at all. But for those which do pass it, this somewhat-normal state causes extra warnings (e.g., from for-each-ref) or even aborts operations (destructive repacks with GIT_REF_PARANOIA set). This patch just introduces the flag and the mechanism; there are no callers yet (and hence no tests). Two things to note on the implementation: - we actually skip any symref that does not resolve to a ref. This includes ones which point to an invalidly-named ref. You could argue this is a more serious breakage than simple dangling. But the overall effect is the same (we could not follow the symref), as well as the impact on things like REF_PARANOIA (either way, a symref we can't follow won't impact reachability, because we'll see the ref itself during iteration). The underlying resolution function doesn't distinguish these two cases (they both get REF_ISBROKEN). - we change the iterator in refs/files-backend.c where we check INCLUDE_BROKEN. There's a matching spot in refs/packed-backend.c, but we don't know need to do anything there. The packed backend does not support symrefs at all. The resulting set of flags might be a bit easier to follow if we broke this down into "INCLUDE_CORRUPT_REFS" and "INCLUDE_DANGLING_SYMREFS". But there are a few reasons not do so: - adding a new OMIT_DANGLING_SYMREFS flag lets us leave existing callers intact, without changing their behavior (and some of them really do want to see the dangling symrefs; e.g., t5505 has a test which expects us to report when a symref becomes dangling) - they're not actually independent. You cannot say "include dangling symrefs" without also including refs whose objects are not reachable, because dangling symrefs by definition do not have an object. We could tweak the implementation to distinguish this, but in practice nobody wants to ask for that. Adding the OMIT flag keeps the implementation simple and makes sure we don't regress the current behavior. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09refs/files-backend: remove unused open mode parameterLibravatar René Scharfe1-1/+1
We only need to provide a mode if we are willing to let open(2) create the file, which is not the case here, so drop the unnecessary parameter. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25refs: make errno output explicit for read_raw_ref_fnLibravatar Han-Wen Nienhuys1-14/+15
This makes it explicit how alternative ref backends should report errors in read_raw_ref_fn. read_raw_ref_fn needs to supply a credible errno for a number of cases. These are primarily: 1) The files backend calls read_raw_ref from lock_raw_ref, and uses the resulting error codes to create/remove directories as needed. 2) ENOENT should be translated in a zero OID, optionally with REF_ISBROKEN set, returning the last successfully resolved symref. This is necessary so read_raw_ref("HEAD") on an empty repo returns refs/heads/main (or the default branch du-jour), and we know on which branch to create the first commit. Make this information flow explicit by adding a failure_errno to the signature of read_raw_ref. All errnos from the files backend are still propagated unchanged, even though inspection suggests only ENOTDIR, EISDIR and ENOENT are relevant. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25refs/files-backend: stop setting errno from lock_ref_oid_basicLibravatar Han-Wen Nienhuys1-9/+2
refs/files-backend.c::lock_ref_oid_basic() tries to signal how it failed to its callers using errno. It is safe to stop setting errno here, because the callers of this file-scope static function are * files_copy_or_rename_ref() * files_create_symref() * files_reflog_expire() None of them looks at errno after seeing a negative return from lock_ref_oid_basic() to make any decision, and no caller of these three functions looks at errno after they signal a failure by returning a negative value. In particular, * files_copy_or_rename_ref() - here, calls are followed by error() (which performs I/O) or write_ref_to_lockfile() (which calls parse_object() which may perform I/O) * files_create_symref() - here, calls are followed by error() or create_symref_locked() (which performs I/O and does not inspect errno) * files_reflog_expire() - here, calls are followed by error() or refs_reflog_exists() (which calls a function in a vtable that is not documented to use and/or preserve errno) Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25refs file backend: move raceproof_create_file() hereLibravatar Ævar Arnfjörð Bjarmason1-0/+109
Move the raceproof_create_file() API added to cache.h and object-file.c in 177978f56ad (raceproof_create_file(): new function, 2017-01-06) to its only user, refs/files-backend.c. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25refs/files: remove unused "errno != ENOTDIR" conditionLibravatar Ævar Arnfjörð Bjarmason1-2/+1
As a follow-up to the preceding commit where we removed the adjacent "errno == EISDIR" condition in the same function, remove the "last_errno != ENOTDIR" condition here. It's not possible for us to hit this condition added in 5b2d8d6f218 (lock_ref_sha1_basic(): improve diagnostics for ref D/F conflicts, 2015-05-11). Since a1c1d8170db (refs_resolve_ref_unsafe: handle d/f conflicts for writes, 2017-10-06) we've explicitly caught these in refs_resolve_ref_unsafe() before returning NULL: if (errno != ENOENT && errno != EISDIR && errno != ENOTDIR) return NULL; We'd then always return the refname from refs_resolve_ref_unsafe() even if we were in a broken state as explained in the preceding commit. The elided context here is a call to refs_resolve_ref_unsafe(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25refs/files: remove unused "errno == EISDIR" codeLibravatar Ævar Arnfjörð Bjarmason1-25/+3
When we lock a reference like "foo" we need to handle the case where "foo" exists, but is an empty directory. That's what this code added in bc7127ef0f (ref locking: allow 'foo' when 'foo/bar' used to exist but not anymore., 2006-09-30) seems like it should be dealing with. Except it doesn't, and we never take this branch. The reason is that when bc7127ef0f was written this looked like: ref = resolve_ref([...]); if (!ref && errno == EISDIR) { [...] And in resolve_ref() we had this code: fd = open(path, O_RDONLY); if (fd < 0) return NULL; I.e. we would attempt to read "foo" with open(), which would fail with EISDIR and we'd return NULL. We'd then take this branch, call remove_empty_directories() and continue. Since a1c1d8170d (refs_resolve_ref_unsafe: handle d/f conflicts for writes, 2017-10-06) we don't. E.g. in the case of files_copy_or_rename_ref() our callstack will look something like: [...] -> files_copy_or_rename_ref() -> lock_ref_oid_basic() -> refs_resolve_ref_unsafe() At that point the first (now only) refs_resolve_ref_unsafe() call in lock_ref_oid_basic() would do the equivalent of this in the resulting call to refs_read_raw_ref() in refs_resolve_ref_unsafe(): /* Via refs_read_raw_ref() */ fd = open(path, O_RDONLY); if (fd < 0) /* get errno == EISDIR */ /* later, in refs_resolve_ref_unsafe() */ if ([...] && errno != EISDIR) return NULL; [...] /* returns the refs/heads/foo to the caller, even though it's a directory */ return refname; I.e. even though we got an "errno == EISDIR" we won't take this branch, since in cases of EISDIR "resolved" is always non-NULL. I.e. we pretend at this point as though everything's OK and there is no "foo" directory. We then proceed with the entire ref update and don't call remove_empty_directories() until we call commit_ref_update(). See 5387c0d883 (commit_ref(): if there is an empty dir in the way, delete it, 2016-05-05) for the addition of that code, and a1c1d8170db (refs_resolve_ref_unsafe: handle d/f conflicts for writes, 2017-10-06) for the commit that changed the original codepath added in bc7127ef0f to use this "EISDIR" handling. Further historical commentary: Before the two preceding commits the caller in files_reflog_expire() was the only one out of our 4 callers that would pass non-NULL as an oid. We would then set a (now gone) "resolve_flags" to "RESOLVE_REF_READING" and just before that "errno != EISDIR" check do: if (resolve_flags & RESOLVE_REF_READING) return NULL; There may have been some case where this ended up mattering and we couldn't safely make this change before we removed the "oid" parameter, but I don't think there was, see [1] for some discussion on that. In any case, now that we've removed the "oid" parameter in a preceding commit we can be sure that this code is redundant, so let's remove it. 1. http://lore.kernel.org/git/871r801yp6.fsf@evledraar.gmail.com Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25refs/files: remove unused "oid" in lock_ref_oid_basic()Libravatar Ævar Arnfjörð Bjarmason1-56/+14
In the preceding commit the last caller that passed a non-NULL OID was changed to pass NULL to lock_ref_oid_basic(). As noted in preceding commits use of this API has been going away (we should use ref transactions, or lock_raw_ref()), so we're unlikely to gain new callers that want to pass the "oid". So let's remove it, doing so means we can remove the "mustexist" condition, and therefore anything except the "flags = RESOLVE_REF_NO_RECURSE" case. Furthermore, since the verify_lock() function we called did most of its work when the "oid" was passed (as "old_oid") we can inline the trivial part of it that remains in its only remaining caller. Without a NULL "oid" passed it was equivalent to calling refs_read_ref_full() followed by oidclr(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25refs API: remove OID argument to reflog_expire()Libravatar Ævar Arnfjörð Bjarmason1-2/+1
Since the the preceding commit the "oid" parameter to reflog_expire() is always NULL, but it was not cleaned up to reduce the size of the diff. Let's do that subsequent API and documentation cleanup now. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25reflog expire: don't lock reflogs using previously seen OIDLibravatar Ævar Arnfjörð Bjarmason1-2/+5
During reflog expiry, the cmd_reflog_expire() function first iterates over all reflogs in logs/*, and then one-by-one acquires the lock for each one and expires it. This behavior has been with us since this command was implemented in 4264dc15e1 ("git reflog expire", 2006-12-19). Change this to stop calling lock_ref_oid_basic() with the OID we saw when we looped over the logs, instead have it pass the OID it managed to lock. This mostly mitigates a race condition where e.g. "git gc" will fail in a concurrently updated repository because the branch moved since "git reflog expire --all" was started. I.e. with: error: cannot lock ref '<refname>': ref '<refname>' is at <OID-A> but expected <OID-B> This behavior of passing in an "oid" was needed for an edge-case that I've untangled in this and preceding commits though, namely that we needed this OID because we'd: 1. Lookup the reflog name/OID via dwim_log() 2. With that OID, lock the reflog 3. Later in builtin/reflog.c we use the OID we looked as input to lookup_commit_reference_gently(), assured that it's equal to the OID we got from dwim_log(). We can be sure that this change is safe to make because between dwim_log (step #1) and lock_ref_oid_basic (step #2) there was no other logic relevant to the OID or expiry run in the cmd_reflog_expire() caller. We can thus treat that code as a black box, before and after this change it would get an OID that's been locked, the only difference is that now we mostly won't be failing to get the lock due to the TOCTOU race[0]. That failure was purely an implementation detail in how the "current OID" was looked up, it was divorced from the locking mechanism. What do we mean with "mostly"? It mostly mitigates it because we'll still run into cases where the ref is locked and being updated as we want to expire it, and other git processes wanting to update the refs will in turn race with us as we expire the reflog. That remaining race can in turn be mitigated with the core.filesRefLockTimeout setting, see 4ff0f01cb7 ("refs: retry acquiring reference locks for 100ms", 2017-08-21). In practice if that value is high enough we'll probably never have ref updates or reflog expiry failing, since the clients involved will retry for far longer than the time any of those operations could take. See [1] for an initial report of how this impacted "git gc" and a large discussion about this change in early 2019. In particular patch looked good to Michael Haggerty, see his[2]. That message seems to not have made it to the ML archive, its content is quoted in full in my [3]. I'm leaving behind now-unused code the refs API etc. that takes the now-NULL "unused_oid" argument, and other code that can be simplified now that we never have on OID in that context, that'll be cleaned up in subsequent commits, but for now let's narrowly focus on fixing the "git gc" issue. As the modified assert() shows we always pass a NULL oid to reflog_expire() now. Unfortunately this sort of probabilistic contention is hard to turn into a test. I've tested this by running the following three subshells in concurrent terminals: ( rm -rf /tmp/git && git init /tmp/git && while true do head -c 10 /dev/urandom | hexdump >/tmp/git/out && git -C /tmp/git add out && git -C /tmp/git commit -m"out" done ) ( rm -rf /tmp/git-clone && git clone file:///tmp/git /tmp/git-clone && while git -C /tmp/git-clone pull do date done ) ( while git -C /tmp/git-clone reflog expire --all do date done ) Before this change the "reflog expire" would fail really quickly with the "but expected" error noted above. After this change both the "pull" and "reflog expire" will run for a while, but eventually fail because I get unlucky with core.filesRefLockTimeout (the "reflog expire" is in a really tight loop). As noted above that can in turn be mitigated with higher values of core.filesRefLockTimeout than the 100ms default. As noted in the commentary added in the preceding commit there's also the case of branches being racily deleted, that can be tested by adding this to the above: ( while git -C /tmp/git-clone branch topic master && git -C /tmp/git-clone branch -D topic do date done ) With core.filesRefLockTimeout set to 10 seconds (it can probably be a lot lower) I managed to run all four of these concurrently for about an hour, and accumulated ~125k commits, auto-gc's and all, and didn't have a single failure. The loops visibly stall while waiting for the lock, but that's expected and desired behavior. 0. https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use 1. https://lore.kernel.org/git/87tvg7brlm.fsf@evledraar.gmail.com/ 2. http://lore.kernel.org/git/b870a17d-2103-41b8-3cbc-7389d5fff33a@alum.mit.edu 3. https://lore.kernel.org/git/87pnqkco8v.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25refs/files: add a comment about refs_reflog_exists() callLibravatar Ævar Arnfjörð Bjarmason1-0/+13
Add a comment about why it is that we need to check for the the existence of a reflog we're deleting after we've successfully acquired the lock in files_reflog_expire(). As noted in [1] the lock protocol for reflogs is somewhat intuitive. This early exit code the comment applies to dates all the way back to 4264dc15e19 (git reflog expire, 2006-12-19). 1. https://lore.kernel.org/git/54DCDA42.2060800@alum.mit.edu/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25refs/files: remove unused "skip" in lock_raw_ref() tooLibravatar Ævar Arnfjörð Bjarmason1-5/+4
Remove the unused "skip" parameter to lock_raw_ref(), it was never used. We do use it when passing "skip" to the refs_rename_ref_available() function in files_copy_or_rename_ref(), but not here. This is part of a larger series that modifies lock_ref_oid_basic() extensively, there will be no more modifications of this function in this series, but since the preceding commit removed this unused parameter from lock_ref_oid_basic(), let's do it here too for consistency. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25refs/files: remove unused "extras/skip" in lock_ref_oid_basic()Libravatar Ævar Arnfjörð Bjarmason1-15/+7
The lock_ref_oid_basic() function has gradually been replaced by use of the file transaction API, there are only 4 remaining callers of it. None of those callers pass non-NULL "extras" and "skip" parameters, the last such caller went away in 92b1551b1d4 (refs: resolve symbolic refs first, 2016-04-25), so let's remove the parameters. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-25refs: drop unused "flags" parameter to lock_ref_oid_basic()Libravatar Jeff King1-7/+6
In the last commit we removed the REF_DELETING flag from lock_ref_oid_basic(). Since then all of the remaining callers do pass REF_NO_DEREF, but that has been ignored completely since 7a418f3a17 (lock_ref_sha1_basic(): only handle REF_NODEREF mode, 2016-04-22). So we can simply get rid of the parameter entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-19refs/files: remove unused REF_DELETING in lock_ref_oid_basic()Libravatar Ævar Arnfjörð Bjarmason1-2/+0
The lock_ref_oid_basic() function has gradually been replaced by most callers no longer performing a low-level "acquire lock, update and release", and instead using the ref transaction API. So there are only 4 remaining callers of lock_ref_oid_basic(). None of those callers pass REF_DELETING anymore, the last caller went away in 92b1551b1d (refs: resolve symbolic refs first, 2016-04-25). Before that we'd refactored and moved this code in: - 8df4e511387 (struct ref_update: move "have_old" into "flags", 2015-02-17) - 7bd9bcf372d (refs: split filesystem-based refs code into a new file, 2015-11-09) - 165056b2fc (lock_ref_for_update(): new function, 2016-04-24) We then finally stopped using it in 92b1551b1d (noted above). So let's remove the handling of this parameter. By itself this change doesn't benefit us much, but it's the start of even more removal of unused code in and around this function in subsequent commits. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-16Merge branch 'wc/packed-ref-removal-cleanup'Libravatar Junio C Hamano1-6/+6
When "git update-ref -d" removes a ref that is packed, it left empty directories under $GIT_DIR/refs/ for * wc/packed-ref-removal-cleanup: refs: cleanup directories when deleting packed ref
2021-05-11refs: cleanup directories when deleting packed refLibravatar Will Chandler1-6/+6
When deleting a packed ref via 'update-ref -d', a lockfile is made in the directory that would contain the loose copy of that ref, creating any directories in the ref's path that do not exist. When the transaction completes, the lockfile is deleted, but any empty parent directories made when creating the lockfile are left in place. These empty directories are not removed by 'pack-refs' or other housekeeping tasks and will accumulate over time. When deleting a loose ref, we remove all empty parent directories at the end of the transaction. This commit applies the parent directory cleanup logic used when deleting loose refs to packed refs as well. Signed-off-by: Will Chandler <wfc@wfchandler.org> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-27hash: provide per-algorithm null OIDsLibravatar brian m. carlson1-1/+1
Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13use CALLOC_ARRAYLibravatar René Scharfe1-5/+5
Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-06refs/files-backend: don't peek into `struct lock_file`Libravatar Martin Ågren1-2/+2
Similar to the previous commits, avoid peeking into the `struct lock_file`. Use the lock file API instead. Note how we obtain the path to the lock file if `fdopen_lock_file()` failed and that this is not a problem: as documented in lockfile.h, failure to "fdopen" does not roll back the lock file and we're free to, e.g., query it for its path. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-08refs: move REF_LOG_ONLY to refs-internal.hLibravatar Han-Wen Nienhuys1-7/+0
REF_LOG_ONLY is used in the transaction preparation: if a symref is involved in a transaction, the referent of the symref should be updated, and the symref itself should only be updated in the reflog. Other ref backends will need to duplicate this logic too, so move it to a central place. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-19refs: move gitdir into base ref_storeLibravatar Han-Wen Nienhuys1-9/+6
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-19refs: split off reading loose ref data in separate functionLibravatar Han-Wen Nienhuys1-15/+19
This prepares for handling FETCH_HEAD (which is not a regular ref) separately from the ref backend. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-31refs: move the logic to add \t to reflog to the files backendLibravatar Han-Wen Nienhuys1-1/+3
523fa69c (reflog: cleanse messages in the refs.c layer, 2020-07-10) centralized reflog normalizaton. However, the normalizaton added a leading "\t" to the message. This is an artifact of the reflog storage format in the files backend, so it should be added there. Routines that parse back the reflog (such as grab_nth_branch_switch) expect the "\t" to not be in the message, so without this fix, git with reftable cannot process the "@{-1}" syntax. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-10reflog: cleanse messages in the refs.c layerLibravatar Junio C Hamano1-1/+1
Regarding reflog messages: - We expect that a reflog message consists of a single line. The file format used by the files backend may add a LF after the message as a delimiter, and output by commands like "git log -g" may complete such an incomplete line by adding a LF at the end, but philosophically, the terminating LF is not a part of the message. - We however allow callers of refs API to supply a random sequence of NUL terminated bytes. We cleanse caller-supplied message by squashing a run of whitespaces into a SP, and by trimming trailing whitespace, before storing the message. This is how we tolerate, instead of erring out, a message with LF in it (be it at the end, in the middle, or both). Currently, the cleansing of the reflog message is done by the files backend, before the log is written out. This is sufficient with the current code, as that is the only backend that writes reflogs. But new backends can be added that write reflogs, and we'd want the resulting log message we would read out of "log -g" the same no matter what backend is used, and moving the code to do so to the generic layer is a way to do so. An added benefit is that the "cleansing" function could be updated later, independent from individual backends, to e.g. allow multi-line log messages if we wanted to, and when that happens, it would help a lot to ensure we covered all bases if the cleansing function (which would be updated) is called from the generic layer. Side note: I am not interested in supporting multi-line reflog messages right at the moment (nobody is asking for it), but I envision that instead of the "squash a run of whitespaces into a SP and rtrim" cleansing, we can %urlencode problematic bytes in the message *AND* append a SP at the end, when a new version of Git that supports multi-line and/or verbatim reflog messages writes a reflog record. The reading side can detect the presense of SP at the end (which should have been rtrimmed out if it were written by existing versions of Git) as a signal that decoding %urlencode recovers the original reflog message. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-30refs: fix segfault when aborting empty transactionLibravatar Patrick Steinhardt1-8/+10
When cleaning up a transaction that has no updates queued, then the transaction's backend data will not have been allocated. We correctly handle this for the packed backend, where the cleanup function checks whether the backend data has been allocated at all -- if not, then there is nothing to clean up. For the files backend we do not check this and as a result will hit a segfault due to dereferencing a `NULL` pointer when cleaning up such a transaction. Fix the issue by checking whether `backend_data` is set in the files backend, too. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-31C: use skip_prefix() to avoid hardcoded string lengthLibravatar Junio C Hamano1-2/+1
We often skip an optional prefix in a string with a hardcoded constant, e.g. if (starts_with(string, "prefix")) string += 6; which is less error prone when written skip_prefix(string, "prefix", &string); Note that this changes a few error messages from "git reflog expire --expire=nonsense.timestamp", which used to complain by saying '--expire=nonsense.timestamp' is not a valid timestamp but with this change, we say 'nonsense.timestamp' is not a valid timestamp which is more technically correct (the string with --expire= as a prefix obviously cannot be a valid timestamp, but the error is about the part of the input without that prefix). Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-12-11refs: pass NULL to refs_read_ref_full() because object ID is not neededLibravatar René Scharfe1-2/+2
refs_read_ref_full() wraps refs_resolve_ref_unsafe(), which handles a NULL oid pointer of callers not interested in the resolved object ID. Pass NULL from files_copy_or_rename_ref() to clarify that it is one such caller. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-07dir-iterator: release strbuf after useLibravatar René Scharfe1-1/+3
Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-11dir-iterator: add flags parameter to dir_iterator_beginLibravatar Matheus Tavares1-1/+1
Add the possibility of giving flags to dir_iterator_begin to initialize a dir-iterator with special options. Currently possible flags are: - DIR_ITERATOR_PEDANTIC, which makes dir_iterator_advance abort immediately in the case of an error, instead of keep looking for the next valid entry; - DIR_ITERATOR_FOLLOW_SYMLINKS, which makes the iterator follow symlinks and include linked directories' contents in the iteration. These new flags will be used in a subsequent patch. Also add tests for the flags' usage and adjust refs/files-backend.c to the new dir_iterator_begin signature. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-11dir-iterator: refactor state machine modelLibravatar Matheus Tavares1-4/+13
dir_iterator_advance() is a large function with two nested loops. Let's improve its readability factoring out three functions and simplifying its mechanics. The refactored model will no longer depend on level.initialized and level.dir_state to keep track of the iteration state and will perform on a single loop. Also, dir_iterator_begin() currently does not check if the given string represents a valid directory path. Since the refactored model will have to stat() the given path at initialization, let's also check for this kind of error and make dir_iterator_begin() return NULL, on failures, with errno appropriately set. And add tests for this new behavior. Improve documentation at dir-iteration.h and code comments at dir-iterator.c to reflect the changes and eliminate possible ambiguities. Finally, adjust refs/files-backend.c to check for now possible dir_iterator_begin() failures. Original-patch-by: Daniel Ferreira <bnmvco@gmail.com> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-16Merge branch 'jk/refs-double-abort'Libravatar Junio C Hamano1-1/+15
A corner case bug in the refs API has been corrected. * jk/refs-double-abort: refs/files-backend: don't look at an aborted transaction refs/files-backend: handle packed transaction prepare failure
2019-04-10Merge branch 'nd/rewritten-ref-is-per-worktree'Libravatar Junio C Hamano1-22/+28
"git rebase" uses the refs/rewritten/ hierarchy to store its intermediate states, which inherently makes the hierarchy per worktree, but it didn't quite work well. * nd/rewritten-ref-is-per-worktree: Make sure refs/rewritten/ is per-worktree files-backend.c: reduce duplication in add_per_worktree_entries_to_dir() files-backend.c: factor out per-worktree code in loose_fill_ref_dir()
2019-03-22refs/files-backend: don't look at an aborted transactionLibravatar Jeff King1-1/+5
When deleting refs, we hold packed-refs.lock and prepare a packed transaction to drop the refs from the packed-refs file. If it turns out that we don't need to rewrite the packed refs (e.g., because none of the deletions were present in the file), then we abort the transaction. If that abort succeeds, then the transaction struct will have been freed, and we set our local pointer to NULL so we don't look at it again. However, if it fails, then the struct will _still_ have been freed (because ref_transaction_abort() always frees). But we don't clean up the pointer, and will jump to our cleanup code, which will try to abort it again, causing a use-after-free. It's actually impossible for this to trigger in practice, since packed_transaction_abort() will never return anything but success. But let's fix it anyway, since that's more than we should assume about the packed-refs code (after all, we are already bothering to check for an error result which cannot be triggered). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-22refs/files-backend: handle packed transaction prepare failureLibravatar Jeff King1-0/+10
In files_transaction_prepare(), if we have to delete some refs, we use a subordinate packed_transaction to do so. It's rare for that sub-transaction's prepare step to fail, since we hold the packed-refs lock. But if it does, we trigger a BUG() due to these steps: - we've attached the packed transaction to the files transaction as backend_data->packed_transaction - when the prepare step fails, the packed transaction cleans itself up, putting itself into the CLOSED state - the error value from preparing the packed transaction lets us know in files_transaction_prepare() that we should also clean up and return an error. We call files_transaction_cleanup(), which tries to abort backend_data->packed_transaction. Since it's already CLOSED, that triggers an assertion in ref_transaction_abort(). We can fix that by disconnecting the packed transaction from the outer files transaction, and then free-ing (not aborting!) it ourselves. A few other options/alternatives I considered: - we could just make it a noop to abort a CLOSED transaction. But that seems less safe, since clearly this code expects (and enforces) a particular set of state transitions. - we could have files_transaction_cleanup() selectively call abort() vs free() based on the state of the on the packed transaction. That's basically a more restricted version of the above, but also potentially unsafe. - instead of disconnecting backend_data->packed_transaction on error, we could wait to install it until we successfully prepare. That might make the flow a little simpler, but it introduces a hassle. Earlier parts of files_transaction_prepare() that encounter an error will jump to the cleanup label, and expect that cleaning up the outer transaction will clean up the packed transaction, too. We'd have to adjust those sites to clean up the packed transaction. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-08Make sure refs/rewritten/ is per-worktreeLibravatar Nguyễn Thái Ngọc Duy1-2/+2
a9be29c981 (sequencer: make refs generated by the `label` command worktree-local, 2018-04-25) adds refs/rewritten/ as per-worktree reference space. Unfortunately (my bad) there are a couple places that need update to make sure it's really per-worktree. - add_per_worktree_entries_to_dir() is updated to make sure ref listing look at per-worktree refs/rewritten/ instead of per-repo one [1] - common_list[] is updated so that git_path() returns the correct location. This includes "rev-parse --git-path". This mess is created by me. I started trying to fix it with the introduction of refs/worktree, where all refs will be per-worktree without special treatments. Unfortunate refs/rewritten came before refs/worktree so this is all we can do. This also fixes logs/refs/worktree not being per-worktree. [1] note that ref listing still works sometimes. For example, if you have .git/worktrees/foo/refs/rewritten/bar AND the directory .git/worktrees/refs/rewritten, refs/rewritten/bar will show up. add_per_worktree_entries_to_dir() is only needed when the directory .git/worktrees/refs/rewritten is missing. Reported-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-08files-backend.c: reduce duplication in add_per_worktree_entries_to_dir()Libravatar Nguyễn Thái Ngọc Duy1-11/+11
This function is duplicated to handle refs/bisect/ and refs/worktree/ and a third prefix is coming. Time to clean up. This also fixes incorrect "refs/worktrees/" length in this code. The correct length is 14 not 11. The test in the next patch will also cover this. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-08files-backend.c: factor out per-worktree code in loose_fill_ref_dir()Libravatar Nguyễn Thái Ngọc Duy1-22/+28
This is the first step for further cleaning up and extending this function. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-14files-backend: drop refs parameter from split_symref_update()Libravatar Jeff King1-3/+2
This parameter was added in fcc42ea0c9 (split_symref_update(): add a files_ref_store argument, 2016-09-04) without comment, but never used. The splitting is purely mechanical, and doesn't depend on the particular ref-store. Let's drop this parameter in the name of simplicity. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-26files-backend.c: fix build error on SolarisLibravatar Nguyễn Thái Ngọc Duy1-1/+2
This function files_reflog_path returns void, which usually means "return;" not returning "void value" from another function. Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-22refs: new ref types to make per-worktree refs visible to all worktreesLibravatar Nguyễn Thái Ngọc Duy1-0/+28
One of the problems with multiple worktree is accessing per-worktree refs of one worktree from another worktree. This was sort of solved by multiple ref store, where the code can open the ref store of another worktree and has access to the ref space of that worktree. The problem with this is reporting. "HEAD" in another ref space is also called "HEAD" like in the current ref space. In order to differentiate them, all the code must somehow carry the ref store around and print something like "HEAD from this ref store". But that is not feasible (or possible with a _lot_ of work). With the current design, we pass a reference around as a string (so called "refname"). Extending this design to pass a string _and_ a ref store is a nightmare, especially when handling extended SHA-1 syntax. So we do it another way. Instead of entering a separate ref space, we make refs from other worktrees available in the current ref space. So "HEAD" is always HEAD of the current worktree, but then we can have "worktrees/blah/HEAD" to denote HEAD from a worktree named "blah". This syntax coincidentally matches the underlying directory structure which makes implementation a bit easier. The main worktree has to be treated specially because well... it's special from the beginning. So HEAD from the main worktree is acccessible via the name "main-worktree/HEAD" instead of "worktrees/main/HEAD" because "main" could be just another secondary worktree. This patch also makes it possible to specify refs from one worktree in another one, e.g. git log worktrees/foo/HEAD Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>