summaryrefslogtreecommitdiff
path: root/fsmonitor.c
AgeCommit message (Collapse)AuthorFilesLines
2021-01-23fsmonitor: de-duplicate BUG()s around dirty bitsLibravatar Derrick Stolee1-14/+13
The index has an fsmonitor_dirty bitmap that records which index entries are "dirty" based on the response from the FSMonitor. If this bitmap ever grows larger than the index, then there was an error in how it was constructed, and it was probably a developer's bug. There are several BUG() statements that are very similar, so replace these uses with a simpler assert_index_minimum(). Since there is one caller that uses a custom 'pos' value instead of the bit_size member, we cannot simplify it too much. However, the error string is identical in each, so this simplifies things. Be sure to add one when checking if a position if valid, since the minimum is a bound on the expected size. The end result is that the code is simpler to read while also preserving these assertions for developers in the FSMonitor space. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-10Merge branch 'jk/strvec'Libravatar Junio C Hamano1-3/+3
The argv_array API is useful for not just managing argv but any "vector" (NULL-terminated array) of strings, and has seen adoption to a certain degree. It has been renamed to "strvec" to reduce the barrier to adoption. * jk/strvec: strvec: rename struct fields strvec: drop argv_array compatibility layer strvec: update documention to avoid argv_array strvec: fix indentation in renamed calls strvec: convert remaining callers away from argv_array name strvec: convert more callers away from argv_array name strvec: convert builtin/ callers away from argv_array name quote: rename sq_dequote_to_argv_array to mention strvec strvec: rename files from argv-array to strvec argv-array: rename to strvec argv-array: use size_t for count and alloc
2020-07-28strvec: convert more callers away from argv_array nameLibravatar Jeff King1-3/+3
We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts remaining files from the first half of the alphabet, to keep the diff to a manageable size. The conversion was done purely mechanically with: git ls-files '*.c' '*.h' | xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' and then selectively staging files with "git add '[abcdefghjkl]*'". We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-28Remove doubled words in various commentsLibravatar Elijah Newren1-1/+1
Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-13fsmonitor: handle version 2 of the hooks that will use opaque tokenLibravatar Kevin Willford1-11/+64
Some file monitors like watchman will use something other than a timestamp to keep better track of what changes happen in between calls to query the fsmonitor. The clockid in watchman is a string. Now that the index is storing an opaque token for the last update the code needs to be updated to pass that opaque token to a verion 2 of the fsmonitor hook. Because there are repos that already have version 1 of the hook and we want them to continue to work when git is updated, we need to handle both version 1 and version 2 of the hook. In order to do that a config value is being added core.fsmonitorHookVersion to force what version of the hook should be used. When this is not set it will default to -1 and then the code will attempt to call version 2 of the hook first. If that fails it will fallback to trying version 1. Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-13fsmonitor: change last update timestamp on the index_state to opaque tokenLibravatar Kevin Willford1-17/+32
Some file system monitors might not use or take a timestamp for processing and in the case of watchman could have race conditions with using a timestamp. Watchman uses something called a clockid that is used for race free queries to it. The clockid for watchman is simply a string. Change the fsmonitor_last_update from being a uint64_t to a char pointer so that any arbitrary data can be stored in it and passed back to the fsmonitor. Signed-off-by: Kevin Willford <Kevin.Willford@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-21unpack-trees: skip stat on fsmonitor-valid filesLibravatar Utsav Shah1-6/+17
The index might be aware that a file hasn't modified via fsmonitor, but unpack-trees did not pay attention to it and checked via ie_match_stat which can be inefficient on certain filesystems. This significantly slows down commands that run oneway_merge, like checkout and reset --hard. This patch makes oneway_merge check whether a file is considered unchanged through fsmonitor and skips ie_match_stat on it. unpack-trees also now correctly copies over fsmonitor validity state from the source index. Finally, for correctness, we force a refresh of fsmonitor state in tweak_fsmonitor. After this change, commands like stash (that use reset --hard internally) go from 8s or more to ~2s on a 250k file repository on a mac. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Kevin Willford <Kevin.Willford@microsoft.com> Signed-off-by: Utsav Shah <utsav@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-13fsmonitor: do not compare bitmap size with size of split indexLibravatar Junio C Hamano1-2/+4
3444ec2e ("fsmonitor: don't fill bitmap with entries to be removed", 2019-10-11) added a handful of sanity checks that make sure that a bit position in fsmonitor bitmap does not go beyond the end of the index. As each bit in the bitmap corresponds to a path in the index, this is the right check most of the time. Except for the case when we are in the split-index mode and looking at a delta index that is to be overlayed on the base index but before the base index has actually been merged in, namely in read_ and write_fsmonitor_extension(). In these codepaths, the entries in the split/delta index is typically a small subset of the entire set of paths (otherwise why would we be using split-index?), so the bitmap used by the fsmonitor is almost always larger than the number of entries in the partial index, and the incorrect comparison would trigger the BUG(). Reported-by: Utsav Shah <ukshah2@illinois.edu> Helped-by: Kevin Willford <Kevin.Willford@microsoft.com> Helped-by: William Baker <William.Baker@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-10-12fsmonitor: don't fill bitmap with entries to be removedLibravatar William Baker1-5/+24
While doing some testing with fsmonitor enabled I found that git commands would segfault after staging and unstaging an untracked file. Looking at the crash it appeared that fsmonitor_ewah_callback was attempting to adjust bits beyond the bounds of the index cache. Digging into how this could happen it became clear that the fsmonitor extension must have been written with more bits than there were entries in the index. The root cause ended up being that fill_fsmonitor_bitmap was populating fsmonitor_dirty with bits for all entries in the index, even those that had been marked for removal. To solve this problem fill_fsmonitor_bitmap has been updated to skip entries with the the CE_REMOVE flag set. With this change the bits written for the fsmonitor extension will be consistent with the index entries written by do_write_index. Additionally, BUG checks have been added to detect if the number of bits in fsmonitor_dirty should ever exceed the number of entries in the index again. Another option that was considered was moving the call to fill_fsmonitor_bitmap closer to where the index is written (and where the fsmonitor extension itself is written). However, that did not work as the fsmonitor_dirty bitmap must be filled before the index is split during writing. Signed-off-by: William Baker <William.Baker@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-07-09Merge branch 'cb/fsmonitor-intfix'Libravatar Junio C Hamano1-4/+4
Variable type fix. * cb/fsmonitor-intfix: fsmonitor: avoid signed integer overflow / infinite loop
2019-06-17fsmonitor: avoid signed integer overflow / infinite loopLibravatar Carlo Marcelo Arenas Belón1-4/+4
883e248b8a ("fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.", 2017-09-22) uses an int in a loop that would wrap if index_state->cache_nr (unsigned) is bigger than INT_MAX Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-08fsmonitor: force a refresh after the index was discardedLibravatar Johannes Schindelin1-3/+2
With this change, the `index_state` struct becomes the new home for the flag that says whether the fsmonitor hook has been run, i.e. it is now per-index. It also gets re-set when the index is discarded, fixing the bug demonstrated by the "test_expect_failure" test added in the preceding commit. In that case fsmonitor-enabled Git would miss updates under certain circumstances, see that preceding commit for details. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-30Merge branch 'jk/snprintf-truncation'Libravatar Junio C Hamano1-10/+4
Avoid unchecked snprintf() to make future code auditing easier. * jk/snprintf-truncation: fmt_with_err: add a comment that truncation is OK shorten_unambiguous_ref: use xsnprintf fsmonitor: use internal argv_array of struct child_process log_write_email_headers: use strbufs http: use strbufs instead of fixed buffers
2018-05-21fsmonitor: use internal argv_array of struct child_processLibravatar René Scharfe1-10/+4
Avoid magic array sizes and indexes by constructing the fsmonitor command line using the embedded argv_array of the child_process. The resulting code is shorter and easier to extend. Getting rid of the snprintf() calls is a bonus -- even though the buffers were big enough here to avoid truncation -- as it makes auditing the remaining callers easier. Inspired-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-25Merge branch 'bp/fsmonitor-prime-index'Libravatar Junio C Hamano1-0/+3
The index file is updated to record the fsmonitor section after a full scan was made, to avoid wasting the effort that has already spent. * bp/fsmonitor-prime-index: fsmonitor: force index write after full scan
2018-04-25Merge branch 'bp/fsmonitor-bufsize-fix'Libravatar Junio C Hamano1-1/+1
Fix an unexploitable (because the oversized contents are not under attacker's control) buffer overflow. * bp/fsmonitor-bufsize-fix: fsmonitor: fix incorrect buffer size when printing version number
2018-04-11fsmonitor: force index write after full scanLibravatar Ben Peart1-0/+3
fsmonitor currently only flags the index as dirty if the extension is being added or removed. This is a performance optimization that recognizes you can stat() a lot of files in less time than it takes to write out an updated index. This patch makes a small enhancement and flags the index dirty if we end up having to stat() all files and scan the entire working directory. The assumption being that must be expensive or you would not have turned on the feature. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11fsmonitor: fix incorrect buffer size when printing version numberLibravatar Ben Peart1-1/+1
This is a trivial bug fix for passing the incorrect size to snprintf() when outputting the version. It should be passing the size of the destination buffer rather than the size of the value being printed. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-07dir.c: ignore paths containing .git when invalidating untracked cacheLibravatar Nguyễn Thái Ngọc Duy1-1/+1
read_directory() code ignores all paths named ".git" even if it's not a valid git repository. See treat_path() for details. Since ".git" is basically invisible to read_directory(), when we are asked to invalidate a path that contains ".git", we can safely ignore it because the slow path would not consider it anyway. This helps when fsmonitor is used and we have a real ".git" repo at worktree top. Occasionally .git/index will be updated and if the fsmonitor hook does not filter it, untracked cache is asked to invalidate the path ".git/index". Without this patch, we invalidate the root directory unncessarily, which: - makes read_directory() fall back to slow path for root directory (slower) - makes the index dirty (because UNTR extension is updated). Depending on the index size, writing it down could also be slow. A note about the new "safe_path" knob. Since this new check could be relatively expensive, avoid it when we know it's not needed. If the path comes from the index, it can't contain ".git". If it does contain, we may be screwed up at many more levels, not just this one. Noticed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-10fsmonitor: store fsmonitor bitmap before splitting indexLibravatar Alex Vandiver1-8/+12
ba1b9cac ("fsmonitor: delay updating state until after split index is merged", 2017-10-27) resolved the problem of the fsmonitor data being applied to the non-base index when reading; however, a similar problem exists when writing the index. Specifically, writing of the fsmonitor extension happens only after the work to split the index has been applied -- as such, the information in the index is only for the non-"base" index, and thus the extension information contains only partial data. When saving, compute the ewah bitmap before the index is split, and store it in the fsmonitor_dirty field, mirroring the behavior that occurred during reading. fsmonitor_dirty is kept from being leaked by being freed when the extension data is written -- which always happens precisely once, no matter the split index configuration. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-01fsmonitor: delay updating state until after split index is mergedLibravatar Alex Vandiver1-16/+24
If the fsmonitor extension is used in conjunction with the split index extension, the set of entries in the index when it is first loaded is only a subset of the real index. This leads to only the non-"base" index being marked as CE_FSMONITOR_VALID. Delay the expansion of the ewah bitmap until after tweak_split_index has been called to merge in the base index as well. The new fsmonitor_dirty is kept from being leaked by dint of being cleaned up in post_read_index_from, which is guaranteed to be called after do_read_index in read_index_from. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-30fsmonitor: set the PWD to the top of the working treeLibravatar Alex Vandiver1-0/+1
The fsmonitor command inherits the PWD of its caller, which may be anywhere in the working copy. This makes is difficult for the fsmonitor command to operate on the whole repository. Specifically, for the watchman integration, this causes each subdirectory to get its own watch entry. Set the CWD to the top of the working directory, for consistency. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01fsmonitor: teach git to optionally utilize a file system monitor to speed up ↵Libravatar Ben Peart1-0/+253
detecting new or changed files. When the index is read from disk, the fsmonitor index extension is used to flag the last known potentially dirty index entries. The registered core.fsmonitor command is called with the time the index was last updated and returns the list of files changed since that time. This list is used to flag any additional dirty cache entries and untracked cache directories. We can then use this valid state to speed up preload_index(), ie_match_stat(), and refresh_cache_ent() as they do not need to lstat() files to detect potential changes for those entries marked CE_FSMONITOR_VALID. In addition, if the untracked cache is turned on valid_cached_dir() can skip checking directories for new or changed files as fsmonitor will invalidate the cache only for those directories that have been identified as having potential changes. To keep the CE_FSMONITOR_VALID state accurate during git operations; when git updates a cache entry to match the current state on disk, it will now set the CE_FSMONITOR_VALID bit. Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit is cleared and the corresponding untracked cache directory is marked invalid. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>