summaryrefslogtreecommitdiff
path: root/dir.c
AgeCommit message (Collapse)AuthorFilesLines
2019-06-13cleanup: fix possible overflow errors in binary search, part 2Libravatar René Scharfe1-1/+1
Calculating the sum of two array indexes to find the midpoint between them can overflow, i.e. code like this is unsafe for big arrays: mid = (first + last) >> 1; Make sure the intermediate value stays within the boundaries instead, like this: mid = first + ((last - first) >> 1); The loop condition of the binary search makes sure that 'last' is always greater than 'first', so this is safe as long as 'first' is not negative. And that can be verified easily using the pre-context of each change, except for name-hash.c, so add an assertion to that effect there. The unsafe calculations were found with: git grep '(.*+.*) *>> *1' This is a continuation of 19716b21a4 (cleanup: fix possible overflow errors in binary search, 2017-10-08). Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-09Merge branch 'jk/untracked-cache-more-fixes'Libravatar Junio C Hamano1-23/+18
Code clean-up. * jk/untracked-cache-more-fixes: untracked-cache: simplify parsing by dropping "len" untracked-cache: simplify parsing by dropping "next" untracked-cache: be defensive about missing NULs in index
2019-05-09Merge branch 'nd/sha1-name-c-wo-the-repository'Libravatar Junio C Hamano1-0/+8
Further code clean-up to allow the lowest level of name-to-object mapping layer to work with a passed-in repository other than the default one. * nd/sha1-name-c-wo-the-repository: (34 commits) sha1-name.c: remove the_repo from get_oid_mb() sha1-name.c: remove the_repo from other get_oid_* sha1-name.c: remove the_repo from maybe_die_on_misspelt_object_name submodule-config.c: use repo_get_oid for reading .gitmodules sha1-name.c: add repo_get_oid() sha1-name.c: remove the_repo from get_oid_with_context_1() sha1-name.c: remove the_repo from resolve_relative_path() sha1-name.c: remove the_repo from diagnose_invalid_index_path() sha1-name.c: remove the_repo from handle_one_ref() sha1-name.c: remove the_repo from get_oid_1() sha1-name.c: remove the_repo from get_oid_basic() sha1-name.c: remove the_repo from get_describe_name() sha1-name.c: remove the_repo from get_oid_oneline() sha1-name.c: add repo_interpret_branch_name() sha1-name.c: remove the_repo from interpret_branch_mark() sha1-name.c: remove the_repo from interpret_nth_prior_checkout() sha1-name.c: remove the_repo from get_short_oid() sha1-name.c: add repo_for_each_abbrev() sha1-name.c: store and use repo in struct disambiguate_state sha1-name.c: add repo_find_unique_abbrev_r() ...
2019-05-09Merge branch 'km/empty-repo-is-still-a-repo'Libravatar Junio C Hamano1-2/+4
Running "git add" on a repository created inside the current repository is an explicit indication that the user wants to add it as a submodule, but when the HEAD of the inner repository is on an unborn branch, it cannot be added as a submodule. Worse, the files in its working tree can be added as if they are a part of the outer repository, which is not what the user wants. These problems are being addressed. * km/empty-repo-is-still-a-repo: add: error appropriately on repository with no commits dir: do not traverse repositories with no commits submodule: refuse to add repository with no commits
2019-04-25Merge branch 'js/untracked-cache-allocfix'Libravatar Junio C Hamano1-1/+1
An underallocation in the code to read the untracked cache extension has been corrected. * js/untracked-cache-allocfix: untracked cache: fix off-by-one
2019-04-25Merge branch 'bc/hash-transition-16'Libravatar Junio C Hamano1-14/+14
Conversion from unsigned char[20] to struct object_id continues. * bc/hash-transition-16: (35 commits) gitweb: make hash size independent Git.pm: make hash size independent read-cache: read data in a hash-independent way dir: make untracked cache extension hash size independent builtin/difftool: use parse_oid_hex refspec: make hash size independent archive: convert struct archiver_args to object_id builtin/get-tar-commit-id: make hash size independent get-tar-commit-id: parse comment record hash: add a function to lookup hash algorithm by length remote-curl: make hash size independent http: replace sha1_to_hex http: compute hash of downloaded objects using the_hash_algo http: replace hard-coded constant with the_hash_algo http-walker: replace sha1_to_hex http-push: remove remaining uses of sha1_to_hex http-backend: allow 64-character hex names http-push: convert to use the_hash_algo builtin/pull: make hash-size independent builtin/am: make hash size independent ...
2019-04-19untracked-cache: simplify parsing by dropping "len"Libravatar Jeff King1-8/+5
The code which parses untracked-cache extensions from disk keeps a "len" variable, which is the size of the string we are parsing. But since we now have an "end of string" variable, we can just use that to get the length when we need it. This eliminates the need to keep "len" up to date (and removes the possibility of any errors where "len" and "eos" get out of sync). As a bonus, it means we are not storing a string length in an "int", which is a potential source of overflows (though in this case it seems fairly unlikely for that to cause any memory problems). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-19untracked-cache: simplify parsing by dropping "next"Libravatar Jeff King1-13/+7
When we parse an on-disk untracked cache, we have two pointers, "data" and "next". As we parse, we point "next" to the end of an element, and then later update "data" to match. But we actually don't need two pointers. Each parsing step can just update "data" directly from other variables we hold (and we don't have to worry about bailing in an intermediate state, since any parsing failure causes us to immediately discard "data" and return). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-19untracked-cache: be defensive about missing NULs in indexLibravatar Jeff King1-7/+11
The on-disk format for the untracked-cache extension contains NUL-terminated filenames. We parse these from the mmap'd file using string functions like strlen(). This works fine in the normal case, but if we see a malformed or corrupted index, we might read off the end of our mmap. Instead, let's use memchr() to find the trailing NUL within the bytes we know are available, and return an error if it's missing. Note that we can further simplify by folding another range check into our conditional. After we find the end of the string, we set "next" to the byte after the string and treat it as an error if there are no such bytes left. That saves us from having to do a range check at the beginning of each subsequent string (and works because there is always data after each string). We can do both range checks together by checking "!eos" (we didn't find a NUL) and "eos == end" (it was on the last available byte, meaning there's nothing after). This replaces the existing "next > end" checks. Note also that the decode_varint() calls have a similar problem (we don't even pass them "end"; they just keep parsing). These are probably OK in practice since varints have a finite length (we stop parsing when we'd overflow a uintmax_t), so the worst case is that we'd overflow into reading the trailing bytes of the index. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-16sha1-name.c: remove the_repo from diagnose_invalid_index_path()Libravatar Nguyễn Thái Ngọc Duy1-0/+8
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-12untracked cache: fix off-by-oneLibravatar Johannes Schindelin1-1/+1
In f9e6c649589e (untracked cache: load from UNTR index extension, 2015-03-08), code was added to read back the untracked cache from an index extension. Probably in the endeavor to avoid the `calloc()` implied by `FLEX_ALLOC_STR()` (it is hard to know why exactly, the commit message of that commit is a bit parsimonious with information), it calls `malloc()` manually and then `memcpy()`s the bits and pieces into place. It allocates the size of `struct untracked_cache_dir` plus the string length of the untracked file name, then copies the information in two steps: first the fixed-size metadata, then the name. And here lies the rub: it includes the trailing NUL byte in the name. If `FLEX_ARRAY` is defined as 0, this results in a buffer overrun. To fix this, let's just add 1, for the trailing NUL byte. Technically, this overallocates on platforms where `FLEX_ARRAY` is 1, but it should not matter much in reality, as `malloc()` usually overallocates anyway, unless the size to allocate aligns exactly with some internal chunk size (see below for more on that). The real strange thing is that neither valgrind nor DrMemory catches this bug. In this developer's tests, a `memcpy()` (but not a `memset()`!) could write up to 4 bytes after the allocated memory range before valgrind would start reporting an issue. However, when running Git built with nedmalloc as allocator, under rare conditions (and inconsistently at that), this bug triggered an `abort()` because nedmalloc rounds up the size to be `malloc()`ed to a multiple of a certain chunk size, then adds a few bytes to be used for storing some internal state. If there is no rounding up to do (because the size is already a multiple of that chunk size), and if the buffer is overrun as in the code patched in this commit, the internal state is corrupted. The scenario that triggered this here bug fix entailed a git.git checkout with an extra copy of the source code in an untracked subdirectory, meaning that there was an untracked subdirectory called "thunderbird-patch-inline" whose name's length is exactly 24 bytes, which, added to the size of above-mentioned `struct untracked_cache_dir` that weighs in with 104 bytes on a 64-bit system, amounts to 128, aligning perfectly with nedmalloc's chunk size. As there is no obvious way to trigger this bug reliably, on all platforms supported by Git, and as the bug is obvious enough, this patch comes without a regression test. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-10dir: do not traverse repositories with no commitsLibravatar Kyle Meyer1-2/+4
When treat_directory() encounters a directory that is not in the index and DIR_NO_GITLINKS is unset, it calls resolve_gitlink_ref() to decide if a directory looks like a repository, in which case the directory won't be traversed. As a result, 'status -uall' and 'ls-files -o' will show only the directory, even when there are untracked files within the directory. For the unusual case where a repository doesn't have a commit checked out, resolve_gitlink_ref() returns -1 because HEAD cannot be resolved, and the directory is treated as a normal directory (i.e. traversal does not stop at the repository boundary). The status and ls-files commands above list untracked files within the repository rather than showing only the top-level directory. And if 'git add' is called on a repository with no commit checked out, any untracked files under the repository are added as blobs in the top-level project, a behavior that is unlikely to be what the caller intended. The above case is a corner case in an already unusual situation of the working tree containing a repository that is not a tracked submodule, but we might as well treat anything that looks like a repository consistently. Loosen the "looks like a repository" criteria in treat_directory() by replacing resolve_gitlink_ref() with is_nonbare_repository_dir(), one of the checks that is performed downstream when resolve_gitlink_ref() is called. As the required update to t3700-add shows, calling 'git add' on a repository with no commit checked out will now raise an error. While this is the desired behavior, note that the output isn't yet appropriate. The next commit will improve this output. Signed-off-by: Kyle Meyer <kyle@kyleam.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-01dir: make untracked cache extension hash size independentLibravatar brian m. carlson1-14/+14
Instead of using a struct with a flex array member to read and write the untracked cache extension, use a shorter, fixed-length struct and add the name and hash data explicitly. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-20report_path_error(): drop unused prefix parameterLibravatar Jeff King1-2/+1
This hasn't been used since 17ddc66e70 (convert report_path_error to take struct pathspec, 2013-07-14), as the names in the struct will have already been prefixed when they were parsed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-06Merge branch 'nd/the-index-final'Libravatar Junio C Hamano1-1/+0
The assumption to work on the single "in-core index" instance has been reduced from the library-ish part of the codebase. * nd/the-index-final: cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch read-cache.c: remove the_* from index_has_changes() merge-recursive.c: remove implicit dependency on the_repository merge-recursive.c: remove implicit dependency on the_index sha1-name.c: remove implicit dependency on the_index read-cache.c: replace update_index_if_able with repo_& read-cache.c: kill read_index() checkout: avoid the_index when possible repository.c: replace hold_locked_index() with repo_hold_locked_index() notes-utils.c: remove the_repository references grep: use grep_opt->repo instead of explict repo argument
2019-01-24cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switchLibravatar Nguyễn Thái Ngọc Duy1-1/+0
By default, index compat macros are off from now on, because they could hide the_index dependency. Only those in builtin can use it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-19dir.c: move, rename and export match_attrs()Libravatar Nguyễn Thái Ngọc Duy1-39/+2
The function will be reused for matching attributes in pathspec when walking trees (currently it's used for matching pathspec when walking a list). pathspec.c would be a more neutral place for this. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-02pathspec: handle non-terminated strings with :(attr)Libravatar Jeff King1-0/+7
The pathspec code always takes names to be matched as a name/namelen pair, but match_attrs() never looks at namelen, and just treats "name" like a NUL-terminated string, passing it to git_check_attr(). This usually works anyway. Every caller passes a NUL-terminated string, and in all but one the passed-in length is the same as the length of the string (the exception is dir_path_match(), which may pass a smaller length to drop a trailing slash). So we won't currently ever read random memory, and the one case I found actually happens to work correctly because the attr code can handle the trailing slash itself. But it's still worth addressing, as the function interface implies that the name does not have to be NUL-terminated, making this an accident waiting to happen. Since teaching git_check_attr() to take a ptr/len pair would be a big refactor, we'll just allocate a new string. We can do this only when necessary, which avoids paying the cost for most callers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-17Merge branch 'jk/cocci'Libravatar Junio C Hamano1-3/+3
spatch transformation to replace boolean uses of !hashcmp() to newly introduced oideq() is added, and applied, to regain performance lost due to support of multiple hash algorithms. * jk/cocci: show_dirstat: simplify same-content check read-cache: use oideq() in ce_compare functions convert hashmap comparison functions to oideq() convert "hashcmp() != 0" to "!hasheq()" convert "oidcmp() != 0" to "!oideq()" convert "hashcmp() == 0" to hasheq() convert "oidcmp() == 0" to oideq() introduce hasheq() and oideq() coccinelle: use <...> for function exclusion
2018-09-17Merge branch 'nd/unpack-trees-with-cache-tree'Libravatar Junio C Hamano1-3/+6
The unpack_trees() API used in checking out a branch and merging walks one or more trees along with the index. When the cache-tree in the index tells us that we are walking a tree whose flattened contents is known (i.e. matches a span in the index), as linearly scanning a span in the index is much more efficient than having to open tree objects recursively and listing their entries, the walk can be optimized, which is done in this topic. * nd/unpack-trees-with-cache-tree: Document update for nd/unpack-trees-with-cache-tree cache-tree: verify valid cache-tree in the test suite unpack-trees: add missing cache invalidation unpack-trees: reuse (still valid) cache-tree from src_index unpack-trees: reduce malloc in cache-tree walk unpack-trees: optimize walking same trees with cache-tree unpack-trees: add performance tracing trace.h: support nested performance tracing
2018-08-29convert "oidcmp() != 0" to "!oideq()"Libravatar Jeff King1-3/+3
This is the flip side of the previous two patches: checking for a non-zero oidcmp() can be more strictly expressed as inequality. Like those patches, we write "!= 0" in the coccinelle transformation, which covers by isomorphism the more common: if (oidcmp(E1, E2)) As with the previous two patches, this patch can be achieved almost entirely by running "make coccicheck"; the only differences are manual line-wrap fixes to match the original code. There is one thing to note for anybody replicating this, though: coccinelle 1.0.4 seems to miss the case in builtin/tag.c, even though it's basically the same as all the others. Running with 1.0.7 does catch this, so presumably it's just a coccinelle bug that was fixed in the interim. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20Merge branch 'nd/no-the-index'Libravatar Junio C Hamano1-11/+16
The more library-ish parts of the codebase learned to work on the in-core index-state instance that is passed in by their callers, instead of always working on the singleton "the_index" instance. * nd/no-the-index: (24 commits) blame.c: remove implicit dependency on the_index apply.c: remove implicit dependency on the_index apply.c: make init_apply_state() take a struct repository apply.c: pass struct apply_state to more functions resolve-undo.c: use the right index instead of the_index archive-*.c: use the right repository archive.c: avoid access to the_index grep: use the right index instead of the_index attr: remove index from git_attr_set_direction() entry.c: use the right index instead of the_index submodule.c: use the right index instead of the_index pathspec.c: use the right index instead of the_index unpack-trees: avoid the_index in verify_absent() unpack-trees: convert clear_ce_flags* to avoid the_index unpack-trees: don't shadow global var the_index unpack-trees: add a note about path invalidation unpack-trees: remove 'extern' on function declaration ls-files: correct index argument to get_convert_attr_ascii() preload-index.c: use the right index instead of the_index dir.c: remove an implicit dependency on the_index in pathspec code ...
2018-08-18trace.h: support nested performance tracingLibravatar Nguyễn Thái Ngọc Duy1-3/+6
Performance measurements are listed right now as a flat list, which is fine when we measure big blocks. But when we start adding more and more measurements, some of them could be just part of a bigger measurement and a flat list gives a wrong impression that they are executed at the same level instead of nested. Add trace_performance_enter() and trace_performance_leave() to allow indent these nested measurements. For now it does not help much because the only nested thing is (lazy) name hash initialization (e.g. called in diff-index from "git status"). This will help more because I'm going to add some more tracing that's actually nested. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-15Merge branch 'nd/i18n'Libravatar Junio C Hamano1-4/+4
Many more strings are prepared for l10n. * nd/i18n: (23 commits) transport-helper.c: mark more strings for translation transport.c: mark more strings for translation sha1-file.c: mark more strings for translation sequencer.c: mark more strings for translation replace-object.c: mark more strings for translation refspec.c: mark more strings for translation refs.c: mark more strings for translation pkt-line.c: mark more strings for translation object.c: mark more strings for translation exec-cmd.c: mark more strings for translation environment.c: mark more strings for translation dir.c: mark more strings for translation convert.c: mark more strings for translation connect.c: mark more strings for translation config.c: mark more strings for translation commit-graph.c: mark more strings for translation builtin/replace.c: mark more strings for translation builtin/pack-objects.c: mark more strings for translation builtin/grep.c: mark strings for translation builtin/config.c: mark more strings for translation ...
2018-08-13dir.c: remove an implicit dependency on the_index in pathspec codeLibravatar Nguyễn Thái Ngọc Duy1-11/+16
Make the match_patchspec API and friends take an index_state instead of assuming the_index in dir.c. All external call sites are converted blindly to keep the patch simple and retain current behavior. Individual call sites may receive further updates to use the right index instead of the_index. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13attr: remove an implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Make the attr API take an index_state instead of assuming the_index in attr code. All call sites are converted blindly to keep the patch simple and retain current behavior. Individual call sites may receive further updates to use the right index instead of the_index. There is one ugly temporary workaround added in attr.c that needs some more explanation. Commit c24f3abace (apply: file commited with CRLF should roundtrip diff and apply - 2017-08-19) forces one convert_to_git() call to NOT read the index at all. But what do you know, we read it anyway by falling back to the_index. When "istate" from convert_to_git is now propagated down to read_attr_from_array() we will hit segfault somewhere inside read_blob_data_from_index. The right way of dealing with this is to kill "use_index" variable and only follow "istate" but at this stage we are not ready for that: while most git_attr_set_direction() calls just passes the_index to be assigned to use_index, unpack-trees passes a different one which is used by entry.c code, which has no way to know what index to use if we delete use_index. So this has to be done later. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-23dir.c: mark more strings for translationLibravatar Nguyễn Thái Ngọc Duy1-3/+3
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-23Update messages in preparation for i18nLibravatar Nguyễn Thái Ngọc Duy1-2/+2
Many messages will be marked for translation in the following commits. This commit updates some of them to be more consistent and reduce diff noise in those commits. Changes are - keep the first letter of die(), error() and warning() in lowercase - no full stop in die(), error() or warning() if it's single sentence messages - indentation - some messages are turned to BUG(), or prefixed with "BUG:" and will not be marked for i18n - some messages are improved to give more information - some messages are broken down by sentence to be i18n friendly (on the same token, combine multiple warning() into one big string) - the trailing \n is converted to printf_ln if possible, or deleted if not redundant - errno_errno() is used instead of explicit strerror() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-18Merge branch 'tz/exclude-doc-smallfixes'Libravatar Junio C Hamano1-1/+1
Doc updates. * tz/exclude-doc-smallfixes: dir.c: fix typos in core.excludesfile comment gitignore.txt: clarify default core.excludesfile path
2018-07-18Merge branch 'sb/object-store-grafts'Libravatar Junio C Hamano1-0/+1
The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h
2018-06-27dir.c: fix typos in core.excludesfile commentLibravatar Todd Zullinger1-1/+1
Make it easier to find references to core.excludesfile and the default $XDG_CONFIG_HOME/git/ignore path. Signed-off-by: Todd Zullinger <tmz@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-18Merge branch 'jk/ewah-bounds-check'Libravatar Junio C Hamano1-1/+2
The code to read compressed bitmap was not careful to avoid reading past the end of the file, which has been corrected. * jk/ewah-bounds-check: ewah: adjust callers of ewah_read_mmap() ewah_read_mmap: bounds-check mmap reads
2018-06-18ewah: adjust callers of ewah_read_mmap()Libravatar Jeff King1-1/+2
The return value of ewah_read_mmap() is now an ssize_t, since we could (in theory) process up to 32GB of data. This would never happen in practice, but a corrupt or malicious .bitmap or index file could convince us to do so. Let's make sure that we don't stuff the value into an int, which would cause us to incorrectly move our pointer forward. We'd always move too little, since negative values are used for reporting errors. So the worst case is just that we end up reporting a corrupt file, not an out-of-bounds read. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-30Merge branch 'bc/object-id'Libravatar Junio C Hamano1-12/+13
Conversion from uchar[20] to struct object_id continues. * bc/object-id: (42 commits) merge-one-file: compute empty blob object ID add--interactive: compute the empty tree value Update shell scripts to compute empty tree object ID sha1_file: only expose empty object constants through git_hash_algo dir: use the_hash_algo for empty blob object ID sequencer: use the_hash_algo for empty tree object ID cache-tree: use is_empty_tree_oid sha1_file: convert cached object code to struct object_id builtin/reset: convert use of EMPTY_TREE_SHA1_BIN builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX wt-status: convert two uses of EMPTY_TREE_SHA1_HEX submodule: convert several uses of EMPTY_TREE_SHA1_HEX sequencer: convert one use of EMPTY_TREE_SHA1_HEX merge: convert empty tree constant to the_hash_algo builtin/merge: switch tree functions to use object_id builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo sha1-file: add functions for hex empty tree and blob OIDs builtin/receive-pack: avoid hard-coded constants for push certs diff: specify abbreviation size in terms of the_hash_algo upload-pack: replace use of several hard-coded constants ...
2018-05-29Sync with Git 2.17.1Libravatar Junio C Hamano1-1/+1
* maint: (25 commits) Git 2.17.1 Git 2.16.4 Git 2.15.2 Git 2.14.4 Git 2.13.7 fsck: complain when .gitmodules is a symlink index-pack: check .gitmodules files with --strict unpack-objects: call fsck_finish() after fscking objects fsck: call fsck_finish() after fscking objects fsck: check .gitmodules content fsck: handle promisor objects in .gitmodules check fsck: detect gitmodules files fsck: actually fsck blob data fsck: simplify ".git" check index-pack: make fsck error message more specific verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant ...
2018-05-22Sync with Git 2.16.4Libravatar Junio C Hamano1-1/+1
* maint-2.16: Git 2.16.4 Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths
2018-05-16object-store: move object access functions to object-store.hLibravatar Stefan Beller1-0/+1
This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-08Merge branch 'sb/submodule-move-nested'Libravatar Junio C Hamano1-3/+57
Moving a submodule that itself has submodule in it with "git mv" forgot to make necessary adjustment to the nested sub-submodules; now the codepath learned to recurse into the submodules. * sb/submodule-move-nested: submodule: fixup nested submodules after moving the submodule submodule-config: remove submodule_from_cache submodule-config: add repository argument to submodule_from_{name, path} submodule-config: allow submodule_free to handle arbitrary repositories grep: remove "repo" arg from non-supporting funcs submodule.h: drop declaration of connect_work_tree_and_git_dir
2018-05-02dir: use the_hash_algo for empty blob object IDLibravatar brian m. carlson1-1/+1
To ensure that we are hash algorithm agnostic, use the_hash_algo to look up the object ID for the empty blob instead of using the empty_tree_oid variable. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02dir: convert struct untracked_cache_dir to object_idLibravatar brian m. carlson1-11/+12
Convert the exclude_sha1 member of struct untracked_cache_dir and rename it to exclude_oid. Eliminate several hard-coded integral constants, and update a function name that referred to SHA-1. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-29submodule: fixup nested submodules after moving the submoduleLibravatar Stefan Beller1-3/+57
connect_work_tree_and_git_dir is used to connect a submodule worktree with its git directory and vice versa after events that require a reconnection such as moving around the working tree. As submodules can have nested submodules themselves, we'd also want to fix the nested submodules when asked to. Add an option to recurse into the nested submodules and connect them as well. As submodules are identified by their name (which determines their git directory in relation to their superproject's git directory) internally and by their path in the working tree of the superproject, we need to make sure that the mapping of name <-> path is kept intact. We can do that in the git-mv command by writing out the gitmodules file first and then forcing a reload of the submodule config machinery. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14sha1_file: convert read_sha1_file to struct object_idLibravatar brian m. carlson1-1/+1
Convert read_sha1_file to take a pointer to struct object_id and rename it read_object_file. Do the same for read_sha1_file_extended. Convert one use in grep.c to use the new function without any other code change, since the pointer being passed is a void pointer that is already initialized with a pointer to struct object_id. Update the declaration and definitions of the modified functions, and apply the following semantic patch to convert the remaining callers: @@ expression E1, E2, E3; @@ - read_sha1_file(E1.hash, E2, E3) + read_object_file(&E1, E2, E3) @@ expression E1, E2, E3; @@ - read_sha1_file(E1->hash, E2, E3) + read_object_file(E1, E2, E3) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1.hash, E2, E3, E4) + read_object_file_extended(&E1, E2, E3, E4) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1->hash, E2, E3, E4) + read_object_file_extended(E1, E2, E3, E4) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-08Merge branch 'bp/untracked-cache-noflush'Libravatar Junio C Hamano1-2/+13
Writing out the index file when the only thing that changed in it is the untracked cache information is often wasteful, and this has been optimized out. * bp/untracked-cache-noflush: untracked cache: use git_env_bool() not getenv() for customization dir.c: don't flag the index as dirty for changes to the untracked cache
2018-02-28untracked cache: use git_env_bool() not getenv() for customizationLibravatar Junio C Hamano1-2/+12
GIT_DISABLE_UNTRACKED_CACHE and GIT_TEST_UNTRACKED_CACHE are only sensed for their presense by using getenv(); use git_env_bool() instead so that GIT_DISABLE_UNTRACKED_CACHE=false would work as naïvely expected. Also rename GIT_TEST_UNTRACKED_CACHE to GIT_FORCE_UNTRACKED_CACHE to express what it does more honestly. Forcing its use may be one useful thing to do while testing the feature, but testing does not have to be the only use of the knob. While at it, avoid repeated calls to git_env_bool() by capturing the return value from the first call in a static variable. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-27Merge branch 'nd/fix-untracked-cache-invalidation'Libravatar Junio C Hamano1-14/+27
Some bugs around "untracked cache" feature have been fixed. * nd/fix-untracked-cache-invalidation: dir.c: ignore paths containing .git when invalidating untracked cache dir.c: stop ignoring opendir() error in open_cached_dir() dir.c: fix missing dir invalidation in untracked code dir.c: avoid stat() in valid_cached_dir() status: add a failing test showing a core.untrackedCache bug
2018-02-15Merge branch 'nd/trace-index-ops'Libravatar Junio C Hamano1-0/+2
* nd/trace-index-ops: trace: measure where the time is spent in the index-heavy operations
2018-02-15Merge branch 'po/object-id'Libravatar Junio C Hamano1-54/+50
Conversion from uchar[20] to struct object_id continues. * po/object-id: sha1_file: rename hash_sha1_file_literally sha1_file: convert write_loose_object to object_id sha1_file: convert force_object_loose to object_id sha1_file: convert write_sha1_file to object_id notes: convert write_notes_tree to object_id notes: convert combine_notes_* to object_id commit: convert commit_tree* to object_id match-trees: convert splice_tree to object_id cache: clear whole hash buffer with oidclr sha1_file: convert hash_sha1_file to object_id dir: convert struct sha1_stat to use object_id sha1_file: convert pretend_sha1_file to object_id
2018-02-07dir.c: ignore paths containing .git when invalidating untracked cacheLibravatar Nguyễn Thái Ngọc Duy1-4/+6
read_directory() code ignores all paths named ".git" even if it's not a valid git repository. See treat_path() for details. Since ".git" is basically invisible to read_directory(), when we are asked to invalidate a path that contains ".git", we can safely ignore it because the slow path would not consider it anyway. This helps when fsmonitor is used and we have a real ".git" repo at worktree top. Occasionally .git/index will be updated and if the fsmonitor hook does not filter it, untracked cache is asked to invalidate the path ".git/index". Without this patch, we invalidate the root directory unncessarily, which: - makes read_directory() fall back to slow path for root directory (slower) - makes the index dirty (because UNTR extension is updated). Depending on the index size, writing it down could also be slow. A note about the new "safe_path" knob. Since this new check could be relatively expensive, avoid it when we know it's not needed. If the path comes from the index, it can't contain ".git". If it does contain, we may be screwed up at many more levels, not just this one. Noticed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-05dir.c: don't flag the index as dirty for changes to the untracked cacheLibravatar Ben Peart1-1/+2
The untracked cache saves its current state in the UNTR index extension. Currently, _any_ change to that state causes the index to be flagged as dirty and written out to disk. Unfortunately, the cost to write out the index can exceed the savings gained by using the untracked cache. Since it is a cache that can be updated from the current state of the working directory, there is no functional requirement that the index be written out for every change to the untracked cache. Update the untracked cache logic so that it no longer forces the index to be written to disk except in the case where the extension is being turned on or off. When some other git command requires the index to be written to disk, the untracked cache will take advantage of that to save it's updated state as well. This results in a performance win when looked at over common sequences of git commands (ie such as a status followed by add, commit, etc). After this patch, all the logic to track statistics for the untracked cache could be removed as it is only used by debug tracing used to debug the untracked cache. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-02trace: measure where the time is spent in the index-heavy operationsLibravatar Nguyễn Thái Ngọc Duy1-0/+2
All the known heavy code blocks are measured (except object database access). This should help identify if an optimization is effective or not. An unoptimized git-status would give something like below: 0.001791141 s: read cache ... 0.004011363 s: preload index 0.000516161 s: refresh index 0.003139257 s: git command: ... 'status' '--porcelain=2' 0.006788129 s: diff-files 0.002090267 s: diff-index 0.001885735 s: initialize name hash 0.032013138 s: read directory 0.051781209 s: git command: './git' 'status' Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>