summaryrefslogtreecommitdiff
path: root/cache.h
AgeCommit message (Collapse)AuthorFilesLines
2019-02-06Merge branch 'jk/loose-object-cache-oid'Libravatar Junio C Hamano1-3/+3
Code clean-up. * jk/loose-object-cache-oid: prefer "hash mismatch" to "sha1 mismatch" sha1-file: avoid "sha1 file" for generic use in messages sha1-file: prefer "loose object file" to "sha1 file" in messages sha1-file: drop has_sha1_file() convert has_sha1_file() callers to has_object_file() sha1-file: convert pass-through functions to object_id sha1-file: modernize loose header/stream functions sha1-file: modernize loose object file functions http: use struct object_id instead of bare sha1 update comment references to sha1_object_info() sha1-file: fix outdated sha1 comment references
2019-02-06Merge branch 'lt/date-human'Libravatar Junio C Hamano1-0/+3
A new date format "--date=human" that morphs its output depending on how far the time is from the current time has been introduced. "--date=auto" can be used to use this new format when the output is going to the pager or to the terminal and otherwise the default format. * lt/date-human: Add `human` date format tests. Add `human` format to test-tool Add 'human' date format documentation Replace the proposed 'auto' mode with 'auto:' Add 'human' date format
2019-02-06Merge branch 'jk/unused-parameter-cleanup'Libravatar Junio C Hamano1-1/+1
Code cleanup. * jk/unused-parameter-cleanup: convert: drop path parameter from actual conversion functions convert: drop len parameter from conversion checks config: drop unused parameter from maybe_remove_section() show_date_relative(): drop unused "tz" parameter column: drop unused "opts" parameter in item_length() create_bundle(): drop unused "header" parameter apply: drop unused "def" parameter from find_name_gnu() match-trees: drop unused path parameter from score functions
2019-02-06Merge branch 'nd/the-index-final'Libravatar Junio C Hamano1-23/+13
The assumption to work on the single "in-core index" instance has been reduced from the library-ish part of the codebase. * nd/the-index-final: cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch read-cache.c: remove the_* from index_has_changes() merge-recursive.c: remove implicit dependency on the_repository merge-recursive.c: remove implicit dependency on the_index sha1-name.c: remove implicit dependency on the_index read-cache.c: replace update_index_if_able with repo_& read-cache.c: kill read_index() checkout: avoid the_index when possible repository.c: replace hold_locked_index() with repo_hold_locked_index() notes-utils.c: remove the_repository references grep: use grep_opt->repo instead of explict repo argument
2019-02-06Merge branch 'tt/bisect-in-c'Libravatar Junio C Hamano1-0/+3
More code in "git bisect" has been rewritten in C. * tt/bisect-in-c: bisect--helper: `bisect_start` shell function partially in C bisect--helper: `get_terms` & `bisect_terms` shell function in C bisect--helper: `bisect_next_check` shell function in C bisect--helper: `check_and_set_terms` shell function in C wrapper: move is_empty_file() and rename it as is_empty_or_missing_file() bisect--helper: `bisect_write` shell function in C bisect--helper: `bisect_reset` shell function in C
2019-02-06Merge branch 'dt/cat-file-batch-ambiguous'Libravatar Junio C Hamano1-1/+19
"git cat-file --batch" reported a dangling symbolic link by mistake, when it wanted to report that a given name is ambiguous. * dt/cat-file-batch-ambiguous: t1512: test ambiguous cat-file --batch and --batch-output Do not print 'dangling' for cat-file in case of ambiguity
2019-02-05Merge branch 'jk/add-ignore-errors-bit-assignment-fix'Libravatar Junio C Hamano1-0/+1
"git add --ignore-errors" did not work as advertised and instead worked as an unintended synonym for "git add --renormalize", which has been fixed. * jk/add-ignore-errors-bit-assignment-fix: add: use separate ADD_CACHE_RENORMALIZE flag
2019-01-29Merge branch 'bc/tree-walk-oid'Libravatar Junio C Hamano1-1/+1
The code to walk tree objects has been taught that we may be working with object names that are not computed with SHA-1. * bc/tree-walk-oid: cache: make oidcpy always copy GIT_MAX_RAWSZ bytes tree-walk: store object_id in a separate member match-trees: use hashcpy to splice trees match-trees: compute buffer offset correctly when splicing tree-walk: copy object ID before use
2019-01-29Merge branch 'bc/sha-256'Libravatar Junio C Hamano1-18/+33
Add sha-256 hash and plug it through the code to allow building Git with the "NewHash". * bc/sha-256: hash: add an SHA-256 implementation using OpenSSL sha256: add an SHA-256 implementation using libgcrypt Add a base implementation of SHA-256 support commit-graph: convert to using the_hash_algo t/helper: add a test helper to compute hash speed sha1-file: add a constant for hash block size t: make the sha1 test-tool helper generic t: add basic tests for our SHA-1 implementation cache: make hashcmp and hasheq work with larger hashes hex: introduce functions to print arbitrary hashes sha1-file: provide functions to look up hash algorithms sha1-file: rename algorithm to "sha1"
2019-01-29Add `human` format to test-toolLibravatar Stephen P. Smith1-0/+2
Add the human format support to the test tool so that GIT_TEST_DATE_NOW can be used to specify the current time. The get_time() helper function was created and and checks the GIT_TEST_DATE_NOW environment variable. If GIT_TEST_DATE_NOW is set, then that date is used instead of the date returned by by gettimeofday(). All calls to gettimeofday() were replaced by calls to get_time(). Renamed occurances of TEST_DATE_NOW to GIT_TEST_DATE_NOW since the variable is now used in the get binary and not just in the test-tool. Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24show_date_relative(): drop unused "tz" parameterLibravatar Jeff King1-1/+1
The timestamp we receive is in epoch time, so there's no need for a timezone parameter to interpret it. The matching show_date() uses "tz" to show dates in author local time, but relative dates show only the absolute time difference. The author's location is irrelevant, barring relativistic effects from using Git close to the speed of light. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-24cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switchLibravatar Nguyễn Thái Ngọc Duy1-3/+3
By default, index compat macros are off from now on, because they could hide the_index dependency. Only those in builtin can use it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-18Do not print 'dangling' for cat-file in case of ambiguityLibravatar David Turner1-1/+19
The return values -1 and -2 from get_oid could mean two different things, depending on whether they were from an enum returned by get_tree_entry_follow_symlinks, or from a different code path. This caused 'dangling' to be printed from a git cat-file in the case of an ambiguous (-2) result. Unify the results of get_oid* and get_tree_entry_follow_symlinks to be one common type, with unambiguous values. Signed-off-by: David Turner <novalis@novalis.org> Reported-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-18Add 'human' date formatLibravatar Linus Torvalds1-0/+1
This adds --date=human, which skips the timezone if it matches the current time-zone, and doesn't print the whole date if that matches (ie skip printing year for dates that are "this year", but also skip the whole date itself if it's in the last few days and we can just say what weekday it was). For really recent dates (same day), use the relative date stamp, while for old dates (year doesn't match), don't bother with time and timezone. Also add 'auto' date mode, which defaults to human if we're using the pager. So you can do git config --add log.date auto and your "git log" commands will show the human-legible format unless you're scripting things. Note that this time format still shows the timezone for recent enough events (but not so recent that they show up as relative dates). You can combine it with the "-local" suffix to never show timezones for an even more simplified view. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-17add: use separate ADD_CACHE_RENORMALIZE flagLibravatar Jeff King1-0/+1
Commit 9472935d81 (add: introduce "--renormalize", 2017-11-16) taught git-add to pass HASH_RENORMALIZE to add_to_index(), which then passes the flag along to index_path(). However, the flags taken by add_to_index() and the ones taken by index_path() are distinct namespaces. We cannot take HASH_* flags in add_to_index(), because they overlap with the ADD_CACHE_* flags we already take (in this case, HASH_RENORMALIZE conflicts with ADD_CACHE_IGNORE_ERRORS). We can solve this by adding a new ADD_CACHE_RENORMALIZE flag, and using it to set HASH_RENORMALIZE within add_to_index(). In order to make it clear that these two flags come from distinct sets, let's also change the name "newflags" in the function to "hash_flags". Reported-by: Dmitriy Smirnov <dmitriy.smirnov@jetbrains.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-15cache: make oidcpy always copy GIT_MAX_RAWSZ bytesLibravatar brian m. carlson1-1/+1
There are some situations in which we want to store an object ID into struct object_id without the_hash_algo necessarily being set correctly. One such case is when cloning a repository, where we must read refs from the remote side without having a repository from which to read the preferred algorithm. In this cases, we may have the_hash_algo set to SHA-1, which is the default, but read refs into struct object_id that are SHA-256. When copying these values, we will want to copy them completely, not just the first 20 bytes. Consequently, make sure that oidcpy copies the maximum number of bytes at all times, regardless of the setting of the_hash_algo. Since oidcpy and hashcpy are no longer functionally identical, remove the Cocinelle object_id transformations that convert from one into the other. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14read-cache.c: remove the_* from index_has_changes()Libravatar Nguyễn Thái Ngọc Duy1-3/+3
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14sha1-name.c: remove implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy1-1/+3
This kills the_index dependency in get_oid_with_context() but for get_oid() and friends, they still assume the_repository (which also means the_index). Unfortunately the widespread use of get_oid() will make it hard to make the conversion now. We probably will add repo_get_oid() at some point and limit the use of get_oid() in builtin/ instead of forcing all get_oid() call sites to carry struct repository. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14read-cache.c: replace update_index_if_able with repo_&Libravatar Nguyễn Thái Ngọc Duy1-6/+0
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14read-cache.c: kill read_index()Libravatar Nguyễn Thái Ngọc Duy1-8/+3
read_index() shares the same problem as hold_locked_index(): it assumes $GIT_DIR/index. Move all call sites to repo_read_index() instead. read_index_preload() and read_index_unmerged() are also killed as a consequence. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14repository.c: replace hold_locked_index() with repo_hold_locked_index()Libravatar Nguyễn Thái Ngọc Duy1-1/+1
hold_locked_index() assumes the index path at $GIT_DIR/index. This is not good for places that take an arbitrary index_state instead of the_index, which is basically everywhere except builtin/. Replace it with repo_hold_locked_index(). hold_locked_index() remains as a wrapper around repo_hold_locked_index() to reduce changes in builtin/ Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-08sha1-file: modernize loose header/stream functionsLibravatar Jeff King1-2/+2
As with the open/map/close functions for loose objects that were recently converted, the functions for parsing the loose object stream use the name "sha1" and a bare "unsigned char *". Let's fix that so that unpack_sha1_header() becomes unpack_loose_header(), etc. These conversions are less clear-cut than the file access functions. You could argue that the they are parsing Git's canonical object format (i.e., "type size\0contents", over which we compute the hash), which is not strictly tied to loose storage. But in practice these functions are used only for loose objects, and using the term "loose_header" (instead of "object_header") distinguishes it from the object header found in packfiles (which contains the same information in a different format). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-08update comment references to sha1_object_info()Libravatar Jeff King1-1/+1
Commit abef9020e3 (sha1_file: convert sha1_object_info* to object_id, 2018-03-12) renamed the function to oid_object_info(), but missed some comments which mention it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-02wrapper: move is_empty_file() and rename it as is_empty_or_missing_file()Libravatar Pranit Bauva1-0/+3
is_empty_file() can help to refactor a lot of code. This will be very helpful in porting "git bisect" to C. Suggested-by: Torsten Bögershausen <tboegi@web.de> Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-14Add a base implementation of SHA-256 supportLibravatar brian m. carlson1-3/+9
SHA-1 is weak and we need to transition to a new hash function. For some time, we have referred to this new function as NewHash. Recently, we decided to pick SHA-256 as NewHash. The reasons behind the choice of SHA-256 are outlined in the thread starting at [1] and in the commit history for the hash function transition document. Add a basic implementation of SHA-256 based off libtomcrypt, which is in the public domain. Optimize it and restructure it to meet our coding standards. Pull in the update and final functions from the SHA-1 block implementation, as we know these function correctly with all compilers. This implementation is slower than SHA-1, but more performant implementations will be introduced in future commits. Wire up SHA-256 in the list of hash algorithms, and add a test that the algorithm works correctly. Note that with this patch, it is still not possible to switch to using SHA-256 in Git. Additional patches are needed to prepare the code to handle a larger hash algorithm and further test fixes are needed. [1] https://public-inbox.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/ Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-14sha1-file: add a constant for hash block sizeLibravatar brian m. carlson1-0/+4
There is one place we need the hash algorithm block size: the HMAC code for push certs. Expose this constant in struct git_hash_algo and expose values for SHA-1 and for the largest value of any hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-14cache: make hashcmp and hasheq work with larger hashesLibravatar brian m. carlson1-10/+12
In 183a638b7d ("hashcmp: assert constant hash size", 2018-08-23), we modified hashcmp to assert that the hash size was always 20 to help it optimize and inline calls to memcmp. In a future series, we replaced many calls to hashcmp and oidcmp with calls to hasheq and oideq to improve inlining further. However, we want to support hash algorithms other than SHA-1, namely SHA-256. When doing so, we must handle the case where these values are 32 bytes long as well as 20. Adjust hashcmp to handle two cases: 20-byte matches, and maximum-size matches. Therefore, when we include SHA-256, we'll automatically handle it properly, while at the same time teaching the compiler that there are only two possible options to consider. This will allow the compiler to write the most efficient possible code. Copy similar code into hasheq and perform an identical transformation. At least with GCC 8.2.0, making hasheq defer to hashcmp when there are two branches prevents the compiler from inlining the comparison, while the code in this patch is inlined properly. Add a comment to avoid an accidental performance regression from well-intentioned refactoring. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-14hex: introduce functions to print arbitrary hashesLibravatar brian m. carlson1-6/+9
Currently, we have functions that turn an arbitrary SHA-1 value or an object ID into hex format, either using a static buffer or with a user-provided buffer. Add variants of these functions that can handle an arbitrary hash algorithm, specified by constant. Update the documentation as well. While we're at it, remove the "extern" declaration from this family of functions, since it's not needed and our style now recommends against it. We use the variant taking the algorithm structure pointer as the internal variant, since taking an algorithm pointer is the easiest way to handle all of the variants in use. Note that we maintain these functions because there are hashes which must change based on the hash algorithm in use but are not object IDs (such as pack checksums). Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-14checkout: print something when checking out pathsLibravatar Nguyễn Thái Ngọc Duy1-2/+2
One of the problems with "git checkout" is that it does so many different things and could confuse people specially when we fail to handle ambiguation correctly. One way to help with that is tell the user what sort of operation is actually carried out. When switching branches, we always print something unless --quiet, either - "HEAD is now at ..." - "Reset branch ..." - "Already on ..." - "Switched to and reset ..." - "Switched to a new branch ..." - "Switched to branch ..." Checking out paths however is silent. Print something so that if we got the user intention wrong, they won't waste too much time to find that out. For the remaining cases of checkout we now print either - "Checked out ... paths out of the index" - "Checked out ... paths out of <abbrev hash>" Since the purpose of printing this is to help disambiguate. Only do it when "--" is missing. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-13Merge branch 'bp/refresh-index-using-preload'Libravatar Junio C Hamano1-0/+3
The helper function to refresh the cached stat information in the in-core index has learned to perform the lstat() part of the operation in parallel on multi-core platforms. * bp/refresh-index-using-preload: refresh_index: remove unnecessary calls to preload_index() speed up refresh_index() by utilizing preload_index()
2018-11-13Merge branch 'ao/submodule-wo-gitmodules-checked-out'Libravatar Junio C Hamano1-0/+2
The submodule support has been updated to read from the blob at HEAD:.gitmodules when the .gitmodules file is missing from the working tree. * ao/submodule-wo-gitmodules-checked-out: t/helper: add test-submodule-nested-repo-config submodule: support reading .gitmodules when it's not in the working tree submodule: add a helper to check if it is safe to write to .gitmodules t7506: clean up .gitmodules properly before setting up new scenario submodule: use the 'submodule--helper config' command submodule--helper: add a new 'config' subcommand t7411: be nicer to future tests and really clean things up t7411: merge tests 5 and 6 submodule: factor out a config_set_in_gitmodules_file_gently function submodule: add a print_config_from_gitmodules() helper
2018-11-13Merge branch 'js/mingw-perl5lib'Libravatar Junio C Hamano1-8/+0
Windows fix. * js/mingw-perl5lib: mingw: unset PERL5LIB by default config: move Windows-specific config settings into compat/mingw.c config: allow for platform-specific core.* config settings config: rename `dummy` parameter to `cb` in git_default_config()
2018-11-13Merge branch 'nd/per-worktree-config'Libravatar Junio C Hamano1-0/+2
A fourth class of configuration files (in addition to the traditional "system wide", "per user in the $HOME directory" and "per repository in the $GIT_DIR/config") has been introduced so that different worktrees that share the same repository (hence the same $GIT_DIR/config file) can use different customization. * nd/per-worktree-config: worktree: add per-worktree config files t1300: extract and use test_cmp_config()
2018-11-02Merge branch 'ag/rebase-i-in-c'Libravatar Junio C Hamano1-0/+1
Rewrite of the remaining "rebase -i" machinery in C. * ag/rebase-i-in-c: rebase -i: move rebase--helper modes to rebase--interactive rebase -i: remove git-rebase--interactive.sh rebase--interactive2: rewrite the submodes of interactive rebase in C rebase -i: implement the main part of interactive rebase as a builtin rebase -i: rewrite init_basic_state() in C rebase -i: rewrite write_basic_state() in C rebase -i: rewrite the rest of init_revisions_and_shortrevisions() in C rebase -i: implement the logic to initialize $revisions in C rebase -i: remove unused modes and functions rebase -i: rewrite complete_action() in C t3404: todo list with commented-out commands only aborts sequencer: change the way skip_unnecessary_picks() returns its result sequencer: refactor append_todo_help() to write its message to a buffer rebase -i: rewrite checkout_onto() in C rebase -i: rewrite setup_reflog_action() in C sequencer: add a new function to silence a command, except if it fails rebase -i: rewrite the edit-todo functionality in C editor: add a function to launch the sequence editor rebase -i: rewrite append_todo_help() in C sequencer: make three functions and an enum from sequencer.c public
2018-10-31config: move Windows-specific config settings into compat/mingw.cLibravatar Johannes Schindelin1-8/+0
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-30speed up refresh_index() by utilizing preload_index()Libravatar Ben Peart1-0/+3
Speed up refresh_index() by utilizing preload_index() to do most of the work spread across multiple threads. This works because most cache entries will get marked CE_UPTODATE so that refresh_cache_ent() can bail out early when called from within refresh_index(). On a Windows repo with ~200K files, this drops refresh times from 6.64 seconds to 2.87 seconds for a savings of 57%. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-26Merge branch 'sg/split-index-racefix'Libravatar Junio C Hamano1-0/+2
The codepath to support the experimental split-index mode had remaining "racily clean" issues fixed. * sg/split-index-racefix: split-index: BUG() when cache entry refers to non-existing shared entry split-index: smudge and add racily clean cache entries to split index split-index: don't compare cached data of entries already marked for split index split-index: count the number of deleted entries t1700-split-index: date back files to avoid racy situations split-index: add tests to demonstrate the racy split index problem t1700-split-index: document why FSMONITOR is disabled in this test script
2018-10-22worktree: add per-worktree config filesLibravatar Nguyễn Thái Ngọc Duy1-0/+2
A new repo extension is added, worktreeConfig. When it is present: - Repository config reading by default includes $GIT_DIR/config _and_ $GIT_DIR/config.worktree. "config" file remains shared in multiple worktree setup. - The special treatment for core.bare and core.worktree, to stay effective only in main worktree, is gone. These config settings are supposed to be in config.worktree. This extension is most useful in multiple worktree setup because you now have an option to store per-worktree config (which is either .git/config.worktree for main worktree, or .git/worktrees/xx/config.worktree for linked ones). This extension can be used in single worktree mode, even though it's pretty much useless (but this can happen after you remove all linked worktrees and move back to single worktree). "git config" reads from both "config" and "config.worktree" by default (i.e. without either --user, --file...) when this extension is present. Default writes still go to "config", not "config.worktree". A new option --worktree is added for that (*). Since a new repo extension is introduced, existing git binaries should refuse to access to the repo (both from main and linked worktrees). So they will not misread the config file (i.e. skip the config.worktree part). They may still accidentally write to the config file anyway if they use with "git config --file <path>". This design places a bet on the assumption that the majority of config variables are shared so it is the default mode. A safer move would be default writes go to per-worktree file, so that accidental changes are isolated. (*) "git config --worktree" points back to "config" file when this extension is not present and there is only one worktree so that it works in any both single and multiple worktree setups. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-19Merge branch 'nd/status-refresh-progress'Libravatar Junio C Hamano1-2/+5
"git status" learns to show progress bar when refreshing the index takes a long time. * nd/status-refresh-progress: status: show progress bar if refreshing the index takes too long
2018-10-19Merge branch 'nd/the-index'Libravatar Junio C Hamano1-6/+8
Various codepaths in the core-ish part learn to work on an arbitrary in-core index structure, not necessarily the default instance "the_index". * nd/the-index: (23 commits) revision.c: reduce implicit dependency the_repository revision.c: remove implicit dependency on the_index ws.c: remove implicit dependency on the_index tree-diff.c: remove implicit dependency on the_index submodule.c: remove implicit dependency on the_index line-range.c: remove implicit dependency on the_index userdiff.c: remove implicit dependency on the_index rerere.c: remove implicit dependency on the_index sha1-file.c: remove implicit dependency on the_index patch-ids.c: remove implicit dependency on the_index merge.c: remove implicit dependency on the_index merge-blobs.c: remove implicit dependency on the_index ll-merge.c: remove implicit dependency on the_index diff-lib.c: remove implicit dependency on the_index read-cache.c: remove implicit dependency on the_index diff.c: remove implicit dependency on the_index grep.c: remove implicit dependency on the_index diff.c: remove the_index dependency in textconv() functions blame.c: rename "repo" argument to "r" combine-diff.c: remove implicit dependency on the_index ...
2018-10-12split-index: smudge and add racily clean cache entries to split indexLibravatar SZEDER Gábor1-0/+2
Ever since the split index feature was introduced [1], refreshing a split index is prone to a variant of the classic racy git problem. Consider the following sequence of commands updating the split index when the shared index contains a racily clean cache entry, i.e. an entry whose cached stat data matches with the corresponding file in the worktree and the cached mtime matches that of the index: echo "cached content" >file git update-index --split-index --add file echo "dirty worktree" >file # size stays the same! # ... wait ... git update-index --add other-file Normally, when a non-split index is updated, then do_write_index() (the function responsible for writing all kinds of indexes, "regular", split, and shared) recognizes racily clean cache entries, and writes them with smudged stat data, i.e. with file size set to 0. When subsequent git commands read the index, they will notice that the smudged stat data doesn't match with the file in the worktree, and then go on to check the file's content and notice its dirtiness. In the above example, however, in the second 'git update-index' prepare_to_write_split_index() decides which cache entries stored only in the shared index should be replaced in the new split index. Alas, this function never looks out for racily clean cache entries, and since the file's stat data in the worktree hasn't changed since the shared index was written, it won't be replaced in the new split index. Consequently, do_write_index() doesn't even get this racily clean cache entry, and can't smudge its stat data. Subsequent git commands will then see that the index has more recent mtime than the file and that the (not smudged) cached stat data still matches with the file in the worktree, and, ultimately, will erroneously consider the file clean. Modify prepare_to_write_split_index() to recognize racily clean cache entries, and mark them to be added to the split index. Note that there are two places where it should check raciness: first those cache entries that are only stored in the shared index, and then those that have been copied by unpack_trees() from the shared index while it constructed a new index. This way do_write_index() will get these racily clean cache entries as well, and will then write them with smudged stat data to the new split index. This change makes all tests in 't1701-racy-split-index.sh' pass, so flip the two 'test_expect_failure' tests to success. Also add the '#' (as in nr. of trial) to those tests' description that were omitted when the tests expected failure. Note that after this change if the index is split when it contains a racily clean cache entry, then a smudged cache entry will be written both to the new shared and to the new split indexes. This doesn't affect regular git commands: as far as they are concerned this is just an entry in the split index replacing an outdated entry in the shared index. It did affect a few tests in 't1700-split-index.sh', though, because they actually check which entries are stored in the split index; a previous patch in this series has already made the necessary adjustments in 't1700'. And racily clean cache entries and index splitting are rare enough to not worry about the resulting duplicated smudged cache entries, and the additional complexity required to prevent them is not worth it. Several tests failed occasionally when the test suite was run with 'GIT_TEST_SPLIT_INDEX=yes'. Here are those that I managed to trace back to this racy split index problem, starting with those failing more frequently, with a link to a failing Travis CI build job for each. The highlighted line [2] shows when the racy file was written, which is not always in the failing test but in a preceeding setup test. t3903-stash.sh: https://travis-ci.org/git/git/jobs/385542084#L5858 t4024-diff-optimize-common.sh: https://travis-ci.org/git/git/jobs/386531969#L3174 t4015-diff-whitespace.sh: https://travis-ci.org/git/git/jobs/360797600#L8215 t2200-add-update.sh: https://travis-ci.org/git/git/jobs/382543426#L3051 t0090-cache-tree.sh: https://travis-ci.org/git/git/jobs/416583010#L3679 There might be others, e.g. perhaps 't1000-read-tree-m-3way.sh' and others using 'lib-read-tree-m-3way.sh', but I couldn't confirm yet. [1] In the branch leading to the merge commit v2.1.0-rc0~45 (Merge branch 'nd/split-index', 2014-07-16). [2] Note that those highlighted lines are in the 'after failure' fold, and your browser might unhelpfully fold it up before you could take a good look. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-09submodule: add a helper to check if it is safe to write to .gitmodulesLibravatar Antonio Ospite1-0/+2
Introduce a helper function named is_writing_gitmodules_ok() to verify that the .gitmodules file is safe to write. The function name follows the scheme of is_staging_gitmodules_ok(). The two symbolic constants GITMODULES_INDEX and GITMODULES_HEAD are used to get help from the C preprocessor in preventing typos, especially for future users. This is in preparation for a future change which teaches git how to read .gitmodules from the index or from the current branch if the file is not available in the working tree. The rationale behind the check is that writing to .gitmodules requires the file to be present in the working tree, unless a brand new .gitmodules is being created (in which case the .gitmodules file would not exist at all: neither in the working tree nor in the index or in the current branch). Expose the functionality also via a "submodule-helper config --check-writeable" command, as git scripts may want to perform the check before modifying submodules configuration. Signed-off-by: Antonio Ospite <ao2@ao2.it> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21ws.c: remove implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21sha1-file.c: remove implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy1-2/+2
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21merge.c: remove implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy1-2/+4
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21read-cache.c: remove 'const' from index_has_changes()Libravatar Nguyễn Thái Ngọc Duy1-1/+1
This function calls do_diff_cache() which eventually needs to set this "istate" to unpack_options->src_index [1]. This is an unfortunate fact that unpack_trees() _will_ update [2] src_index so we can't really pass a const index_state there. Just remove 'const'. [1] Right now diff_cache() in diff-lib.c assigns the_index to src_index. But the plan is to get rid of the_index, so it should be 'istate' from here that gets assigned to src_index. [2] Some transient bits in the source index are touched. Optional extensions can also be removed. But other than that the source tree should still be valid. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-17Merge branch 'jk/cocci'Libravatar Junio C Hamano1-6/+16
spatch transformation to replace boolean uses of !hashcmp() to newly introduced oideq() is added, and applied, to regain performance lost due to support of multiple hash algorithms. * jk/cocci: show_dirstat: simplify same-content check read-cache: use oideq() in ce_compare functions convert hashmap comparison functions to oideq() convert "hashcmp() != 0" to "!hasheq()" convert "oidcmp() != 0" to "!oideq()" convert "hashcmp() == 0" to hasheq() convert "oidcmp() == 0" to oideq() introduce hasheq() and oideq() coccinelle: use <...> for function exclusion
2018-09-17Merge branch 'nd/clone-case-smashing-warning'Libravatar Junio C Hamano1-0/+1
Running "git clone" against a project that contain two files with pathnames that differ only in cases on a case insensitive filesystem would result in one of the files lost because the underlying filesystem is incapable of holding both at the same time. An attempt is made to detect such a case and warn. * nd/clone-case-smashing-warning: clone: report duplicate entries on case-insensitive filesystems
2018-09-17status: show progress bar if refreshing the index takes too longLibravatar Nguyễn Thái Ngọc Duy1-2/+5
Refreshing the index is usually very fast, but it can still take a long time sometimes. Cold cache is one. Or copying a repo to a new place (*). It's good to show something to let the user know "git status" is not hanging, it's just busy doing something. (*) In this case, all stat info in the index becomes invalid and git falls back to rehashing all file content to see if there's any difference between updating stat info in the index. This is quite expensive. Even with a repo as small as git.git, it takes 3 seconds. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-29convert "hashcmp() == 0" to hasheq()Libravatar Jeff King1-4/+4
This is the partner patch to the previous one, but covering the "hash" variants instead of "oid". Note that our coccinelle rule is slightly more complex to avoid triggering the call in hasheq(). I didn't bother to add a new rule to convert: - hasheq(E1->hash, E2->hash) + oideq(E1, E2) Since these are new functions, there won't be any such existing callers. And since most of the code is already using oideq, we're not likely to introduce new ones. We might still see "!hashcmp(E1->hash, E2->hash)" from topics in flight. But because our new rule comes after the existing ones, that should first get converted to "!oidcmp(E1, E2)" and then to "oideq(E1, E2)". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>