Age | Commit message (Collapse) | Author | Files | Lines |
|
By testing 'git -c core.fsmonitor= status -uno', we can check for the
simplest index operations that can be made sparse-aware. The necessary
implementation details are already integrated with sparse-checkout, so
modify command_requires_full_index to be zero for cmd_status().
In refresh_index(), we loop through the index entries to refresh their
stat() information. However, sparse directories have no stat()
information to populate. Ignore these entries.
This allows 'git status' to no longer expand a sparse index to a full
one. This is further tested by dropping the "-uno" option and adding an
untracked file into the worktree.
The performance test p2000-sparse-checkout-operations.sh demonstrates
these improvements:
Test HEAD~1 HEAD
-----------------------------------------------------------------------------
2000.2: git status (full-index-v3) 0.31(0.30+0.05) 0.31(0.29+0.06) +0.0%
2000.3: git status (full-index-v4) 0.31(0.29+0.07) 0.34(0.30+0.08) +9.7%
2000.4: git status (sparse-index-v3) 2.35(2.28+0.10) 0.04(0.04+0.05) -98.3%
2000.5: git status (sparse-index-v4) 2.35(2.24+0.15) 0.05(0.04+0.06) -97.9%
Note that since HEAD~1 was expanding the sparse index by parsing trees,
it was artificially slower than the full index case. Thus, the 98%
improvement is misleading, and instead we should celebrate the 0.34s to
0.05s improvement of 85%. This is more indicative of the peformance
gains we are expecting by using a sparse index.
Note: we are dropping the assignment of core.fsmonitor here. This is not
necessary for the test script as we are not altering the config any
other way. Correct integration with FS Monitor will be validated in
later changes.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
'git status' began reporting a percentage of populated paths when
sparse-checkout is enabled in 051df3cf (wt-status: show sparse
checkout status as well, 2020-07-18). This percentage is incorrect when
the index has sparse directories. It would also be expensive to
calculate as we would need to parse trees to count the total number of
possible paths.
Avoid the expensive computation by simplifying the output to only report
that a sparse checkout exists, without the percentage.
This change is the reason we use 'git status --porcelain=v2' in
t1092-sparse-checkout-compatibility.sh. We don't want to ensure that
this message is equal across both modes, but instead just the important
information about staged, modified, and untracked files are compared.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While comparing an index to a tree, we may see a sparse directory entry.
In this case, we should compare that portion of the tree to the tree
represented by that entry. This could include a new tree which needs to
be expanded to a full list of added files. It could also include an
existing tree, in which case all of the changes inside are important to
describe, including the modifications, additions, and deletions. Note
that the case where the tree has a path and the index does not remains
identical to before: the lack of a cache entry is the same with a sparse
index.
Use diff_tree_oid() appropriately to compute the diff.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we have sparse directory entries in the index, we want to compare
that directory against sparse-checkout patterns. Those pattern matching
algorithms are built expecting a file path, not a directory path. This
is especially important in the "cone mode" patterns which will match
files that exist within the "parent directories" as well as the
recursive directory matches.
If path_matches_pattern_list() is given a directory, we can add a fake
filename ("-") to the directory and get the same results as before,
assuming we are in cone mode. Since sparse index requires cone mode
patterns, this is an acceptable assumption.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
During unpack_callback(), index entries are compared against tree
entries. These are matched according to names and types. One goal is to
decide if we should recurse into subtrees or simply operate on one index
entry.
In the case of a sparse-directory entry, we do not want to recurse into
that subtree and instead simply compare the trees. In some cases, we
might want to perform a merge operation on the entry, such as during
'git checkout <commit>' which wants to replace a sparse tree entry with
the tree for that path at the target commit. We extend the logic within
unpack_single_entry() to create a sparse-directory entry in this case,
and then that is sent to call_unpack_fn().
There are some subtleties in this process. For instance, we need to
update find_cache_entry() to allow finding a sparse-directory entry that
exactly matches a given path. Use the new helper method
sparse_dir_matches_path() for this. We also need to ignore conflict
markers in the case that the entries correspond to directories and we
already have a sparse directory entry.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the next change, we will use this method to unpack a sparse directory
entry, so change the name to unpack_single_entry() so these entries
apply. The new name reflects that we will not recurse into trees in
order to resolve the conflicts.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As we further integrate the sparse-index into unpack-trees, we need to
ensure that we compare sparse directory entries correctly with other
entries. This affects searching for an exact path as well as sorting
index entries.
Sparse directory entries contain the trailing directory separator. This
is important for the sorting, in particular. Thus, within
do_compare_entry() we stop using S_IFREG in all cases, since sparse
directories should use S_IFDIR to indicate that the comparison should
treat the entry name as a dirctory.
Within compare_entry(), it first calls do_compare_entry() to check the
leading portion of the name. When the input path is a directory name, we
could match exactly already. Thus, we should return 0 if we have an
exact string match on a sparse directory entry. The final check is a
length comparison between the strings.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The cache_bottom member of 'struct unpack_trees_options' is used to
track the range of index entries corresponding to a node of the cache
tree. While recursing with traverse_by_cache_tree(), this value is
preserved on the call stack using a local and then restored as that
method returns.
The mark_ce_used() method normally modifies the cache_bottom member when
it refers to the marked cache entry. However, sparse directory entries
are stored as nodes in the cache-tree data structure as of 2de37c53
(cache-tree: integrate with sparse directory entries, 2021-03-30). Thus,
the cache_bottom will be modified as the cache-tree walk advances. Do
not update it as well within mark_ce_used().
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Before moving to update 'git status' and 'git add' to work with sparse
indexes, add an explicit test that ensures the sparse-index works the
same as a normal sparse-checkout when the worktree contains directories
and files outside of the sparse cone.
Specifically, 'folder1/a' is a file in our test repo, but 'folder1' is
not in the sparse cone. When 'folder1/a' is modified, the file is not
shown as modified and adding it will fail. This is new behavior as of
a20f704 (add: warn when asked to update SKIP_WORKTREE entries,
2021-04-08). Before that change, these adds would be silently ignored.
Untracked files are fine: adding new files both with 'git add .' and
'git add folder1/' works just as in a full checkout. This may not be
entirely desirable, but we are not intending to change behavior at the
moment, only document it. A future change could alter the behavior to
be more sensible, and this test could be modified to satisfy the new
expected behavior.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
As more features integrate with the sparse-index feature, more and more
special cases arise that require different data shapes within the tree
structure of the repository in order to demonstrate those cases.
Add several interesting special cases all at once instead of sprinkling
them across several commits. The interesting cases being added here are:
* Add sparse-directory entries on both sides of directories within the
sparse-checkout definition.
* Add directories outside the sparse-checkout definition who have only
one entry and are the first entry of a directory with multiple
entries.
* Add filenames adjacent to a sparse directory entry that sort before
and after the trailing slash.
Later tests will take advantage of these shapes, but they also deepen
the tests that already exist.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This fixes the test data shape to be as expected, allowing rename
detection to work properly now that the 'larger-content' file actually
has meaningful lines.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When creating a full index from a sparse one, we create cache entries
for every blob within a given sparse directory entry. These are
correctly marked with the CE_SKIP_WORKTREE flag, but the CE_EXTENDED
flag is not included. The CE_EXTENDED flag would exist if we loaded a
full index from disk with these entries marked with CE_SKIP_WORKTREE, so
we can add the flag here to be consistent. This allows us to directly
compare the flags present in cache entries when testing the sparse-index
feature, but has no significance to its correctness in the user-facing
functionality.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The sparse-index format is designed to be compatible with merge
conflicts, even those outside the sparse-checkout definition. The reason
is that when converting a full index to a sparse one, a cache entry with
nonzero stage will not be collapsed into a sparse directory entry.
However, this behavior was not tested, and a different behavior within
convert_to_sparse() fails in this scenario. Specifically,
cache_tree_update() will fail when unmerged entries exist.
convert_to_sparse_rec() uses the cache-tree data to recursively walk the
tree structure, but also to compute the OIDs used in the
sparse-directory entries.
Add an index scan to convert_to_sparse() that will detect if these merge
conflict entries exist and skip the conversion before trying to update
the cache-tree. This is marked as NEEDSWORK because this can be removed
with a suitable update to cache_tree_update() or a similar method that
can construct a cache-tree with invalid nodes, but still allow creating
the nodes necessary for creating sparse directory entries.
It is possible that in the future we will not need to make such an
update, since if we do not expand a sparse-index into a full one, this
conversion does not need to happen. Thus, this can be deferred until the
merge machinery is made to integrate with the sparse-index.
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Preparatory build procedure clean-up for documentation.
* fc/doc-build-cleanup:
doc: avoid using rm directly
doc: simplify Makefile using .DELETE_ON_ERROR
doc: remove unnecessary rm instances
doc: improve asciidoc dependencies
doc: refactor common asciidoc dependencies
|
|
Test clean-up.
* ab/test-lib-updates:
test-lib: split up and deprecate test_create_repo()
test-lib: do not show advice about init.defaultBranch under --verbose
test-lib: reformat argument list in test_create_repo()
submodule tests: use symbolic-ref --short to discover branch name
test-lib functions: add --printf option to test_commit
describe tests: convert setup to use test_commit
test-lib functions: add an --annotated option to "test_commit"
test-lib-functions: document test_commit --no-tag
test-lib-functions: reword "test_commit --append" docs
test-lib tests: remove dead GIT_TEST_FRAMEWORK_SELFTEST variable
test-lib: bring $remove_trash out of retirement
|
|
Test portability fix.
* dd/honor-users-tar-in-tests:
t: use configured TAR instead of tar
|
|
Message update.
* ps/rev-list-object-type-filter:
help: fix small typo in error message
|
|
Workaround compiler warnings.
* ab/trace2-squelch-gcc-warning:
trace2: refactor to avoid gcc warning under -O3
|
|
The "-m" option in "git log -m" that does not specify which format,
if any, of diff is desired did not have any visible effect; it now
implies some form of diff (by default "--patch") is produced.
* so/log-m-implies-p:
diff-merges: let "-m" imply "-p"
diff-merges: rename "combined_imply_patch" to "merges_imply_patch"
stash list: stop passing "-m" to "git log"
git-svn: stop passing "-m" to "git rev-list"
diff-merges: move specific diff-index "-m" handling to diff-index
t4013: test "git diff-index -m"
t4013: test "git diff-tree -m"
t4013: test "git log -m --stat"
t4013: test "git log -m --raw"
t4013: test that "-m" alone has no effect in "git log"
|
|
Optimize out repeated rename detection in a sequence of mergy
operations.
* en/ort-perf-batch-11:
merge-ort, diffcore-rename: employ cached renames when possible
merge-ort: handle interactions of caching and rename/rename(1to1) cases
merge-ort: add helper functions for using cached renames
merge-ort: preserve cached renames for the appropriate side
merge-ort: avoid accidental API mis-use
merge-ort: add code to check for whether cached renames can be reused
merge-ort: populate caches of rename detection results
merge-ort: add data structures for in-memory caching of rename detection
t6429: testcases for remembering renames
fast-rebase: write conflict state to working tree, index, and HEAD
fast-rebase: change assert() to BUG()
Documentation/technical: describe remembering renames optimization
t6423: rename file within directory that other side renamed
|
|
"git fetch" over protocol v2 left its side of the socket open after
it finished speaking, which unnecessarily wasted the resource on
the other side.
* jk/fetch-pack-v2-half-close-early:
fetch-pack: signal v2 server that we are done making requests
|
|
Use the hashfile API in the codepath that writes the index file to
reduce code duplication.
* ds/write-index-with-hashfile-api:
read-cache: delete unused hashing methods
read-cache: use hashfile instead of git_hash_ctx
csum-file.h: increase hashfile buffer size
hashfile: use write_in_full()
|
|
Recent "git clone" left a temporary directory behind when the
transport layer returned an failure.
* jk/clone-clean-upon-transport-error:
clone: clean up directory after transport_fetch_refs() failure
|
|
"git send-email" learned the "--sendmail-cmd" command line option
and the "sendemail.sendmailCmd" configuration variable, which is a
more sensible approach than the current way of repurposing the
"smtp-server" that is meant to name the server to instead name the
command to talk to the server.
* ga/send-email-sendmail-cmd:
git-send-email: add option to specify sendmail command
|
|
The code to handle the "--format" option in "for-each-ref" and
friends made too many string comparisons on %(atom)s used in the
format string, which has been corrected by converting them into
enum when the format string is parsed.
* zh/ref-filter-atom-type:
ref-filter: introduce enum atom_type
ref-filter: add objectsize to used_atom
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Message update.
* ah/setup-extensions-message-i18n-fix:
setup: split "extensions found" messages into singular and plural
|
|
Message update.
* ah/fetch-reject-warning-grammofix:
fetch: improve grammar of "shallow roots" message
|
|
The documentation for "color.pager" configuration variable has been
updated.
* jk/doc-color-pager:
doc: explain the use of color.pager
|
|
Doc fix.
* tl/fix-packfile-uri-doc:
packfile-uri.txt: fix blobPackfileUri description
|
|
The description of "fast-forward" in the glossary has been updated.
* ry/clarify-fast-forward-in-glossary:
docs: improve fast-forward in glossary content
|
|
Quite a many die() messages in rev-parse haven't been marked for
translation.
* wm/rev-parse-die-i18n:
rev-parse: mark die() messages for translation
|
|
Doc update.
* jc/clarify-revision-range:
revisions(7): clarify that most commands take a single revision range
|
|
Doc update.
* ah/doc-describe:
describe-doc: clarify default length of abbreviation
|
|
Message update.
* ah/submodule-helper-module-summary-parseopt:
submodule: use the imperative mood to describe the --files option
|
|
i18n update.
* ah/stash-usage-i18n-fix:
stash: don't translate literal commands
|
|
i18n update.
* ah/merge-usage-i18n-fix:
merge: don't translate literal commands
|
|
Rewrite code that triggers undefined behaiour warning.
* jn/size-t-casted-to-off-t-fix:
xsize_t: avoid implementation defined behavior when len < 0
|
|
The parallel checkout codepath did not initialize object ID field
used to talk to the worker processes in a futureproof way.
* mt/parallel-checkout-with-padded-oidcpy:
parallel-checkout: send the new object_id algo field to the workers
|
|
We historically rejected a very short string as an author name
while accepting a patch e-mail, which has been loosened.
* ef/mailinfo-short-name:
mailinfo: don't discard names under 3 characters
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
l10n-2.32.0-rnd1.1
* tag 'l10n-2.32.0-rnd1.1' of git://github.com/git-l10n/git-po: (25 commits)
l10n: es: 2.32.0 round 1
l10n: zh_CN: for git v2.32.0 l10n round 1
l10n: Update Catalan translation
l10n: de.po: Update German translation for Git v2.32.0
l10n: README: note on fuzzy translations
l10n: README: document l10n conventions
l10n: README: document "core translation"
l10n: README: document git-po-helper
l10n: README: add file extention ".md"
l10n: pt_PT: add Portuguese translations part 3
l10n: bg.po: Updated Bulgarian translation (5204t)
l10n: id: po-id for 2.32.0 (round 1)
l10n: vi.po(5204t): Updated Vietnamese translation for v2.32.0
l10n: zh_TW.po: localized
l10n: zh_TW.po: v2.32.0 round 1 (11 untranslated)
l10n: sv.po: Update Swedish translation (5204t0f0u)
l10n: fix typos in po/TEAMS
l10n: fr: v2.32.0 round 1
l10n: tr: v2.32.0-r1
l10n: fr: fixed inconsistencies
...
|
|
Test fix.
* rs/parallel-checkout-test-fix:
parallel-checkout: avoid dash local bug in tests
|
|
Last minute portability fix.
* jc/fsync-can-fail-with-eintr:
fsync(): be prepared to see EINTR
|
|
Dash bug https://bugs.launchpad.net/ubuntu/+source/dash/+bug/139097
lets the shell erroneously perform field splitting on the expansion of a
command substitution during declaration of a local variable. It causes
the parallel-checkout tests to fail e.g. when running them with
/bin/dash on MacOS 11.4, where they error out like this:
./t2080-parallel-checkout-basics.sh: 33: local: 0: bad variable name
That's because the output of wc -l contains leading spaces and the
returned number of lines is treated as another variable to declare, i.e.
as in "local workers= 0".
Work around it by enclosing the command substitution in quotes.
Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Christopher Diaz Riveros <christopher.diaz.riv@gmail.com>
|
|
Translate 126 new messages (5204t0f0u) for git 2.32.0.
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
|
|
* 'fix_typo' of github.com:e-yes/git:
l10n: ru.po: fix typo in Russian translation
|
|
Some platforms, like NonStop do not automatically restart fsync()
when interrupted by a signal, even when that signal is setup with
SA_RESTART.
This can lead to test breakage, e.g., where "--progress" is used,
thus SIGALRM is sent often, and can interrupt an fsync() syscall.
Make sure we deal with such a case by retrying the syscall
ourselves. Luckily, we call fsync() fron a single wrapper,
fsync_or_die(), so the fix is fairly isolated.
Reported-by: Randall S. Becker <randall.becker@nexbridge.ca>
Helped-by: Jeff King <peff@peff.net>
Helped-by: Taylor Blau <me@ttaylorr.com>
[jc: the above two did most of the work---I just tied the loose end]
Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|