summaryrefslogtreecommitdiff
path: root/t/t0410-partial-clone.sh
AgeCommit message (Collapse)AuthorFilesLines
2019-10-07Merge branch 'cc/multi-promisor'Libravatar Junio C Hamano1-0/+13
Cleanup. * cc/multi-promisor: promisor-remote: skip move_to_tail when no-op promisor-remote.h: drop extern from function declaration
2019-10-07Merge branch 'jt/cache-tree-avoid-lazy-fetch-during-merge'Libravatar Junio C Hamano1-0/+14
The cache-tree code has been taught to be less aggressive in attempting to see if a tree object it computed already exists in the repository. * jt/cache-tree-avoid-lazy-fetch-during-merge: cache-tree: do not lazy-fetch tentative tree
2019-10-02promisor-remote: skip move_to_tail when no-opLibravatar Emily Shaffer1-0/+13
Previously, when promisor_remote_move_to_tail() is called for a promisor_remote which is currently the final element in promisors, a cycle is created in the promisors linked list. This cycle leads to a double free later on in promisor_remote_clear() when the final element of the promisors list is removed: promisors is set to promisors->next (a no-op, as promisors->next == promisors); the previous value of promisors is free()'d; then the new value of promisors (which is equal to the previous value of promisors) is also free()'d. This double-free error was unrecoverable for the user without removing the filter or re-cloning the repo and hoping to miss this edge case. Now, when promisor_remote_move_to_tail() would be a no-op, just do a no-op. In cases of promisor_remote_move_to_tail() where r is not already at the tail of the list, it works as before. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Acked-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-18Merge branch 'cc/multi-promisor'Libravatar Junio C Hamano1-6/+55
Teach the lazy clone machinery that there can be more than one promisor remote and consult them in order when downloading missing objects on demand. * cc/multi-promisor: Move core_partial_clone_filter_default to promisor-remote.c Move repository_format_partial_clone to promisor-remote.c Remove fetch-object.{c,h} in favor of promisor-remote.{c,h} remote: add promisor and partial clone config to the doc partial-clone: add multiple remotes in the doc t0410: test fetching from many promisor remotes builtin/fetch: remove unique promisor remote limitation promisor-remote: parse remote.*.partialclonefilter Use promisor_remote_get_direct() and has_promisor_remote() promisor-remote: use repository_format_partial_clone promisor-remote: add promisor_remote_reinit() promisor-remote: implement promisor_remote_get_direct() Add initial support for many promisor remotes fetch-object: make functions return an error code t0410: remove pipes after git commands
2019-09-09cache-tree: do not lazy-fetch tentative treeLibravatar Jonathan Tan1-0/+14
The cache-tree datastructure is used to speed up the comparison between the HEAD and the index, and when the index is updated by a cherry-pick (for example), a tree object that would represent the paths in the index in a directory is constructed in-core, to see if such a tree object exists already in the object store. When the lazy-fetch mechanism was introduced, we converted this "does the tree exist?" check into an "if it does not, and if we lazily cloned, see if the remote has it" call by mistake. Since the whole point of this check is to repair the cache-tree by recording an already existing tree object opportunistically, we shouldn't even try to fetch one from the remote. Pass the OBJECT_INFO_SKIP_FETCH_OBJECT flag to make sure we only check for existence in the local object store without triggering the lazy fetch mechanism. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> [jc: rewritten the proposed log message] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-09Merge branch 'ds/feature-macros'Libravatar Junio C Hamano1-1/+1
A mechanism to affect the default setting for a (related) group of configuration variables is introduced. * ds/feature-macros: repo-settings: create feature.experimental setting repo-settings: create feature.manyFiles setting repo-settings: parse core.untrackedCache commit-graph: turn on commit-graph by default t6501: use 'git gc' in quiet mode repo-settings: consolidate some config settings
2019-08-13commit-graph: turn on commit-graph by defaultLibravatar Derrick Stolee1-1/+1
The commit-graph feature has seen a lot of activity in the past year or so since it was introduced. The feature is a critical performance enhancement for medium- to large-sized repos, and does not significantly hurt small repos. Change the defaults for core.commitGraph and gc.writeCommitGraph to true so users benefit from this feature by default. There are several places in the test suite where the environment variable GIT_TEST_COMMIT_GRAPH is disabled to avoid reading a commit-graph, if it exists. The config option overrides the environment, so swap these. Some GIT_TEST_COMMIT_GRAPH assignments remain, and those are to avoid writing a commit-graph when a new commit is created. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-02t: warn against adding non-httpd-specific tests after sourcing 'lib-httpd'Libravatar SZEDER Gábor1-0/+3
We have a couple of test scripts that are not completely httpd-specific, but do run a few httpd-specific tests at the end. These test scripts source 'lib-httpd.sh' somewhere mid-script, which then skips all the rest of the test script if the dependencies for running httpd tests are not fulfilled. As the previous two patches in this series show, already on two occasions non-httpd-specific tests were appended at the end of such test scripts, and, consequently, they were skipped as well when httpd tests couldn't be run. Add a comment at the end of these test scripts to warn against adding non-httpd-specific tests at the end, in the hope that they will help prevent similar issues in the future. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-25t0410: test fetching from many promisor remotesLibravatar Christian Couder1-1/+48
This shows that it is now possible to fetch objects from more than one promisor remote, and that fetching from a new promisor remote can configure it as one. Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-25promisor-remote: parse remote.*.partialclonefilterLibravatar Christian Couder1-1/+1
This makes it possible to specify a different partial clone filter for each promisor remote. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-25t0410: remove pipes after git commandsLibravatar Christian Couder1-4/+6
Let's not run a git command, especially one with "verify" in its name, upstream of a pipe, because the pipe will hide the git command's exit code. While at it, let's also avoid a useless `cat` command piping into `sed`. Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-03-14tests: use 'test_atexit' to stop httpdLibravatar SZEDER Gábor1-2/+0
Use 'test_atexit' to run cleanup commands to stop httpd at the end of the test script or upon interrupt or failure, as it is shorter, simpler, and more robust than registering such cleanup commands in the trap on EXIT in the test scripts. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-18Merge branch 'sg/stress-test'Libravatar Junio C Hamano1-1/+0
Flaky tests can now be repeatedly run under load with the "--stress" option. * sg/stress-test: test-lib: add the '--stress' option to run a test repeatedly under load test-lib-functions: introduce the 'test_set_port' helper function test-lib: set $TRASH_DIRECTORY earlier test-lib: consolidate naming of test-results paths test-lib: parse command line options earlier test-lib: parse options in a for loop to keep $@ intact test-lib: extract Bash version check for '-x' tracing test-lib: translate SIGTERM and SIGHUP to an exit
2019-01-14Merge branch 'md/list-lazy-objects-fix'Libravatar Junio C Hamano1-2/+14
"git rev-list --exclude-promisor-objects" had to take an object that does not exist locally (and is lazily available) from the command line without barfing, but the code dereferenced NULL. * md/list-lazy-objects-fix: list-objects.c: don't segfault for missing cmdline objects
2019-01-07test-lib-functions: introduce the 'test_set_port' helper functionLibravatar SZEDER Gábor1-1/+0
Several test scripts run daemons like 'git-daemon' or Apache, and communicate with them through TCP sockets. To have unique ports where these daemons are accessible, the ports are usually the number of the corresponding test scripts, unless the user overrides them via environment variables, and thus all those tests and test libs contain more or less the same bit of one-liner boilerplate code to find out the port. The last patch in this series will make this a bit more complicated. Factor out finding the port for a daemon into the common helper function 'test_set_port' to avoid repeating ourselves. Take special care of test scripts with "low" numbers: - Test numbers below 1024 would result in a port that's only usable as root, so set their port to '10000 + test-nr' to make sure it doesn't interfere with other tests in the test suite. This makes the hardcoded port number in 't0410-partial-clone.sh' unnecessary, remove it. - The shell's arithmetic evaluation interprets numbers with leading zeros as octal values, which means that test number below 1000 and containing the digits 8 or 9 will trigger an error. Remove all leading zeros from the test numbers to prevent this. Note that the 'git p4' tests are unlike the other tests involving daemons in that: - 'lib-git-p4.sh' doesn't use the test's number for unique port as is, but does a bit of additional arithmetic on top [1]. - The port is not overridable via an environment variable. With this patch even 'git p4' tests will use the test's number as default port, and it will be overridable via the P4DPORT environment variable. [1] Commit fc00233071 (git-p4 tests: refactor and cleanup, 2011-08-22) introduced that "unusual" unique port computation without explaining why it was necessary (as opposed to simply using the test number as is). It seems to be just unnecessary complication, and in any case that commit came way before the "test nr as unique port" got "standardized" for other daemons in commits c44132fcf3 (tests: auto-set git-daemon port, 2014-02-10), 3bb486e439 (tests: auto-set LIB_HTTPD_PORT from test name, 2014-02-10), and bf9d7df950 (t/lib-git-svn.sh: improve svnserve tests with parallel make test, 2017-12-01). Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-06list-objects.c: don't segfault for missing cmdline objectsLibravatar Matthew DeVore1-2/+14
When a command is invoked with both --exclude-promisor-objects, --objects-edge-aggressive, and a missing object on the command line, the rev_info.cmdline array could get a NULL pointer for the value of an 'item' field. Prevent dereferencing of a NULL pointer in that situation. Properly handle --ignore-missing. If it is not passed, die when an object is missing. Otherwise, just silently ignore it. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-30Merge branch 'md/filter-trees'Libravatar Junio C Hamano1-0/+45
The "rev-list --filter" feature learned to exclude all trees via "tree:0" filter. * md/filter-trees: list-objects: support for skipping tree traversal filter-trees: code clean-up of tests list-objects-filter: implement filter tree:0 list-objects-filter-options: do not over-strbuf_init list-objects-filter: use BUG rather than die revision: mark non-user-given objects instead rev-list: handle missing tree objects properly list-objects: always parse trees gently list-objects: refactor to process_tree_contents list-objects: store common func args in struct
2018-10-19Merge branch 'jt/non-blob-lazy-fetch'Libravatar Junio C Hamano1-0/+41
A partial clone that is configured to lazily fetch missing objects will on-demand issue a "git fetch" request to the originating repository to fill not-yet-obtained objects. The request has been optimized for requesting a tree object (and not the leaf blob objects contained in it) by telling the originating repository that no blobs are needed. * jt/non-blob-lazy-fetch: fetch-pack: exclude blobs when lazy-fetching trees fetch-pack: avoid object flags if no_dependents
2018-10-07rev-list: handle missing tree objects properlyLibravatar Matthew DeVore1-0/+45
Previously, we assumed only blob objects could be missing. This patch makes rev-list handle missing trees like missing blobs. The --missing=* and --exclude-promisor-objects flags now work for trees as they already do for blobs. This is demonstrated in t6112. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-04fetch-pack: exclude blobs when lazy-fetching treesLibravatar Jonathan Tan1-0/+41
A partial clone with missing trees can be obtained using "git clone --filter=tree:none <repo>". In such a repository, when a tree needs to be lazily fetched, any tree or blob it directly or indirectly references is fetched as well, regardless of whether the original command required those objects, or if the local repository already had some of them. This is because the fetch protocol, which the lazy fetch uses, does not allow clients to request that only the wanted objects be sent, which would be the ideal solution. This patch implements a partial solution: specify the "blob:none" filter, somewhat reducing the fetch payload. This change has no effect when lazily fetching blobs (due to how filters work). And if lazily fetching a commit (such repositories are difficult to construct and is not a use case we support very well, but it is possible), referenced commits and trees are still fetched - only the blobs are not fetched. The necessary code change is done in fetch_pack() instead of somewhere closer to where the "filter" instruction is written to the wire so that only one part of the code needs to be changed in order for users of all protocol versions to benefit from this optimization. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-24Merge branch 'jt/lazy-object-fetch-fix'Libravatar Junio C Hamano1-0/+12
The code to backfill objects in lazily cloned repository did not work correctly, which has been corrected. * jt/lazy-object-fetch-fix: fetch-object: set exact_oid when fetching fetch-object: unify fetch_object[s] functions
2018-09-13fetch-object: set exact_oid when fetchingLibravatar Jonathan Tan1-0/+12
fetch_objects() currently does not set exact_oid in struct ref when invoking transport_fetch_refs(). If the server supports ref-in-want, fetch_pack() uses this field to determine whether a wanted ref should be requested as a "want-ref" line or a "want" line; without the setting of exact_oid, the wrong line will be sent. Set exact_oid, so that the correct line is sent. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-29commit-graph: define GIT_TEST_COMMIT_GRAPHLibravatar Derrick Stolee1-1/+1
The commit-graph feature is tested in isolation by t5318-commit-graph.sh and t6600-test-reach.sh, but there are many more interesting scenarios involving commit walks. Many of these scenarios are covered by the existing test suite, but we need to maintain coverage when the optional commit-graph structure is not present. To allow running the full test suite with the commit-graph present, add a new test environment variable, GIT_TEST_COMMIT_GRAPH. Similar to GIT_TEST_SPLIT_INDEX, this variable makes every Git command try to load the commit-graph when parsing commits, and writes the commit-graph file after every 'git commit' command. There are a few tests that rely on commits not existing in pack-files to trigger important events, so manually set GIT_TEST_COMMIT_GRAPH to false for the necessary commands. There is one test in t6024-recursive-merge.sh that relies on the merge-base algorithm picking one of two ambiguous merge-bases, and the commit-graph feature changes which merge-base is picked. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-09repack: repack promisor objects if -a or -A is setLibravatar Jonathan Tan1-11/+74
Currently, repack does not touch promisor packfiles at all, potentially causing the performance of repositories that have many such packfiles to drop. Therefore, repack all promisor objects if invoked with -a or -A. This is done by an additional invocation of pack-objects on all promisor objects individually given, which takes care of deduplication and allows the resulting packfiles to respect flags such as --max-pack-size. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-12list-objects: check if filter is NULL before usingLibravatar Jonathan Tan1-0/+8
In partial_clone_get_default_filter_spec(), the core_partial_clone_filter_default variable may be NULL; ensure that it is not NULL before using it. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08gc: do not repack promisor packfilesLibravatar Jonathan Tan1-2/+52
Teach gc to stop traversal at promisor objects, and to leave promisor packfiles alone. This has the effect of only repacking non-promisor packfiles, and preserves the distinction between promisor packfiles and non-promisor packfiles. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08rev-list: support termination at promisor objectsLibravatar Jonathan Tan1-0/+101
Teach rev-list to support termination of an object traversal at any object from a promisor remote (whether one that the local repo also has, or one that the local repo knows about because it has another promisor object that references it). This will be used subsequently in gc and in the connectivity check used by fetch. For efficiency, if an object is referenced by a promisor object, and is in the local repo only as a non-promisor object, object traversal will not stop there. This is to avoid building the list of promisor object references. (In list-objects.c, the case where obj is NULL in process_blob() and process_tree() do not need to be changed because those happen only when there is a conflict between the expected type and the existing object. If the object doesn't exist, an object will be synthesized, which is fine.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08sha1_file: support lazily fetching missing objectsLibravatar Jonathan Tan1-0/+51
Teach sha1_file to fetch objects from the remote configured in extensions.partialclone whenever an object is requested but missing. The fetching of objects can be suppressed through a global variable. This is used by fsck and index-pack. However, by default, such fetching is not suppressed. This is meant as a temporary measure to ensure that all Git commands work in such a situation. Future patches will update some commands to either tolerate missing objects (without fetching them) or be more efficient in fetching them. In order to determine the code changes in sha1_file.c necessary, I investigated the following: (1) functions in sha1_file.c that take in a hash, without the user regarding how the object is stored (loose or packed) (2) functions in packfile.c (because I need to check callers that know about the loose/packed distinction and operate on both differently, and ensure that they can handle the concept of objects that are neither loose nor packed) (1) is handled by the modification to sha1_object_info_extended(). For (2), I looked at for_each_packed_object and others. For for_each_packed_object, the callers either already work or are fixed in this patch: - reachable - only to find recent objects - builtin/fsck - already knows about missing objects - builtin/cat-file - warning message added in this commit Callers of the other functions do not need to be changed: - parse_pack_index - http - indirectly from http_get_info_packs - find_pack_entry_one - this searches a single pack that is provided as an argument; the caller already knows (through other means) that the sought object is in a specific pack - find_sha1_pack - fast-import - appears to be an optimization to not store a file if it is already in a pack - http-walker - to search through a struct alt_base - http-push - to search through remote packs - has_sha1_pack - builtin/fsck - already knows about promisor objects - builtin/count-objects - informational purposes only (check if loose object is also packed) - builtin/prune-packed - check if object to be pruned is packed (if not, don't prune it) - revision - used to exclude packed objects if requested by user - diff - just for optimization Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-05fsck: support promisor objects as CLI argumentLibravatar Jonathan Tan1-0/+13
Teach fsck to not treat missing promisor objects provided on the CLI as an error when extensions.partialclone is set. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-05fsck: support referenced promisor objectsLibravatar Jonathan Tan1-0/+23
Teach fsck to not treat missing promisor objects indirectly pointed to by refs as an error when extensions.partialclone is set. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-05fsck: support refs pointing to promisor objectsLibravatar Jonathan Tan1-0/+24
Teach fsck to not treat refs referring to missing promisor objects as an error when extensions.partialclone is set. For the purposes of warning about no default refs, such refs are still treated as legitimate refs. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-05fsck: introduce partialclone extensionLibravatar Jonathan Tan1-0/+81
Currently, Git does not support repos with very large numbers of objects or repos that wish to minimize manipulation of certain blobs (for example, because they are very large) very well, even if the user operates mostly on part of the repo, because Git is designed on the assumption that every referenced object is available somewhere in the repo storage. In such an arrangement, the full set of objects is usually available in remote storage, ready to be lazily downloaded. Teach fsck about the new state of affairs. In this commit, teach fsck that missing promisor objects referenced from the reflog are not an error case; in future commits, fsck will be taught about other cases. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>