summaryrefslogtreecommitdiff
path: root/cache.h
AgeCommit message (Collapse)AuthorFilesLines
2017-03-28Merge branch 'jk/pack-name-cleanups' into maintLibravatar Junio C Hamano1-0/+21
Code clean-up. * jk/pack-name-cleanups: index-pack: make pointer-alias fallbacks safer replace snprintf with odb_pack_name() odb_pack_keep(): stop generating keepfile name sha1_file.c: make pack-name helper globally accessible move odb_* declarations out of git-compat-util.h
2017-03-28Merge branch 'jk/interpret-branch-name' into maintLibravatar Junio C Hamano1-1/+31
"git branch @" created refs/heads/@ as a branch, and in general the code that handled @{-1} and @{upstream} was a bit too loose in disambiguating. * jk/interpret-branch-name: checkout: restrict @-expansions when finding branch strbuf_check_ref_format(): expand only local branches branch: restrict @-expansions when deleting t3204: test git-branch @-expansion corner cases interpret_branch_name: allow callers to restrict expansions strbuf_branchname: add docstring strbuf_branchname: drop return value interpret_branch_name: move docstring to header file interpret_branch_name(): handle auto-namelen for @{-1}
2017-03-28Merge branch 'jk/parse-config-key-cleanup' into maintLibravatar Junio C Hamano1-1/+4
The "parse_config_key()" API function has been cleaned up. * jk/parse-config-key-cleanup: parse_hide_refs_config: tell parse_config_key we don't want a subsection parse_config_key: allow matching single-level config parse_config_key: use skip_prefix instead of starts_with refs: parse_hide_refs_config to use parse_config_key
2017-03-21Merge branch 'rj/remove-unused-mktemp' into maintLibravatar Junio C Hamano1-3/+0
Code cleanup. * rj/remove-unused-mktemp: wrapper.c: remove unused gitmkstemps() function wrapper.c: remove unused git_mkstemp() function
2017-03-16odb_pack_keep(): stop generating keepfile nameLibravatar Jeff King1-4/+4
The odb_pack_keep() function generates the name of a .keep file and opens it. This has two problems: 1. It requires a fixed-size buffer to create the filename and doesn't notice when the result is truncated. 2. Of the two callers, one sometimes wants to open a filename it already has, which makes things awkward (it has to do so manually, and skips the leading-directory creation). Instead, let's have odb_pack_keep() just open the file. Generating the name isn't hard, and a future patch will switch callers over to odb_pack_name() anyway. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-16sha1_file.c: make pack-name helper globally accessibleLibravatar Jeff King1-0/+9
We provide sha1_pack_name() and sha1_pack_index_name(), but the more generic form (which takes its own strbuf and an arbitrary extension) is only used to implement the other two. Let's make it available, but clean up a few things: 1. Name it odb_pack_name(), as the original sha1_get_pack_name() is long but not all that descriptive. 2. Switch the strbuf argument to the beginning, so that it matches similar path-building functions like git_path_buf(). 3. Clean up the out-dated docstring and move it to the public declaration. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-16move odb_* declarations out of git-compat-util.hLibravatar Jeff King1-0/+12
These functions were originally conceived as wrapper functions similar to xmkstemp(). They were later moved by 463db9b10 (wrapper: move odb_* to environment.c, 2010-11-06). The more appropriate place for a declaration is in cache.h. While we're at it, let's add some basic docstrings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-08real_pathdup(): fix callsites that wanted it to die on errorLibravatar Johannes Schindelin1-1/+1
In 4ac9006f832 (real_path: have callers use real_pathdup and strbuf_realpath, 2016-12-12), we changed the xstrdup(real_path()) pattern to use real_pathdup() directly. The problem with this change is that real_path() calls strbuf_realpath() with die_on_error = 1 while real_pathdup() calls it with die_on_error = 0. Meaning that in cases where real_path() causes Git to die() with an error message, real_pathdup() is silent and returns NULL instead. The callers, however, are ill-prepared for that change, as they expect the return value to be non-NULL (and otherwise the function died with an appropriate error message). Fix this by extending real_pathdup()'s signature to accept the die_on_error flag and simply pass it through to strbuf_realpath(), and then adjust all callers after a careful audit whether they would handle NULLs well. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-02interpret_branch_name: allow callers to restrict expansionsLibravatar Jeff King1-2/+11
The interpret_branch_name() function converts names like @{-1} and @{upstream} into branch names. The expanded ref names are not fully qualified, and may be outside of the refs/heads/ namespace (e.g., "@" expands to "HEAD", and "@{upstream}" is likely to be in "refs/remotes/"). This is OK for callers like dwim_ref() which are primarily interested in resolving the resulting name, no matter where it is. But callers like "git branch" treat the result as a branch name in refs/heads/. When we expand to a ref outside that namespace, the results are very confusing (e.g., "git branch @" tries to create refs/heads/HEAD, which is nonsense). Callers can't know from the returned string how the expansion happened (e.g., did the user really ask for a branch named "HEAD", or did we do a bogus expansion?). One fix would be to return some out-parameters describing the types of expansion that occurred. This has the benefit that the caller can generate precise error messages ("I understood @{upstream} to mean origin/master, but that is a remote tracking branch, so you cannot create it as a local name"). However, out-parameters make the function interface somewhat cumbersome. Instead, let's do the opposite: let the caller tell us which elements to expand. That's easier to pass in, and none of the callers give more precise error messages than "@{upstream} isn't a valid branch name" anyway (which should be sufficient). The strbuf_branchname() function needs a similar parameter, as most of the callers access interpret_branch_name() through it. We can break the callers down into two groups: 1. Callers that are happy with any kind of ref in the result. We pass "0" here, so they continue to work without restrictions. This includes merge_name(), the reflog handling in add_pending_object_with_path(), and substitute_branch_name(). This last is what powers dwim_ref(). 2. Callers that have funny corner cases (mostly in git-branch and git-checkout). These need to make use of the new parameter, but I've left them as "0" in this patch, and will address them individually in follow-on patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-02interpret_branch_name: move docstring to header fileLibravatar Jeff King1-0/+21
We generally put docstrings with function declarations, because it's the callers who need to know how the function works. Let's do so for interpret_branch_name(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-28wrapper.c: remove unused git_mkstemp() functionLibravatar Ramsay Jones1-3/+0
The last caller of git_mkstemp() was removed in commit 6fec0a89 ("verify_signed_buffer: use tempfile object", 16-06-2016). Since the introduction of the 'tempfile' APIs, along with git_mkstemp_mode, it is unlikely that new callers will materialize. Remove the dead code. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-24parse_config_key: allow matching single-level configLibravatar Jeff King1-1/+4
The parse_config_key() function was introduced to make it easier to match "section.subsection.key" variables. It also handles the simpler "section.key", and the caller is responsible for distinguishing the two from its out-parameters. Most callers who _only_ want "section.key" would just use a strcmp(var, "section.key"), since there is no parsing required. However, they may still use parse_config_key() if their "section" variable isn't a constant (an example of this is in parse_hide_refs_config). Using the parse_config_key is a bit clunky, though: const char *subsection; int subsection_len; const char *key; if (!parse_config_key(var, section, &subsection, &subsection_len, &key) && !subsection) { /* matched! */ } Instead, let's treat a NULL subsection as an indication that the caller does not expect one. That lets us write: const char *key; if (!parse_config_key(var, section, NULL, NULL, &key)) { /* matched! */ } Existing callers should be unaffected, as passing a NULL subsection would currently segfault. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03Merge branch 'cw/log-updates-for-all-refs-really'Libravatar Junio C Hamano1-1/+8
The "core.logAllRefUpdates" that used to be boolean has been enhanced to take 'always' as well, to record ref updates to refs other than the ones that are expected to be updated (i.e. branches, remote-tracking branches and notes). * cw/log-updates-for-all-refs-really: doc: add note about ignoring '--no-create-reflog' update-ref: add test cases for bare repository refs: add option core.logAllRefUpdates = always config: add markup to core.logAllRefUpdates doc
2017-02-03Merge branch 'sb/submodule-recursive-absorb'Libravatar Junio C Hamano1-1/+4
When a submodule "A", which has another submodule "B" nested within it, is "absorbed" into the top-level superproject, the inner submodule "B" used to be left in a strange state. The logic to adjust the .git pointers in these submodules has been corrected. * sb/submodule-recursive-absorb: submodule absorbing: fix worktree/gitdir pointers recursively for non-moves cache.h: expose the dying procedure for reading gitlinks setup: add gentle version of resolve_git_dir
2017-02-02Merge branch 'rs/absolute-pathdup'Libravatar Junio C Hamano1-0/+1
Code cleanup. * rs/absolute-pathdup: use absolute_pathdup() abspath: add absolute_pathdup()
2017-01-31Merge branch 'sb/in-core-index-doc' into maintLibravatar Junio C Hamano1-0/+32
Documentation and in-code comments updates. * sb/in-core-index-doc: documentation: retire unfinished documentation cache.h: document add_[file_]to_index cache.h: document remove_index_entry_at cache.h: document index_name_pos
2017-01-31Merge branch 'sb/in-core-index-doc'Libravatar Junio C Hamano1-0/+32
Documentation and in-code comments updates. * sb/in-core-index-doc: documentation: retire unfinished documentation cache.h: document add_[file_]to_index cache.h: document remove_index_entry_at cache.h: document index_name_pos
2017-01-31Merge branch 'jk/loose-object-fsck'Libravatar Junio C Hamano1-0/+13
"git fsck" inspects loose objects more carefully now. * jk/loose-object-fsck: fsck: detect trailing garbage in all object types fsck: parse loose object paths directly sha1_file: add read_loose_object() function t1450: test fsck of packed objects sha1_file: fix error message for alternate objects t1450: refactor loose-object removal
2017-01-31refs: add option core.logAllRefUpdates = alwaysLibravatar Cornelius Weig1-1/+8
When core.logallrefupdates is true, we only create a new reflog for refs that are under certain well-known hierarchies. The reason is that we know that some hierarchies (like refs/tags) are not meant to change, and that unknown hierarchies might not want reflogs at all (e.g., a hypothetical refs/foo might be meant to change often and drop old history immediately). However, sometimes it is useful to override this decision and simply log for all refs, because the safety and audit trail is more important than the performance implications of keeping the log around. This patch introduces a new "always" mode for the core.logallrefupdates option which will log updates to everything under refs/, regardless where in the hierarchy it is (we still will not log things like ORIG_HEAD and FETCH_HEAD, which are known to be transient). Based-on-patch-by: Jeff King <peff@peff.net> Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-26abspath: add absolute_pathdup()Libravatar René Scharfe1-0/+1
Add a function that returns a buffer containing the absolute path of its argument and a semantic patch for its intended use. It avoids an extra string copy to a static buffer. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-26cache.h: expose the dying procedure for reading gitlinksLibravatar Stefan Beller1-0/+1
In a later patch we want to react to only a subset of errors, defaulting the rest to die as usual. Separate the block that takes care of dying into its own function so we have easy access to it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-26setup: add gentle version of resolve_git_dirLibravatar Stefan Beller1-1/+3
This follows a93bedada (setup: add gentle version of read_gitfile, 2015-06-09), and assumes the same reasoning. resolve_git_dir is unsuited for speculative calls, so we want to use the gentle version to find out about potential errors. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-23Merge branch 'bw/read-blob-data-does-not-modify-index-state'Libravatar Junio C Hamano1-1/+1
Code clean-up. * bw/read-blob-data-does-not-modify-index-state: index: improve constness for reading blob data
2017-01-19cache.h: document add_[file_]to_indexLibravatar Stefan Beller1-0/+10
Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-19cache.h: document remove_index_entry_atLibravatar Stefan Beller1-0/+3
Do this by moving the existing documentation from read-cache.c to cache.h. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-19cache.h: document index_name_posLibravatar Stefan Beller1-0/+19
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-18Merge branch 'bw/pathspec-cleanup'Libravatar Junio C Hamano1-1/+0
Code clean-up in the pathspec API. * bw/pathspec-cleanup: pathspec: rename prefix_pathspec to init_pathspec_item pathspec: small readability changes pathspec: create strip submodule slash helpers pathspec: create parse_element_magic helper pathspec: create parse_long_magic function pathspec: create parse_short_magic function pathspec: factor global magic into its own function pathspec: simpler logic to prefix original pathspec elements pathspec: always show mnemonic and name in unsupported_magic pathspec: remove unused variable from unsupported_magic pathspec: copy and free owned memory pathspec: remove the deprecated get_pathspec function ls-tree: convert show_recursive to use the pathspec struct interface dir: convert fill_directory to use the pathspec struct interface dir: remove struct path_simplify mv: remove use of deprecated 'get_pathspec()'
2017-01-18Merge branch 'bw/grep-recurse-submodules'Libravatar Junio C Hamano1-0/+5
"git grep" has been taught to optionally recurse into submodules. * bw/grep-recurse-submodules: grep: search history of moved submodules grep: enable recurse-submodules to work on <tree> objects grep: optionally recurse into submodules grep: add submodules as a grep source type submodules: load gitmodules file from commit sha1 submodules: add helper to determine if a submodule is initialized submodules: add helper to determine if a submodule is populated real_path: canonicalize directory separators in root parts real_path: have callers use real_pathdup and strbuf_realpath real_path: create real_pathdup real_path: convert real_path_internal to strbuf_realpath real_path: resolve symlinks by hand
2017-01-15sha1_file: add read_loose_object() functionLibravatar Jeff King1-0/+13
It's surprisingly hard to ask the sha1_file code to open a _specific_ incarnation of a loose object. Most of the functions take a sha1, and loop over the various object types (packed versus loose) and locations (local versus alternates) at a low level. However, some tools like fsck need to look at a specific file. This patch gives them a function they can use to open the loose object at a given path. The implementation unfortunately ends up repeating bits of related functions, but there's not a good way around it without some major refactoring of the whole sha1_file stack. We need to mmap the specific file, then partially read the zlib stream to know whether we're streaming or not, and then finally either stream it or copy the data to a buffer. We can do that by assembling some of the more arcane internal sha1_file functions, but we end up having to essentially reimplement unpack_sha1_file(), along with the streaming bits of check_sha1_signature(). Still, most of the ugliness is contained in the new function, and the interface is clean enough that it may be reusable (though it seems unlikely anything but git-fsck would care about opening a specific file). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-11index: improve constness for reading blob dataLibravatar Brandon Williams1-1/+1
Improve constness of the index_state parameter to the 'read_blob_data_from_index' function. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-10Merge branch 'jc/git-open-cloexec'Libravatar Junio C Hamano1-1/+2
The codeflow of setting NOATIME and CLOEXEC on file descriptors Git opens has been simplified. We may want to drop the tip one, but we'll see. * jc/git-open-cloexec: sha1_file: stop opening files with O_NOATIME git_open_cloexec(): use fcntl(2) w/ FD_CLOEXEC fallback git_open(): untangle possible NOATIME and CLOEXEC interactions
2017-01-08pathspec: remove the deprecated get_pathspec functionLibravatar Brandon Williams1-1/+0
Now that all callers of the old 'get_pathspec' interface have been migrated to use the new pathspec struct interface it can be removed from the codebase. Since there are no more users of the '_raw' field in the pathspec struct it can also be removed. This patch also removes the old functionality of modifying the const char **argv array that was passed into parse_pathspec. Instead the constructed 'match' string (which is a pathspec element with the prefix prepended) is only stored in its corresponding pathspec_item entry. Signed-off-by: Brandon Williams <bmwill@google.com> Reviewed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-22submodules: load gitmodules file from commit sha1Libravatar Brandon Williams1-0/+2
teach submodules to load a '.gitmodules' file from a commit sha1. This enables the population of the submodule_cache to be based on the state of the '.gitmodules' file from a particular commit. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-12real_path: create real_pathdupLibravatar Brandon Williams1-0/+1
Create real_pathdup which returns a caller owned string of the resolved realpath based on the provide path. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-12real_path: convert real_path_internal to strbuf_realpathLibravatar Brandon Williams1-0/+2
Change the name of real_path_internal to strbuf_realpath. In addition push the static strbuf up to its callers and instead take as a parameter a pointer to a strbuf to use for the final result. This change makes strbuf_realpath reentrant. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-15compression: unify pack.compression configuration parsingLibravatar Junio C Hamano1-1/+1
There are three codepaths that use a variable whose name is pack_compression_level to affect how objects and deltas sent to a packfile is compressed. Unlike zlib_compression_level that controls the loose object compression, however, this variable was static to each of these codepaths. Two of them read the pack.compression configuration variable, using core.compression as the default, and one of them also allowed overriding it from the command line. The other codepath in bulk-checkin did not pay any attention to the configuration. Unify the configuration parsing to git_default_config(), where we implement the parsing of core.loosecompression and core.compression and make the former override the latter, by moving code to parse pack.compression and also allow core.compression to give default to this variable. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-02sha1_file: stop opening files with O_NOATIMELibravatar Junio C Hamano1-1/+1
When we open object files, we try to do so with O_NOATIME. This dates back to 144bde78e9 (Use O_NOATIME when opening the sha1 files., 2005-04-23), which is an optimization to avoid creating a bunch of dirty inodes when we're accessing many objects. But a few things have changed since then: 1. In June 2005, git learned about packfiles, which means we would do a lot fewer atime updates (rather than one per object access, we'd generally get one per packfile). 2. In late 2006, Linux learned about "relatime", which is generally the default on modern installs. So performance around atimes updates is a non-issue there these days. All the world isn't Linux, but as it turns out, Linux is the only platform to implement O_NOATIME in the first place. So it's very unlikely that this code is helping anybody these days. Helped-by: Jeff King <peff@peff.net> [jc: took idea and log message from peff] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-31Merge branch 'ls/git-open-cloexec'Libravatar Junio C Hamano1-1/+1
Git generally does not explicitly close file descriptors that were open in the parent process when spawning a child process, but most of the time the child does not want to access them. As Windows does not allow removing or renaming a file that has a file descriptor open, a slow-to-exit child can even break the parent process by holding onto them. Use O_CLOEXEC flag to open files in various codepaths. * ls/git-open-cloexec: read-cache: make sure file handles are not inherited by child processes sha1_file: open window into packfiles with O_CLOEXEC sha1_file: rename git_open_noatime() to git_open()
2016-10-28Merge branch 'jk/fetch-quick-tag-following' into maintLibravatar Junio C Hamano1-0/+1
When fetching from a remote that has many tags that are irrelevant to branches we are following, we used to waste way too many cycles when checking if the object pointed at by a tag (that we are not going to fetch!) exists in our repository too carefully. * jk/fetch-quick-tag-following: fetch: use "quick" has_sha1_file for tag following
2016-10-28git_open(): untangle possible NOATIME and CLOEXEC interactionsLibravatar Junio C Hamano1-0/+1
The way we structured the fallback/retry mechanism for opening with O_NOATIME and O_CLOEXEC meant that if we failed due to lack of support to open the file with O_NOATIME option (i.e. EINVAL), we would still try to drop O_CLOEXEC first and retry, and then drop O_NOATIME. A platform on which O_NOATIME is defined in the header without support from the kernel wouldn't have a chance to open with O_CLOEXEC option due to this code structure. Arguably, O_CLOEXEC is more important than O_NOATIME, as the latter is mostly about performance, while the former can affect correctness. Instead use O_CLOEXEC to open the file, and then use fcntl(2) to set O_NOATIME on the resulting file descriptor. open(2) itself does not cause atime to be updated according to Linus [*1*]. The helper to do the former can be usable in the codepath in ce_compare_data() that was recently added to open a file descriptor with O_CLOEXEC; use it while we are at it. *1* <CA+55aFw83E+zOd+z5h-CA-3NhrLjVr-anL6pubrSWttYx3zu8g@mail.gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-27Merge branch 'jk/no-looking-at-dotgit-outside-repo'Libravatar Junio C Hamano1-2/+2
Update "git diff --no-index" codepath not to try to peek into .git/ directory that happens to be under the current directory, when we know we are operating outside any repository. * jk/no-looking-at-dotgit-outside-repo: diff: handle sha1 abbreviations outside of repository diff_aligned_abbrev: use "struct oid" diff_unique_abbrev: rename to diff_aligned_abbrev find_unique_abbrev: use 4-buffer ring test-*-cache-tree: setup git dir read info/{attributes,exclude} only when in repository
2016-10-27Merge branch 'jk/abbrev-auto'Libravatar Junio C Hamano1-1/+6
Updates the way approximate count of total objects is computed while attempting to come up with a unique abbreviated object name, which in turn needs to estimate how many hexdigits are necessary to ensure uniqueness. * jk/abbrev-auto: find_unique_abbrev: move logic out of get_short_sha1()
2016-10-27Merge branch 'lt/abbrev-auto'Libravatar Junio C Hamano1-0/+4
Allow the default abbreviation length, which has historically been 7, to scale as the repository grows. The logic suggests to use 12 hexdigits for the Linux kernel, and 9 to 10 for Git itself. * lt/abbrev-auto: abbrev: auto size the default abbreviation abbrev: prepare for new world order abbrev: add FALLBACK_DEFAULT_ABBREV to prepare for auto sizing
2016-10-26find_unique_abbrev: use 4-buffer ringLibravatar Jeff King1-2/+2
Some code paths want to format multiple abbreviated sha1s in the same output line. Because we use a single static buffer for our return value, they have to either break their output into several calls or allocate their own arrays and use find_unique_abbrev_r(). Intead, let's mimic sha1_to_hex() and use a ring of several buffers, so that the return value stays valid through multiple calls. This shortens some of the callers, and makes it harder to for them to make a silly mistake. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-26Merge branch 'jk/fetch-quick-tag-following'Libravatar Junio C Hamano1-0/+1
When fetching from a remote that has many tags that are irrelevant to branches we are following, we used to waste way too many cycles when checking if the object pointed at by a tag (that we are not going to fetch!) exists in our repository too carefully. * jk/fetch-quick-tag-following: fetch: use "quick" has_sha1_file for tag following
2016-10-26Merge branch 'bw/ls-files-recurse-submodules'Libravatar Junio C Hamano1-0/+2
"git ls-files" learned "--recurse-submodules" option that can be used to get a listing of tracked files across submodules (i.e. this only works with "--cached" option, not for listing untracked or ignored files). This would be a useful tool to sit on the upstream side of a pipe that is read with xargs to work on all working tree files from the top-level superproject. * bw/ls-files-recurse-submodules: ls-files: add pathspec matching for submodules ls-files: pass through safe options for --recurse-submodules ls-files: optionally recurse into submodules git: make super-prefix option
2016-10-25sha1_file: rename git_open_noatime() to git_open()Libravatar Lars Schneider1-1/+1
This function is meant to be used when reading from files in the object store, and the original objective was to avoid smudging atime of loose object files too often, hence its name. Because we'll be extending its role in the next commit to also arrange the file descriptors they return auto-closed in the child processes, rename it to lose "noatime" part that is too specific. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17Merge branch 'jk/alt-odb-cleanup'Libravatar Junio C Hamano1-3/+33
Codepaths involved in interacting alternate object store have been cleaned up. * jk/alt-odb-cleanup: alternates: use fspathcmp to detect duplicates sha1_file: always allow relative paths to alternates count-objects: report alternates via verbose mode fill_sha1_file: write into a strbuf alternates: store scratch buffer as strbuf fill_sha1_file: write "boring" characters alternates: use a separate scratch space alternates: encapsulate alt->base munging alternates: provide helper for allocating alternate alternates: provide helper for adding to alternates list link_alt_odb_entry: refactor string handling link_alt_odb_entry: handle normalize_path errors t5613: clarify "too deep" recursion tests t5613: do not chdir in main process t5613: whitespace/style cleanups t5613: use test_must_fail t5613: drop test_valid_repo function t5613: drop reachable_via function
2016-10-17Merge branch 'jk/quarantine-received-objects'Libravatar Junio C Hamano1-0/+1
In order for the receiving end of "git push" to inspect the received history and decide to reject the push, the objects sent from the sending end need to be made available to the hook and the mechanism for the connectivity check, and this was done traditionally by storing the objects in the receiving repository and letting "git gc" to expire it. Instead, store the newly received objects in a temporary area, and make them available by reusing the alternate object store mechanism to them only while we decide if we accept the check, and once we decide, either migrate them to the repository or purge them immediately. * jk/quarantine-received-objects: tmp-objdir: do not migrate files starting with '.' tmp-objdir: put quarantine information in the environment receive-pack: quarantine objects until pre-receive accepts tmp-objdir: introduce API for temporary object directories check_connected: accept an env argument
2016-10-14fetch: use "quick" has_sha1_file for tag followingLibravatar Jeff King1-0/+1
When we auto-follow tags in a fetch, we look at all of the tags advertised by the remote and fetch ones where we don't already have the tag, but we do have the object it peels to. This involves a lot of calls to has_sha1_file(), some of which we can reasonably expect to fail. Since 45e8a74 (has_sha1_file: re-check pack directory before giving up, 2013-08-30), this may cause many calls to reprepare_packed_git(), which is potentially expensive. This has gone unnoticed for several years because it requires a fairly unique setup to matter: 1. You need to have a lot of packs on the client side to make reprepare_packed_git() expensive (the most expensive part is finding duplicates in an unsorted list, which is currently quadratic). 2. You need a large number of tag refs on the server side that are candidates for auto-following (i.e., that the client doesn't have). Each one triggers a re-read of the pack directory. 3. Under normal circumstances, the client would auto-follow those tags and after one large fetch, (2) would no longer be true. But if those tags point to history which is disconnected from what the client otherwise fetches, then it will never auto-follow, and those candidates will impact it on every fetch. So when all three are true, each fetch pays an extra O(nr_tags * nr_packs^2) cost, mostly in string comparisons on the pack names. This was exacerbated by 47bf4b0 (prepare_packed_git_one: refactor duplicate-pack check, 2014-06-30) which uses a slightly more expensive string check, under the assumption that the duplicate check doesn't happen very often (and it shouldn't; the real problem here is how often we are calling reprepare_packed_git()). This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice accuracy for speed, in cases where we might be racy with a simultaneous repack. This is similar to the fix in 0eeb077 (index-pack: avoid excessive re-reading of pack directory, 2015-06-09). As with that case, it's OK for has_sha1_file() occasionally say "no I don't have it" when we do, because the worst case is not a corruption, but simply that we may fail to auto-follow a tag that points to it. Here are results from the included perf script, which sets up a situation similar to the one described above: Test HEAD^ HEAD ---------------------------------------------------------- 5550.4: fetch 11.21(10.42+0.78) 0.08(0.04+0.02) -99.3% Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>