summaryrefslogtreecommitdiff
path: root/cache.h
AgeCommit message (Collapse)AuthorFilesLines
2017-03-13path.c: add xdg_cache_homeLibravatar Devin Lehmacher1-0/+7
We already have xdg_config_home to format paths relative to XDG_CONFIG_HOME. Let's provide a similar function xdg_cache_home to do the same for paths relative to XDG_CACHE_HOME. Signed-off-by: Devin Lehmacher <lehmacdj@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-12Merge branch 'js/realpath-pathdup-fix'Libravatar Junio C Hamano1-1/+1
Git v2.12 was shipped with an embarrassing breakage where various operations that verify paths given from the user stopped dying when seeing an issue, and instead later triggering segfault. * js/realpath-pathdup-fix: real_pathdup(): fix callsites that wanted it to die on error t1501: demonstrate NULL pointer access with invalid GIT_WORK_TREE
2017-03-10Merge branch 'rj/remove-unused-mktemp'Libravatar Junio C Hamano1-3/+0
Code cleanup. * rj/remove-unused-mktemp: wrapper.c: remove unused gitmkstemps() function wrapper.c: remove unused git_mkstemp() function
2017-03-10Merge branch 'jk/parse-config-key-cleanup'Libravatar Junio C Hamano1-1/+4
The "parse_config_key()" API function has been cleaned up. * jk/parse-config-key-cleanup: parse_hide_refs_config: tell parse_config_key we don't want a subsection parse_config_key: allow matching single-level config parse_config_key: use skip_prefix instead of starts_with
2017-03-08real_pathdup(): fix callsites that wanted it to die on errorLibravatar Johannes Schindelin1-1/+1
In 4ac9006f832 (real_path: have callers use real_pathdup and strbuf_realpath, 2016-12-12), we changed the xstrdup(real_path()) pattern to use real_pathdup() directly. The problem with this change is that real_path() calls strbuf_realpath() with die_on_error = 1 while real_pathdup() calls it with die_on_error = 0. Meaning that in cases where real_path() causes Git to die() with an error message, real_pathdup() is silent and returns NULL instead. The callers, however, are ill-prepared for that change, as they expect the return value to be non-NULL (and otherwise the function died with an appropriate error message). Fix this by extending real_pathdup()'s signature to accept the die_on_error flag and simply pass it through to strbuf_realpath(), and then adjust all callers after a careful audit whether they would handle NULLs well. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-28wrapper.c: remove unused git_mkstemp() functionLibravatar Ramsay Jones1-3/+0
The last caller of git_mkstemp() was removed in commit 6fec0a89 ("verify_signed_buffer: use tempfile object", 16-06-2016). Since the introduction of the 'tempfile' APIs, along with git_mkstemp_mode, it is unlikely that new callers will materialize. Remove the dead code. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-27Merge branch 'mh/ref-remove-empty-directory'Libravatar Junio C Hamano1-2/+46
Deletion of a branch "foo/bar" could remove .git/refs/heads/foo once there no longer is any other branch whose name begins with "foo/", but we didn't do so so far. Now we do. * mh/ref-remove-empty-directory: (23 commits) files_transaction_commit(): clean up empty directories try_remove_empty_parents(): teach to remove parents of reflogs, too try_remove_empty_parents(): don't trash argument contents try_remove_empty_parents(): rename parameter "name" -> "refname" delete_ref_loose(): inline function delete_ref_loose(): derive loose reference path from lock log_ref_write_1(): inline function log_ref_setup(): manage the name of the reflog file internally log_ref_write_1(): don't depend on logfile argument log_ref_setup(): pass the open file descriptor back to the caller log_ref_setup(): improve robustness against races log_ref_setup(): separate code for create vs non-create log_ref_write(): inline function rename_tmp_log(): improve error reporting rename_tmp_log(): use raceproof_create_file() lock_ref_sha1_basic(): use raceproof_create_file() lock_ref_sha1_basic(): inline constant raceproof_create_file(): new function safe_create_leading_directories(): set errno on SCLD_EXISTS safe_create_leading_directories_const(): preserve errno ...
2017-02-24parse_config_key: allow matching single-level configLibravatar Jeff King1-1/+4
The parse_config_key() function was introduced to make it easier to match "section.subsection.key" variables. It also handles the simpler "section.key", and the caller is responsible for distinguishing the two from its out-parameters. Most callers who _only_ want "section.key" would just use a strcmp(var, "section.key"), since there is no parsing required. However, they may still use parse_config_key() if their "section" variable isn't a constant (an example of this is in parse_hide_refs_config). Using the parse_config_key is a bit clunky, though: const char *subsection; int subsection_len; const char *key; if (!parse_config_key(var, section, &subsection, &subsection_len, &key) && !subsection) { /* matched! */ } Instead, let's treat a NULL subsection as an indication that the caller does not expect one. That lets us write: const char *key; if (!parse_config_key(var, section, NULL, NULL, &key)) { /* matched! */ } Existing callers should be unaffected, as passing a NULL subsection would currently segfault. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-03Merge branch 'cw/log-updates-for-all-refs-really'Libravatar Junio C Hamano1-1/+8
The "core.logAllRefUpdates" that used to be boolean has been enhanced to take 'always' as well, to record ref updates to refs other than the ones that are expected to be updated (i.e. branches, remote-tracking branches and notes). * cw/log-updates-for-all-refs-really: doc: add note about ignoring '--no-create-reflog' update-ref: add test cases for bare repository refs: add option core.logAllRefUpdates = always config: add markup to core.logAllRefUpdates doc
2017-02-03Merge branch 'sb/submodule-recursive-absorb'Libravatar Junio C Hamano1-1/+4
When a submodule "A", which has another submodule "B" nested within it, is "absorbed" into the top-level superproject, the inner submodule "B" used to be left in a strange state. The logic to adjust the .git pointers in these submodules has been corrected. * sb/submodule-recursive-absorb: submodule absorbing: fix worktree/gitdir pointers recursively for non-moves cache.h: expose the dying procedure for reading gitlinks setup: add gentle version of resolve_git_dir
2017-02-02Merge branch 'rs/absolute-pathdup'Libravatar Junio C Hamano1-0/+1
Code cleanup. * rs/absolute-pathdup: use absolute_pathdup() abspath: add absolute_pathdup()
2017-01-31Merge branch 'sb/in-core-index-doc' into maintLibravatar Junio C Hamano1-0/+32
Documentation and in-code comments updates. * sb/in-core-index-doc: documentation: retire unfinished documentation cache.h: document add_[file_]to_index cache.h: document remove_index_entry_at cache.h: document index_name_pos
2017-01-31Merge branch 'sb/in-core-index-doc'Libravatar Junio C Hamano1-0/+32
Documentation and in-code comments updates. * sb/in-core-index-doc: documentation: retire unfinished documentation cache.h: document add_[file_]to_index cache.h: document remove_index_entry_at cache.h: document index_name_pos
2017-01-31Merge branch 'jk/loose-object-fsck'Libravatar Junio C Hamano1-0/+13
"git fsck" inspects loose objects more carefully now. * jk/loose-object-fsck: fsck: detect trailing garbage in all object types fsck: parse loose object paths directly sha1_file: add read_loose_object() function t1450: test fsck of packed objects sha1_file: fix error message for alternate objects t1450: refactor loose-object removal
2017-01-31refs: add option core.logAllRefUpdates = alwaysLibravatar Cornelius Weig1-1/+8
When core.logallrefupdates is true, we only create a new reflog for refs that are under certain well-known hierarchies. The reason is that we know that some hierarchies (like refs/tags) are not meant to change, and that unknown hierarchies might not want reflogs at all (e.g., a hypothetical refs/foo might be meant to change often and drop old history immediately). However, sometimes it is useful to override this decision and simply log for all refs, because the safety and audit trail is more important than the performance implications of keeping the log around. This patch introduces a new "always" mode for the core.logallrefupdates option which will log updates to everything under refs/, regardless where in the hierarchy it is (we still will not log things like ORIG_HEAD and FETCH_HEAD, which are known to be transient). Based-on-patch-by: Jeff King <peff@peff.net> Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-26abspath: add absolute_pathdup()Libravatar René Scharfe1-0/+1
Add a function that returns a buffer containing the absolute path of its argument and a semantic patch for its intended use. It avoids an extra string copy to a static buffer. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-26cache.h: expose the dying procedure for reading gitlinksLibravatar Stefan Beller1-0/+1
In a later patch we want to react to only a subset of errors, defaulting the rest to die as usual. Separate the block that takes care of dying into its own function so we have easy access to it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-26setup: add gentle version of resolve_git_dirLibravatar Stefan Beller1-1/+3
This follows a93bedada (setup: add gentle version of read_gitfile, 2015-06-09), and assumes the same reasoning. resolve_git_dir is unsuited for speculative calls, so we want to use the gentle version to find out about potential errors. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-23Merge branch 'bw/read-blob-data-does-not-modify-index-state'Libravatar Junio C Hamano1-1/+1
Code clean-up. * bw/read-blob-data-does-not-modify-index-state: index: improve constness for reading blob data
2017-01-19cache.h: document add_[file_]to_indexLibravatar Stefan Beller1-0/+10
Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-19cache.h: document remove_index_entry_atLibravatar Stefan Beller1-0/+3
Do this by moving the existing documentation from read-cache.c to cache.h. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-19cache.h: document index_name_posLibravatar Stefan Beller1-0/+19
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-18Merge branch 'bw/pathspec-cleanup'Libravatar Junio C Hamano1-1/+0
Code clean-up in the pathspec API. * bw/pathspec-cleanup: pathspec: rename prefix_pathspec to init_pathspec_item pathspec: small readability changes pathspec: create strip submodule slash helpers pathspec: create parse_element_magic helper pathspec: create parse_long_magic function pathspec: create parse_short_magic function pathspec: factor global magic into its own function pathspec: simpler logic to prefix original pathspec elements pathspec: always show mnemonic and name in unsupported_magic pathspec: remove unused variable from unsupported_magic pathspec: copy and free owned memory pathspec: remove the deprecated get_pathspec function ls-tree: convert show_recursive to use the pathspec struct interface dir: convert fill_directory to use the pathspec struct interface dir: remove struct path_simplify mv: remove use of deprecated 'get_pathspec()'
2017-01-18Merge branch 'bw/grep-recurse-submodules'Libravatar Junio C Hamano1-0/+5
"git grep" has been taught to optionally recurse into submodules. * bw/grep-recurse-submodules: grep: search history of moved submodules grep: enable recurse-submodules to work on <tree> objects grep: optionally recurse into submodules grep: add submodules as a grep source type submodules: load gitmodules file from commit sha1 submodules: add helper to determine if a submodule is initialized submodules: add helper to determine if a submodule is populated real_path: canonicalize directory separators in root parts real_path: have callers use real_pathdup and strbuf_realpath real_path: create real_pathdup real_path: convert real_path_internal to strbuf_realpath real_path: resolve symlinks by hand
2017-01-15sha1_file: add read_loose_object() functionLibravatar Jeff King1-0/+13
It's surprisingly hard to ask the sha1_file code to open a _specific_ incarnation of a loose object. Most of the functions take a sha1, and loop over the various object types (packed versus loose) and locations (local versus alternates) at a low level. However, some tools like fsck need to look at a specific file. This patch gives them a function they can use to open the loose object at a given path. The implementation unfortunately ends up repeating bits of related functions, but there's not a good way around it without some major refactoring of the whole sha1_file stack. We need to mmap the specific file, then partially read the zlib stream to know whether we're streaming or not, and then finally either stream it or copy the data to a buffer. We can do that by assembling some of the more arcane internal sha1_file functions, but we end up having to essentially reimplement unpack_sha1_file(), along with the streaming bits of check_sha1_signature(). Still, most of the ugliness is contained in the new function, and the interface is clean enough that it may be reusable (though it seems unlikely anything but git-fsck would care about opening a specific file). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-11index: improve constness for reading blob dataLibravatar Brandon Williams1-1/+1
Improve constness of the index_state parameter to the 'read_blob_data_from_index' function. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-10Merge branch 'jc/git-open-cloexec'Libravatar Junio C Hamano1-1/+2
The codeflow of setting NOATIME and CLOEXEC on file descriptors Git opens has been simplified. We may want to drop the tip one, but we'll see. * jc/git-open-cloexec: sha1_file: stop opening files with O_NOATIME git_open_cloexec(): use fcntl(2) w/ FD_CLOEXEC fallback git_open(): untangle possible NOATIME and CLOEXEC interactions
2017-01-08pathspec: remove the deprecated get_pathspec functionLibravatar Brandon Williams1-1/+0
Now that all callers of the old 'get_pathspec' interface have been migrated to use the new pathspec struct interface it can be removed from the codebase. Since there are no more users of the '_raw' field in the pathspec struct it can also be removed. This patch also removes the old functionality of modifying the const char **argv array that was passed into parse_pathspec. Instead the constructed 'match' string (which is a pathspec element with the prefix prepended) is only stored in its corresponding pathspec_item entry. Signed-off-by: Brandon Williams <bmwill@google.com> Reviewed-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-07raceproof_create_file(): new functionLibravatar Michael Haggerty1-0/+43
Add a function that tries to create a file and any containing directories in a way that is robust against races with other processes that might be cleaning up empty directories at the same time. The actual file creation is done by a callback function, which, if it fails, should set errno to EISDIR or ENOENT according to the convention of open(). raceproof_create_file() detects such failures, and respectively either tries to delete empty directories that might be in the way of the file or tries to create the containing directories. Then it retries the callback function. This function is not yet used. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-07safe_create_leading_directories(): set errno on SCLD_EXISTSLibravatar Michael Haggerty1-2/+3
The exit path for SCLD_EXISTS wasn't setting errno, which some callers use to generate error messages for the user. Fix the problem and document that the function sets errno correctly to help avoid similar regressions in the future. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-22submodules: load gitmodules file from commit sha1Libravatar Brandon Williams1-0/+2
teach submodules to load a '.gitmodules' file from a commit sha1. This enables the population of the submodule_cache to be based on the state of the '.gitmodules' file from a particular commit. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-12real_path: create real_pathdupLibravatar Brandon Williams1-0/+1
Create real_pathdup which returns a caller owned string of the resolved realpath based on the provide path. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-12real_path: convert real_path_internal to strbuf_realpathLibravatar Brandon Williams1-0/+2
Change the name of real_path_internal to strbuf_realpath. In addition push the static strbuf up to its callers and instead take as a parameter a pointer to a strbuf to use for the final result. This change makes strbuf_realpath reentrant. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-15compression: unify pack.compression configuration parsingLibravatar Junio C Hamano1-1/+1
There are three codepaths that use a variable whose name is pack_compression_level to affect how objects and deltas sent to a packfile is compressed. Unlike zlib_compression_level that controls the loose object compression, however, this variable was static to each of these codepaths. Two of them read the pack.compression configuration variable, using core.compression as the default, and one of them also allowed overriding it from the command line. The other codepath in bulk-checkin did not pay any attention to the configuration. Unify the configuration parsing to git_default_config(), where we implement the parsing of core.loosecompression and core.compression and make the former override the latter, by moving code to parse pack.compression and also allow core.compression to give default to this variable. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-02sha1_file: stop opening files with O_NOATIMELibravatar Junio C Hamano1-1/+1
When we open object files, we try to do so with O_NOATIME. This dates back to 144bde78e9 (Use O_NOATIME when opening the sha1 files., 2005-04-23), which is an optimization to avoid creating a bunch of dirty inodes when we're accessing many objects. But a few things have changed since then: 1. In June 2005, git learned about packfiles, which means we would do a lot fewer atime updates (rather than one per object access, we'd generally get one per packfile). 2. In late 2006, Linux learned about "relatime", which is generally the default on modern installs. So performance around atimes updates is a non-issue there these days. All the world isn't Linux, but as it turns out, Linux is the only platform to implement O_NOATIME in the first place. So it's very unlikely that this code is helping anybody these days. Helped-by: Jeff King <peff@peff.net> [jc: took idea and log message from peff] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-31Merge branch 'ls/git-open-cloexec'Libravatar Junio C Hamano1-1/+1
Git generally does not explicitly close file descriptors that were open in the parent process when spawning a child process, but most of the time the child does not want to access them. As Windows does not allow removing or renaming a file that has a file descriptor open, a slow-to-exit child can even break the parent process by holding onto them. Use O_CLOEXEC flag to open files in various codepaths. * ls/git-open-cloexec: read-cache: make sure file handles are not inherited by child processes sha1_file: open window into packfiles with O_CLOEXEC sha1_file: rename git_open_noatime() to git_open()
2016-10-28Merge branch 'jk/fetch-quick-tag-following' into maintLibravatar Junio C Hamano1-0/+1
When fetching from a remote that has many tags that are irrelevant to branches we are following, we used to waste way too many cycles when checking if the object pointed at by a tag (that we are not going to fetch!) exists in our repository too carefully. * jk/fetch-quick-tag-following: fetch: use "quick" has_sha1_file for tag following
2016-10-28git_open(): untangle possible NOATIME and CLOEXEC interactionsLibravatar Junio C Hamano1-0/+1
The way we structured the fallback/retry mechanism for opening with O_NOATIME and O_CLOEXEC meant that if we failed due to lack of support to open the file with O_NOATIME option (i.e. EINVAL), we would still try to drop O_CLOEXEC first and retry, and then drop O_NOATIME. A platform on which O_NOATIME is defined in the header without support from the kernel wouldn't have a chance to open with O_CLOEXEC option due to this code structure. Arguably, O_CLOEXEC is more important than O_NOATIME, as the latter is mostly about performance, while the former can affect correctness. Instead use O_CLOEXEC to open the file, and then use fcntl(2) to set O_NOATIME on the resulting file descriptor. open(2) itself does not cause atime to be updated according to Linus [*1*]. The helper to do the former can be usable in the codepath in ce_compare_data() that was recently added to open a file descriptor with O_CLOEXEC; use it while we are at it. *1* <CA+55aFw83E+zOd+z5h-CA-3NhrLjVr-anL6pubrSWttYx3zu8g@mail.gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-27Merge branch 'jk/no-looking-at-dotgit-outside-repo'Libravatar Junio C Hamano1-2/+2
Update "git diff --no-index" codepath not to try to peek into .git/ directory that happens to be under the current directory, when we know we are operating outside any repository. * jk/no-looking-at-dotgit-outside-repo: diff: handle sha1 abbreviations outside of repository diff_aligned_abbrev: use "struct oid" diff_unique_abbrev: rename to diff_aligned_abbrev find_unique_abbrev: use 4-buffer ring test-*-cache-tree: setup git dir read info/{attributes,exclude} only when in repository
2016-10-27Merge branch 'jk/abbrev-auto'Libravatar Junio C Hamano1-1/+6
Updates the way approximate count of total objects is computed while attempting to come up with a unique abbreviated object name, which in turn needs to estimate how many hexdigits are necessary to ensure uniqueness. * jk/abbrev-auto: find_unique_abbrev: move logic out of get_short_sha1()
2016-10-27Merge branch 'lt/abbrev-auto'Libravatar Junio C Hamano1-0/+4
Allow the default abbreviation length, which has historically been 7, to scale as the repository grows. The logic suggests to use 12 hexdigits for the Linux kernel, and 9 to 10 for Git itself. * lt/abbrev-auto: abbrev: auto size the default abbreviation abbrev: prepare for new world order abbrev: add FALLBACK_DEFAULT_ABBREV to prepare for auto sizing
2016-10-26find_unique_abbrev: use 4-buffer ringLibravatar Jeff King1-2/+2
Some code paths want to format multiple abbreviated sha1s in the same output line. Because we use a single static buffer for our return value, they have to either break their output into several calls or allocate their own arrays and use find_unique_abbrev_r(). Intead, let's mimic sha1_to_hex() and use a ring of several buffers, so that the return value stays valid through multiple calls. This shortens some of the callers, and makes it harder to for them to make a silly mistake. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-26Merge branch 'jk/fetch-quick-tag-following'Libravatar Junio C Hamano1-0/+1
When fetching from a remote that has many tags that are irrelevant to branches we are following, we used to waste way too many cycles when checking if the object pointed at by a tag (that we are not going to fetch!) exists in our repository too carefully. * jk/fetch-quick-tag-following: fetch: use "quick" has_sha1_file for tag following
2016-10-26Merge branch 'bw/ls-files-recurse-submodules'Libravatar Junio C Hamano1-0/+2
"git ls-files" learned "--recurse-submodules" option that can be used to get a listing of tracked files across submodules (i.e. this only works with "--cached" option, not for listing untracked or ignored files). This would be a useful tool to sit on the upstream side of a pipe that is read with xargs to work on all working tree files from the top-level superproject. * bw/ls-files-recurse-submodules: ls-files: add pathspec matching for submodules ls-files: pass through safe options for --recurse-submodules ls-files: optionally recurse into submodules git: make super-prefix option
2016-10-25sha1_file: rename git_open_noatime() to git_open()Libravatar Lars Schneider1-1/+1
This function is meant to be used when reading from files in the object store, and the original objective was to avoid smudging atime of loose object files too often, hence its name. Because we'll be extending its role in the next commit to also arrange the file descriptors they return auto-closed in the child processes, rename it to lose "noatime" part that is too specific. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-17Merge branch 'jk/alt-odb-cleanup'Libravatar Junio C Hamano1-3/+33
Codepaths involved in interacting alternate object store have been cleaned up. * jk/alt-odb-cleanup: alternates: use fspathcmp to detect duplicates sha1_file: always allow relative paths to alternates count-objects: report alternates via verbose mode fill_sha1_file: write into a strbuf alternates: store scratch buffer as strbuf fill_sha1_file: write "boring" characters alternates: use a separate scratch space alternates: encapsulate alt->base munging alternates: provide helper for allocating alternate alternates: provide helper for adding to alternates list link_alt_odb_entry: refactor string handling link_alt_odb_entry: handle normalize_path errors t5613: clarify "too deep" recursion tests t5613: do not chdir in main process t5613: whitespace/style cleanups t5613: use test_must_fail t5613: drop test_valid_repo function t5613: drop reachable_via function
2016-10-17Merge branch 'jk/quarantine-received-objects'Libravatar Junio C Hamano1-0/+1
In order for the receiving end of "git push" to inspect the received history and decide to reject the push, the objects sent from the sending end need to be made available to the hook and the mechanism for the connectivity check, and this was done traditionally by storing the objects in the receiving repository and letting "git gc" to expire it. Instead, store the newly received objects in a temporary area, and make them available by reusing the alternate object store mechanism to them only while we decide if we accept the check, and once we decide, either migrate them to the repository or purge them immediately. * jk/quarantine-received-objects: tmp-objdir: do not migrate files starting with '.' tmp-objdir: put quarantine information in the environment receive-pack: quarantine objects until pre-receive accepts tmp-objdir: introduce API for temporary object directories check_connected: accept an env argument
2016-10-14fetch: use "quick" has_sha1_file for tag followingLibravatar Jeff King1-0/+1
When we auto-follow tags in a fetch, we look at all of the tags advertised by the remote and fetch ones where we don't already have the tag, but we do have the object it peels to. This involves a lot of calls to has_sha1_file(), some of which we can reasonably expect to fail. Since 45e8a74 (has_sha1_file: re-check pack directory before giving up, 2013-08-30), this may cause many calls to reprepare_packed_git(), which is potentially expensive. This has gone unnoticed for several years because it requires a fairly unique setup to matter: 1. You need to have a lot of packs on the client side to make reprepare_packed_git() expensive (the most expensive part is finding duplicates in an unsorted list, which is currently quadratic). 2. You need a large number of tag refs on the server side that are candidates for auto-following (i.e., that the client doesn't have). Each one triggers a re-read of the pack directory. 3. Under normal circumstances, the client would auto-follow those tags and after one large fetch, (2) would no longer be true. But if those tags point to history which is disconnected from what the client otherwise fetches, then it will never auto-follow, and those candidates will impact it on every fetch. So when all three are true, each fetch pays an extra O(nr_tags * nr_packs^2) cost, mostly in string comparisons on the pack names. This was exacerbated by 47bf4b0 (prepare_packed_git_one: refactor duplicate-pack check, 2014-06-30) which uses a slightly more expensive string check, under the assumption that the duplicate check doesn't happen very often (and it shouldn't; the real problem here is how often we are calling reprepare_packed_git()). This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice accuracy for speed, in cases where we might be racy with a simultaneous repack. This is similar to the fix in 0eeb077 (index-pack: avoid excessive re-reading of pack directory, 2015-06-09). As with that case, it's OK for has_sha1_file() occasionally say "no I don't have it" when we do, because the worst case is not a corruption, but simply that we may fail to auto-follow a tag that points to it. Here are results from the included perf script, which sets up a situation similar to the one described above: Test HEAD^ HEAD ---------------------------------------------------------- 5550.4: fetch 11.21(10.42+0.78) 0.08(0.04+0.02) -99.3% Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10Merge branch 'jk/pack-objects-optim-mru'Libravatar Junio C Hamano1-0/+8
"git pack-objects" in a repository with many packfiles used to spend a lot of time looking for/at objects in them; the accesses to the packfiles are now optimized by checking the most-recently-used packfile first. * jk/pack-objects-optim-mru: pack-objects: use mru list when iterating over packs pack-objects: break delta cycles before delta-search phase sha1_file: make packed_object_info public provide an initializer for "struct object_info"
2016-10-10tmp-objdir: put quarantine information in the environmentLibravatar Jeff King1-0/+1
The presence of the GIT_QUARANTINE_PATH variable lets any called programs know that they're operating in a temporary object directory (and where that directory is). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>