summaryrefslogtreecommitdiff
path: root/cache.h
AgeCommit message (Collapse)AuthorFilesLines
2015-07-27Merge branch 'jk/index-pack-reduce-recheck' into maintLibravatar Junio C Hamano1-1/+10
Disable "have we lost a race with competing repack?" check while receiving a huge object transfer that runs index-pack. * jk/index-pack-reduce-recheck: index-pack: avoid excessive re-reading of pack directory
2015-06-09index-pack: avoid excessive re-reading of pack directoryLibravatar Jeff King1-1/+10
Since 45e8a74 (has_sha1_file: re-check pack directory before giving up, 2013-08-30), we spend extra effort for has_sha1_file to give the right answer when somebody else is repacking. Usually this effort does not matter, because after finding that the object does not exist, the next step is usually to die(). However, some code paths make a large number of has_sha1_file checks which are _not_ expected to return 1. The collision test in index-pack.c is such a case. On a local system, this can cause a performance slowdown of around 5%. But on a system with high-latency system calls (like NFS), it can be much worse. This patch introduces a "quick" flag to has_sha1_file which callers can use when they would prefer high performance at the cost of false negatives during repacks. There may be other code paths that can use this, but the index-pack one is the most obviously critical, so we'll start with switching that one. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-05Merge branch 'pt/xdg-config-path' into maintLibravatar Junio C Hamano1-1/+7
Code clean-up for xdg configuration path support. * pt/xdg-config-path: path.c: remove home_config_paths() git-config: replace use of home_config_paths() git-commit: replace use of home_config_paths() credential-store.c: replace home_config_paths() with xdg_config_home() dir.c: replace home_config_paths() with xdg_config_home() attr.c: replace home_config_paths() with xdg_config_home() path.c: implement xdg_config_home() t0302: "unreadable" test needs POSIXPERM t0302: test credential-store support for XDG_CONFIG_HOME git-credential-store: support XDG_CONFIG_HOME git-credential-store: support multiple credential files
2015-05-26Merge branch 'jc/hash-object' into maintLibravatar Junio C Hamano1-0/+1
"hash-object --literally" introduced in v2.2 was not prepared to take a really long object type name. * jc/hash-object: write_sha1_file(): do not use a separate sha1[] array t1007: add hash-object --literally tests hash-object --literally: fix buffer overrun with extra-long object type git-hash-object.txt: document --literally option
2015-05-13Merge branch 'jk/prune-mtime' into maintLibravatar Junio C Hamano1-3/+6
Access to objects in repositories that borrow from another one on a slow NFS server unnecessarily got more expensive due to recent code becoming more cautious in a naive way not to lose objects to pruning. * jk/prune-mtime: sha1_file: only freshen packs once per run sha1_file: freshen pack objects before loose reachable: only mark local objects as recent
2015-05-06path.c: remove home_config_paths()Libravatar Paul Tan1-1/+0
home_config_paths() combines distinct functionality already implemented by expand_user_path() and xdg_config_home(), and it also hard-codes the path ~/.gitconfig, which makes it unsuitable to use for other home config file paths. Since its use will just add unnecessary complexity to the code, remove it. Signed-off-by: Paul Tan <pyokagan@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-06path.c: implement xdg_config_home()Libravatar Paul Tan1-0/+7
The XDG base dir spec[1] specifies that configuration files be stored in a subdirectory in $XDG_CONFIG_HOME. To construct such a configuration file path, home_config_paths() can be used. However, home_config_paths() combines distinct functionality: 1. Retrieve the home git config file path ~/.gitconfig 2. Construct the XDG config path of the file specified by `file`. This function was introduced in commit 21cf3227 ("read (but not write) from $XDG_CONFIG_HOME/git/config file"). While the intention of the function was to allow the home directory configuration file path and the xdg directory configuration file path to be retrieved with one function call, the hard-coding of the path ~/.gitconfig prevents it from being used for other configuration files. Furthermore, retrieving a file path relative to the user's home directory can be done with expand_user_path(). Hence, it can be seen that home_config_paths() introduces unnecessary complexity, especially if a user just wants to retrieve the xdg config file path. As such, implement a simpler function xdg_config_home() for constructing the XDG base dir spec configuration file path. This function, together with expand_user_path(), can replace all uses of home_config_paths(). [1] http://standards.freedesktop.org/basedir-spec/basedir-spec-0.7.html Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Paul Tan <pyokagan@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-05hash-object --literally: fix buffer overrun with extra-long object typeLibravatar Eric Sunshine1-0/+1
"hash-object" learned in 5ba9a93 (hash-object: add --literally option, 2014-09-11) to allow crafting a corrupt/broken object of unknown type. When the user-provided type is particularly long, however, it can overflow the relatively small stack-based character array handed to write_sha1_file_prepare() by hash_sha1_file() and write_sha1_file(), leading to stack corruption (and crash). Introduce a custom helper to allow arbitrarily long typenames just for "hash-object --literally". [jc: Eric's original used a strbuf in the more common codepaths, and I rewrote it to avoid penalizing the non-literally code. Bugs are mine] Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-20sha1_file: only freshen packs once per runLibravatar Jeff King1-0/+1
Since 33d4221 (write_sha1_file: freshen existing objects, 2014-10-15), we update the mtime of existing objects that we would have written out (had they not existed). For the common case in which many objects are packed, we may update the mtime on a single packfile repeatedly. This can result in a noticeable performance problem if calling utime() is expensive (e.g., because your storage is on NFS). We can fix this by keeping a per-pack flag that lets us freshen only once per program invocation. An alternative would be to keep the packed_git.mtime flag up to date as we freshen, and freshen only once every N seconds. In practice, it's not worth the complexity. We are racing against prune expiration times here, which inherently must be set to accomodate reasonable program running times (because they really care about the time between an object being written and it becoming referenced, and the latter is typically the last step a program takes). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-20reachable: only mark local objects as recentLibravatar Jeff King1-3/+5
When pruning and repacking a repository that has an alternate object store configured, we may traverse a large number of objects in the alternate. This serves no purpose, and may be expensive to do. A longer explanation is below. Commits d3038d2 and abcb865 taught prune and pack-objects (respectively) to treat "recent" objects as tips for reachability, so that we keep whole chunks of history. They built on the object traversal in 660c889 (sha1_file: add for_each iterators for loose and packed objects, 2014-10-15), which covers both local and alternate objects. In both cases, covering alternate objects is unnecessary, as both commands can only drop objects from the local repository. In the case of prune, we traverse only the local object directory. And in the case of repacking, while we may or may not include local objects in our pack, we will never reach into the alternate with "repack -d". The "-l" option is only a question of whether we are migrating objects from the alternate into our repository, or leaving them untouched. It is possible that we may drop an object that is depended upon by another object in the alternate. For example, imagine two repositories, A and B, with A pointing to B as an alternate. Now imagine a commit that is in B which references a tree that is only in A. Traversing from recent objects in B might prevent A from dropping that tree. But this case isn't worth covering. Repo B should take responsibility for its own objects. It would never have had the commit in the first place if it did not also have the tree, and assuming it is using the same "keep recent chunks of history" scheme, then it would itself keep the tree, as well. So checking the alternate objects is not worth doing, and come with a significant performance impact. In both cases, we skip any recent objects that have already been marked SEEN (i.e., that we know are already reachable for prune, or included in the pack for a repack). So there is a slight waste of time in opening the alternate packs at all, only to notice that we have already considered each object. But much worse, the alternate repository may have a large number of objects that are not reachable from the local repository at all, and we end up adding them to the traversal. We can fix this by considering only local unseen objects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-26Merge branch 'jc/report-path-error-to-dir'Libravatar Junio C Hamano1-1/+0
Code clean-up. * jc/report-path-error-to-dir: report_path_error(): move to dir.c
2015-03-25Merge branch 'jk/prune-with-corrupt-refs'Libravatar Junio C Hamano1-0/+8
"git prune" used to largely ignore broken refs when deciding which objects are still being used, which could spread an existing small damage and make it a larger one. * jk/prune-with-corrupt-refs: refs.c: drop curate_packed_refs repack: turn on "ref paranoia" when doing a destructive repack prune: turn on ref_paranoia flag refs: introduce a "ref paranoia" flag t5312: test object deletion code paths in a corrupted repository
2015-03-24report_path_error(): move to dir.cLibravatar Junio C Hamano1-1/+0
The expected call sequence is for the caller to use match_pathspec() repeatedly on a set of pathspecs, accumulating the "hits" in a separate array, and then call this function to diagnose a pathspec that never matched anything, as that can indicate a typo from the command line, e.g. "git commit Maekfile". Many builtin commands use this function from builtin/ls-files.c, which is not a very healthy arrangement. ls-files might have been the first command to feel the need for such a helper, but the need is shared by everybody who uses the "match and then report" pattern. Move it to dir.c where match_pathspec() is defined. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-20refs: introduce a "ref paranoia" flagLibravatar Jeff King1-0/+8
Most operations that iterate over refs are happy to ignore broken cruft. However, some operations should be performed with knowledge of these broken refs, because it is better for the operation to choke on a missing object than it is to silently pretend that the ref did not exist (e.g., if we are computing the set of reachable tips in order to prune objects). These processes could just call for_each_rawref, except that ref iteration is often hidden behind other interfaces. For instance, for a destructive "repack -ad", we would have to inform "pack-objects" that we are destructive, and then it would in turn have to tell the revision code that our "--all" should include broken refs. It's much simpler to just set a global for "dangerous" operations that includes broken refs in all iterations. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-05Merge branch 'jk/prune-mtime' into maintLibravatar Junio C Hamano1-0/+9
In v2.2.0, we broke "git prune" that runs in a repository that borrows from an alternate object store. * jk/prune-mtime: sha1_file: fix iterating loose alternate objects for_each_loose_file_in_objdir: take an optional strbuf path
2015-03-05Merge branch 'jk/decimal-width-for-uintmax' into maintLibravatar Junio C Hamano1-1/+1
We didn't format an integer that wouldn't fit in "int" but in "uintmax_t" correctly. * jk/decimal-width-for-uintmax: decimal_width: avoid integer overflow
2015-03-05Merge branch 'mh/refs-have-new'Libravatar Junio C Hamano1-1/+1
Simplify the ref transaction API around how "the ref should be pointing at this object" is specified. * mh/refs-have-new: refs.h: remove duplication in function docstrings update_ref(): improve documentation ref_transaction_verify(): new function to check a reference's value ref_transaction_delete(): check that old_sha1 is not null_sha1 ref_transaction_create(): check that new_sha1 is valid commit: avoid race when creating orphan commits commit: add tests of commit races ref_transaction_delete(): remove "have_old" parameter ref_transaction_update(): remove "have_old" parameter struct ref_update: move "have_old" into "flags" refs.c: change some "flags" to "unsigned int" refs: remove the gap in the REF_* constant values refs: move REF_DELETING to refs.c
2015-02-22Merge branch 'jk/prune-mtime'Libravatar Junio C Hamano1-0/+9
In v2.2.0, we broke "git prune" that runs in a repository that borrows from an alternate object store. * jk/prune-mtime: sha1_file: fix iterating loose alternate objects for_each_loose_file_in_objdir: take an optional strbuf path
2015-02-18Merge branch 'jk/decimal-width-for-uintmax'Libravatar Junio C Hamano1-1/+1
We didn't format an integer that wouldn't fit in "int" but in "uintmax_t" correctly. * jk/decimal-width-for-uintmax: decimal_width: avoid integer overflow
2015-02-17refs.c: change some "flags" to "unsigned int"Libravatar Michael Haggerty1-1/+1
Change the following functions' "flags" arguments from "int" to "unsigned int": * ref_transaction_update() * ref_transaction_create() * ref_transaction_delete() * update_ref() * delete_ref() * lock_ref_sha1_basic() Also change the "flags" member in "struct ref_update" to unsigned. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-09for_each_loose_file_in_objdir: take an optional strbuf pathLibravatar Jeff King1-0/+9
We feed a root "objdir" path to this iterator function, which then copies the result into a strbuf, so that it can repeatedly append the object sub-directories to it. Let's make it easy for callers to just pass us a strbuf in the first place. We leave the original interface as a convenience for callers who want to just pass a const string like the result of get_object_directory(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-05decimal_width: avoid integer overflowLibravatar Jeff King1-1/+1
The decimal_width function originally appeared in blame.c as "lineno_width", and was designed for calculating the print-width of small-ish integer values (line numbers in text files). In ec7ff5b, it was made into a reusable function, and in dc801e7, we started using it to align diffstats. Binary files in a diffstat show byte counts rather than line numbers, meaning they can be quite large (e.g., consider adding or removing a 2GB file). decimal_width is not up to the challenge for two reasons: 1. It takes the value as an "int", whereas large files may easily surpass this. The value may be truncated, in which case we will produce an incorrect value. 2. It counts "up" by repeatedly multiplying another integer by 10 until it surpasses the value. This can cause an infinite loop when the value is close to the largest representable integer. For example, consider using a 32-bit signed integer, and a value of 2,140,000,000 (just shy of 2^31-1). We will count up and eventually see that 1,000,000,000 is smaller than our value. The next step would be to multiply by 10 and see that 10,000,000,000 is too large, ending the loop. But we can't represent that value, and we have signed overflow. This is technically undefined behavior, but a common behavior is to lose the high bits, in which case our iterator will certainly be less than the number. So we'll keep multiplying, overflow again, and so on. This patch changes the argument to a uintmax_t (the same type we use to store the diffstat information for binary filese), and counts "down" by repeatedly dividing our value by 10. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22Merge branch 'dm/compat-s-ifmt-for-zos'Libravatar Junio C Hamano1-7/+0
Long overdue departure from the assumption that S_IFMT is shared by everybody made in 2005. * dm/compat-s-ifmt-for-zos: compat: convert modes to use portable file type values
2014-12-17Sync with v2.1.4Libravatar Junio C Hamano1-0/+3
* maint-2.1: Git 2.1.4 Git 2.0.5 Git 1.9.5 Git 1.8.5.6 fsck: complain about NTFS ".git" aliases in trees read-cache: optionally disallow NTFS .git variants path: add is_ntfs_dotgit() helper fsck: complain about HFS+ ".git" aliases in trees read-cache: optionally disallow HFS+ .git variants utf8: add is_hfs_dotgit() helper fsck: notice .git case-insensitively t1450: refactor ".", "..", and ".git" fsck tests verify_dotfile(): reject .git case-insensitively read-tree: add tests for confusing paths like ".." and ".git" unpack-trees: propagate errors adding entries to the index
2014-12-17Sync with v2.0.5Libravatar Junio C Hamano1-0/+3
* maint-2.0: Git 2.0.5 Git 1.9.5 Git 1.8.5.6 fsck: complain about NTFS ".git" aliases in trees read-cache: optionally disallow NTFS .git variants path: add is_ntfs_dotgit() helper fsck: complain about HFS+ ".git" aliases in trees read-cache: optionally disallow HFS+ .git variants utf8: add is_hfs_dotgit() helper fsck: notice .git case-insensitively t1450: refactor ".", "..", and ".git" fsck tests verify_dotfile(): reject .git case-insensitively read-tree: add tests for confusing paths like ".." and ".git" unpack-trees: propagate errors adding entries to the index
2014-12-17Sync with v1.9.5Libravatar Junio C Hamano1-0/+3
* maint-1.9: Git 1.9.5 Git 1.8.5.6 fsck: complain about NTFS ".git" aliases in trees read-cache: optionally disallow NTFS .git variants path: add is_ntfs_dotgit() helper fsck: complain about HFS+ ".git" aliases in trees read-cache: optionally disallow HFS+ .git variants utf8: add is_hfs_dotgit() helper fsck: notice .git case-insensitively t1450: refactor ".", "..", and ".git" fsck tests verify_dotfile(): reject .git case-insensitively read-tree: add tests for confusing paths like ".." and ".git" unpack-trees: propagate errors adding entries to the index
2014-12-17Sync with v1.8.5.6Libravatar Junio C Hamano1-0/+3
* maint-1.8.5: Git 1.8.5.6 fsck: complain about NTFS ".git" aliases in trees read-cache: optionally disallow NTFS .git variants path: add is_ntfs_dotgit() helper fsck: complain about HFS+ ".git" aliases in trees read-cache: optionally disallow HFS+ .git variants utf8: add is_hfs_dotgit() helper fsck: notice .git case-insensitively t1450: refactor ".", "..", and ".git" fsck tests verify_dotfile(): reject .git case-insensitively read-tree: add tests for confusing paths like ".." and ".git" unpack-trees: propagate errors adding entries to the index
2014-12-17read-cache: optionally disallow NTFS .git variantsLibravatar Johannes Schindelin1-0/+1
The point of disallowing ".git" in the index is that we would never want to accidentally overwrite files in the repository directory. But this means we need to respect the filesystem's idea of when two paths are equal. The prior commit added a helper to make such a comparison for NTFS and FAT32; let's use it in verify_path(). We make this check optional for two reasons: 1. It restricts the set of allowable filenames, which is unnecessary for people who are not on NTFS nor FAT32. In practice this probably doesn't matter, though, as the restricted names are rather obscure and almost certainly would never come up in practice. 2. It has a minor performance penalty for every path we insert into the index. This patch ties the check to the core.protectNTFS config option. Though this is expected to be most useful on Windows, we allow it to be set everywhere, as NTFS may be mounted on other platforms. The variable does default to on for Windows, though. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-17path: add is_ntfs_dotgit() helperLibravatar Johannes Schindelin1-0/+1
We do not allow paths with a ".git" component to be added to the index, as that would mean repository contents could overwrite our repository files. However, asking "is this path the same as .git" is not as simple as strcmp() on some filesystems. On NTFS (and FAT32), there exist so-called "short names" for backwards-compatibility: 8.3 compliant names that refer to the same files as their long names. As ".git" is not an 8.3 compliant name, a short name is generated automatically, typically "git~1". Depending on the Windows version, any combination of trailing spaces and periods are ignored, too, so that both "git~1." and ".git." still refer to the Git directory. The reason is that 8.3 stores file names shorter than 8 characters with trailing spaces. So literally, it does not matter for the short name whether it is padded with spaces or whether it is shorter than 8 characters, it is considered to be the exact same. The period is the separator between file name and file extension, and again, an empty extension consists just of spaces in 8.3 format. So technically, we would need only take care of the equivalent of this regex: (\.git {0,4}|git~1 {0,3})\. {0,3} However, there are indications that at least some Windows versions might be more lenient and accept arbitrary combinations of trailing spaces and periods and strip them out. So we're playing it real safe here. Besides, there can be little doubt about the intention behind using file names matching even the more lenient pattern specified above, therefore we should be fine with disallowing such patterns. Extra care is taken to catch names such as '.\\.git\\booh' because the backslash is marked as a directory separator only on Windows, and we want to use this new helper function also in fsck on other platforms. A big thank you goes to Ed Thomson and an unnamed Microsoft engineer for the detailed analysis performed to come up with the corresponding fixes for libgit2. This commit adds a function to detect whether a given file name can refer to the Git directory by mistake. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-17read-cache: optionally disallow HFS+ .git variantsLibravatar Jeff King1-0/+1
The point of disallowing ".git" in the index is that we would never want to accidentally overwrite files in the repository directory. But this means we need to respect the filesystem's idea of when two paths are equal. The prior commit added a helper to make such a comparison for HFS+; let's use it in verify_path. We make this check optional for two reasons: 1. It restricts the set of allowable filenames, which is unnecessary for people who are not on HFS+. In practice this probably doesn't matter, though, as the restricted names are rather obscure and almost certainly would never come up in practice. 2. It has a minor performance penalty for every path we insert into the index. This patch ties the check to the core.protectHFS config option. Though this is expected to be most useful on OS X, we allow it to be set everywhere, as HFS+ may be mounted on other platforms. The variable does default to on for OS X, though. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-04compat: convert modes to use portable file type valuesLibravatar David Michael1-7/+0
This adds simple wrapper functions around calls to stat(), fstat(), and lstat() that translate the operating system's native file type bits to those used by most operating systems. It also rewrites the S_IF* macros to the common values, so all file type processing is performed using the translated modes. This makes projects portable across operating systems that use different file type definitions. Only the file type bits may be affected by these compatibility functions; the file permission bits are assumed to be 07777 and are passed through unchanged. Signed-off-by: David Michael <fedora.dm0@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-29Merge branch 'jk/prune-mtime'Libravatar Junio C Hamano1-1/+45
Tighten the logic to decide that an unreachable cruft is sufficiently old by covering corner cases such as an ancient object becoming reachable and then going unreachable again, in which case its retention period should be prolonged. * jk/prune-mtime: (28 commits) drop add_object_array_with_mode revision: remove definition of unused 'add_object' function pack-objects: double-check options before discarding objects repack: pack objects mentioned by the index pack-objects: use argv_array reachable: use revision machinery's --indexed-objects code rev-list: add --indexed-objects option rev-list: document --reflog option t5516: test pushing a tag of an otherwise unreferenced blob traverse_commit_list: support pending blobs/trees with paths make add_object_array_with_context interface more sane write_sha1_file: freshen existing objects pack-objects: match prune logic for discarding objects pack-objects: refactor unpack-unreachable expiration check prune: keep objects reachable from recent objects sha1_file: add for_each iterators for loose and packed objects count-objects: use for_each_loose_file_in_objdir count-objects: do not use xsize_t when counting object size prune-packed: use for_each_loose_file_in_objdir reachable: mark index blobs as SEEN ...
2014-10-16sha1_file: add for_each iterators for loose and packed objectsLibravatar Jeff King1-0/+11
We typically iterate over the reachable objects in a repository by starting at the tips and walking the graph. There's no easy way to iterate over all of the objects, including unreachable ones. Let's provide a way of doing so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16prune: factor out loose-object directory traversalLibravatar Jeff King1-0/+33
Prune has to walk $GIT_DIR/objects/?? in order to find the set of loose objects to prune. Other parts of the code (e.g., count-objects) want to do the same. Let's factor it out into a reusable for_each-style function. Note that this is not quite a straight code movement. The original code had strange behavior when it found a file of the form "[0-9a-f]{2}/.{38}" that did _not_ contain all hex digits. It executed a "break" from the loop, meaning that we stopped pruning in that directory (but still pruned other directories!). This was probably a bug; we do not want to process the file as an object, but we should keep going otherwise (and that is how the new code handles it). We are also a little more careful with loose object directories which fail to open. The original code silently ignored any failures, but the new code will complain about any problems besides ENOENT. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16foreach_alt_odb: propagate return value from callbackLibravatar Jeff King1-1/+1
We check the return value of the callback and stop iterating if it is non-zero. However, we do not make the non-zero return value available to the caller, so they have no way of knowing whether the operation succeeded or not (technically they can keep their own error flag in the callback data, but that is unlike our other for_each functions). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15refs.c: allow listing and deleting badly named refsLibravatar Ronnie Sahlberg1-2/+15
We currently do not handle badly named refs well: $ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\. $ git branch fatal: Reference has invalid format: 'refs/heads/master.....@*@\.' $ git branch -D master.....@\*@\\. error: branch 'master.....@*@\.' not found. Users cannot recover from a badly named ref without manually finding and deleting the loose ref file or appropriate line in packed-refs. Making that easier will make it easier to tweak the ref naming rules in the future, for example to forbid shell metacharacters like '`' and '"', without putting people in a state that is hard to get out of. So allow "branch --list" to show these refs and allow "branch -d/-D" and "update-ref -d" to delete them. Other commands (for example to rename refs) will continue to not handle these refs but can be changed in later patches. Details: In resolving functions, refuse to resolve refs that don't pass the git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to resolve refs that escape the refs/ directory and do not match the pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD"). In locking functions, refuse to act on badly named refs unless they are being deleted and either are in the refs/ directory or match [A-Z_]*. Just like other invalid refs, flag resolved, badly named refs with the REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them in all iteration functions except for for_each_rawref. Flag badly named refs (but not symrefs pointing to badly named refs) with a REF_BAD_NAME flag to make it easier for future callers to notice and handle them specially. For example, in a later patch for-each-ref will use this flag to detect refs whose names can confuse callers parsing for-each-ref output. In the transaction API, refuse to create or update badly named refs, but allow deleting them (unless they try to escape refs/ and don't match [A-Z_]*). Signed-off-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15branch -d: avoid repeated symref resolutionLibravatar Jonathan Nieder1-0/+6
If a repository gets in a broken state with too much symref nesting, it cannot be repaired with "git branch -d": $ git symbolic-ref refs/heads/nonsense refs/heads/nonsense $ git branch -d nonsense error: branch 'nonsense' not found. Worse, "git update-ref --no-deref -d" doesn't work for such repairs either: $ git update-ref -d refs/heads/nonsense error: unable to resolve reference refs/heads/nonsense: Too many levels of symbolic links Fix both by teaching resolve_ref_unsafe a new RESOLVE_REF_NO_RECURSE flag and passing it when appropriate. Callers can still read the value of a symref (for example to print a message about it) with that flag set --- resolve_ref_unsafe will resolve one level of symrefs and stop there. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15refs.c: change resolve_ref_unsafe reading argument to be a flags fieldLibravatar Ronnie Sahlberg1-11/+12
resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref resolves successfully for writing but not for reading). Change this to be a flags field instead, and pass the new constant RESOLVE_REF_READING when we want this behaviour. While at it, swap two of the arguments in the function to put output arguments at the end. As a nice side effect, this ensures that we can catch callers that were unaware of the new API so they can be audited. Give the wrapper functions resolve_refdup and read_ref_full the same treatment for consistency. Signed-off-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-14Merge branch 'mh/lockfile'Libravatar Junio C Hamano1-19/+1
The lockfile API and its users have been cleaned up. * mh/lockfile: (38 commits) lockfile.h: extract new header file for the functions in lockfile.c hold_locked_index(): move from lockfile.c to read-cache.c hold_lock_file_for_append(): restore errno before returning get_locked_file_path(): new function lockfile.c: rename static functions lockfile: rename LOCK_NODEREF to LOCK_NO_DEREF commit_lock_file_to(): refactor a helper out of commit_lock_file() trim_last_path_component(): replace last_path_elm() resolve_symlink(): take a strbuf parameter resolve_symlink(): use a strbuf for internal scratch space lockfile: change lock_file::filename into a strbuf commit_lock_file(): use a strbuf to manage temporary space try_merge_strategy(): use a statically-allocated lock_file object try_merge_strategy(): remove redundant lock_file allocation struct lock_file: declare some fields volatile lockfile: avoid transitory invalid states git_config_set_multivar_in_file(): avoid call to rollback_lock_file() dump_marks(): remove a redundant call to rollback_lock_file() api-lockfile: document edge cases commit_lock_file(): rollback lock file on failure to rename ...
2014-10-08Merge branch 'sp/stream-clean-filter'Libravatar Junio C Hamano1-0/+1
When running a required clean filter, we do not have to mmap the original before feeding the filter. Instead, stream the file contents directly to the filter and process its output. * sp/stream-clean-filter: sha1_file: don't convert off_t to size_t too early to avoid potential die() convert: stream from fd to required clean filter to reduce used address space copy_fd(): do not close the input file descriptor mmap_limit: introduce GIT_MMAP_LIMIT to allow testing expected mmap size memory_limit: use git_env_ulong() to parse GIT_ALLOC_LIMIT config.c: add git_env_ulong() to parse environment variable convert: drop arguments other than 'path' from would_convert_to_git()
2014-10-01lockfile.h: extract new header file for the functions in lockfile.cLibravatar Michael Haggerty1-26/+1
Move the interface declaration for the functions in lockfile.c from cache.h to a new file, lockfile.h. Add #includes where necessary (and remove some redundant includes of cache.h by files that already include builtin.h). Move the documentation of the lock_file state diagram from lockfile.c to the new header file. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01get_locked_file_path(): new functionLibravatar Michael Haggerty1-0/+1
Add a function to return the path of the file that is locked by a lock_file object. This reduces the knowledge that callers have to have about the lock_file layout. Suggested-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01lockfile: rename LOCK_NODEREF to LOCK_NO_DEREFLibravatar Michael Haggerty1-1/+1
This makes it harder to misread the name as LOCK_NODE_REF. Suggested-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01commit_lock_file_to(): refactor a helper out of commit_lock_file()Libravatar Michael Haggerty1-0/+1
commit_locked_index(), when writing to an alternate index file, duplicates (poorly) the code in commit_lock_file(). And anyway, it shouldn't have to know so much about the internal workings of lockfile objects. So extract a new function commit_lock_file_to() that does the work common to the two functions, and call it from both commit_lock_file() and commit_locked_index(). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01lockfile: change lock_file::filename into a strbufLibravatar Michael Haggerty1-1/+1
For now, we still make sure to allocate at least PATH_MAX characters for the strbuf because resolve_symlink() doesn't know how to expand the space for its return value. (That will be fixed in a moment.) Another alternative would be to just use a strbuf as scratch space in lock_file() but then store a pointer to the naked string in struct lock_file. But lock_file objects are often reused. By reusing the same strbuf, we can avoid having to reallocate the string most times when a lock_file object is reused. Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01struct lock_file: declare some fields volatileLibravatar Michael Haggerty1-3/+3
The function remove_lock_file_on_signal() is used as a signal handler. It is not realistic to make the signal handler conform strictly to the C standard, which is very restrictive about what a signal handler is allowed to do. But let's increase the likelihood that it will work: The lock_file_list global variable and several fields from struct lock_file are used by the signal handler. Declare those values "volatile" to (1) force the main process to write the values to RAM promptly, and (2) prevent updates to these fields from being reordered in a way that leaves an opportunity for a jump to the signal handler while the object is in an inconsistent state. We don't mark the filename field volatile because that would prevent the use of strcpy(), and it is anyway unlikely that a compiler re-orders a strcpy() call across other expressions. So in practice it should be possible to get away without "volatile" in the "filename" case. Suggested-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01lockfile: avoid transitory invalid statesLibravatar Michael Haggerty1-0/+1
Because remove_lock_file() can be called any time by the signal handler, it is important that any lock_file objects that are in the lock_file_list are always in a valid state. And since lock_file objects are often reused (but are never removed from lock_file_list), that means we have to be careful whenever mutating a lock_file object to always keep it in a well-defined state. This was formerly not the case, because part of the state was encoded by setting lk->filename to the empty string vs. a valid filename. It is wrong to assume that this string can be updated atomically; for example, even strcpy(lk->filename, value) is unsafe. But the old code was even more reckless; for example, strcpy(lk->filename, path); if (!(flags & LOCK_NODEREF)) resolve_symlink(lk->filename, max_path_len); strcat(lk->filename, ".lock"); During the call to resolve_symlink(), lk->filename contained the name of the file that was being locked, not the name of the lockfile. If a signal were raised during that interval, then the signal handler would have deleted the valuable file! We could probably continue to use the filename field to encode the state by being careful to write characters 1..N-1 of the filename first, and then overwrite the NUL at filename[0] with the first character of the filename, but that would be awkward and error-prone. So, instead of using the filename field to determine whether the lock_file object is active, add a new field "lock_file::active" for this purpose. Be careful to set this field only when filename really contains the name of a file that should be deleted on cleanup. Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01cache.h: define constants LOCK_SUFFIX and LOCK_SUFFIX_LENLibravatar Michael Haggerty1-0/+4
There are a few places that use these values, so define constants for them. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01unable_to_lock_die(): rename function from unable_to_lock_index_die()Libravatar Michael Haggerty1-1/+1
This function is used for other things besides the index, so rename it accordingly. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Reviewed-by: Ronnie Sahlberg <sahlberg@google.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-29Merge branch 'ta/config-add-to-empty-or-true-fix' into maintLibravatar Junio C Hamano1-0/+2
"git config --add section.var val" used to lose existing section.var whose value was an empty string. * ta/config-add-to-empty-or-true-fix: config: avoid a funny sentinel value "a^" make config --add behave correctly for empty and NULL values