summaryrefslogtreecommitdiff
path: root/cache.h
AgeCommit message (Collapse)AuthorFilesLines
2015-05-19Merge branch 'kn/cat-file-literally'Libravatar Junio C Hamano1-0/+2
Add the "--allow-unknown-type" option to "cat-file" to allow inspecting loose objects of an experimental or a broken type. * kn/cat-file-literally: t1006: add tests for git cat-file --allow-unknown-type cat-file: teach cat-file a '--allow-unknown-type' option cat-file: make the options mutually exclusive sha1_file: support reading from a loose object of unknown type
2015-05-11Merge branch 'pt/xdg-config-path'Libravatar Junio C Hamano1-1/+7
Code clean-up for xdg configuration path support. * pt/xdg-config-path: path.c: remove home_config_paths() git-config: replace use of home_config_paths() git-commit: replace use of home_config_paths() credential-store.c: replace home_config_paths() with xdg_config_home() dir.c: replace home_config_paths() with xdg_config_home() attr.c: replace home_config_paths() with xdg_config_home() path.c: implement xdg_config_home()
2015-05-11Merge branch 'jc/hash-object'Libravatar Junio C Hamano1-0/+1
"hash-object --literally" introduced in v2.2 was not prepared to take a really long object type name. * jc/hash-object: write_sha1_file(): do not use a separate sha1[] array t1007: add hash-object --literally tests hash-object --literally: fix buffer overrun with extra-long object type git-hash-object.txt: document --literally option
2015-05-11Merge branch 'nd/multiple-work-trees'Libravatar Junio C Hamano1-5/+12
A replacement for contrib/workdir/git-new-workdir that does not rely on symbolic links and make sharing of objects and refs safer by making the borrowee and borrowers aware of each other. * nd/multiple-work-trees: (41 commits) prune --worktrees: fix expire vs worktree existence condition t1501: fix test with split index t2026: fix broken &&-chain t2026 needs procondition SANITY git-checkout.txt: a note about multiple checkout support for submodules checkout: add --ignore-other-wortrees checkout: pass whole struct to parse_branchname_arg instead of individual flags git-common-dir: make "modules/" per-working-directory directory checkout: do not fail if target is an empty directory t2025: add a test to make sure grafts is working from a linked checkout checkout: don't require a work tree when checking out into a new one git_path(): keep "info/sparse-checkout" per work-tree count-objects: report unused files in $GIT_DIR/worktrees/... gc: support prune --worktrees gc: factor out gc.pruneexpire parsing code gc: style change -- no SP before closing parenthesis checkout: clean up half-prepared directories in --to mode checkout: reject if the branch is already checked out elsewhere prune: strategies for linked checkouts checkout: support checking out into a new working directory ...
2015-05-06sha1_file: support reading from a loose object of unknown typeLibravatar Karthik Nayak1-0/+2
Update sha1_loose_object_info() to optionally allow it to read from a loose object file of unknown/bogus type; as the function usually returns the type of the object it read in the form of enum for known types, add an optional "typename" field to receive the name of the type in textual form and a flag to indicate the reading of a loose object file of unknown/bogus type. Add parse_sha1_header_extended() which acts as a wrapper around parse_sha1_header() allowing more information to be obtained. Add unpack_sha1_header_to_strbuf() to unpack sha1 headers of unknown/corrupt objects which have a unknown sha1 header size to a strbuf structure. This was written by Junio C Hamano but tested by me. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Hepled-by: Jeff King <peff@peff.net> Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-06path.c: remove home_config_paths()Libravatar Paul Tan1-1/+0
home_config_paths() combines distinct functionality already implemented by expand_user_path() and xdg_config_home(), and it also hard-codes the path ~/.gitconfig, which makes it unsuitable to use for other home config file paths. Since its use will just add unnecessary complexity to the code, remove it. Signed-off-by: Paul Tan <pyokagan@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-06path.c: implement xdg_config_home()Libravatar Paul Tan1-0/+7
The XDG base dir spec[1] specifies that configuration files be stored in a subdirectory in $XDG_CONFIG_HOME. To construct such a configuration file path, home_config_paths() can be used. However, home_config_paths() combines distinct functionality: 1. Retrieve the home git config file path ~/.gitconfig 2. Construct the XDG config path of the file specified by `file`. This function was introduced in commit 21cf3227 ("read (but not write) from $XDG_CONFIG_HOME/git/config file"). While the intention of the function was to allow the home directory configuration file path and the xdg directory configuration file path to be retrieved with one function call, the hard-coding of the path ~/.gitconfig prevents it from being used for other configuration files. Furthermore, retrieving a file path relative to the user's home directory can be done with expand_user_path(). Hence, it can be seen that home_config_paths() introduces unnecessary complexity, especially if a user just wants to retrieve the xdg config file path. As such, implement a simpler function xdg_config_home() for constructing the XDG base dir spec configuration file path. This function, together with expand_user_path(), can replace all uses of home_config_paths(). [1] http://standards.freedesktop.org/basedir-spec/basedir-spec-0.7.html Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Paul Tan <pyokagan@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-05Merge branch 'jk/prune-mtime'Libravatar Junio C Hamano1-3/+6
Access to objects in repositories that borrow from another one on a slow NFS server unnecessarily got more expensive due to recent code becoming more cautious in a naive way not to lose objects to pruning. * jk/prune-mtime: sha1_file: only freshen packs once per run sha1_file: freshen pack objects before loose reachable: only mark local objects as recent
2015-05-05Merge branch 'bc/object-id'Libravatar Junio C Hamano1-4/+36
Identify parts of the code that knows that we use SHA-1 hash to name our objects too much, and use (1) symbolic constants instead of hardcoded 20 as byte count and/or (2) use struct object_id instead of unsigned char [20] for object names. * bc/object-id: apply: convert threeway_stage to object_id patch-id: convert to use struct object_id commit: convert parts to struct object_id diff: convert struct combine_diff_path to object_id bulk-checkin.c: convert to use struct object_id zip: use GIT_SHA1_HEXSZ for trailers archive.c: convert to use struct object_id bisect.c: convert leaf functions to use struct object_id define utility functions for object IDs define a structure for object IDs
2015-05-05hash-object --literally: fix buffer overrun with extra-long object typeLibravatar Eric Sunshine1-0/+1
"hash-object" learned in 5ba9a93 (hash-object: add --literally option, 2014-09-11) to allow crafting a corrupt/broken object of unknown type. When the user-provided type is particularly long, however, it can overflow the relatively small stack-based character array handed to write_sha1_file_prepare() by hash_sha1_file() and write_sha1_file(), leading to stack corruption (and crash). Introduce a custom helper to allow arbitrarily long typenames just for "hash-object --literally". [jc: Eric's original used a strbuf in the more common codepaths, and I rewrote it to avoid penalizing the non-literally code. Bugs are mine] Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-20sha1_file: only freshen packs once per runLibravatar Jeff King1-0/+1
Since 33d4221 (write_sha1_file: freshen existing objects, 2014-10-15), we update the mtime of existing objects that we would have written out (had they not existed). For the common case in which many objects are packed, we may update the mtime on a single packfile repeatedly. This can result in a noticeable performance problem if calling utime() is expensive (e.g., because your storage is on NFS). We can fix this by keeping a per-pack flag that lets us freshen only once per program invocation. An alternative would be to keep the packed_git.mtime flag up to date as we freshen, and freshen only once every N seconds. In practice, it's not worth the complexity. We are racing against prune expiration times here, which inherently must be set to accomodate reasonable program running times (because they really care about the time between an object being written and it becoming referenced, and the latter is typically the last step a program takes). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-20reachable: only mark local objects as recentLibravatar Jeff King1-3/+5
When pruning and repacking a repository that has an alternate object store configured, we may traverse a large number of objects in the alternate. This serves no purpose, and may be expensive to do. A longer explanation is below. Commits d3038d2 and abcb865 taught prune and pack-objects (respectively) to treat "recent" objects as tips for reachability, so that we keep whole chunks of history. They built on the object traversal in 660c889 (sha1_file: add for_each iterators for loose and packed objects, 2014-10-15), which covers both local and alternate objects. In both cases, covering alternate objects is unnecessary, as both commands can only drop objects from the local repository. In the case of prune, we traverse only the local object directory. And in the case of repacking, while we may or may not include local objects in our pack, we will never reach into the alternate with "repack -d". The "-l" option is only a question of whether we are migrating objects from the alternate into our repository, or leaving them untouched. It is possible that we may drop an object that is depended upon by another object in the alternate. For example, imagine two repositories, A and B, with A pointing to B as an alternate. Now imagine a commit that is in B which references a tree that is only in A. Traversing from recent objects in B might prevent A from dropping that tree. But this case isn't worth covering. Repo B should take responsibility for its own objects. It would never have had the commit in the first place if it did not also have the tree, and assuming it is using the same "keep recent chunks of history" scheme, then it would itself keep the tree, as well. So checking the alternate objects is not worth doing, and come with a significant performance impact. In both cases, we skip any recent objects that have already been marked SEEN (i.e., that we know are already reachable for prune, or included in the pack for a repack). So there is a slight waste of time in opening the alternate packs at all, only to notice that we have already considered each object. But much worse, the alternate repository may have a large number of objects that are not reachable from the local repository at all, and we end up adding them to the traversal. We can fix this by considering only local unseen objects. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-26Merge branch 'jc/report-path-error-to-dir'Libravatar Junio C Hamano1-1/+0
Code clean-up. * jc/report-path-error-to-dir: report_path_error(): move to dir.c
2015-03-25Merge branch 'jk/prune-with-corrupt-refs'Libravatar Junio C Hamano1-0/+8
"git prune" used to largely ignore broken refs when deciding which objects are still being used, which could spread an existing small damage and make it a larger one. * jk/prune-with-corrupt-refs: refs.c: drop curate_packed_refs repack: turn on "ref paranoia" when doing a destructive repack prune: turn on ref_paranoia flag refs: introduce a "ref paranoia" flag t5312: test object deletion code paths in a corrupted repository
2015-03-24report_path_error(): move to dir.cLibravatar Junio C Hamano1-1/+0
The expected call sequence is for the caller to use match_pathspec() repeatedly on a set of pathspecs, accumulating the "hits" in a separate array, and then call this function to diagnose a pathspec that never matched anything, as that can indicate a typo from the command line, e.g. "git commit Maekfile". Many builtin commands use this function from builtin/ls-files.c, which is not a very healthy arrangement. ls-files might have been the first command to feel the need for such a helper, but the need is shared by everybody who uses the "match and then report" pattern. Move it to dir.c where match_pathspec() is defined. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-20refs: introduce a "ref paranoia" flagLibravatar Jeff King1-0/+8
Most operations that iterate over refs are happy to ignore broken cruft. However, some operations should be performed with knowledge of these broken refs, because it is better for the operation to choke on a missing object than it is to silently pretend that the ref did not exist (e.g., if we are computing the set of reachable tips in order to prune objects). These processes could just call for_each_rawref, except that ref iteration is often hidden behind other interfaces. For instance, for a destructive "repack -ad", we would have to inform "pack-objects" that we are destructive, and then it would in turn have to tell the revision code that our "--all" should include broken refs. It's much simpler to just set a global for "dangerous" operations that includes broken refs in all iterations. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-13define utility functions for object IDsLibravatar brian m. carlson1-4/+28
There are several utility functions (hashcmp and friends) that are used for comparing object IDs (SHA-1 values). Using these functions, which take pointers to unsigned char, with struct object_id requires tiresome access to the sha1 member, which bloats code and violates the desired encapsulation. Provide wrappers around these functions for struct object_id for neater, more maintainable code. Use the new constants to avoid the hard-coded 20s and 40s throughout the original functions. These functions simply call the underlying pointer-to-unsigned-char versions to ensure that any performance improvements will be passed through to the new functions. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-13define a structure for object IDsLibravatar brian m. carlson1-0/+8
Many places throughout the code use "unsigned char [20]" to store object IDs (SHA-1 values). This leads to lots of hardcoded numbers throughout the codebase. It also leads to confusion about the purposes of a buffer. Introduce a structure for object IDs. This allows us to obtain the benefits of compile-time checking for misuse. The structure is expected to remain the same size and have the same alignment requirements on all known platforms, compared to the array of unsigned char, although this is not required for correctness. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-05Merge branch 'jk/prune-mtime' into maintLibravatar Junio C Hamano1-0/+9
In v2.2.0, we broke "git prune" that runs in a repository that borrows from an alternate object store. * jk/prune-mtime: sha1_file: fix iterating loose alternate objects for_each_loose_file_in_objdir: take an optional strbuf path
2015-03-05Merge branch 'jk/decimal-width-for-uintmax' into maintLibravatar Junio C Hamano1-1/+1
We didn't format an integer that wouldn't fit in "int" but in "uintmax_t" correctly. * jk/decimal-width-for-uintmax: decimal_width: avoid integer overflow
2015-03-05Merge branch 'mh/refs-have-new'Libravatar Junio C Hamano1-1/+1
Simplify the ref transaction API around how "the ref should be pointing at this object" is specified. * mh/refs-have-new: refs.h: remove duplication in function docstrings update_ref(): improve documentation ref_transaction_verify(): new function to check a reference's value ref_transaction_delete(): check that old_sha1 is not null_sha1 ref_transaction_create(): check that new_sha1 is valid commit: avoid race when creating orphan commits commit: add tests of commit races ref_transaction_delete(): remove "have_old" parameter ref_transaction_update(): remove "have_old" parameter struct ref_update: move "have_old" into "flags" refs.c: change some "flags" to "unsigned int" refs: remove the gap in the REF_* constant values refs: move REF_DELETING to refs.c
2015-02-22Merge branch 'jk/prune-mtime'Libravatar Junio C Hamano1-0/+9
In v2.2.0, we broke "git prune" that runs in a repository that borrows from an alternate object store. * jk/prune-mtime: sha1_file: fix iterating loose alternate objects for_each_loose_file_in_objdir: take an optional strbuf path
2015-02-18Merge branch 'jk/decimal-width-for-uintmax'Libravatar Junio C Hamano1-1/+1
We didn't format an integer that wouldn't fit in "int" but in "uintmax_t" correctly. * jk/decimal-width-for-uintmax: decimal_width: avoid integer overflow
2015-02-17refs.c: change some "flags" to "unsigned int"Libravatar Michael Haggerty1-1/+1
Change the following functions' "flags" arguments from "int" to "unsigned int": * ref_transaction_update() * ref_transaction_create() * ref_transaction_delete() * update_ref() * delete_ref() * lock_ref_sha1_basic() Also change the "flags" member in "struct ref_update" to unsigned. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-09for_each_loose_file_in_objdir: take an optional strbuf pathLibravatar Jeff King1-0/+9
We feed a root "objdir" path to this iterator function, which then copies the result into a strbuf, so that it can repeatedly append the object sub-directories to it. Let's make it easy for callers to just pass us a strbuf in the first place. We leave the original interface as a convenience for callers who want to just pass a const string like the result of get_object_directory(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-05decimal_width: avoid integer overflowLibravatar Jeff King1-1/+1
The decimal_width function originally appeared in blame.c as "lineno_width", and was designed for calculating the print-width of small-ish integer values (line numbers in text files). In ec7ff5b, it was made into a reusable function, and in dc801e7, we started using it to align diffstats. Binary files in a diffstat show byte counts rather than line numbers, meaning they can be quite large (e.g., consider adding or removing a 2GB file). decimal_width is not up to the challenge for two reasons: 1. It takes the value as an "int", whereas large files may easily surpass this. The value may be truncated, in which case we will produce an incorrect value. 2. It counts "up" by repeatedly multiplying another integer by 10 until it surpasses the value. This can cause an infinite loop when the value is close to the largest representable integer. For example, consider using a 32-bit signed integer, and a value of 2,140,000,000 (just shy of 2^31-1). We will count up and eventually see that 1,000,000,000 is smaller than our value. The next step would be to multiply by 10 and see that 10,000,000,000 is too large, ending the loop. But we can't represent that value, and we have signed overflow. This is technically undefined behavior, but a common behavior is to lose the high bits, in which case our iterator will certainly be less than the number. So we'll keep multiplying, overflow again, and so on. This patch changes the argument to a uintmax_t (the same type we use to store the diffstat information for binary filese), and counts "down" by repeatedly dividing our value by 10. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22Merge branch 'dm/compat-s-ifmt-for-zos'Libravatar Junio C Hamano1-7/+0
Long overdue departure from the assumption that S_IFMT is shared by everybody made in 2005. * dm/compat-s-ifmt-for-zos: compat: convert modes to use portable file type values
2014-12-17Sync with v2.1.4Libravatar Junio C Hamano1-0/+3
* maint-2.1: Git 2.1.4 Git 2.0.5 Git 1.9.5 Git 1.8.5.6 fsck: complain about NTFS ".git" aliases in trees read-cache: optionally disallow NTFS .git variants path: add is_ntfs_dotgit() helper fsck: complain about HFS+ ".git" aliases in trees read-cache: optionally disallow HFS+ .git variants utf8: add is_hfs_dotgit() helper fsck: notice .git case-insensitively t1450: refactor ".", "..", and ".git" fsck tests verify_dotfile(): reject .git case-insensitively read-tree: add tests for confusing paths like ".." and ".git" unpack-trees: propagate errors adding entries to the index
2014-12-17Sync with v2.0.5Libravatar Junio C Hamano1-0/+3
* maint-2.0: Git 2.0.5 Git 1.9.5 Git 1.8.5.6 fsck: complain about NTFS ".git" aliases in trees read-cache: optionally disallow NTFS .git variants path: add is_ntfs_dotgit() helper fsck: complain about HFS+ ".git" aliases in trees read-cache: optionally disallow HFS+ .git variants utf8: add is_hfs_dotgit() helper fsck: notice .git case-insensitively t1450: refactor ".", "..", and ".git" fsck tests verify_dotfile(): reject .git case-insensitively read-tree: add tests for confusing paths like ".." and ".git" unpack-trees: propagate errors adding entries to the index
2014-12-17Sync with v1.9.5Libravatar Junio C Hamano1-0/+3
* maint-1.9: Git 1.9.5 Git 1.8.5.6 fsck: complain about NTFS ".git" aliases in trees read-cache: optionally disallow NTFS .git variants path: add is_ntfs_dotgit() helper fsck: complain about HFS+ ".git" aliases in trees read-cache: optionally disallow HFS+ .git variants utf8: add is_hfs_dotgit() helper fsck: notice .git case-insensitively t1450: refactor ".", "..", and ".git" fsck tests verify_dotfile(): reject .git case-insensitively read-tree: add tests for confusing paths like ".." and ".git" unpack-trees: propagate errors adding entries to the index
2014-12-17Sync with v1.8.5.6Libravatar Junio C Hamano1-0/+3
* maint-1.8.5: Git 1.8.5.6 fsck: complain about NTFS ".git" aliases in trees read-cache: optionally disallow NTFS .git variants path: add is_ntfs_dotgit() helper fsck: complain about HFS+ ".git" aliases in trees read-cache: optionally disallow HFS+ .git variants utf8: add is_hfs_dotgit() helper fsck: notice .git case-insensitively t1450: refactor ".", "..", and ".git" fsck tests verify_dotfile(): reject .git case-insensitively read-tree: add tests for confusing paths like ".." and ".git" unpack-trees: propagate errors adding entries to the index
2014-12-17read-cache: optionally disallow NTFS .git variantsLibravatar Johannes Schindelin1-0/+1
The point of disallowing ".git" in the index is that we would never want to accidentally overwrite files in the repository directory. But this means we need to respect the filesystem's idea of when two paths are equal. The prior commit added a helper to make such a comparison for NTFS and FAT32; let's use it in verify_path(). We make this check optional for two reasons: 1. It restricts the set of allowable filenames, which is unnecessary for people who are not on NTFS nor FAT32. In practice this probably doesn't matter, though, as the restricted names are rather obscure and almost certainly would never come up in practice. 2. It has a minor performance penalty for every path we insert into the index. This patch ties the check to the core.protectNTFS config option. Though this is expected to be most useful on Windows, we allow it to be set everywhere, as NTFS may be mounted on other platforms. The variable does default to on for Windows, though. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-17path: add is_ntfs_dotgit() helperLibravatar Johannes Schindelin1-0/+1
We do not allow paths with a ".git" component to be added to the index, as that would mean repository contents could overwrite our repository files. However, asking "is this path the same as .git" is not as simple as strcmp() on some filesystems. On NTFS (and FAT32), there exist so-called "short names" for backwards-compatibility: 8.3 compliant names that refer to the same files as their long names. As ".git" is not an 8.3 compliant name, a short name is generated automatically, typically "git~1". Depending on the Windows version, any combination of trailing spaces and periods are ignored, too, so that both "git~1." and ".git." still refer to the Git directory. The reason is that 8.3 stores file names shorter than 8 characters with trailing spaces. So literally, it does not matter for the short name whether it is padded with spaces or whether it is shorter than 8 characters, it is considered to be the exact same. The period is the separator between file name and file extension, and again, an empty extension consists just of spaces in 8.3 format. So technically, we would need only take care of the equivalent of this regex: (\.git {0,4}|git~1 {0,3})\. {0,3} However, there are indications that at least some Windows versions might be more lenient and accept arbitrary combinations of trailing spaces and periods and strip them out. So we're playing it real safe here. Besides, there can be little doubt about the intention behind using file names matching even the more lenient pattern specified above, therefore we should be fine with disallowing such patterns. Extra care is taken to catch names such as '.\\.git\\booh' because the backslash is marked as a directory separator only on Windows, and we want to use this new helper function also in fsck on other platforms. A big thank you goes to Ed Thomson and an unnamed Microsoft engineer for the detailed analysis performed to come up with the corresponding fixes for libgit2. This commit adds a function to detect whether a given file name can refer to the Git directory by mistake. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-17read-cache: optionally disallow HFS+ .git variantsLibravatar Jeff King1-0/+1
The point of disallowing ".git" in the index is that we would never want to accidentally overwrite files in the repository directory. But this means we need to respect the filesystem's idea of when two paths are equal. The prior commit added a helper to make such a comparison for HFS+; let's use it in verify_path. We make this check optional for two reasons: 1. It restricts the set of allowable filenames, which is unnecessary for people who are not on HFS+. In practice this probably doesn't matter, though, as the restricted names are rather obscure and almost certainly would never come up in practice. 2. It has a minor performance penalty for every path we insert into the index. This patch ties the check to the core.protectHFS config option. Though this is expected to be most useful on OS X, we allow it to be set everywhere, as HFS+ may be mounted on other platforms. The variable does default to on for OS X, though. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-04compat: convert modes to use portable file type valuesLibravatar David Michael1-7/+0
This adds simple wrapper functions around calls to stat(), fstat(), and lstat() that translate the operating system's native file type bits to those used by most operating systems. It also rewrites the S_IF* macros to the common values, so all file type processing is performed using the translated modes. This makes projects portable across operating systems that use different file type definitions. Only the file type bits may be affected by these compatibility functions; the file permission bits are assumed to be 07777 and are passed through unchanged. Signed-off-by: David Michael <fedora.dm0@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01count-objects: report unused files in $GIT_DIR/worktrees/...Libravatar Nguyễn Thái Ngọc Duy1-0/+1
In linked checkouts, borrowed parts like config is taken from $GIT_COMMON_DIR. $GIT_DIR/config is never used. Report them as garbage. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01wrapper.c: wrapper to open a file, fprintf then closeLibravatar Nguyễn Thái Ngọc Duy1-0/+2
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01setup.c: support multi-checkout repo setupLibravatar Nguyễn Thái Ngọc Duy1-0/+1
The repo setup procedure is updated to detect $GIT_DIR/commondir and set $GIT_COMMON_DIR properly. The core.worktree is ignored when $GIT_COMMON_DIR is set. This is because the config file is shared in multi-checkout setup, but checkout directories _are_ different. Making core.worktree effective in all checkouts mean it's back to a single checkout. Helped-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01$GIT_COMMON_DIR: a new environment variableLibravatar Nguyễn Thái Ngọc Duy1-1/+3
This variable is intended to support multiple working directories attached to a repository. Such a repository may have a main working directory, created by either "git init" or "git clone" and one or more linked working directories. These working directories and the main repository share the same repository directory. In linked working directories, $GIT_COMMON_DIR must be defined to point to the real repository directory and $GIT_DIR points to an unused subdirectory inside $GIT_COMMON_DIR. File locations inside the repository are reorganized from the linked worktree view point: - worktree-specific such as HEAD, logs/HEAD, index, other top-level refs and unrecognized files are from $GIT_DIR. - the rest like objects, refs, info, hooks, packed-refs, shallow... are from $GIT_COMMON_DIR (except info/sparse-checkout, but that's a separate patch) Scripts are supposed to retrieve paths in $GIT_DIR with "git rev-parse --git-path", which will take care of "$GIT_DIR vs $GIT_COMMON_DIR" business. The redirection is done by git_path(), git_pathdup() and strbuf_git_path(). The selected list of paths goes to $GIT_COMMON_DIR, not the other way around in case a developer adds a new worktree-specific file and it's accidentally promoted to be shared across repositories (this includes unknown files added by third party commands) The list of known files that belong to $GIT_DIR are: ADD_EDIT.patch BISECT_ANCESTORS_OK BISECT_EXPECTED_REV BISECT_LOG BISECT_NAMES CHERRY_PICK_HEAD COMMIT_MSG FETCH_HEAD HEAD MERGE_HEAD MERGE_MODE MERGE_RR NOTES_EDITMSG NOTES_MERGE_WORKTREE ORIG_HEAD REVERT_HEAD SQUASH_MSG TAG_EDITMSG fast_import_crash_* logs/HEAD next-index-* rebase-apply rebase-merge rsync-refs-* sequencer/* shallow_* Path mapping is NOT done for git_path_submodule(). Multi-checkouts are not supported as submodules. Helped-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01git_path(): be aware of file relocation in $GIT_DIRLibravatar Nguyễn Thái Ngọc Duy1-0/+1
We allow the user to relocate certain paths out of $GIT_DIR via environment variables, e.g. GIT_OBJECT_DIRECTORY, GIT_INDEX_FILE and GIT_GRAFT_FILE. Callers are not supposed to use git_path() or git_pathdup() to get those paths. Instead they must use get_object_directory(), get_index_file() and get_graft_file() respectively. This is inconvenient and could be missed in review (for example, there's git_path("objects/info/alternates") somewhere in sha1_file.c). This patch makes git_path() and git_pathdup() understand those environment variables. So if you set GIT_OBJECT_DIRECTORY to /foo/bar, git_path("objects/abc") should return /foo/bar/abc. The same is done for the two remaining env variables. "git rev-parse --git-path" is the wrapper for script use. This patch kinda reverts a0279e1 (setup_git_env: use git_pathdup instead of xmalloc + sprintf - 2014-06-19) because using git_pathdup here would result in infinite recursion: setup_git_env() -> git_pathdup("objects") -> .. -> adjust_git_path() -> get_object_directory() -> oops, git_object_directory is NOT set yet -> setup_git_env() I wanted to make git_pathdup_literal() that skips adjust_git_path(). But that won't work because later on when $GIT_COMMON_DIR is introduced, git_pathdup_literal("objects") needs adjust_git_path() to replace $GIT_DIR with $GIT_COMMON_DIR. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01git_snpath(): retire and replace with strbuf_git_path()Libravatar Nguyễn Thái Ngọc Duy1-2/+2
In the previous patch, git_snpath() is modified to allocate a new strbuf buffer because vsnpath() needs that. But that makes it awkward because git_snpath() receives a pre-allocated buffer from outside and has to copy data back. Rename it to strbuf_git_path() and make it receive strbuf directly. Using git_path() in update_refs_for_switch() which used to call git_snpath() is safe because that function and all of its callers do not keep any pointer to the round-robin buffer pool allocated by get_pathname(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01path.c: make get_pathname() call sites return const char *Libravatar Nguyễn Thái Ngọc Duy1-3/+3
Before the previous commit, get_pathname returns an array of PATH_MAX length. Even if git_path() and similar functions does not use the whole array, git_path() caller can, in theory. After the commit, get_pathname() may return a buffer that has just enough room for the returned string and git_path() caller should never write beyond that. Make git_path(), mkpath() and git_path_submodule() return a const buffer to make sure callers do not write in it at all. This could have been part of the previous commit, but the "const" conversion is too much distraction from the core changes in path.c. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-29Merge branch 'jk/prune-mtime'Libravatar Junio C Hamano1-1/+45
Tighten the logic to decide that an unreachable cruft is sufficiently old by covering corner cases such as an ancient object becoming reachable and then going unreachable again, in which case its retention period should be prolonged. * jk/prune-mtime: (28 commits) drop add_object_array_with_mode revision: remove definition of unused 'add_object' function pack-objects: double-check options before discarding objects repack: pack objects mentioned by the index pack-objects: use argv_array reachable: use revision machinery's --indexed-objects code rev-list: add --indexed-objects option rev-list: document --reflog option t5516: test pushing a tag of an otherwise unreferenced blob traverse_commit_list: support pending blobs/trees with paths make add_object_array_with_context interface more sane write_sha1_file: freshen existing objects pack-objects: match prune logic for discarding objects pack-objects: refactor unpack-unreachable expiration check prune: keep objects reachable from recent objects sha1_file: add for_each iterators for loose and packed objects count-objects: use for_each_loose_file_in_objdir count-objects: do not use xsize_t when counting object size prune-packed: use for_each_loose_file_in_objdir reachable: mark index blobs as SEEN ...
2014-10-16sha1_file: add for_each iterators for loose and packed objectsLibravatar Jeff King1-0/+11
We typically iterate over the reachable objects in a repository by starting at the tips and walking the graph. There's no easy way to iterate over all of the objects, including unreachable ones. Let's provide a way of doing so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16prune: factor out loose-object directory traversalLibravatar Jeff King1-0/+33
Prune has to walk $GIT_DIR/objects/?? in order to find the set of loose objects to prune. Other parts of the code (e.g., count-objects) want to do the same. Let's factor it out into a reusable for_each-style function. Note that this is not quite a straight code movement. The original code had strange behavior when it found a file of the form "[0-9a-f]{2}/.{38}" that did _not_ contain all hex digits. It executed a "break" from the loop, meaning that we stopped pruning in that directory (but still pruned other directories!). This was probably a bug; we do not want to process the file as an object, but we should keep going otherwise (and that is how the new code handles it). We are also a little more careful with loose object directories which fail to open. The original code silently ignored any failures, but the new code will complain about any problems besides ENOENT. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16foreach_alt_odb: propagate return value from callbackLibravatar Jeff King1-1/+1
We check the return value of the callback and stop iterating if it is non-zero. However, we do not make the non-zero return value available to the caller, so they have no way of knowing whether the operation succeeded or not (technically they can keep their own error flag in the callback data, but that is unlike our other for_each functions). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15refs.c: allow listing and deleting badly named refsLibravatar Ronnie Sahlberg1-2/+15
We currently do not handle badly named refs well: $ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\. $ git branch fatal: Reference has invalid format: 'refs/heads/master.....@*@\.' $ git branch -D master.....@\*@\\. error: branch 'master.....@*@\.' not found. Users cannot recover from a badly named ref without manually finding and deleting the loose ref file or appropriate line in packed-refs. Making that easier will make it easier to tweak the ref naming rules in the future, for example to forbid shell metacharacters like '`' and '"', without putting people in a state that is hard to get out of. So allow "branch --list" to show these refs and allow "branch -d/-D" and "update-ref -d" to delete them. Other commands (for example to rename refs) will continue to not handle these refs but can be changed in later patches. Details: In resolving functions, refuse to resolve refs that don't pass the git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to resolve refs that escape the refs/ directory and do not match the pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD"). In locking functions, refuse to act on badly named refs unless they are being deleted and either are in the refs/ directory or match [A-Z_]*. Just like other invalid refs, flag resolved, badly named refs with the REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them in all iteration functions except for for_each_rawref. Flag badly named refs (but not symrefs pointing to badly named refs) with a REF_BAD_NAME flag to make it easier for future callers to notice and handle them specially. For example, in a later patch for-each-ref will use this flag to detect refs whose names can confuse callers parsing for-each-ref output. In the transaction API, refuse to create or update badly named refs, but allow deleting them (unless they try to escape refs/ and don't match [A-Z_]*). Signed-off-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15branch -d: avoid repeated symref resolutionLibravatar Jonathan Nieder1-0/+6
If a repository gets in a broken state with too much symref nesting, it cannot be repaired with "git branch -d": $ git symbolic-ref refs/heads/nonsense refs/heads/nonsense $ git branch -d nonsense error: branch 'nonsense' not found. Worse, "git update-ref --no-deref -d" doesn't work for such repairs either: $ git update-ref -d refs/heads/nonsense error: unable to resolve reference refs/heads/nonsense: Too many levels of symbolic links Fix both by teaching resolve_ref_unsafe a new RESOLVE_REF_NO_RECURSE flag and passing it when appropriate. Callers can still read the value of a symref (for example to print a message about it) with that flag set --- resolve_ref_unsafe will resolve one level of symrefs and stop there. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15refs.c: change resolve_ref_unsafe reading argument to be a flags fieldLibravatar Ronnie Sahlberg1-11/+12
resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref resolves successfully for writing but not for reading). Change this to be a flags field instead, and pass the new constant RESOLVE_REF_READING when we want this behaviour. While at it, swap two of the arguments in the function to put output arguments at the end. As a nice side effect, this ensures that we can catch callers that were unaware of the new API so they can be audited. Give the wrapper functions resolve_refdup and read_ref_full the same treatment for consistency. Signed-off-by: Ronnie Sahlberg <sahlberg@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-14Merge branch 'mh/lockfile'Libravatar Junio C Hamano1-19/+1
The lockfile API and its users have been cleaned up. * mh/lockfile: (38 commits) lockfile.h: extract new header file for the functions in lockfile.c hold_locked_index(): move from lockfile.c to read-cache.c hold_lock_file_for_append(): restore errno before returning get_locked_file_path(): new function lockfile.c: rename static functions lockfile: rename LOCK_NODEREF to LOCK_NO_DEREF commit_lock_file_to(): refactor a helper out of commit_lock_file() trim_last_path_component(): replace last_path_elm() resolve_symlink(): take a strbuf parameter resolve_symlink(): use a strbuf for internal scratch space lockfile: change lock_file::filename into a strbuf commit_lock_file(): use a strbuf to manage temporary space try_merge_strategy(): use a statically-allocated lock_file object try_merge_strategy(): remove redundant lock_file allocation struct lock_file: declare some fields volatile lockfile: avoid transitory invalid states git_config_set_multivar_in_file(): avoid call to rollback_lock_file() dump_marks(): remove a redundant call to rollback_lock_file() api-lockfile: document edge cases commit_lock_file(): rollback lock file on failure to rename ...