summaryrefslogtreecommitdiff
path: root/path.c
AgeCommit message (Collapse)AuthorFilesLines
2019-12-06Sync with 2.21.1Libravatar Johannes Schindelin1-28/+68
* maint-2.21: (42 commits) Git 2.21.1 mingw: sh arguments need quoting in more circumstances mingw: fix quoting of empty arguments for `sh` mingw: use MSYS2 quoting even when spawning shell scripts mingw: detect when MSYS2's sh is to be spawned more robustly t7415: drop v2.20.x-specific work-around Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh ...
2019-12-06Sync with 2.20.2Libravatar Johannes Schindelin1-28/+68
* maint-2.20: (36 commits) Git 2.20.2 t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories ...
2019-12-06Sync with 2.19.3Libravatar Johannes Schindelin1-28/+68
* maint-2.19: (34 commits) Git 2.19.3 Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams ...
2019-12-06Sync with 2.18.2Libravatar Johannes Schindelin1-28/+68
* maint-2.18: (33 commits) Git 2.18.2 Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up ...
2019-12-06Sync with 2.17.3Libravatar Johannes Schindelin1-28/+68
* maint-2.17: (32 commits) Git 2.17.3 Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names ...
2019-12-06Sync with 2.14.6Libravatar Johannes Schindelin1-28/+68
* maint-2.14: (28 commits) Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment test-path-utils: offer to run a protectNTFS/protectHFS benchmark ...
2019-12-05is_ntfs_dotgit(): speed it upLibravatar Johannes Schindelin1-25/+30
Previously, this function was written without focusing on speed, intending to make reviewing the code as easy as possible, to avoid any bugs in this critical code. Turns out: we can do much better on both accounts. With this patch, we make it as fast as this developer can make it go: - We avoid the call to `is_dir_sep()` and make all the character comparisons explicit. - We avoid the cost of calling `strncasecmp()` and unroll the test for `.git` and `git~1`, not even using `tolower()` because it is faster to compare against two constant values. - We look for `.git` and `.git~1` first thing, and return early if not found. - We also avoid calling a separate function for detecting chains of spaces and periods. Each of these improvements has a noticeable impact on the speed of `is_ntfs_dotgit()`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05path: also guard `.gitmodules` against NTFS Alternate Data StreamsLibravatar Johannes Schindelin1-1/+1
We just safe-guarded `.git` against NTFS Alternate Data Stream-related attack vectors, and now it is time to do the same for `.gitmodules`. Note: In the added regression test, we refrain from verifying all kinds of variations between short names and NTFS Alternate Data Streams: as the new code disallows _all_ Alternate Data Streams of `.gitmodules`, it is enough to test one in order to know that all of them are guarded against. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05path: safeguard `.git` against NTFS Alternate Streams AccessesLibravatar Johannes Schindelin1-1/+11
Probably inspired by HFS' resource streams, NTFS supports "Alternate Data Streams": by appending `:<stream-name>` to the file name, information in addition to the file contents can be written and read, information that is copied together with the file (unless copied to a non-NTFS location). These Alternate Data Streams are typically used for things like marking an executable as having just been downloaded from the internet (and hence not necessarily being trustworthy). In addition to a stream name, a stream type can be appended, like so: `:<stream-name>:<stream-type>`. Unless specified, the default stream type is `$DATA` for files and `$INDEX_ALLOCATION` for directories. In other words, `.git::$INDEX_ALLOCATION` is a valid way to reference the `.git` directory! In our work in Git v2.2.1 to protect Git on NTFS drives under `core.protectNTFS`, we focused exclusively on NTFS short names, unaware of the fact that NTFS Alternate Data Streams offer a similar attack vector. Let's fix this. Seeing as it is better to be safe than sorry, we simply disallow paths referring to *any* NTFS Alternate Data Stream of `.git`, not just `::$INDEX_ALLOCATION`. This also simplifies the implementation. This closes CVE-2019-1352. Further reading about NTFS Alternate Data Streams: https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c54dec26-1551-4d3a-a0ea-4fa40f848eb3 Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-05is_ntfs_dotgit(): only verify the leading segmentLibravatar Johannes Schindelin1-4/+1
The config setting `core.protectNTFS` is specifically designed to work not only on Windows, but anywhere, to allow for repositories hosted on, say, Linux servers to be protected against NTFS-specific attack vectors. As a consequence, `is_ntfs_dotgit()` manually splits backslash-separated paths (but does not do the same for paths separated by forward slashes), under the assumption that the backslash might not be a valid directory separator on the _current_ Operating System. However, the two callers, `verify_path()` and `fsck_tree()`, are supposed to feed only individual path segments to the `is_ntfs_dotgit()` function. This causes a lot of duplicate scanning (and very inefficient scanning, too, as the inner loop of `is_ntfs_dotgit()` was optimized for readability rather than for speed. Let's simplify the design of `is_ntfs_dotgit()` by putting the burden of splitting the paths by backslashes as directory separators on the callers of said function. Consequently, the `verify_path()` function, which already splits the path by directory separators, now treats backslashes as directory separators _explicitly_ when `core.protectNTFS` is turned on, even on platforms where the backslash is _not_ a directory separator. Note that we have to repeat some code in `verify_path()`: if the backslash is not a directory separator on the current Operating System, we want to allow file names like `\`, but we _do_ want to disallow paths that are clearly intended to cause harm when the repository is cloned on Windows. The `fsck_tree()` function (the other caller of `is_ntfs_dotgit()`) now needs to look for backslashes in tree entries' names specifically when `core.protectNTFS` is turned on. While it would be tempting to completely disallow backslashes in that case (much like `fsck` reports names containing forward slashes as "full paths"), this would be overzealous: when `core.protectNTFS` is turned on in a non-Windows setup, backslashes are perfectly valid characters in file names while we _still_ want to disallow tree entries that are clearly designed to exploit NTFS-specific behavior. This simplification will make subsequent changes easier to implement, such as turning `core.protectNTFS` on by default (not only on Windows) or protecting against attack vectors involving NTFS Alternate Data Streams. Incidentally, this change allows for catching malicious repositories that contain tree entries of the form `dir\.gitmodules` already on the server side rather than only on the client side (and previously only on Windows): in contrast to `is_ntfs_dotgit()`, the `is_ntfs_dotgitmodules()` function already expects the caller to split the paths by directory separators. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-12-04path.c: document the purpose of `is_ntfs_dotgit()`Libravatar Johannes Schindelin1-0/+28
Previously, this function was completely undocumented. It is worth, though, to explain what is going on, as it is not really obvious at all. Suggested-by: Garima Singh <garima.singh@microsoft.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2019-04-10Merge branch 'nd/rewritten-ref-is-per-worktree'Libravatar Junio C Hamano1-0/+3
"git rebase" uses the refs/rewritten/ hierarchy to store its intermediate states, which inherently makes the hierarchy per worktree, but it didn't quite work well. * nd/rewritten-ref-is-per-worktree: Make sure refs/rewritten/ is per-worktree files-backend.c: reduce duplication in add_per_worktree_entries_to_dir() files-backend.c: factor out per-worktree code in loose_fill_ref_dir()
2019-03-08Make sure refs/rewritten/ is per-worktreeLibravatar Nguyễn Thái Ngọc Duy1-0/+3
a9be29c981 (sequencer: make refs generated by the `label` command worktree-local, 2018-04-25) adds refs/rewritten/ as per-worktree reference space. Unfortunately (my bad) there are a couple places that need update to make sure it's really per-worktree. - add_per_worktree_entries_to_dir() is updated to make sure ref listing look at per-worktree refs/rewritten/ instead of per-repo one [1] - common_list[] is updated so that git_path() returns the correct location. This includes "rev-parse --git-path". This mess is created by me. I started trying to fix it with the introduction of refs/worktree, where all refs will be per-worktree without special treatments. Unfortunate refs/rewritten came before refs/worktree so this is all we can do. This also fixes logs/refs/worktree not being per-worktree. [1] note that ref listing still works sometimes. For example, if you have .git/worktrees/foo/refs/rewritten/bar AND the directory .git/worktrees/refs/rewritten, refs/rewritten/bar will show up. add_per_worktree_entries_to_dir() is only needed when the directory .git/worktrees/refs/rewritten is missing. Reported-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-04Merge branch 'jk/loose-object-cache'Libravatar Junio C Hamano1-1/+1
Code clean-up with optimization for the codepath that checks (non-)existence of loose objects. * jk/loose-object-cache: odb_load_loose_cache: fix strbuf leak fetch-pack: drop custom loose object cache sha1-file: use loose object cache for quick existence check object-store: provide helpers for loose_objects_cache sha1-file: use an object_directory for the main object dir handle alternates paths the same as the main object dir sha1_file_name(): overwrite buffer instead of appending rename "alternate_object_database" to "object_directory" submodule--helper: prefer strip_suffix() to ends_with() fsck: do not reuse child_process structs
2018-11-21Merge branch 'tb/char-may-be-unsigned' into maintLibravatar Junio C Hamano1-1/+1
Build portability fix. * tb/char-may-be-unsigned: path.c: char is not (always) signed
2018-11-13Merge branch 'nd/per-worktree-ref-iteration'Libravatar Junio C Hamano1-0/+2
The code to traverse objects for reachability, used to decide what objects are unreferenced and expendable, have been taught to also consider per-worktree refs of other worktrees as starting points to prevent data loss. * nd/per-worktree-ref-iteration: git-worktree.txt: correct linkgit command name reflog expire: cover reflog from all worktrees fsck: check HEAD and reflog from other worktrees fsck: move fsck_head_link() to get_default_heads() to avoid some globals revision.c: better error reporting on ref from different worktrees revision.c: correct a parameter name refs: new ref types to make per-worktree refs visible to all worktrees Add a place for (not) sharing stuff between worktrees refs.c: indent with tabs, not spaces
2018-11-13sha1-file: use an object_directory for the main object dirLibravatar Jeff King1-1/+1
Our handling of alternate object directories is needlessly different from the main object directory. As a result, many places in the code basically look like this: do_something(r->objects->objdir); for (odb = r->objects->alt_odb_list; odb; odb = odb->next) do_something(odb->path); That gets annoying when do_something() is non-trivial, and we've resorted to gross hacks like creating fake alternates (see find_short_object_filename()). Instead, let's give each raw_object_store a unified list of object_directory structs. The first will be the main store, and everything after is an alternate. Very few callers even care about the distinction, and can just loop over the whole list (and those who care can just treat the first element differently). A few observations: - we don't need r->objects->objectdir anymore, and can just mechanically convert that to r->objects->odb->path - object_directory's path field needs to become a real pointer rather than a FLEX_ARRAY, in order to fill it with expand_base_dir() - we'll call prepare_alt_odb() earlier in many functions (i.e., outside of the loop). This may result in us calling it even when our function would be satisfied looking only at the main odb. But this doesn't matter in practice. It's not a very expensive operation in the first place, and in the majority of cases it will be a noop. We call it already (and cache its results) in prepare_packed_git(), and we'll generally check packs before loose objects. So essentially every program is going to call it immediately once per program. Arguably we should just prepare_alt_odb() immediately upon setting up the repository's object directory, which would save us sprinkling calls throughout the code base (and forgetting to do so has been a source of subtle bugs in the past). But I've stopped short of that here, since there are already a lot of other moving parts in this patch. - Most call sites just get shorter. The check_and_freshen() functions are an exception, because they have entry points to handle local and nonlocal directories separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-06Merge branch 'tb/char-may-be-unsigned'Libravatar Junio C Hamano1-1/+1
Build portability fix. * tb/char-may-be-unsigned: path.c: char is not (always) signed
2018-10-26path.c: char is not (always) signedLibravatar Torsten Bögershausen1-1/+1
If a "char" in C is signed or unsigned is not specified, because it is out of tradition "implementation dependent". Therefore constructs like "if (name[i] < 0)" are not portable, use "if (name[i] & 0x80)" instead. Detected by "gcc (Raspbian 6.3.0-18+rpi1+deb9u1) 6.3.0 20170516" when setting DEVELOPER = 1 DEVOPTS = extra-all Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-07Add a place for (not) sharing stuff between worktreesLibravatar Nguyễn Thái Ngọc Duy1-0/+2
When multiple worktrees are used, we need rules to determine if something belongs to one worktree or all of them. Instead of keeping adding rules when new stuff comes (*), have a generic rule: - Inside $GIT_DIR, which is per-worktree by default, add $GIT_DIR/common which is always shared. New features that want to share stuff should put stuff under this directory. - Inside refs/, which is shared by default except refs/bisect, add refs/worktree/ which is per-worktree. We may eventually move refs/bisect to this new location and remove the exception in refs code. (*) And it may also include stuff from external commands which will have no way to modify common/per-worktree rules. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-18Merge branch 'sb/object-store-grafts'Libravatar Junio C Hamano1-9/+9
The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h
2018-05-29Sync with Git 2.17.1Libravatar Junio C Hamano1-1/+85
* maint: (25 commits) Git 2.17.1 Git 2.16.4 Git 2.15.2 Git 2.14.4 Git 2.13.7 fsck: complain when .gitmodules is a symlink index-pack: check .gitmodules files with --strict unpack-objects: call fsck_finish() after fscking objects fsck: call fsck_finish() after fscking objects fsck: check .gitmodules content fsck: handle promisor objects in .gitmodules check fsck: detect gitmodules files fsck: actually fsck blob data fsck: simplify ".git" check index-pack: make fsck error message more specific verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant ...
2018-05-22Sync with Git 2.14.4Libravatar Junio C Hamano1-1/+85
* maint-2.14: Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths
2018-05-22Sync with Git 2.13.7Libravatar Junio C Hamano1-1/+85
* maint-2.13: Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths
2018-05-21is_ntfs_dotgit: match other .git filesLibravatar Johannes Schindelin1-0/+84
When we started to catch NTFS short names that clash with .git, we only looked for GIT~1. This is sufficient because we only ever clone into an empty directory, so .git is guaranteed to be the first subdirectory or file in that directory. However, even with a fresh clone, .gitmodules is *not* necessarily the first file to be written that would want the NTFS short name GITMOD~1: a malicious repository can add .gitmodul0000 and friends, which sorts before `.gitmodules` and is therefore checked out *first*. For that reason, we have to test not only for ~1 short names, but for others, too. It's hard to just adapt the existing checks in is_ntfs_dotgit(): since Windows 2000 (i.e., in all Windows versions still supported by Git), NTFS short names are only generated in the <prefix>~<number> form up to number 4. After that, a *different* prefix is used, calculated from the long file name using an undocumented, but stable algorithm. For example, the short name of .gitmodules would be GITMOD~1, but if it is taken, and all of ~2, ~3 and ~4 are taken, too, the short name GI7EBA~1 will be used. From there, collisions are handled by incrementing the number, shortening the prefix as needed (until ~9999999 is reached, in which case NTFS will not allow the file to be created). We'd also want to handle .gitignore and .gitattributes, which suffer from a similar problem, using the fall-back short names GI250A~1 and GI7D29~1, respectively. To accommodate for that, we could reimplement the hashing algorithm, but it is just safer and simpler to provide the known prefixes. This algorithm has been reverse-engineered and described at https://usn.pw/blog/gen/2015/06/09/filenames/, which is defunct but still available via https://web.archive.org/. These can be recomputed by running the following Perl script: -- snip -- use warnings; use strict; sub compute_short_name_hash ($) { my $checksum = 0; foreach (split('', $_[0])) { $checksum = ($checksum * 0x25 + ord($_)) & 0xffff; } $checksum = ($checksum * 314159269) & 0xffffffff; $checksum = 1 + (~$checksum & 0x7fffffff) if ($checksum & 0x80000000); $checksum -= (($checksum * 1152921497) >> 60) * 1000000007; return scalar reverse sprintf("%x", $checksum & 0xffff); } print compute_short_name_hash($ARGV[0]); -- snap -- E.g., running that with the argument ".gitignore" will result in "250a" (which then becomes "gi250a" in the code). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff King <peff@peff.net>
2018-05-21is_ntfs_dotgit: use a size_t for traversing stringLibravatar Jeff King1-1/+1
We walk through the "name" string using an int, which can wrap to a negative value and cause us to read random memory before our array (e.g., by creating a tree with a name >2GB, since "int" is still 32 bits even on most 64-bit platforms). Worse, this is easy to trigger during the fsck_tree() check, which is supposed to be protecting us from malicious garbage. Note one bit of trickiness in the existing code: we sometimes assign -1 to "len" at the end of the loop, and then rely on the "len++" in the for-loop's increment to take it back to 0. This is still legal with a size_t, since assigning -1 will turn into SIZE_MAX, which then wraps around to 0 on increment. Signed-off-by: Jeff King <peff@peff.net>
2018-05-18path.c: migrate global git_path_* to take a repository argumentLibravatar Stefan Beller1-9/+9
Migrate all git_path_* functions that are defined in path.c to take a repository argument. Unlike other patches in this series, do not use the #define trick, as we rewrite the whole function, which is rather small. This doesn't migrate all the functions, as other builtins have their own local path functions defined using GIT_PATH_FUNC. So keep that macro around to serve the other locations. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-23repository: introduce raw object store fieldLibravatar Stefan Beller1-1/+2
The raw object store field will contain any objects needed for access to objects in a given repository. This patch introduces the raw object store and populates it with the `objectdir`, which used to be part of the repository struct. As the struct gains members, we'll also populate the function to clear the memory for these members. In a later step, we'll introduce a struct object_parser, that will complement the object parsing in a repository struct: The raw object parser is the layer that will provide access to raw object content, while the higher level object parser code will parse raw objects and keeps track of parenthood and other object relationships using 'struct object'. For now only add the lower level to the repository struct. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-28Merge branch 'ao/path-use-xmalloc'Libravatar Junio C Hamano1-1/+1
A possible oom error is now caught as a fatal error, instead of continuing and dereferencing NULL. * ao/path-use-xmalloc: path.c: use xmalloc() in add_to_trie()
2017-10-25path.c: use xmalloc() in add_to_trie()Libravatar Andrey Okoshkin1-1/+1
Add usage of xmalloc() instead of malloc() in add_to_trie() as xmalloc wraps and checks memory allocation result. Signed-off-by: Andrey Okoshkin <a.okoshkin@samsung.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-18Merge branch 'jk/validate-headref-fix' into maintLibravatar Junio C Hamano1-11/+12
Code clean-up. * jk/validate-headref-fix: validate_headref: use get_oid_hex for detached HEADs validate_headref: use skip_prefix for symref parsing validate_headref: NUL-terminate HEAD buffer
2017-10-07Merge branch 'tg/memfixes'Libravatar Junio C Hamano1-5/+4
Fixes for a handful memory access issues identified by valgrind. * tg/memfixes: sub-process: use child_process.args instead of child_process.argv http-push: fix construction of hex value from path path.c: fix uninitialized memory access
2017-10-05Merge branch 'rs/cleanup-strbuf-users'Libravatar Junio C Hamano1-1/+1
Code clean-up. * rs/cleanup-strbuf-users: graph: use strbuf_addchars() to add spaces use strbuf_addstr() for adding strings to strbufs path: use strbuf_add_real_path()
2017-10-04path.c: fix uninitialized memory accessLibravatar Jeff King1-5/+4
In cleanup_path we're passing in a char array, run a memcmp on it, and run through it without ever checking if something is in the array in the first place. This can lead us to access uninitialized memory, for example in t5541-http-push-smart.sh test 7, when run under valgrind: ==4423== Conditional jump or move depends on uninitialised value(s) ==4423== at 0x242FA9: cleanup_path (path.c:35) ==4423== by 0x242FA9: mkpath (path.c:456) ==4423== by 0x256CC7: refname_match (refs.c:364) ==4423== by 0x26C181: count_refspec_match (remote.c:1015) ==4423== by 0x26C181: match_explicit_lhs (remote.c:1126) ==4423== by 0x26C181: check_push_refs (remote.c:1409) ==4423== by 0x2ABB4D: transport_push (transport.c:870) ==4423== by 0x186703: push_with_options (push.c:332) ==4423== by 0x18746D: do_push (push.c:409) ==4423== by 0x18746D: cmd_push (push.c:566) ==4423== by 0x1183E0: run_builtin (git.c:352) ==4423== by 0x11973E: handle_builtin (git.c:539) ==4423== by 0x11973E: run_argv (git.c:593) ==4423== by 0x11973E: main (git.c:698) ==4423== Uninitialised value was created by a heap allocation ==4423== at 0x4C2CD8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==4423== by 0x4C2F195: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==4423== by 0x2C196B: xrealloc (wrapper.c:137) ==4423== by 0x29A30B: strbuf_grow (strbuf.c:66) ==4423== by 0x29A30B: strbuf_vaddf (strbuf.c:277) ==4423== by 0x242F9F: mkpath (path.c:454) ==4423== by 0x256CC7: refname_match (refs.c:364) ==4423== by 0x26C181: count_refspec_match (remote.c:1015) ==4423== by 0x26C181: match_explicit_lhs (remote.c:1126) ==4423== by 0x26C181: check_push_refs (remote.c:1409) ==4423== by 0x2ABB4D: transport_push (transport.c:870) ==4423== by 0x186703: push_with_options (push.c:332) ==4423== by 0x18746D: do_push (push.c:409) ==4423== by 0x18746D: cmd_push (push.c:566) ==4423== by 0x1183E0: run_builtin (git.c:352) ==4423== by 0x11973E: handle_builtin (git.c:539) ==4423== by 0x11973E: run_argv (git.c:593) ==4423== by 0x11973E: main (git.c:698) ==4423== Avoid this by using skip_prefix(), which knows not to go beyond the end of the string. Reported-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-03Merge branch 'jk/validate-headref-fix'Libravatar Junio C Hamano1-11/+12
Code clean-up. * jk/validate-headref-fix: validate_headref: use get_oid_hex for detached HEADs validate_headref: use skip_prefix for symref parsing validate_headref: NUL-terminate HEAD buffer
2017-10-02path: use strbuf_add_real_path()Libravatar René Scharfe1-1/+1
Avoid a string copy to a static buffer by using strbuf_add_real_path() instead of combining strbuf_addstr() and real_path(). Patch generated by Coccinelle and contrib/coccinelle/strbuf.cocci. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-27validate_headref: use get_oid_hex for detached HEADsLibravatar Jeff King1-2/+2
If a candidate HEAD isn't a symref, we check that it contains a viable sha1. But in a post-sha1 world, we should be checking whether it has any plausible object-id. We can do that by switching to get_oid_hex(). Note that both before and after this patch, we only check for a plausible object id at the start of the file, and then call that good enough. We ignore any content _after_ the hex, so a string like: 0123456789012345678901234567890123456789 foo is accepted. Though we do put extra bytes like this into some pseudorefs (e.g., FETCH_HEAD), we don't typically do so with HEAD. We could tighten this up by using parse_oid_hex(), like: if (!parse_oid_hex(buffer, &oid, &end) && *end++ == '\n' && *end == '\0') return 0; But we're probably better to remain on the loose side. We're just checking here for a plausible-looking repository directory, so heuristics are acceptable (if we really want to be meticulous, we should use the actual ref code to parse HEAD). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-27validate_headref: use skip_prefix for symref parsingLibravatar Jeff King1-9/+6
Since the previous commit guarantees that our symref buffer is NUL-terminated, we can just use skip_prefix() and friends to parse it. This is shorter and saves us having to deal with magic numbers and keeping the "len" counter up to date. While we're at it, let's name the rather obscure "buf" to "refname", since that is the thing we are parsing with it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-27validate_headref: NUL-terminate HEAD bufferLibravatar Jeff King1-0/+4
When we are checking to see if we have a git repo, we peek into the HEAD file and see if it's a plausible symlink, symref, or detached HEAD. For the latter two, we read the contents with read_in_full(), which means they aren't NUL-terminated. The symref check is careful to respect the length we got, but the sha1 check will happily parse up to 40 bytes, even if we read fewer. E.g.,: echo 1234 >.git/HEAD git rev-parse will parse 36 uninitialized bytes from our stack buffer. This isn't a big deal in practice. Our buffer is 256 bytes, so we know we'll never read outside of it. The worst case is that the uninitialized bytes look like valid hex, and we claim a bogus HEAD file is valid. The chances of this happening randomly are quite slim, but let's be careful. One option would be to check that "len == 41" before feeding the buffer to get_sha1_hex(). But we'd like to eventually prepare for a world with variable-length hashes. Let's NUL-terminate as soon as we've read the buffer (we already even leave a spare byte to do so!). That fixes this problem without depending on the size of an object id. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move {,re}prepare_packed_git and approximate_object_countLibravatar Jonathan Tan1-0/+1
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-04Merge tag 'v2.13.5' into maintLibravatar Junio C Hamano1-0/+5
2017-08-01Merge tag 'v2.12.4' into maintLibravatar Junio C Hamano1-0/+5
2017-07-30Merge tag 'v2.10.4' into maint-2.11Libravatar Junio C Hamano1-0/+5
Git 2.10.4
2017-07-30Merge tag 'v2.9.5' into maint-2.10Libravatar Junio C Hamano1-0/+5
Git 2.9.5
2017-07-30Merge tag 'v2.8.6' into maint-2.9Libravatar Junio C Hamano1-0/+5
Git 2.8.6
2017-07-30Merge tag 'v2.7.6' into maint-2.8Libravatar Junio C Hamano1-0/+5
Git 2.7.6
2017-07-28connect: factor out "looks like command line option" checkLibravatar Jeff King1-0/+5
We reject hostnames that start with a dash because they may be confused for command-line options. Let's factor out that notion into a helper function, as we'll use it in more places. And while it's simple now, it's not clear if some systems might need more complex logic to handle all cases. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23path: add repo_worktree_path and strbuf_repo_worktree_pathLibravatar Brandon Williams1-0/+41
Introduce 'repo_worktree_path' and 'strbuf_repo_worktree_path' which take a repository struct and constructs a path relative to the repository's worktree. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23path: add repo_git_path and strbuf_repo_git_pathLibravatar Brandon Williams1-0/+21
Introduce 'repo_git_path' and 'strbuf_repo_git_path' which take a repository struct and constructs a path into the repository's git directory. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23path: worktree_git_path() should not use file relocationLibravatar Brandon Williams1-1/+2
git_path is a convenience function that usually produces a string $GIT_DIR/<path>. Since v2.5.0-rc0~143^2~35 (git_path(): be aware of file relocation in $GIT_DIR, 2014-11-30), as a side benefit callers get support for path relocation variables like $GIT_OBJECT_DIRECTORY: - git_path("index") is $GIT_INDEX_FILE when set - git_path("info/grafts") is $GIT_GRAFTS_FILE when set - git_path("objects/<foo>") is $GIT_OBJECT_DIRECTORY/<foo> when set - git_path("hooks/<foo>") is <foo> under core.hookspath when set - git_path("refs/<foo>") etc (see path.c::common_list) is relative to $GIT_COMMON_DIR instead of $GIT_DIR worktree_git_path, by comparison, is designed to resolve files in a specific worktree's git dir. Unfortunately, it shares code with git_path and performs the same relocation. The result is that paths that are meant to be relative to the specified worktree's git dir end up replaced by paths from environment variables within the current git dir. Luckily, no current callers pass such arguments. The relocation was noticed when testing the result of merging two patches under review, one of which introduces a caller: * The first patch made git prune check the index file in each worktree's git dir (using worktree_git_path(wt, "index")) for objects not to prune. This would trigger the unwanted relocation when GIT_INDEX_FILE is set, causing objects reachable from the index to be pruned. * The second patch simplified the relocation logic for index, info/grafts, objects, and hooks to happen unconditionally instead of based on whether environment or configuration variables are set. This caused the relocation to trigger even when GIT_INDEX_FILE is not set. [jn: rewrote commit message; skipping all relocation instead of just GIT_INDEX_FILE] Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>