summaryrefslogtreecommitdiff
path: root/git-compat-util.h
AgeCommit message (Collapse)AuthorFilesLines
2021-02-12checkout: fix bug that makes checkout follow symlinks in leading pathLibravatar Matheus Tavares1-0/+5
Before checking out a file, we have to confirm that all of its leading components are real existing directories. And to reduce the number of lstat() calls in this process, we cache the last leading path known to contain only directories. However, when a path collision occurs (e.g. when checking out case-sensitive files in case-insensitive file systems), a cached path might have its file type changed on disk, leaving the cache on an invalid state. Normally, this doesn't bring any bad consequences as we usually check out files in index order, and therefore, by the time the cached path becomes outdated, we no longer need it anyway (because all files in that directory would have already been written). But, there are some users of the checkout machinery that do not always follow the index order. In particular: checkout-index writes the paths in the same order that they appear on the CLI (or stdin); and the delayed checkout feature -- used when a long-running filter process replies with "status=delayed" -- postpones the checkout of some entries, thus modifying the checkout order. When we have to check out an out-of-order entry and the lstat() cache is invalid (due to a previous path collision), checkout_entry() may end up using the invalid data and thrusting that the leading components are real directories when, in reality, they are not. In the best case scenario, where the directory was replaced by a regular file, the user will get an error: "fatal: unable to create file 'foo/bar': Not a directory". But if the directory was replaced by a symlink, checkout could actually end up following the symlink and writing the file at a wrong place, even outside the repository. Since delayed checkout is affected by this bug, it could be used by an attacker to write arbitrary files during the clone of a maliciously crafted repository. Some candidate solutions considered were to disable the lstat() cache during unordered checkouts or sort the entries before passing them to the checkout machinery. But both ideas include some performance penalty and they don't future-proof the code against new unordered use cases. Instead, we now manually reset the lstat cache whenever we successfully remove a directory. Note: We are not even checking whether the directory was the same as the lstat cache points to because we might face a scenario where the paths refer to the same location but differ due to case folding, precomposed UTF-8 issues, or the presence of `..` components in the path. Two regression tests, with case-collisions and utf8-collisions, are also added for both checkout-index and delayed checkout. Note: to make the previously mentioned clone attack unfeasible, it would be sufficient to reset the lstat cache only after the remove_subtree() call inside checkout_entry(). This is the place where we would remove a directory whose path collides with the path of another entry that we are currently trying to check out (possibly a symlink). However, in the interest of a thorough fix that does not leave Git open to similar-but-not-identical attack vectors, we decided to intercept all `rmdir()` calls in one fell swoop. This addresses CVE-2021-21300. Co-authored-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
2019-12-06Sync with 2.16.6Libravatar Johannes Schindelin1-0/+4
* maint-2.16: (31 commits) Git 2.16.6 test-drop-caches: use `has_dos_drive_prefix()` Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses ...
2019-12-06Sync with 2.15.4Libravatar Johannes Schindelin1-0/+4
* maint-2.15: (29 commits) Git 2.15.4 Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment ...
2019-12-06Sync with 2.14.6Libravatar Johannes Schindelin1-0/+4
* maint-2.14: (28 commits) Git 2.14.6 mingw: handle `subst`-ed "DOS drives" mingw: refuse to access paths with trailing spaces or periods mingw: refuse to access paths with illegal characters unpack-trees: let merged_entry() pass through do_add_entry()'s errors quote-stress-test: offer to test quoting arguments for MSYS2 sh t6130/t9350: prepare for stringent Win32 path validation quote-stress-test: allow skipping some trials quote-stress-test: accept arguments to test via the command-line tests: add a helper to stress test argument quoting mingw: fix quoting of arguments Disallow dubiously-nested submodule git directories protect_ntfs: turn on NTFS protection by default path: also guard `.gitmodules` against NTFS Alternate Data Streams is_ntfs_dotgit(): speed it up mingw: disallow backslash characters in tree objects' file names path: safeguard `.git` against NTFS Alternate Streams Accesses clone --recurse-submodules: prevent name squatting on Windows is_ntfs_dotgit(): only verify the leading segment test-path-utils: offer to run a protectNTFS/protectHFS benchmark ...
2019-12-05mingw: refuse to access paths with trailing spaces or periodsLibravatar Johannes Schindelin1-0/+4
When creating a directory on Windows whose path ends in a space or a period (or chains thereof), the Win32 API "helpfully" trims those. For example, `mkdir("abc ");` will return success, but actually create a directory called `abc` instead. This stems back to the DOS days, when all file names had exactly 8 characters plus exactly 3 characters for the file extension, and the only way to have shorter names was by padding with spaces. Sadly, this "helpful" behavior is a bit inconsistent: after a successful `mkdir("abc ");`, a `mkdir("abc /def")` will actually _fail_ (because the directory `abc ` does not actually exist). Even if it would work, we now have a serious problem because a Git repository could contain directories `abc` and `abc `, and on Windows, they would be "merged" unintentionally. As these paths are illegal on Windows, anyway, let's disallow any accesses to such paths on that Operating System. For practical reasons, this behavior is still guarded by the config setting `core.protectNTFS`: it is possible (and at least two regression tests make use of it) to create commits without involving the worktree. In such a scenario, it is of course possible -- even on Windows -- to create such file names. Among other consequences, this patch disallows submodules' paths to end in spaces on Windows (which would formerly have confused Git enough to try to write into incorrect paths, anyway). While this patch does not fix a vulnerability on its own, it prevents an attack vector that was exploited in demonstrations of a number of recently-fixed security bugs. The regression test added to `t/t7417-submodule-path-url.sh` reflects that attack vector. Note that we have to adjust the test case "prevent git~1 squatting on Windows" in `t/t7415-submodule-names.sh` because of a very subtle issue. It tries to clone two submodules whose names differ only in a trailing period character, and as a consequence their git directories differ in the same way. Previously, when Git tried to clone the second submodule, it thought that the git directory already existed (because on Windows, when you create a directory with the name `b.` it actually creates `b`), but with this patch, the first submodule's clone will fail because of the illegal name of the git directory. Therefore, when cloning the second submodule, Git will take a different code path: a fresh clone (without an existing git directory). Both code paths fail to clone the second submodule, both because the the corresponding worktree directory exists and is not empty, but the error messages are worded differently. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2018-05-22Sync with Git 2.16.4Libravatar Junio C Hamano1-0/+17
* maint-2.16: Git 2.16.4 Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths
2018-05-22Sync with Git 2.15.2Libravatar Junio C Hamano1-0/+17
* maint-2.15: Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths
2018-05-22Sync with Git 2.14.4Libravatar Junio C Hamano1-0/+17
* maint-2.14: Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths
2018-05-22Sync with Git 2.13.7Libravatar Junio C Hamano1-0/+17
* maint-2.13: Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths
2018-05-21skip_prefix: add case-insensitive variantLibravatar Jeff King1-0/+17
We have the convenient skip_prefix() helper, but if you want to do case-insensitive matching, you're stuck doing it by hand. We could add an extra parameter to the function to let callers ask for this, but the function is small and somewhat performance-critical. Let's just re-implement it for the case-insensitive version. Signed-off-by: Jeff King <peff@peff.net>
2018-02-22wrapper: rename 'template' variablesLibravatar Brandon Williams1-2/+2
Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-11git-compat-util: introduce skip_to_optional_arg()Libravatar Christian Couder1-0/+23
We often accept both a "--key" option and a "--key=<val>" option. These options currently are parsed using something like: if (!strcmp(arg, "--key")) { /* do something */ } else if (skip_prefix(arg, "--key=", &arg)) { /* do something with arg */ } which is a bit cumbersome compared to just: if (skip_to_optional_arg(arg, "--key", &arg)) { /* do something with arg */ } This also introduces skip_to_optional_arg_default() for the few cases where something different should be done when the first argument is exactly "--key" than when it is exactly "--key=". In general it is better for UI consistency and simplicity if "--key" and "--key=" do the same thing though, so that using skip_to_optional_arg() should be encouraged compared to skip_to_optional_arg_default(). Note that these functions can be used to parse any "key=value" string where "key" is also considered as valid, not just command line options. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-29Merge branch 'rj/no-sign-compare'Libravatar Junio C Hamano1-2/+4
Many codepaths have been updated to squelch -Wsign-compare warnings. * rj/no-sign-compare: ALLOC_GROW: avoid -Wsign-compare warnings cache.h: hex2chr() - avoid -Wsign-compare warnings commit-slab.h: avoid -Wsign-compare warnings git-compat-util.h: xsize_t() - avoid -Wsign-compare warnings
2017-09-22git-compat-util.h: xsize_t() - avoid -Wsign-compare warningsLibravatar Ramsay Jones1-2/+4
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-20git-compat-util: make UNLEAK less error-proneLibravatar Jonathan Tan1-2/+2
Commit 0e5bba5 ("add UNLEAK annotation for reducing leak false positives", 2017-09-08) introduced an UNLEAK macro to be used as "UNLEAK(var);", but its existing definitions leave semicolons that act as empty statements, which will lead to syntax errors, e.g. if (condition) UNLEAK(var); else something_else(var); would be broken with two statements between if (condition) and else. Lose the excess semicolon from the end of the macro replacement text. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-08add UNLEAK annotation for reducing leak false positivesLibravatar Jeff King1-0/+20
It's a common pattern in git commands to allocate some memory that should last for the lifetime of the program and then not bother to free it, relying on the OS to throw it away. This keeps the code simple, and it's fast (we don't waste time traversing structures or calling free at the end of the program). But it also triggers warnings from memory-leak checkers like valgrind or LSAN. They know that the memory was still allocated at program exit, but they don't know _when_ the leaked memory stopped being useful. If it was early in the program, then it's probably a real and important leak. But if it was used right up until program exit, it's not an interesting leak and we'd like to suppress it so that we can see the real leaks. This patch introduces an UNLEAK() macro that lets us do so. To understand its design, let's first look at some of the alternatives. Unfortunately the suppression systems offered by leak-checking tools don't quite do what we want. A leak-checker basically knows two things: 1. Which blocks were allocated via malloc, and the callstack during the allocation. 2. Which blocks were left un-freed at the end of the program (and which are unreachable, but more on that later). Their suppressions work by mentioning the function or callstack of a particular allocation, and marking it as OK to leak. So imagine you have code like this: int cmd_foo(...) { /* this allocates some memory */ char *p = some_function(); printf("%s", p); return 0; } You can say "ignore allocations from some_function(), they're not leaks". But that's not right. That function may be called elsewhere, too, and we would potentially want to know about those leaks. So you can say "ignore the callstack when main calls some_function". That works, but your annotations are brittle. In this case it's only two functions, but you can imagine that the actual allocation is much deeper. If any of the intermediate code changes, you have to update the suppression. What we _really_ want to say is that "the value assigned to p at the end of the function is not a real leak". But leak-checkers can't understand that; they don't know about "p" in the first place. However, we can do something a little bit tricky if we make some assumptions about how leak-checkers work. They generally don't just report all un-freed blocks. That would report even globals which are still accessible when the leak-check is run. Instead they take some set of memory (like BSS) as a root and mark it as "reachable". Then they scan the reachable blocks for anything that looks like a pointer to a malloc'd block, and consider that block reachable. And then they scan those blocks, and so on, transitively marking anything reachable from a global as "not leaked" (or at least leaked in a different category). So we can mark the value of "p" as reachable by putting it into a variable with program lifetime. One way to do that is to just mark "p" as static. But that actually affects the run-time behavior if the function is called twice (you aren't likely to call main() twice, but some of our cmd_*() functions are called from other commands). Instead, we can trick the leak-checker by putting the value into _any_ reachable bytes. This patch keeps a global linked-list of bytes copied from "unleaked" variables. That list is reachable even at program exit, which confers recursive reachability on whatever values we unleak. In other words, you can do: int cmd_foo(...) { char *p = some_function(); printf("%s", p); UNLEAK(p); return 0; } to annotate "p" and suppress the leak report. But wait, couldn't we just say "free(p)"? In this toy example, yes. But UNLEAK()'s byte-copying strategy has several advantages over actually freeing the memory: 1. It's recursive across structures. In many cases our "p" is not just a pointer, but a complex struct whose fields may have been allocated by a sub-function. And in some cases (e.g., dir_struct) we don't even have a function which knows how to free all of the struct members. By marking the struct itself as reachable, that confers reachability on any pointers it contains (including those found in embedded structs, or reachable by walking heap blocks recursively. 2. It works on cases where we're not sure if the value is allocated or not. For example: char *p = argc > 1 ? argv[1] : some_function(); It's safe to use UNLEAK(p) here, because it's not freeing any memory. In the case that we're pointing to argv here, the reachability checker will just ignore our bytes. 3. Likewise, it works even if the variable has _already_ been freed. We're just copying the pointer bytes. If the block has been freed, the leak-checker will skip over those bytes as uninteresting. 4. Because it's not actually freeing memory, you can UNLEAK() before we are finished accessing the variable. This is helpful in cases like this: char *p = some_function(); return another_function(p); Writing this with free() requires: int ret; char *p = some_function(); ret = another_function(p); free(p); return ret; But with unleak we can just write: char *p = some_function(); UNLEAK(p); return another_function(p); This patch adds the UNLEAK() macro and enables it automatically when Git is compiled with SANITIZE=leak. In normal builds it's a noop, so we pay no runtime cost. It also adds some UNLEAK() annotations to show off how the feature works. On top of other recent leak fixes, these are enough to get t0000 and t0001 to pass when compiled with LSAN. Note the case in commit.c which actually converts a strbuf_release() into an UNLEAK. This code was already non-leaky, but the free didn't do anything useful, since we're exiting. Converting it to an annotation means that non-leak-checking builds pay no runtime cost. The cost is minimal enough that it's probably not worth going on a crusade to convert these kinds of frees to UNLEAKS. I did it here for consistency with the "sb" leak (though it would have been equally correct to go the other way, and turn them both into strbuf_release() calls). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move release_pack_memory()Libravatar Jonathan Tan1-2/+0
The function unuse_one_window() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23Merge branch 'rs/move-array' into maintLibravatar Junio C Hamano1-0/+8
Code clean-up. * rs/move-array: ls-files: don't try to prune an empty index apply: use COPY_ARRAY and MOVE_ARRAY in update_image() use MOVE_ARRAY add MOVE_ARRAY
2017-08-11Merge branch 'rs/move-array'Libravatar Junio C Hamano1-0/+8
Code clean-up. * rs/move-array: ls-files: don't try to prune an empty index apply: use COPY_ARRAY and MOVE_ARRAY in update_image() use MOVE_ARRAY add MOVE_ARRAY
2017-07-18Merge branch 'tb/push-to-cygwin-unc-path'Libravatar Junio C Hamano1-0/+3
On Cygwin, similar to Windows, "git push //server/share/repository" ought to mean a repository on a network share that can be accessed locally, but this did not work correctly due to stripping the double slashes at the beginning. This may need to be heavily tested before it gets unleashed to the wild, as the change is at a fairly low-level code and would affect not just the code to decide if the push destination is local. There may be unexpected fallouts in the path normalization. * tb/push-to-cygwin-unc-path: cygwin: allow pushing to UNC paths
2017-07-17add MOVE_ARRAYLibravatar René Scharfe1-0/+8
Similar to COPY_ARRAY (introduced in 60566cbb58), add a safe and convenient helper for moving potentially overlapping ranges of array entries. It infers the element size, multiplies automatically and safely to get the size in bytes, does a basic type safety check by comparing element sizes and unlike memmove(3) it supports NULL pointers iff 0 elements are to be moved. Also add a semantic patch to demonstrate the helper's intended usage. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05cygwin: allow pushing to UNC pathsLibravatar Torsten Bögershausen1-0/+3
cygwin can use an UNC path like //server/share/repo $ cd //server/share/dir $ mkdir test $ cd test $ git init --bare However, when we try to push from a local Git repository to this repo, there is a problem: Git converts the leading "//" into a single "/". As cygwin handles an UNC path so well, Git can support them better: - Introduce cygwin_offset_1st_component() which keeps the leading "//", similar to what Git for Windows does. - Move CYGWIN out of the POSIX in the tests for path normalization in t0060 Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULLLibravatar Ævar Arnfjörð Bjarmason1-0/+6
Add a FREE_AND_NULL() wrapper marco for the common pattern of freeing a pointer and assigning NULL to it right afterwards. The implementation is similar to the (currently unused) XDL_PTRFREE macro in xdiff/xmacros.h added in commit 3443546f6e ("Use a *real* built-in diff generator", 2006-03-24). The only difference is that free() is called unconditionally, see [1]. See [2] for a suggested alternative which does this via a function instead of a macro. As covered in replies to that message, while it's a viable approach, it would introduce caveats which this approach doesn't have, so that potential change is left to a future follow-up change. This merely allows us to translate exactly what we're doing now to a less verbose & idiomatic form using a macro, while guaranteeing that we don't introduce any functional changes. 1. <alpine.DEB.2.20.1608301948310.129229@virtualbox> (http://public-inbox.org/git/alpine.DEB.2.20.1608301948310.129229@virtualbox/) 2. <20170610032143.GA7880@starla> (https://public-inbox.org/git/20170610032143.GA7880@starla/) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13Merge branch 'nd/fopen-errors'Libravatar Junio C Hamano1-6/+9
We often try to open a file for reading whose existence is optional, and silently ignore errors from open/fopen; report such errors if they are not due to missing files. * nd/fopen-errors: mingw_fopen: report ENOENT for invalid file names mingw: verify that paths are not mistaken for remote nicknames log: fix memory leak in open_next_file() rerere.c: move error_errno() closer to the source system call print errno when reporting a system call error wrapper.c: make warn_on_inaccessible() static wrapper.c: add and use fopen_or_warn() wrapper.c: add and use warn_on_fopen_errors() config.mak.uname: set FREAD_READS_DIRECTORIES for Darwin, too config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD clone: use xfopen() instead of fopen() use xfopen() in more places git_fopen: fix a sparse 'not declared' warning
2017-06-13Merge branch 'jc/noent-notdir'Libravatar Junio C Hamano1-0/+15
Our code often opens a path to an optional file, to work on its contents when we can successfully open it. We can ignore a failure to open if such an optional file does not exist, but we do want to report a failure in opening for other reasons (e.g. we got an I/O error, or the file is there, but we lack the permission to open). The exact errors we need to ignore are ENOENT (obviously) and ENOTDIR (less obvious). Instead of repeating comparison of errno with these two constants, introduce a helper function to do so. * jc/noent-notdir: treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked compat-util: is_missing_file_error()
2017-06-13Merge branch 'bw/forking-and-threading' into maintLibravatar Junio C Hamano1-1/+0
The "run-command" API implementation has been made more robust against dead-locking in a threaded environment. * bw/forking-and-threading: usage.c: drop set_error_handle() run-command: restrict PATH search to executable files run-command: expose is_executable function run-command: block signals between fork and execve run-command: add note about forking and threading run-command: handle dup2 and close errors in child run-command: eliminate calls to error handling functions in child run-command: don't die in child when duping /dev/null run-command: prepare child environment before forking string-list: add string_list_remove function run-command: use the async-signal-safe execv instead of execvp run-command: prepare command before forking t0061: run_command executes scripts without a #! line t5550: use write_script to generate post-update hook
2017-05-30Merge branch 'bw/forking-and-threading'Libravatar Junio C Hamano1-1/+0
The "run-command" API implementation has been made more robust against dead-locking in a threaded environment. * bw/forking-and-threading: usage.c: drop set_error_handle() run-command: restrict PATH search to executable files run-command: expose is_executable function run-command: block signals between fork and execve run-command: add note about forking and threading run-command: handle dup2 and close errors in child run-command: eliminate calls to error handling functions in child run-command: don't die in child when duping /dev/null run-command: prepare child environment before forking string-list: add string_list_remove function run-command: use the async-signal-safe execv instead of execvp run-command: prepare command before forking t0061: run_command executes scripts without a #! line t5550: use write_script to generate post-update hook
2017-05-30compat-util: is_missing_file_error()Libravatar Junio C Hamano1-0/+15
Our code often opens a path to an optional file, to work on its contents when we can successfully open it. We can ignore a failure to open if such an optional file does not exist, but we do want to report a failure in opening for other reasons (e.g. we got an I/O error, or the file is there, but we lack the permission to open). The exact errors we need to ignore are ENOENT (obviously) and ENOTDIR (less obvious). Instead of repeating comparison of errno with these two constants, introduce a helper function to do so. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-29Merge branch 'jk/bug-to-abort'Libravatar Junio C Hamano1-0/+9
Introduce the BUG() macro to improve die("BUG: ..."). * jk/bug-to-abort: usage: add NORETURN to BUG() function definitions config: complain about --local outside of a git repo setup_git_env: convert die("BUG") to BUG() usage.c: add BUG() function
2017-05-26wrapper.c: make warn_on_inaccessible() staticLibravatar Nguyễn Thái Ngọc Duy1-2/+0
After the last patch, this function is not used outside anymore. Keep it static. Noticed-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26wrapper.c: add and use fopen_or_warn()Libravatar Nguyễn Thái Ngọc Duy1-0/+1
When fopen() returns NULL, it could be because the given path does not exist, but it could also be some other errors and the caller has to check. Add a wrapper so we don't have to repeat the same error check everywhere. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26wrapper.c: add and use warn_on_fopen_errors()Libravatar Nguyễn Thái Ngọc Duy1-0/+2
In many places, Git warns about an inaccessible file after a fopen() failed. To discern these cases from other cases where we want to warn about inaccessible files, introduce a new helper specifically to test whether fopen() failed because the current user lacks the permission to open file in question. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26git_fopen: fix a sparse 'not declared' warningLibravatar Ramsay Jones1-4/+6
If git is built with the FREAD_READS_DIRECTORIES build variable set, this would cause sparse to issue a 'not declared, should it be static?' warning on Linux. This is a result of the method employed by 'compat/fopen.c' to suppress the (possible) redefinition of the (system) fopen macro, which also removes the extern declaration of the git_fopen function. In order to suppress the warning, introduce a new macro to suppress the definition (or possibly the re-definition) of the fopen symbol as a macro override. This new macro (SUPPRESS_FOPEN_REDEFINITION) is only defined in 'compat/fopen.c', just prior to the inclusion of the 'git-compat-util.h' header file. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-16Merge branch 'js/larger-timestamps'Libravatar Junio C Hamano1-0/+5
Some platforms have ulong that is smaller than time_t, and our historical use of ulong for timestamp would mean they cannot represent some timestamp that the platform allows. Invent a separate and dedicated timestamp_t (so that we can distingiuish timestamps and a vanilla ulongs, which along is already a good move), and then declare uintmax_t is the type to be used as the timestamp_t. * js/larger-timestamps: archive-tar: fix a sparse 'constant too large' warning use uintmax_t for timestamps date.c: abort if the system time cannot handle one of our timestamps timestamp_t: a new data type for timestamps PRItime: introduce a new "printf format" for timestamps parse_timestamp(): specify explicitly where we parse timestamps t0006 & t5000: skip "far in the future" test when time_t is too limited t0006 & t5000: prepare for 64-bit timestamps ref-filter: avoid using `unsigned long` for catch-all data type
2017-05-16Merge branch 'dt/raise-core-packed-git-limit'Libravatar Junio C Hamano1-1/+1
The default packed-git limit value has been raised on larger platforms to save "git fetch" from a (recoverable) failure while "gc" is running in parallel. * dt/raise-core-packed-git-limit: Increase core.packedGitLimit
2017-05-15usage.c: drop set_error_handle()Libravatar Jeff King1-1/+0
The set_error_handle() function was introduced by 3b331e926 (vreportf: report to arbitrary filehandles, 2015-08-11) so that run-command could send post-fork, pre-exec errors to the parent's original stderr. That use went away in 79319b194 (run-command: eliminate calls to error handling functions in child, 2017-04-19), which pushes all of the error reporting to the parent. This leaves no callers of set_error_handle(). As we're not likely to add any new ones, let's drop it. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Brandon Williams <bmwill@google.com> Reviewed-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-15usage.c: add BUG() functionLibravatar Jeff King1-0/+9
There's a convention in Git's code base to write assertions as: if (...some_bad_thing...) die("BUG: the terrible thing happened"); with the idea that users should never see a "BUG:" message (but if they, it at least gives a clue what happened). We use die() here because it's convenient, but there are a few draw-backs: 1. Without parsing the messages, it's hard for callers to distinguish BUG assertions from regular errors. For instance, it would be nice if the test suite could check that we don't hit any assertions, but test_must_fail will pass BUG deaths as OK. 2. It would be useful to add more debugging features to BUG assertions, like file/line numbers or dumping core. 3. The die() handler can be replaced, and might not actually exit the whole program (e.g., it may just pthread_exit()). This is convenient for normal errors, but for an assertion failure (which is supposed to never happen), we're probably better off taking down the whole process as quickly and cleanly as possible. We could address these by checking in die() whether the error message starts with "BUG", and behaving appropriately. But there's little advantage at that point to sharing the die() code, and only downsides (e.g., we can't change the BUG() interface independently). Moreover, converting all of the existing BUG calls reveals that the test suite does indeed trigger a few of them. Instead, this patch introduces a new BUG() function, which prints an error before dying via SIGABRT. This gives us test suite checking and core dumps. The function is actually a macro (when supported) so that we can show the file/line number. We can convert die("BUG") invocations to BUG() in further patches, dealing with any test fallouts individually. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-27use uintmax_t for timestampsLibravatar Johannes Schindelin1-4/+4
Previously, we used `unsigned long` for timestamps. This was only a good choice on Linux, where we know implicitly that `unsigned long` is what is used for `time_t`. However, we want to use a different data type for timestamps for two reasons: - there is nothing that says that `unsigned long` should be the same data type as `time_t`, and indeed, on 64-bit Windows for example, it is not: `unsigned long` is 32-bit but `time_t` is 64-bit. - even on 32-bit Linux, where `unsigned long` (and thereby `time_t`) is 32-bit, we *want* to be able to encode timestamps in Git that are currently absurdly far in the future, *even if* the system library is not able to format those timestamps into date strings. So let's just switch to the maximal integer type available, which should be at least 64-bit for all practical purposes these days. It certainly cannot be worse than `unsigned long`, so... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-27timestamp_t: a new data type for timestampsLibravatar Johannes Schindelin1-0/+2
Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-23Merge branch 'dt/xgethostname-nul-termination'Libravatar Junio C Hamano1-0/+6
gethostname(2) may not NUL terminate the buffer if hostname does not fit; unfortunately there is no easy way to see if our buffer was too small, but at least this will make sure we will not end up using garbage past the end of the buffer. * dt/xgethostname-nul-termination: xgethostname: handle long hostnames use HOST_NAME_MAX to size buffers for gethostname(2)
2017-04-23PRItime: introduce a new "printf format" for timestampsLibravatar Johannes Schindelin1-0/+1
Currently, Git's source code treats all timestamps as if they were unsigned longs. Therefore, it is okay to write "%lu" when printing them. There is a substantial problem with that, though: at least on Windows, time_t is *larger* than unsigned long, and hence we will want to switch away from the ill-specified `unsigned long` data type. So let's introduce the pseudo format "PRItime" (currently simply being defined to "lu") to make it easier to change the data type used for timestamps. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-23parse_timestamp(): specify explicitly where we parse timestampsLibravatar Johannes Schindelin1-0/+2
Currently, Git's source code represents all timestamps as `unsigned long`. In preparation for using a more appropriate data type, let's introduce a symbol `parse_timestamp` (currently being defined to `strtoul`) where appropriate, so that we can later easily switch to, say, use `strtoull()` instead. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-20Increase core.packedGitLimitLibravatar David Turner1-1/+1
When core.packedGitLimit is exceeded, git will close packs. If there is a repack operation going on in parallel with a fetch, the fetch might open a pack, and then be forced to close it due to packedGitLimit being hit. The repack could then delete the pack out from under the fetch, causing the fetch to fail. Increase core.packedGitLimit's default value to prevent this. On current 64-bit x86_64 machines, 48 bits of address space are available. It appears that 64-bit ARM machines have no standard amount of address space (that is, it varies by manufacturer), and IA64 and POWER machines have the full 64 bits. So 48 bits is the only limit that we can reasonably care about. We reserve a few bits of the 48-bit address space for the kernel's use (this is not strictly necessary, but it's better to be safe), and use up to the remaining 45. No git repository will be anywhere near this large any time soon, so this should prevent the failure. Helped-by: Jeff King <peff@peff.net> Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-18xgethostname: handle long hostnamesLibravatar David Turner1-0/+2
If the full hostname doesn't fit in the buffer supplied to gethostname, POSIX does not specify whether the buffer will be null-terminated, so to be safe, we should do it ourselves. Introduce new function, xgethostname, which ensures that there is always a \0 at the end of the buffer. Signed-off-by: David Turner <dturner@twosigma.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-18use HOST_NAME_MAX to size buffers for gethostname(2)Libravatar René Scharfe1-0/+4
POSIX limits the length of host names to HOST_NAME_MAX. Export the fallback definition from daemon.c and use this constant to make all buffers used with gethostname(2) big enough for any possible result and a terminating NUL. Inspired-by: David Turner <dturner@twosigma.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: David Turner <dturner@twosigma.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-28Merge branch 'jk/pack-name-cleanups' into maintLibravatar Junio C Hamano1-2/+0
Code clean-up. * jk/pack-name-cleanups: index-pack: make pointer-alias fallbacks safer replace snprintf with odb_pack_name() odb_pack_keep(): stop generating keepfile name sha1_file.c: make pack-name helper globally accessible move odb_* declarations out of git-compat-util.h
2017-03-21Merge branch 'jk/pack-name-cleanups'Libravatar Junio C Hamano1-2/+0
Code clean-up. * jk/pack-name-cleanups: index-pack: make pointer-alias fallbacks safer replace snprintf with odb_pack_name() odb_pack_keep(): stop generating keepfile name sha1_file.c: make pack-name helper globally accessible move odb_* declarations out of git-compat-util.h
2017-03-16move odb_* declarations out of git-compat-util.hLibravatar Jeff King1-2/+0
These functions were originally conceived as wrapper functions similar to xmkstemp(). They were later moved by 463db9b10 (wrapper: move odb_* to environment.c, 2010-11-06). The more appropriate place for a declaration is in cache.h. While we're at it, let's add some basic docstrings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-28wrapper.c: remove unused gitmkstemps() functionLibravatar Ramsay Jones1-5/+0
The last call to the mkstemps() function was removed in commit 659488326 ("wrapper.c: delete dead function git_mkstemps()", 22-04-2016). In order to support platforms without mkstemps(), this functionality was provided, along with a Makefile build variable (NO_MKSTEMPS), by the gitmkstemps() function. Remove the dead code, along with the defunct build machinery. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-15Merge branch 'rs/swap'Libravatar Junio C Hamano1-0/+10
Code clean-up. * rs/swap: graph: use SWAP macro diff: use SWAP macro use SWAP macro apply: use SWAP macro add SWAP macro