summaryrefslogtreecommitdiff
path: root/compat
AgeCommit message (Collapse)AuthorFilesLines
2022-03-10core.fsyncmethod: add writeout-only modeLibravatar Neeraj Singh2-0/+31
This commit introduces the `core.fsyncMethod` configuration knob, which can currently be set to `fsync` or `writeout-only`. The new writeout-only mode attempts to tell the operating system to flush its in-memory page cache to the storage hardware without issuing a CACHE_FLUSH command to the storage controller. Writeout-only fsync is significantly faster than a vanilla fsync on common hardware, since data is written to a disk-side cache rather than all the way to a durable medium. Later changes in this patch series will take advantage of this primitive to implement batching of hardware flushes. When git_fsync is called with FSYNC_WRITEOUT_ONLY, it may fail and the caller is expected to do an ordinary fsync as needed. On Apple platforms, the fsync system call does not issue a CACHE_FLUSH directive to the storage controller. This change updates fsync to do fcntl(F_FULLFSYNC) to make fsync actually durable. We maintain parity with existing behavior on Apple platforms by setting the default value of the new core.fsyncMethod option. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-10wrapper: make inclusion of Windows csprng header tightly scopedLibravatar Neeraj Singh1-5/+0
Including NTSecAPI.h in git-compat-util.h causes build errors in any other file that includes winternl.h. NTSecAPI.h was included in order to get access to the RtlGenRandom cryptographically secure PRNG. This change scopes the inclusion of ntsecapi.h to wrapper.c, which is the only place that it's actually needed. The build breakage is due to the definition of UNICODE_STRING in NtSecApi.h: #ifndef _NTDEF_ typedef LSA_UNICODE_STRING UNICODE_STRING, *PUNICODE_STRING; typedef LSA_STRING STRING, *PSTRING ; #endif LsaLookup.h: typedef struct _LSA_UNICODE_STRING { USHORT Length; USHORT MaximumLength; #ifdef MIDL_PASS [size_is(MaximumLength/2), length_is(Length/2)] #endif // MIDL_PASS PWSTR Buffer; } LSA_UNICODE_STRING, *PLSA_UNICODE_STRING; winternl.h also defines UNICODE_STRING: typedef struct _UNICODE_STRING { USHORT Length; USHORT MaximumLength; PWSTR Buffer; } UNICODE_STRING; typedef UNICODE_STRING *PUNICODE_STRING; Both definitions have equivalent layouts. Apparently these internal Windows headers aren't designed to be included together. This is an oversight in the headers and does not represent an incompatibility between the APIs. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16Merge branch 'ab/auto-detect-zlib-compress2'Libravatar Junio C Hamano1-5/+6
The build procedure has been taught to notice older version of zlib and enable our replacement uncompress2() automatically. * ab/auto-detect-zlib-compress2: compat: auto-detect if zlib has uncompress2()
2022-02-11Merge branch 'bc/csprng-mktemps'Libravatar Junio C Hamano1-0/+6
Pick a better random number generator and use it when we prepare temporary filenames. * bc/csprng-mktemps: wrapper: use a CSPRNG to generate random file names wrapper: add a helper to generate numbers from a CSPRNG
2022-02-05Merge branch 'jc/qsort-s-alignment-fix'Libravatar Junio C Hamano1-10/+4
Fix a hand-rolled alloca() imitation that may have violated alignment requirement of data being sorted in compatibility implementation of qsort_s() and stable qsort(). * jc/qsort-s-alignment-fix: stable-qsort: avoid using potentially unaligned access compat/qsort_s.c: avoid using potentially unaligned access
2022-01-26compat: auto-detect if zlib has uncompress2()Libravatar Ævar Arnfjörð Bjarmason1-5/+6
We have a copy of uncompress2() implementation in compat/ so that we can build with an older version of zlib that lack the function, and the build procedure selects if it is used via the NO_UNCOMPRESS2 $(MAKE) variable. This is yet another "annoying" knob the porters need to tweak on platforms that are not common enough to have the default set in the config.mak.uname file. Attempt to instead ask the system header <zlib.h> to decide if we need the compatibility implementation. This is a deviation from the way we have been handling the "compatiblity" features so far, and if it can be done cleanly enough, it could work as a model for features that need compatibility definition we discover in the future. With that goal in mind, avoid expedient but ugly hacks, like shoving the code that is conditionally compiled into an unrelated .c file, which may not work in future cases---instead, take an approach that uses a file that is independently compiled and stands on its own. Compile and link compat/zlib-uncompress2.c file unconditionally, but conditionally hide the implementation behind #if/#endif when zlib version is 1.2.9 or newer, and unconditionally archive the resulting object file in the libgit.a to be picked up by the linker. There are a few things to note in the shape of the code base after this change: - We no longer use NO_UNCOMPRESS2 knob; if the system header <zlib.h> claims a version that is more cent than the library actually is, this would break, but it is easy to add it back when we find such a system. - The object file compat/zlib-uncompress2.o is always compiled and archived in libgit.a, just like a few other compat/ object files already are. - The inclusion of <zlib.h> is done in <git-compat-util.h>; we used to do so from <cache.h> which includes <git-compat-util.h> as the first thing it does, so from the *.c codes, there is no practical change. - Until objects in libgit.a that is already used gains a reference to the function, the reftable code will be the only one that wants it, so libgit.a on the linker command line needs to appear once more at the end to satisify the mutual dependency. - Beat found a trick used by OpenSSL to avoid making the conditionally-compiled object truly empty (apparently because they had to deal with compilers that do not want to see an effectively empty input file). Our compat/zlib-uncompress2.c file borrows the same trick for portabilty. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Beat Bolli <dev+git@drbeat.li> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-19getcwd(mingw): handle the case when there is no cwdLibravatar Johannes Schindelin1-0/+4
A recent upstream topic introduced checks for certain Git commands that prevent them from deleting the current working directory, introducing also a regression test that ensures that commands such as `git version` _can_ run without a current working directory. While technically not possible on Windows via the regular Win32 API, we do run the regression tests in an MSYS2 Bash which uses a POSIX emulation layer (the MSYS2/Cygwin runtime) where a really evil hack _does_ allow to delete a directory even if it is the current working directory. Therefore, Git needs to be prepared for a missing working directory, even on Windows. This issue was not noticed in upstream Git because there was no caller that tried to discover a Git directory with a deleted current working directory in the test suite. But in the microsoft/git fork, we do want to run `pre-command`/`post-command` hooks for every command, even for `git version`, which means that we make precisely such a call. The bug is not in that `pre-command`/`post-command` feature, though, but in `mingw_getcwd()` and needs to be addressed there. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-17wrapper: add a helper to generate numbers from a CSPRNGLibravatar brian m. carlson1-0/+6
There are many situations in which having access to a cryptographically secure pseudorandom number generator (CSPRNG) is helpful. In the future, we'll encounter one of these when dealing with temporary files. To make this possible, let's add a function which reads from a system CSPRNG and returns some bytes. We know that all systems will have such an interface. A CSPRNG is required for a secure TLS or SSH implementation and a Git implementation which provided neither would be of little practical use. In addition, POSIX is set to standardize getentropy(2) in the next version, so in the (potentially distant) future we can rely on that. For systems which lack one of the other interfaces, we provide the ability to use OpenSSL's CSPRNG. OpenSSL is highly portable and functions on practically every known OS, and we know it will have access to some source of cryptographically secure randomness. We also provide support for the arc4random in libbsd for folks who would prefer to use that. Because this is a security sensitive interface, we take some precautions. We either succeed by filling the buffer completely as we requested, or we fail. We don't return partial data because the caller will almost never find that to be a useful behavior. Specify a makefile knob which users can use to specify one or more suitable CSPRNGs, and turn the multiple string options into a set of defines, since we cannot match on strings in the preprocessor. We allow multiple options to make the job of handling this in autoconf easier. The order of options is important here. On systems with arc4random, which is most of the BSDs, we use that, since, except on MirBSD and macOS, it uses ChaCha20, which is extremely fast, and sits entirely in userspace, avoiding a system call. We then prefer getrandom over getentropy, because the former has been available longer on Linux, and then OpenSSL. Finally, if none of those are available, we use /dev/urandom, because most Unix-like operating systems provide that API. We prefer options that don't involve device files when possible because those work in some restricted environments where device files may not be available. Set the configuration variables appropriately for Linux and the BSDs, including macOS, as well as Windows and NonStop. We specifically only consider versions which receive publicly available security support here. For the same reason, we don't specify getrandom(2) on Linux, because CentOS 7 doesn't support it in glibc (although its kernel does) and we don't want to resort to making syscalls. Finally, add a test helper to allow this to be tested by hand and in tests. We don't add any tests, since invoking the CSPRNG is not likely to produce interesting, reproducible results. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-12Merge branch 'ma/windows-dynload-fix'Libravatar Junio C Hamano4-9/+12
Fix calling dynamically loaded functions on Windows. * ma/windows-dynload-fix: lazyload: use correct calling conventions
2022-01-09lazyload: use correct calling conventionsLibravatar Matthias Aßhauer4-9/+12
Christoph Reiter reported on the Git for Windows issue tracker[1], that mingw_strftime() imports strftime() from ucrtbase.dll with the wrong calling convention. It should be __cdecl instead of WINAPI, which we always use in DECLARE_PROC_ADDR(). The MSYS2 project encountered cmake sefaults on x86 Windows caused by the same issue in the cmake source. [2] There are no known git crashes that where caused by this, yet, but we should try to prevent them. We import two other non-WINAPI functions via DECLARE_PROC_ADDR(), too. * NtSetSystemInformation() (NTAPI) * GetUserNameExW() (SEC_ENTRY) NTAPI, SEC_ENTRY and WINAPI are all ususally defined as __stdcall, but there are circumstances where they're defined differently. Teach DECLARE_PROC_ADDR() about calling conventions and be explicit about when we want to use which calling convention. Import winnt.h for the definition of NTAPI and sspi.h for SEC_ENTRY near their respective only users. [1] https://github.com/git-for-windows/git/issues/3560 [2] https://github.com/msys2/MINGW-packages/issues/10152 Reported-By: Christoph Reiter <reiter.christoph@gmail.com> Signed-off-by: Matthias Aßhauer <mha1993@live.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-07compat/qsort_s.c: avoid using potentially unaligned accessLibravatar Junio C Hamano1-10/+4
The compatibility definition for qsort_s() uses "char buffer[1024]" on the stack to avoid making malloc() calls for small temporary space, which essentially hand-rolls alloca(). But the elements of the array being sorted may have alignment needs more strict than what an array of bytes may have. &buf[0] may be word aligned, but using the address as if it stores the first element of an array of a struct, whose first member may need to be aligned on double-word boundary, would be a no-no. We could use xalloca() from git-compat-util.h, or alloca() directly on platforms with HAVE_ALLOCA_H, but let's try using unconditionally xmalloc() before we know the performance characteristics of the callers. It may not make much of an argument to inspect the current callers and say "it shouldn't matter to any of them", but anyway: * The one in object-name.c is used to sort potential matches to a given ambiguous object name prefix in the error path; * The one in pack-write.c is done once per a pack .idx file being written to create the reverse index, so (1) the cost of malloc() overhead is dwarfed by the cost of the packing operation, and (2) the number of entries being sorted is the number of objects in a pack; * The one in ref-filter.c is used by "branch --list", "tag --list", and "for-each-ref", only once per operation. We sort an array of pointers with entries, each corresponding to a ref that is shown. * The one in string-list.c is used by sort_string_list(), which is way too generic to assume any access patterns, so it may or may not matter, but I do not care too much ;-) Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15Merge branch 'hn/reftable'Libravatar Junio C Hamano2-0/+96
The "reftable" backend for the refs API, without integrating into the refs subsystem, has been added. * hn/reftable: Add "test-tool dump-reftable" command. reftable: add dump utility reftable: implement stack, a mutable database of reftable files. reftable: implement refname validation reftable: add merged table view reftable: add a heap-based priority queue for reftable records reftable: reftable file level tests reftable: read reftable files reftable: generic interface to tables reftable: write reftable files reftable: a generic binary tree implementation reftable: reading/writing blocks Provide zlib's uncompress2 from compat/zlib-compat.c reftable: (de)serialization for the polymorphic record type. reftable: add blocksource, an abstraction for random access reads reftable: utility functions reftable: add error related functionality reftable: add LICENSE hash.h: provide constants for the hash IDs
2021-12-10Merge branch 'cb/mingw-gmtime-r'Libravatar Junio C Hamano1-0/+2
Build fix on Windows. * cb/mingw-gmtime-r: mingw: avoid fallback for {local,gm}time_r()
2021-11-29Merge branch 'jc/unsetenv-returns-an-int'Libravatar Junio C Hamano1-1/+3
The compatibility implementation for unsetenv(3) were written to mimic ancient, non-POSIX, variant seen in an old glibc; it has been changed to return an integer to match the more modern era. * jc/unsetenv-returns-an-int: unsetenv(3) returns int, not void
2021-11-27mingw: avoid fallback for {local,gm}time_r()Libravatar Carlo Marcelo Arenas Belón1-0/+2
mingw-w64's pthread_unistd.h had a bug that mistakenly (because there is no support for the *lockfile() functions required[1]) defined _POSIX_THREAD_SAFE_FUNCTIONS and that was being worked around since 3ecd153a3b (compat/mingw: support MSys2-based MinGW build, 2016-01-14). The bug was fixed in winphtreads, but as a side effect, leaves the reentrant functions from time.h no longer visible and therefore breaks the build. Since the intention all along was to avoid using the fallback functions, formalize the use of POSIX by setting the corresponding feature flag and compile out the implementation for the fallback functions. [1] https://unix.org/whitepapers/reentrant.html Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-10simple-ipc: work around issues with Cygwin's Unix socket emulationLibravatar Johannes Schindelin1-0/+22
Cygwin emulates Unix sockets by writing files with custom contents and then marking them as system files. The tricky problem is that while the file is written and its `system` bit is set, it is still identified as a file. This caused test failures when Git is too fast looking for the Unix sockets and then complains that there is a plain file in the way. Let's work around this by adding a delayed retry loop, specifically for Cygwin. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Tested-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-29unsetenv(3) returns int, not voidLibravatar Junio C Hamano1-1/+3
This compatilibity implementation has been returning a wrong type, ever since 731043fd (Add compat/unsetenv.c ., 2006-01-25) added to the system, yet nobody noticed it in the past 16 years, presumably because no code checks failures in their unsetenv() calls. Sigh. For now, make it always succeed. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-18Merge branch 'cm/save-restore-terminal'Libravatar Junio C Hamano2-15/+63
An editor session launched during a Git operation (e.g. during 'git commit') can leave the terminal in a funny state. The code path has updated to save the terminal state before, and restore it after, it spawns an editor. * cm/save-restore-terminal: editor: save and reset terminal after calling EDITOR terminal: teach git how to save/restore its terminal settings
2021-10-13Merge branch 'jh/builtin-fsmonitor-part1'Libravatar Junio C Hamano2-22/+171
Built-in fsmonitor (part 1). * jh/builtin-fsmonitor-part1: t/helper/simple-ipc: convert test-simple-ipc to use start_bg_command run-command: create start_bg_command simple-ipc/ipc-win32: add Windows ACL to named pipe simple-ipc/ipc-win32: add trace2 debugging simple-ipc: move definition of ipc_active_state outside of ifdef simple-ipc: preparations for supporting binary messages. trace2: add trace2_child_ready() to report on background children
2021-10-13Merge branch 'ab/config-based-hooks-1'Libravatar Junio C Hamano1-1/+1
Mostly preliminary clean-up in the hook API. * ab/config-based-hooks-1: hook-list.h: add a generated list of hooks, like config-list.h hook.c users: use "hook_exists()" instead of "find_hook()" hook.c: add a hook_exists() wrapper and use it in bugreport.c hook.[ch]: move find_hook() from run-command.c to hook.c Makefile: remove an out-of-date comment Makefile: don't perform "mv $@+ $@" dance for $(GENERATED_H) Makefile: stop hardcoding {command,config}-list.h Makefile: mark "check" target as .PHONY
2021-10-12Merge branch 'rs/git-mmap-uses-malloc' into maintLibravatar Junio C Hamano1-1/+6
mmap() imitation used to call xmalloc() that dies upon malloc() failure, which has been corrected to just return an error to the caller to be handled. * rs/git-mmap-uses-malloc: compat: let git_mmap use malloc(3) directly
2021-10-08Provide zlib's uncompress2 from compat/zlib-compat.cLibravatar Han-Wen Nienhuys2-0/+96
This will be needed for reading reflog blocks in reftable. Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-06terminal: teach git how to save/restore its terminal settingsLibravatar Carlo Marcelo Arenas Belón2-15/+63
Currently, git will share its console with all its children (unless they create their own), and is therefore possible that any of them that might change the settings for it could affect its operations once completed. Refactor the platform specific functionality to save the terminal settings and expand it to also do so for the output handler. This will allow for the state of the terminal to be saved and restored around a child that might misbehave (ex vi) which will be implemented next. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27lazyload.h: use an even more generic function pointer than FARPROCLibravatar Carlo Marcelo Arenas Belón1-3/+6
gcc will helpfully raise a -Wcast-function-type warning when casting between functions that might have incompatible return types (ex: GetUserNameExW returns bool which is only half the size of the return type from FARPROC which is long long), so create a new type that could be used as a completely generic function pointer and cast through it instead. Additionaly remove the -Wno-incompatible-pointer-types temporary flag added in 27e0c3c (win32: allow building with pedantic mode enabled, 2021-09-03), as it will be no longer needed. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27lazyload.h: fix warnings about mismatching function pointer typesLibravatar Johannes Sixt1-2/+3
Here, GCC warns about every use of the INIT_PROC_ADDR macro, for example: In file included from compat/mingw.c:8: compat/mingw.c: In function 'mingw_strftime': compat/win32/lazyload.h:38:12: warning: assignment to 'size_t (*)(char *, size_t, const char *, const struct tm *)' {aka 'long long unsigned int (*)(char *, long long unsigned int, const char *, const struct tm *)'} from incompatible pointer type 'FARPROC' {aka 'long long int (*)()'} [-Wincompatible-pointer-types] 38 | (function = get_proc_addr(&proc_addr_##function)) | ^ compat/mingw.c:1014:6: note: in expansion of macro 'INIT_PROC_ADDR' 1014 | if (INIT_PROC_ADDR(strftime)) | ^~~~~~~~~~~~~~ (message wrapped for convenience). Insert a cast to keep the compiler happy. A cast is fine in these cases because they are generic function pointer values that have been looked up in a DLL. Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-23Makefile: stop hardcoding {command,config}-list.hLibravatar Ævar Arnfjörð Bjarmason1-1/+1
Change various places that hardcode the names of these two files to refer to either $(GENERATED_H), or to a new generated-hdrs target. That target is consistent with the *-objs targets I recently added in 029bac01a8 (Makefile: add {program,xdiff,test,git,fuzz}-objs & objects targets, 2021-02-23). A subsequent commit will add a new generated hook-list.h. By doing this refactoring we'll only need to add the new file to the GENERATED_H variable, not EXCEPT_HDRS, the vcbuild/README etc. Hardcoding command-list.h there seems to have been a case of copy/paste programming in 976aaedca0 (msvc: add a Makefile target to pre-generate the Visual Studio solution, 2019-07-29). The config-list.h was added later in 709df95b78 (help: move list_config_help to builtin/help, 2020-04-16). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20Merge branch 'cb/pedantic-build-for-developers'Libravatar Junio C Hamano2-2/+2
Update the build procedure to use the "-pedantic" build when DEVELOPER makefile macro is in effect. * cb/pedantic-build-for-developers: developer: enable pedantic by default win32: allow building with pedantic mode enabled gettext: remove optional non-standard parens in N_() definition
2021-09-20Merge branch 'ab/tr2-leaks-and-fixes'Libravatar Junio C Hamano1-24/+145
The tracing of process ancestry information has been enhanced. * ab/tr2-leaks-and-fixes: tr2: log N parent process names on Linux tr2: do compiler enum check in trace2_collect_process_info() tr2: leave the parent list empty upon failure & don't leak memory tr2: stop leaking "thread_name" memory tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux tr2: remove NEEDSWORK comment for "non-procfs" implementations
2021-09-20simple-ipc/ipc-win32: add Windows ACL to named pipeLibravatar Jeff Hostetler1-11/+129
Set an ACL on the named pipe to allow the well-known group EVERYONE to read and write to the IPC server's named pipe. In the event that the daemon was started with elevation, allow non-elevated clients to communicate with the daemon. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20simple-ipc/ipc-win32: add trace2 debuggingLibravatar Jeff Hostetler1-1/+24
Create "ipc-debug" category events to log unexpected errors when creating Simple-IPC connections. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-20simple-ipc: preparations for supporting binary messages.Libravatar Jeff Hostetler2-10/+18
Add `command_len` argument to the Simple IPC API. In my original Simple IPC API, I assumed that the request would always be a null-terminated string of text characters. The `command` argument was just a `const char *`. I found a caller that would like to pass a binary command to the daemon, so I am amending the Simple IPC API to receive `const char *command, size_t command_len` arguments. I considered changing the `command` argument to be a `void *`, but the IPC layer simply passes it to the pkt-line layer which takes a `const char *`, so to avoid confusion I left it as is. Note, the response side has always been a `struct strbuf` which includes the buffer and length, so we already support returning a binary answer. (Yes, it feels a little weird returning a binary buffer in a `strbuf`, but it works.) Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08Merge branch 'rs/git-mmap-uses-malloc'Libravatar Junio C Hamano1-1/+6
mmap() imitation used to call xmalloc() that dies upon malloc() failure, which has been corrected to just return an error to the caller to be handled. * rs/git-mmap-uses-malloc: compat: let git_mmap use malloc(3) directly
2021-09-07tr2: log N parent process names on LinuxLibravatar Ævar Arnfjörð Bjarmason1-17/+132
In 2f732bf15e6 (tr2: log parent process name, 2021-07-21) we started logging parent process names, but only logged all parents on Windows. on Linux only the name of the immediate parent process was logged. Extend the functionality added there to also log full parent chain on Linux. This requires us to lookup "/proc/<getppid()>/stat" instead of "/proc/<getppid()>/comm". The "comm" file just contains the name of the process, but the "stat" file has both that information, and the parent PID of that process, see procfs(5). We parse out the parent PID of our own parent, and recursively walk the chain of "/proc/*/stat" files all the way up the chain. A parent PID of 0 indicates the end of the chain. It's possible given the semantics of Linux's PID files that we end up getting an entirely nonsensical chain of processes. It could happen if e.g. we have a chain of processes like: 1 (init) => 321 (bash) => 123 (git) Let's assume that "bash" was started a while ago, and that as shown the OS has already cycled back to using a lower PID for us than our parent process. In the time it takes us to start up and get to trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP) our parent process might exit, and be replaced by an entirely different process! We'd racily look up our own getppid(), but in the meantime our parent would exit, and Linux would have cycled all the way back to starting an entirely unrelated process as PID 321. If that happens we'll just silently log incorrect data in our ancestry chain. Luckily we don't need to worry about this except in this specific cycling scenario, as Linux does not have PID randomization. It appears it once did through a third-party feature, but that it was removed around 2006[1]. For anyone worried about this edge case raising PID_MAX via "/proc/sys/kernel/pid_max" will mitigate it, but not eliminate it. One thing we don't need to worry about is getting into an infinite loop when walking "/proc/*/stat". See 353d3d77f4f (trace2: collect Windows-specific process information, 2019-02-22) for the related Windows code that needs to deal with that, and [2] for an explanation of that edge case. Aside from potential race conditions it's also a bit painful to correctly parse the process name out of "/proc/*/stat". A simpler approach is to use fscanf(), see [3] for an implementation of that, but as noted in the comment being added here it would fail in the face of some weird process names, so we need our own parse_proc_stat() to parse it out. With this patch the "ancestry" chain for a trace2 event might look like this: $ GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version | grep ancestry | jq -r .ancestry [ "bash", "screen", "systemd" ] And in the case of naughty process names like the following. This uses perl's ability to use prctl(PR_SET_NAME, ...). See Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on assignment to $0 on Linux, 2010-04-15)[4]: $ perl -e '$0 = "(naughty\nname)"; system "GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version"' | grep ancestry | jq -r .ancestry [ "sh", "(naughty\nname)", "bash", "screen", "systemd" ] 1. https://grsecurity.net/news#grsec2110 2. https://lore.kernel.org/git/48a62d5e-28e2-7103-a5bb-5db7e197a4b9@jeffhostetler.com/ 3. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/ 4. https://github.com/Perl/perl5/commit/7636ea95c57762930accf4358f7c0c2dec086b5e Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07tr2: do compiler enum check in trace2_collect_process_info()Libravatar Ævar Arnfjörð Bjarmason1-6/+7
Change code added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21) to use a switch statement without a "default" branch to have the compiler error if this code ever drifts out of sync with the members of the "enum trace2_process_info_reason". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07tr2: leave the parent list empty upon failure & don't leak memoryLibravatar Ævar Arnfjörð Bjarmason1-3/+5
In a subsequent commit I'll be replacing most of this code to log N parents, but let's first fix bugs introduced in the recent 2f732bf15e6 (tr2: log parent process name, 2021-07-21). It was using the strbuf_read_file() in the wrong way, its return value is either a length or a negative value on error. If we didn't have a procfs, or otherwise couldn't access it we'd end up pushing an empty string to the trace2 ancestry array. It was also using the strvec_push() API the wrong way. That API always does an xstrdup(), so by detaching the strbuf here we'd leak memory. Let's instead pass in our pointer for strvec_push() to xstrdup(), and then free our own strbuf. I do have some WIP changes to make strvec_push_nodup() non-static, which makes this and some other callsites nicer, but let's just follow the prevailing pattern of using strvec_push() for now. We'll also need to free that "procfs_path" strbuf whether or not strbuf_read_file() succeeds, which was another source of memory leaks in 2f732bf15e6, i.e. we'd leak that memory as well if we weren't on a system where we could read the file from procfs. Let's move all the freeing of the memory to the end of the function. If we're still at STRBUF_INIT with "name" due to not having taken the branch where the strbuf_read_file() succeeds freeing it is redundant. So we could move it into the body of the "if", but just handling freeing the same way for all branches of the function makes it more readable. In combination with the preceding commit this makes all of t[0-9]*trace2*.sh pass under SANITIZE=leak on Linux. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under LinuxLibravatar Ævar Arnfjörð Bjarmason1-1/+5
Rewrite a comment added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21) to describe what we might do under TRACE2_PROCESS_INFO_EXIT in the future, instead of vaguely referring to "something extra". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07tr2: remove NEEDSWORK comment for "non-procfs" implementationsLibravatar Ævar Arnfjörð Bjarmason1-1/+0
I'm fairly sure that there is no way on Linux to inspect the process tree without using procfs, any tool such as ps(1), top(1) etc. that shows this sort of information ultimately looks the information up in procfs. So let's remove this comment added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21), it's setting us up for an impossible task. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-03win32: allow building with pedantic mode enabledLibravatar Carlo Marcelo Arenas Belón2-2/+2
In preparation to building with pedantic mode enabled, change a couple of places where the current mingw gcc compiler provided with the SDK reports issues. A full fix for the incompatible use of (void *) to store function pointers has been punted, with the minimal change to instead use a generic function pointer (FARPROC), and therefore the (hopefully) temporary need to disable incompatible pointer warnings. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-24Merge branch 'es/trace2-log-parent-process-name'Libravatar Junio C Hamano2-0/+66
trace2 logs learned to show parent process name to see in what context Git was invoked. * es/trace2-log-parent-process-name: tr2: log parent process name tr2: make process info collection platform-generic
2021-08-24compat: let git_mmap use malloc(3) directlyLibravatar René Scharfe1-1/+6
xmalloc() dies on error, allows zero-sized allocations and enforces GIT_ALLOC_LIMIT for testing. Our mmap replacement doesn't need any of that. Let's cut out the wrapper, reject zero-sized requests as required by POSIX and use malloc(3) directly. Allocation errors were needlessly handled by git_mmap() before; this code becomes reachable now. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-08-02mingw: align symlinks-related rmdir() behavior with LinuxLibravatar Thomas Bétous1-0/+21
When performing a rebase, rmdir() is called on the folder .git/logs. On Unix rmdir() exits without deleting anything in case .git/logs is a symbolic link but the equivalent functions on Windows (_rmdir, _wrmdir and RemoveDirectoryW) do not behave the same and remove the folder if it is symlinked even if it is not empty. This creates issues when folders in .git/ are symlinks which is especially the case when git-repo[1] is used: It replaces `.git/logs/` with a symlink. One such issue is that the _target_ of that symlink is removed e.g. during a `git rebase`, where `delete_reflog("REBASE_HEAD")` will not only try to remove `.git/logs/REBASE_HEAD` but then recursively try to remove the parent directories until an error occurs, a technique that obviously relies on `rmdir()` refusing to remove a symlink. This was reported in https://github.com/git-for-windows/git/issues/2967. This commit updates mingw_rmdir() so that its behavior is the same as Linux rmdir() in case of symbolic links. To verify that Git does not regress on the reported issue, this patch adds a regression test for the `git rebase` symptom, even if the same `rmdir()` behavior is quite likely to cause potential problems in other Git commands as well. [1]: git-repo is a python tool built on top of Git which helps manage many Git repositories. It stores all the .git/ folders in a central place by taking advantage of symbolic links. More information: https://gerrit.googlesource.com/git-repo/ Signed-off-by: Thomas Bétous <tomspycell@gmail.com> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22tr2: log parent process nameLibravatar Emily Shaffer1-0/+55
It can be useful to tell who invoked Git - was it invoked manually by a user via CLI or script? By an IDE? In some cases - like 'repo' tool - we can influence the source code and set the GIT_TRACE2_PARENT_SID environment variable from the caller process. In 'repo''s case, that parent SID is manipulated to include the string "repo", which means we can positively identify when Git was invoked by 'repo' tool. However, identifying parents that way requires both that we know which tools invoke Git and that we have the ability to modify the source code of those tools. It cannot scale to keep up with the various IDEs and wrappers which use Git, most of which we don't know about. Learning which tools and wrappers invoke Git, and how, would give us insight to decide where to improve Git's usability and performance. Unfortunately, there's no cross-platform reliable way to gather the name of the parent process. If procfs is present, we can use that; otherwise we will need to discover the name another way. However, the process ID should be sufficient to look up the process name on most platforms, so that code may be shareable. Git for Windows gathers similar information and logs it as a "data_json" event. However, since "data_json" has a variable format, it is difficult to parse effectively in some languages; instead, let's pursue a dedicated "cmd_ancestry" event to record information about the ancestry of the current process and a consistent, parseable way. Git for Windows also gathers information about more than one generation of parent. In Linux further ancestry info can be gathered with procfs, but it's unwieldy to do so. In the interest of later moving Git for Windows ancestry logging to the 'cmd_ancestry' event, and in the interest of later adding more ancestry to the Linux implementation - or of adding this functionality to other platforms which have an easier time walking the process tree - let's make 'cmd_ancestry' accept an array of parentage. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-22tr2: make process info collection platform-genericLibravatar Emily Shaffer1-0/+11
To pave the way for non-Windows platforms to define trace2_collect_process_info(), reorganize the stub-or-definition schema to something which doesn't directly reference Windows. Platforms which want to collect parent process information in the future should: 1. Add an implementation to compat/ (e.g. compat/somearch/procinfo.c) 2. Add that object to COMPAT_OBJS to config.mak.uname (e.g. COMPAT_OBJS += compat/somearch/procinfo.o) 3. Define HAVE_PLATFORM_PROCINFO in config.mak.uname In the Windows case, this definition lives in compat/win32/trace2_win32_process_info.c, which is already conditionally added to COMPAT_OBJS; so let's add HAVE_PLATFORM_PROCINFO to hint to the build that compat/stub/procinfo.c should not be used. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Helped-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-22Merge branch 'jh/simple-ipc-sans-pthread'Libravatar Junio C Hamano3-7/+19
The "simple-ipc" did not compile without pthreads support, but the build procedure was not properly account for it. * jh/simple-ipc-sans-pthread: simple-ipc: correct ifdefs when NO_PTHREADS is defined
2021-05-21simple-ipc: correct ifdefs when NO_PTHREADS is definedLibravatar Jeff Hostetler3-7/+19
Simple IPC always requires threads (in addition to various platform-specific IPC support). Fix the ifdefs in the Makefile to define SUPPORTS_SIMPLE_IPC when appropriate. Previously, the Unix version of the code would only verify that Unix domain sockets were available. This problem was reported here: https://lore.kernel.org/git/YKN5lXs4AoK%2FJFTO@coredump.intra.peff.net/T/#m08be8f1942ea8a2c36cfee0e51cdf06489fdeafc Reported-by: Randall S. Becker <rsbecker@nexbridge.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-20Merge branch 'js/access-nul-emulation-on-windows'Libravatar Junio C Hamano1-0/+2
Portability fix. * js/access-nul-emulation-on-windows: msvc: avoid calling `access("NUL", flags)`
2021-04-16msvc: avoid calling `access("NUL", flags)`Libravatar Johannes Schindelin1-0/+2
Apparently this is not supported with Microsoft's Universal C Runtime. So let's not actually do that. Instead, just return success because we _know_ that we expect the `NUL` device to be present. Side note: it is possible to turn off the "Null device driver" and thereby disable `NUL`. Too many things are broken if this driver is disabled, therefore it is not worth bothering to try to detect its presence when `access()` is called. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-13Merge branch 'tb/precompose-prefix-simplify'Libravatar Junio C Hamano2-5/+5
Streamline the codepath to fix the UTF-8 encoding issues in the argv[] and the prefix on macOS. * tb/precompose-prefix-simplify: macOS: precompose startup_info->prefix precompose_utf8: make precompose_string_if_needed() public
2021-04-05precompose_utf8: make precompose_string_if_needed() publicLibravatar Torsten Bögershausen2-5/+5
commit 5c327502 (MacOS: precompose_argv_prefix(), 2021-02-03) uses the function precompose_string_if_needed() internally. It is only used from precompose_argv_prefix() and therefore static in compat/precompose_utf8.c Expose this function, it will be used in the next commit. While there, allow passing a NULL pointer, which will return NULL. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-04-02Merge branch 'jh/simple-ipc'Libravatar Junio C Hamano3-0/+1778
A simple IPC interface gets introduced to build services like fsmonitor on top. * jh/simple-ipc: t0052: add simple-ipc tests and t/helper/test-simple-ipc tool simple-ipc: add Unix domain socket implementation unix-stream-server: create unix domain socket under lock unix-socket: disallow chdir() when creating unix domain sockets unix-socket: add backlog size option to unix_stream_listen() unix-socket: eliminate static unix_stream_socket() helper function simple-ipc: add win32 implementation simple-ipc: design documentation for new IPC mechanism pkt-line: add options argument to read_packetized_to_strbuf() pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option pkt-line: do not issue flush packets in write_packetized_*() pkt-line: eliminate the need for static buffer in packet_write_gently()