summaryrefslogtreecommitdiff
path: root/Makefile
AgeCommit message (Collapse)AuthorFilesLines
2017-11-22list-objects: filter objects in traverse_commit_listLibravatar Jeff Hostetler1-0/+2
Create traverse_commit_list_filtered() and add filtering interface to allow certain objects to be omitted from the traversal. Update traverse_commit_list() to be a wrapper for the above with a null filter to minimize the number of callers that needed to be changed. Object filtering will be used in a future commit by rev-list and pack-objects for partial clone and fetch to omit unwanted objects from the result. traverse_bitmap_commit_list() does not work with filtering. If a packfile bitmap is present, it will not be used. It should be possible to extend such support in the future (at least to simple filters that do not require object pathnames), but that is beyond the scope of this patch series. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-11Merge branch 'jt/oidmap'Libravatar Junio C Hamano1-0/+1
Introduce a new "oidmap" API and rewrite oidset to use it. * jt/oidmap: oidmap: map with OID as key
2017-10-03Merge branch 'mh/mmap-packed-refs'Libravatar Junio C Hamano1-0/+6
Operations that do not touch (majority of) packed refs have been optimized by making accesses to packed-refs file lazy; we no longer pre-parse everything, and an access to a single ref in the packed-refs does not touch majority of irrelevant refs, either. * mh/mmap-packed-refs: (21 commits) packed-backend.c: rename a bunch of things and update comments mmapped_ref_iterator: inline into `packed_ref_iterator` ref_cache: remove support for storing peeled values packed_ref_store: get rid of the `ref_cache` entirely ref_store: implement `refs_peel_ref()` generically packed_read_raw_ref(): read the reference from the mmapped buffer packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator` read_packed_refs(): ensure that references are ordered when read packed_ref_cache: keep the `packed-refs` file mmapped if possible packed-backend.c: reorder some definitions mmapped_ref_iterator_advance(): no peeled value for broken refs mmapped_ref_iterator: add iterator over a packed-refs file packed_ref_cache: remember the file-wide peeling state read_packed_refs(): read references with minimal copying read_packed_refs(): make parsing of the header line more robust read_packed_refs(): only check for a header at the top of the file read_packed_refs(): use mmap to read the `packed-refs` file die_unterminated_line(), die_invalid_line(): new functions packed_ref_cache: add a backlink to the associated `packed_ref_store` prefix_ref_iterator: break when we leave the prefix ...
2017-10-01oidmap: map with OID as keyLibravatar Jonathan Tan1-0/+1
This is similar to using the hashmap in hashmap.c, but with an easier-to-use API. In particular, custom entry comparisons no longer need to be written, and lookups can be done without constructing a temporary entry structure. This is implemented as a thin wrapper over the hashmap API. In particular, this means that there is an additional 4-byte overhead due to the fact that the first 4 bytes of the hash is redundantly stored. For now, I'm taking the simpler approach, but if need be, we can reimplement oidmap without affecting the callers significantly. oidset has been updated to use oidmap. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25packed_ref_cache: keep the `packed-refs` file mmapped if possibleLibravatar Michael Haggerty1-0/+6
Keep a copy of the `packed-refs` file contents in memory for as long as a `packed_ref_cache` object is in use: * If the system allows it, keep the `packed-refs` file mmapped. * If not (either because the system doesn't support `mmap()` at all, or because a file that is currently mmapped cannot be replaced via `rename()`), then make a copy of the file's contents in heap-allocated space, and keep that around instead. We base the choice of behavior on a new build-time switch, `MMAP_PREVENTS_DELETE`. By default, this switch is set for Windows variants. After this commit, `MMAP_NONE` and `MMAP_TEMPORARY` are still handled identically. But the next commit will introduce a difference. This whole change is still pointless, because we only read the `packed-refs` file contents immediately after instantiating the `packed_ref_cache`. But that will soon change. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25Merge branch 'bw/git-clang-format'Libravatar Junio C Hamano1-0/+4
"make style" runs git-clang-format to help developers by pointing out coding style issues. * bw/git-clang-format: Makefile: add style build rule clang-format: outline the git project's coding style
2017-09-19Merge branch 'jk/leak-checkers'Libravatar Junio C Hamano1-0/+3
Many of our programs consider that it is OK to release dynamic storage that is used throughout the life of the program by simply exiting, but this makes it harder to leak detection tools to avoid reporting false positives. Plug many existing leaks and introduce a mechanism for developers to mark that the region of memory pointed by a pointer is not lost/leaking to help these tools. * jk/leak-checkers: add UNLEAK annotation for reducing leak false positives set_git_dir: handle feeding gitdir to itself repository: free fields before overwriting them reset: free allocated tree buffers reset: make tree counting less confusing config: plug user_config leak update-index: fix cache entry leak in add_one_file() add: free leaked pathspec after add_files_to_cache() test-lib: set LSAN_OPTIONS to abort by default test-lib: --valgrind should not override --verbose-log
2017-09-19Merge branch 'ti/external-sha1dc'Libravatar Junio C Hamano1-3/+15
Platforms that ship with a separate sha1 with collision detection library can link to it instead of using the copy we ship as part of our source tree. * ti/external-sha1dc: sha1dc: allow building with the external sha1dc library sha1dc: build git plumbing code more explicitly
2017-09-08add UNLEAK annotation for reducing leak false positivesLibravatar Jeff King1-0/+3
It's a common pattern in git commands to allocate some memory that should last for the lifetime of the program and then not bother to free it, relying on the OS to throw it away. This keeps the code simple, and it's fast (we don't waste time traversing structures or calling free at the end of the program). But it also triggers warnings from memory-leak checkers like valgrind or LSAN. They know that the memory was still allocated at program exit, but they don't know _when_ the leaked memory stopped being useful. If it was early in the program, then it's probably a real and important leak. But if it was used right up until program exit, it's not an interesting leak and we'd like to suppress it so that we can see the real leaks. This patch introduces an UNLEAK() macro that lets us do so. To understand its design, let's first look at some of the alternatives. Unfortunately the suppression systems offered by leak-checking tools don't quite do what we want. A leak-checker basically knows two things: 1. Which blocks were allocated via malloc, and the callstack during the allocation. 2. Which blocks were left un-freed at the end of the program (and which are unreachable, but more on that later). Their suppressions work by mentioning the function or callstack of a particular allocation, and marking it as OK to leak. So imagine you have code like this: int cmd_foo(...) { /* this allocates some memory */ char *p = some_function(); printf("%s", p); return 0; } You can say "ignore allocations from some_function(), they're not leaks". But that's not right. That function may be called elsewhere, too, and we would potentially want to know about those leaks. So you can say "ignore the callstack when main calls some_function". That works, but your annotations are brittle. In this case it's only two functions, but you can imagine that the actual allocation is much deeper. If any of the intermediate code changes, you have to update the suppression. What we _really_ want to say is that "the value assigned to p at the end of the function is not a real leak". But leak-checkers can't understand that; they don't know about "p" in the first place. However, we can do something a little bit tricky if we make some assumptions about how leak-checkers work. They generally don't just report all un-freed blocks. That would report even globals which are still accessible when the leak-check is run. Instead they take some set of memory (like BSS) as a root and mark it as "reachable". Then they scan the reachable blocks for anything that looks like a pointer to a malloc'd block, and consider that block reachable. And then they scan those blocks, and so on, transitively marking anything reachable from a global as "not leaked" (or at least leaked in a different category). So we can mark the value of "p" as reachable by putting it into a variable with program lifetime. One way to do that is to just mark "p" as static. But that actually affects the run-time behavior if the function is called twice (you aren't likely to call main() twice, but some of our cmd_*() functions are called from other commands). Instead, we can trick the leak-checker by putting the value into _any_ reachable bytes. This patch keeps a global linked-list of bytes copied from "unleaked" variables. That list is reachable even at program exit, which confers recursive reachability on whatever values we unleak. In other words, you can do: int cmd_foo(...) { char *p = some_function(); printf("%s", p); UNLEAK(p); return 0; } to annotate "p" and suppress the leak report. But wait, couldn't we just say "free(p)"? In this toy example, yes. But UNLEAK()'s byte-copying strategy has several advantages over actually freeing the memory: 1. It's recursive across structures. In many cases our "p" is not just a pointer, but a complex struct whose fields may have been allocated by a sub-function. And in some cases (e.g., dir_struct) we don't even have a function which knows how to free all of the struct members. By marking the struct itself as reachable, that confers reachability on any pointers it contains (including those found in embedded structs, or reachable by walking heap blocks recursively. 2. It works on cases where we're not sure if the value is allocated or not. For example: char *p = argc > 1 ? argv[1] : some_function(); It's safe to use UNLEAK(p) here, because it's not freeing any memory. In the case that we're pointing to argv here, the reachability checker will just ignore our bytes. 3. Likewise, it works even if the variable has _already_ been freed. We're just copying the pointer bytes. If the block has been freed, the leak-checker will skip over those bytes as uninteresting. 4. Because it's not actually freeing memory, you can UNLEAK() before we are finished accessing the variable. This is helpful in cases like this: char *p = some_function(); return another_function(p); Writing this with free() requires: int ret; char *p = some_function(); ret = another_function(p); free(p); return ret; But with unleak we can just write: char *p = some_function(); UNLEAK(p); return another_function(p); This patch adds the UNLEAK() macro and enables it automatically when Git is compiled with SANITIZE=leak. In normal builds it's a noop, so we pay no runtime cost. It also adds some UNLEAK() annotations to show off how the feature works. On top of other recent leak fixes, these are enough to get t0000 and t0001 to pass when compiled with LSAN. Note the case in commit.c which actually converts a strbuf_release() into an UNLEAK. This code was already non-leaky, but the free didn't do anything useful, since we're exiting. Converting it to an annotation means that non-leak-checking builds pay no runtime cost. The cost is minimal enough that it's probably not worth going on a crusade to convert these kinds of frees to UNLEAKS. I did it here for consistency with the "sb" leak (though it would have been equally correct to go the other way, and turn them both into strbuf_release() calls). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-26Merge branch 'jt/packmigrate'Libravatar Junio C Hamano1-0/+1
Code movement to make it easier to hack later. * jt/packmigrate: (23 commits) pack: move for_each_packed_object() pack: move has_pack_index() pack: move has_sha1_pack() pack: move find_pack_entry() and make it global pack: move find_sha1_pack() pack: move find_pack_entry_one(), is_pack_valid() pack: move check_pack_index_ptr(), nth_packed_object_offset() pack: move nth_packed_object_{sha1,oid} pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry() pack: move unpack_object_header() pack: move get_size_from_delta() pack: move unpack_object_header_buffer() pack: move {,re}prepare_packed_git and approximate_object_count pack: move install_packed_git() pack: move add_packed_git() pack: move unuse_pack() pack: move use_pack() pack: move pack-closing functions pack: move release_pack_memory() pack: move open_pack_index(), parse_pack_index() ...
2017-08-26Merge branch 'kw/write-index-reduce-alloc'Libravatar Junio C Hamano1-0/+1
We used to spend more than necessary cycles allocating and freeing piece of memory while writing each index entry out. This has been optimized. * kw/write-index-reduce-alloc: read-cache: avoid allocating every ondisk entry when writing read-cache: fix memory leak in do_write_index perf: add test for writing the index
2017-08-26Merge branch 'jn/vcs-svn-cleanup'Libravatar Junio C Hamano1-1/+0
Code clean-up. * jn/vcs-svn-cleanup: vcs-svn: move remaining repo_tree functions to fast_export.h vcs-svn: remove repo_delete wrapper function vcs-svn: remove custom mode constants vcs-svn: remove more unused prototypes and declarations
2017-08-23pack: move pack name-related functionsLibravatar Jonathan Tan1-0/+1
Currently, sha1_file.c and cache.h contain many functions, both related to and unrelated to packfiles. This makes both files very large and causes an unclear separation of concerns. Create a new file, packfile.c, to hold all packfile-related functions currently in sha1_file.c. It has a corresponding header packfile.h. In this commit, the pack name-related functions are moved. Subsequent commits will move the other functions. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23vcs-svn: move remaining repo_tree functions to fast_export.hLibravatar Jonathan Nieder1-1/+0
These used to be for manipulating the in-memory repo_tree structure, but nowadays they are convenience wrappers to handle a few git-vs-svn mismatches: 1. Git does not track empty directories but Subversion does. When looking up a path in git that Subversion thinks exists and finding nothing, we can safely assume that the path represents a directory. This is needed when a later Subversion revision modifies that directory. 2. Subversion allows deleting a file by copying. In Git fast-import we have to handle that more explicitly as a deletion. These are details of the tool's interaction with git fast-import. Move them to fast_export.c, where other such details are handled. This way the function names do not start with a repo_ prefix that would clash with the repository object introduced in v2.14.0-rc0~38^2~16 (repository: introduce the repository object, 2017-06-22) or an svn_ prefix that would clash with libsvn (in case someone wants to link this code with libsvn some day). Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-22Merge branch 'mh/packed-ref-store'Libravatar Junio C Hamano1-0/+1
The "ref-store" code reorganization continues. * mh/packed-ref-store: (32 commits) files-backend: cheapen refname_available check when locking refs packed_ref_store: handle a packed-refs file that is a symlink read_packed_refs(): die if `packed-refs` contains bogus data t3210: add some tests of bogus packed-refs file contents repack_without_refs(): don't lock or unlock the packed refs commit_packed_refs(): remove call to `packed_refs_unlock()` clear_packed_ref_cache(): don't protest if the lock is held packed_refs_unlock(), packed_refs_is_locked(): new functions packed_refs_lock(): report errors via a `struct strbuf *err` packed_refs_lock(): function renamed from lock_packed_refs() commit_packed_refs(): use a staging file separate from the lockfile commit_packed_refs(): report errors rather than dying packed_ref_store: make class into a subclass of `ref_store` packed-backend: new module for handling packed references packed_read_raw_ref(): new function, replacing `resolve_packed_ref()` packed_ref_store: support iteration packed_peel_ref(): new function, extracted from `files_peel_ref()` repack_without_refs(): take a `packed_ref_store *` parameter get_packed_ref(): take a `packed_ref_store *` parameter rollback_packed_refs(): take a `packed_ref_store *` parameter ...
2017-08-21perf: add test for writing the indexLibravatar Kevin Willford1-0/+1
A performance test for writing the index to be able to determine if changes to allocating ondisk structure help. Signed-off-by: Kevin Willford <kewillf@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-16sha1dc: allow building with the external sha1dc libraryLibravatar Takashi Iwai1-0/+13
Some distros provide SHA1 collision-detect code as a shared library. It's the same code as we have in git tree (but may be with a different init default for hash), and git can link with it as well; at least, it may make maintenance easier, according to our security guys. This patch allows user to build git linking with the external sha1dc library instead of the built-in code. User needs to define DC_SHA1_EXTERNAL explicitly. As default without it, the built-in sha1dc code is used like before. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-16sha1dc: build git plumbing code more explicitlyLibravatar Takashi Iwai1-3/+2
The plumbing code between sha1dc and git is defined in sha1dc_git.[ch], but these aren't compiled / included directly but only via the indirect inclusion from sha1dc code. This is slightly confusing when you try to trace the build flow. This patch brings the following changes for simplification: - Make sha1dc_git.c stand-alone and build from Makefile - sha1dc_git.h is the common header to include further sha1.h depending on the build condition - Move comments for plumbing codes from the header to definitions This is also meant as a preliminary work for further plumbing with external sha1dc shlib. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-14Makefile: add style build ruleLibravatar Brandon Williams1-0/+4
Add the 'style' build rule which will run git-clang-format on the diff between HEAD and the current worktree. The result is a diff of suggested changes. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-21Merge branch 'jc/po-pritime-fix'Libravatar Junio C Hamano1-0/+21
We started using "%" PRItime, imitating "%" PRIuMAX and friends, as a way to format the internal timestamp value, but this does not play well with gettext(1) i18n framework, and causes "make pot" that is run by the l10n coordinator to create a broken po/git.pot file. This is a possible workaround for that problem. * jc/po-pritime-fix: Makefile: help gettext tools to cope with our custom PRItime format
2017-07-20Merge branch 'jk/build-with-asan'Libravatar Junio C Hamano1-1/+6
A recent update made it easier to use "-fsanitize=" option while compiling but supported only one sanitize option. Allow more than one to be combined, joined with a comma, like "make SANITIZE=foo,bar". * jk/build-with-asan: Makefile: allow combining UBSan with other sanitizers
2017-07-20Makefile: help gettext tools to cope with our custom PRItime formatLibravatar Junio C Hamano1-0/+21
We started using our own timestamp_t type and PRItime format specifier to go along with it, so that we can later change the underlying type and output format more easily, but this does not play well with gettext tools. Because gettext tools need to keep the *.po file portable across platforms, they have to special-case the format specifiers like PRIuMAX that are known types in inttypes.h, instead of letting CPP handle strings like "%" PRIuMAX " seconds ago" as an ordinary string concatenation. They fundamentally cannot do the same for our own custom type/format. Given that po/git.pot needs to be generated only once every release and by only one person, i.e. the l10n coordinator, let's update the Makefile rule to generate po/git.pot so that gettext tools are run on a munged set of sources in which all mentions of PRItime are replaced with PRIuMAX, which is what we happen to use right now. This way, developers do not have to care that PRItime does not play well with gettext, and translators do not have to care that we use our own PRItime. The credit for the idea to munge the source files goes to Dscho. Possible bugs are mine. Helped-by: Jiang Xin <worldhello.net@gmail.com> Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17Makefile: allow combining UBSan with other sanitizersLibravatar René Scharfe1-1/+6
Multiple sanitizers can be specified as a comma-separated list. Set the flag NO_UNALIGNED_LOADS even if UndefinedBehaviorSanitizer is not the only sanitizer to build with. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-13Merge branch 'jk/build-with-asan'Libravatar Junio C Hamano1-0/+8
The build procedure has been improved to allow building and testing Git with address sanitizer more easily. * jk/build-with-asan: Makefile: disable unaligned loads with UBSan Makefile: turn off -fomit-frame-pointer with sanitizers Makefile: add helper for compiling with -fsanitize test-lib: turn on ASan abort_on_error by default test-lib: set ASAN_OPTIONS variable before we run git
2017-07-10Merge branch 'ab/sha1dc'Libravatar Junio C Hamano1-0/+16
The "collission-detecting" implementation of SHA-1 hash we borrowed from is replaced by directly binding the upstream project as our submodule. Glitches on minority platforms are still being worked out. * ab/sha1dc: sha1collisiondetection: automatically enable when submodule is populated sha1dc: optionally use sha1collisiondetection as a submodule
2017-07-10Makefile: disable unaligned loads with UBSanLibravatar Jeff King1-0/+3
The undefined behavior sanitizer complains about unaligned loads, even if they're OK for a particular platform in practice. It's possible that they _are_ a problem, of course, but since it's a known tradeoff the UBSan errors are just noise. Let's quiet it automatically by building with NO_UNALIGNED_LOADS when SANITIZE=undefined is in use. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-10Makefile: turn off -fomit-frame-pointer with sanitizersLibravatar Jeff King1-0/+1
The ASan manual recommends disabling this optimization, as it can make the backtraces produced by the tool harder to follow (and since this is a test-debug build, we don't care about squeezing out every last drop of performance). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-10Makefile: add helper for compiling with -fsanitizeLibravatar Jeff King1-0/+4
You can already build and test with ASan by doing: make CFLAGS=-fsanitize=address test but there are a few slight annoyances: 1. It's a little long to type. 2. It override your CFLAGS completely. You'd probably still want -O2, for instance. 3. It's a good idea to also turn off "recovery", which lets the program keep running after a problem is detected (with the intention of finding as many bugs as possible in a given run). Since Git's test suite should generally run without triggering any problems, it's better to abort immediately and fail the test when we do find an issue. With this patch, all of that happens automatically when you run: make SANITIZE=address test Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-05Merge branch 'bw/repo-object'Libravatar Junio C Hamano1-0/+1
Introduce a "repository" object to eventually make it easier to work in multiple repositories (the primary focus is to work with the superproject and its submodules) in a single process. * bw/repo-object: ls-files: use repository object repository: enable initialization of submodules submodule: convert is_submodule_initialized to work on a repository submodule: add repo_read_gitmodules submodule-config: store the_submodule_cache in the_repository repository: add index_state to struct repo config: read config from a repository object path: add repo_worktree_path and strbuf_repo_worktree_path path: add repo_git_path and strbuf_repo_git_path path: worktree_git_path() should not use file relocation path: convert do_git_path to take a 'struct repository' path: convert strbuf_git_common_path to take a 'struct repository' path: always pass in commondir to update_common_dir path: create path.h environment: store worktree in the_repository environment: place key repository state in the_repository repository: introduce the repository object environment: remove namespace_len variable setup: add comment indicating a hack setup: don't perform lazy initialization of repository state
2017-07-03sha1collisiondetection: automatically enable when submodule is populatedLibravatar Junio C Hamano1-0/+4
If a user wants to experiment with the version of collision detecting sha1 from the submodule, the user needed to not just populate the submodule but also needed to turn the knob. A Makefile trick is easy enough to do so, so let's do this. When somebody with a copy of the submodule populated wants not to use it, that can be done by overriding it in config.mak or from the command line. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03sha1dc: optionally use sha1collisiondetection as a submoduleLibravatar Ævar Arnfjörð Bjarmason1-0/+12
Add an option to use the sha1collisiondetection library from the submodule in sha1collisiondetection/ instead of in the copy in the sha1dc/ directory. This allows us to try out the submodule in sha1collisiondetection without breaking the build for anyone who's not expecting them as we work out any kinks. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23repository: introduce the repository objectLibravatar Brandon Williams1-0/+1
Introduce the repository object 'struct repository' which can be used to hold all state pertaining to a git repository. Some of the benefits of object-ifying a repository are: 1. Make the code base more readable and easier to reason about. 2. Allow for working on multiple repositories, specifically submodules, within the same process. Currently the process for working on a submodule involves setting up an argv_array of options for a particular command and then launching a child process to execute the command in the context of the submodule. This is clunky and can require lots of little hacks in order to ensure correctness. Ideally it would be nice to simply pass a repository and an options struct to a command. 3. Eliminating reliance on global state will make it easier to enable the use of threading to improve performance. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23packed-backend: new module for handling packed referencesLibravatar Michael Haggerty1-0/+1
Now that the interface between `files_ref_store` and `packed_ref_store` is relatively narrow, move the latter into a new module, "refs/packed-backend.h" and "refs/packed-backend.c". It still doesn't quite implement the `ref_store` interface, but it will soon. This commit moves code around and adjusts its visibility, but doesn't change anything. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-22Merge branch 'nd/fopen-errors'Libravatar Junio C Hamano1-1/+2
Hotfix for a topic that is already in 'master'. * nd/fopen-errors: configure.ac: loosen FREAD_READS_DIRECTORIES test program
2017-06-19Merge branch 'ab/pcre-v2'Libravatar Junio C Hamano1-8/+41
Update "perl-compatible regular expression" support to enable JIT and also allow linking with the newer PCRE v2 library. * ab/pcre-v2: grep: add support for PCRE v2 grep: un-break building with PCRE >= 8.32 without --enable-jit grep: un-break building with PCRE < 8.20 grep: un-break building with PCRE < 8.32 grep: add support for the PCRE v1 JIT API log: add -P as a synonym for --perl-regexp grep: skip pthreads overhead when using one thread grep: don't redundantly compile throwaway patterns under threading
2017-06-15configure.ac: loosen FREAD_READS_DIRECTORIES test programLibravatar Jeff King1-1/+2
We added an FREAD_READS_DIRECTORIES Makefile knob long ago in cba22528f (Add compat/fopen.c which returns NULL on attempt to open directory, 2008-02-08) to handle systems where reading from a directory returned garbage. This works by catching the problem at the fopen() stage and returning NULL. More recently, we found that there is a class of systems (including Linux) where fopen() succeeds but fread() fails. Since the solution is the same (having fopen return NULL), they use the same Makefile knob as of e2d90fd1c (config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD, 2017-05-03). This works fine except for one thing: the autoconf test in configure.ac to set FREAD_READS_DIRECTORIES actually checks whether fread succeeds. Which means that on Linux systems, the knob isn't set (and we even override the config.mak.uname default). t1308 catches the failure. We can fix this by tweaking the autoconf test to cover both cases. In theory we might care about the distinction between the traditional "fread reads directories" case and the new "fopen opens directories". But since our solution catches the problem at the fopen stage either way, we don't actually need to know the difference. The "fopen" case is a superset. This does mean the FREAD_READS_DIRECTORIES name is slightly misleading. Probably FOPEN_OPENS_DIRECTORIES would be more accurate. But it would be disruptive to simply change the name (people's existing build configs would fail), and it's not worth the complexity of handling both. Let's just add a comment in the knob description. Reported-by: Øyvind A. Holm <sunny@sunbase.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-05Merge branch 'js/blame-lib'Libravatar Junio C Hamano1-0/+1
The internal logic used in "git blame" has been libified to make it easier to use by cgit. * js/blame-lib: (29 commits) blame: move entry prepend to libgit blame: move scoreboard setup to libgit blame: move scoreboard-related methods to libgit blame: move fake-commit-related methods to libgit blame: move origin-related methods to libgit blame: move core structures to header blame: create entry prepend function blame: create scoreboard setup function blame: create scoreboard init function blame: rework methods that determine 'final' commit blame: wrap blame_sort and compare_blame_final blame: move progress updates to a scoreboard callback blame: make sanity_check use a callback in scoreboard blame: move no_whole_file_rename flag to scoreboard blame: move xdl_opts flags to scoreboard blame: move show_root flag to scoreboard blame: move reverse flag to scoreboard blame: move contents_from to scoreboard blame: move copy/move thresholds to scoreboard blame: move stat counters to scoreboard ...
2017-06-04Merge branch 'ab/sha1dc-maint'Libravatar Junio C Hamano1-1/+8
The "collision detecting" SHA-1 implementation shipped with 2.13 was quite broken on some big-endian platforms and/or platforms that do not like unaligned fetches. Update to the upstream code which has already fixed these issues. * ab/sha1dc-maint: sha1dc: update from upstream
2017-06-02Merge branch 'ab/grep-preparatory-cleanup'Libravatar Junio C Hamano1-4/+10
The internal implementation of "git grep" has seen some clean-up. * ab/grep-preparatory-cleanup: (31 commits) grep: assert that threading is enabled when calling grep_{lock,unlock} grep: given --threads with NO_PTHREADS=YesPlease, warn pack-objects: fix buggy warning about threads pack-objects & index-pack: add test for --threads warning test-lib: add a PTHREADS prerequisite grep: move is_fixed() earlier to avoid forward declaration grep: change internal *pcre* variable & function names to be *pcre1* grep: change the internal PCRE macro names to be PCRE1 grep: factor test for \0 in grep patterns into a function grep: remove redundant regflags assignments grep: catch a missing enum in switch statement perf: add a comparison test of log --grep regex engines with -F perf: add a comparison test of log --grep regex engines perf: add a comparison test of grep regex engines with -F perf: add a comparison test of grep regex engines perf: emit progress output when unpacking & building perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do grep: add tests to fix blind spots with \0 patterns grep: prepare for testing binary regexes containing rx metacharacters grep: add a test helper function for less verbose -f \0 tests ...
2017-06-02grep: add support for PCRE v2Libravatar Ævar Arnfjörð Bjarmason1-8/+28
Add support for v2 of the PCRE API. This is a new major version of PCRE that came out in early 2015[1]. The regular expression syntax is the same, but while the API is similar, pretty much every function is either renamed or takes different arguments. Thus using it via entirely new functions makes sense, as opposed to trying to e.g. have one compile_pcre_pattern() that would call either PCRE v1 or v2 functions. Git can now be compiled with either USE_LIBPCRE1=YesPlease or USE_LIBPCRE2=YesPlease, with USE_LIBPCRE=YesPlease currently being a synonym for the former. Providing both is a compile-time error. With earlier patches to enable JIT for PCRE v1 the performance of the release versions of both libraries is almost exactly the same, with PCRE v2 being around 1% slower. However after I reported this to the pcre-dev mailing list[2] I got a lot of help with the API use from Zoltán Herczeg, he subsequently optimized some of the JIT functionality in v2 of the library. Running the p7820-grep-engines.sh performance test against the latest Subversion trunk of both, with both them and git compiled as -O3, and the test run against linux.git, gives the following results. Just the /perl/ tests shown: $ GIT_PERF_REPEAT_COUNT=30 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_MAKE_COMMAND='grep -q LIBPCRE2 Makefile && make -j8 USE_LIBPCRE2=YesPlease CC=~/perl5/installed/bin/gcc NO_R_TO_GCC_LINKER=YesPlease CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre2/inst LDFLAGS=-Wl,-rpath,/home/avar/g/pcre2/inst/lib || make -j8 USE_LIBPCRE=YesPlease CC=~/perl5/installed/bin/gcc NO_R_TO_GCC_LINKER=YesPlease CFLAGS=-O3 LIBPCREDIR=/home/avar/g/pcre/inst LDFLAGS=-Wl,-rpath,/home/avar/g/pcre/inst/lib' ./run HEAD~5 HEAD~ HEAD p7820-grep-engines.sh [...] Test HEAD~5 HEAD~ HEAD ----------------------------------------------------------------------------------------------------------------- 7820.3: perl grep 'how.to' 0.31(1.10+0.48) 0.21(0.35+0.56) -32.3% 0.21(0.34+0.55) -32.3% 7820.7: perl grep '^how to' 0.56(2.70+0.40) 0.24(0.64+0.52) -57.1% 0.20(0.28+0.60) -64.3% 7820.11: perl grep '[how] to' 0.56(2.66+0.38) 0.29(0.95+0.45) -48.2% 0.23(0.45+0.54) -58.9% 7820.15: perl grep '(e.t[^ ]*|v.ry) rare' 1.02(5.77+0.42) 0.31(1.02+0.54) -69.6% 0.23(0.50+0.54) -77.5% 7820.19: perl grep 'm(ú|u)lt.b(æ|y)te' 0.38(1.57+0.42) 0.27(0.85+0.46) -28.9% 0.21(0.33+0.57) -44.7% See commit ("perf: add a comparison test of grep regex engines", 2017-04-19) for details on the machine the above test run was executed on. Here HEAD~2 is git with PCRE v1 without JIT, HEAD~ is PCRE v1 with JIT, and HEAD is PCRE v2 (also with JIT). See previous commits of mine mentioning p7820-grep-engines.sh for more details on the test setup. For ease of readability, a different run just of HEAD~ (PCRE v1 with JIT v.s. PCRE v2), again with just the /perl/ tests shown: [...] Test HEAD~ HEAD ---------------------------------------------------------------------------------------- 7820.3: perl grep 'how.to' 0.21(0.42+0.52) 0.21(0.31+0.58) +0.0% 7820.7: perl grep '^how to' 0.25(0.65+0.50) 0.20(0.31+0.57) -20.0% 7820.11: perl grep '[how] to' 0.30(0.90+0.50) 0.23(0.46+0.53) -23.3% 7820.15: perl grep '(e.t[^ ]*|v.ry) rare' 0.30(1.19+0.38) 0.23(0.51+0.51) -23.3% 7820.19: perl grep 'm(ú|u)lt.b(æ|y)te' 0.27(0.84+0.48) 0.21(0.34+0.57) -22.2% I.e. the two are either neck-to-neck, but PCRE v2 usually pulls ahead, when it does it's around 20% faster. A brief note on thread safety: As noted in pcre2api(3) & pcre2jit(3) the compiled pattern can be shared between threads, but not some of the JIT context, however the grep threading support does all pattern & JIT compilation in separate threads, so this code doesn't need to concern itself with thread safety. See commit 63e7e9d8b6 ("git-grep: Learn PCRE", 2011-05-09) for the initial addition of PCRE v1. This change follows some of the same patterns it did (and which were discussed on list at the time), e.g. mocking up types with typedef instead of ifdef-ing them out when USE_LIBPCRE2 isn't defined. This adds some trivial memory use to the program, but makes the code look nicer. 1. https://lists.exim.org/lurker/message/20150105.162835.0666407a.en.html 2. https://lists.exim.org/lurker/thread/20170419.172322.833ee099.en.html Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02grep: un-break building with PCRE >= 8.32 without --enable-jitLibravatar Ævar Arnfjörð Bjarmason1-0/+13
Amend my change earlier in this series ("grep: add support for the PCRE v1 JIT API", 2017-04-11) to un-break the build on PCRE v1 versions later than 8.31 compiled without --enable-jit. As explained in that change and a later compatibility change in this series ("grep: un-break building with PCRE < 8.32", 2017-05-10) the pcre_jit_exec() function is a faster path to execute the JIT. Unfortunately there's no compatibility stub for that function compiled into the library if pcre_config(PCRE_CONFIG_JIT, &ret) would return 0, and no macro that can be used to check for it, so the only portable option to support builds without --enable-jit is via a new NO_LIBPCRE1_JIT=UnfortunatelyYes Makefile option[1]. Another option would be to make the JIT opt-in via USE_LIBPCRE1_JIT=YesPlease, after all it's not a default option of PCRE v1. I think it makes more sense to make it opt-out since even though it's not a default option, most packagers of PCRE seem to turn it on by default, with the notable exception of the MinGW package. Make the MinGW platform work by default by changing the build defaults to turn on NO_LIBPCRE1_JIT=UnfortunatelyYes. It is the only platform that turns on USE_LIBPCRE=YesPlease by default, see commit df5218b4c3 ("config.mak.uname: support MSys2", 2016-01-13) for that change. 1. "How do I support pcre1 JIT on all versions?" (https://lists.exim.org/lurker/thread/20170601.103148.10253788.en.html) 2. https://github.com/Alexpux/MINGW-packages/blob/master/mingw-w64-pcre/PKGBUILD (referenced from "Re: PCRE v2 compile error, was Re: What's cooking in git.git (May 2017, #01; Mon, 1)"; <alpine.DEB.2.20.1705021756530.3480@virtualbox>) Sign