summaryrefslogtreecommitdiff
path: root/sha1_file.c
AgeCommit message (Collapse)AuthorFilesLines
2017-08-23pack: move for_each_packed_object()Libravatar Jonathan Tan1-40/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move has_pack_index()Libravatar Jonathan Tan1-8/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move has_sha1_pack()Libravatar Jonathan Tan1-6/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move find_pack_entry() and make it globalLibravatar Jonathan Tan1-53/+0
This function needs to be global as it is used by sha1_file.c and will be used by packfile.c. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move find_sha1_pack()Libravatar Jonathan Tan1-13/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move find_pack_entry_one(), is_pack_valid()Libravatar Jonathan Tan1-73/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move check_pack_index_ptr(), nth_packed_object_offset()Libravatar Jonathan Tan1-33/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move nth_packed_object_{sha1,oid}Libravatar Jonathan Tan1-31/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry()Libravatar Jonathan Tan1-663/+14
Both sha1_file.c and packfile.c now need read_object(), so a copy of read_object() was created in packfile.c. This patch makes both mark_bad_packed_object() and has_packed_and_bad() global. Unlike most of the other patches in this series, these 2 functions need to remain global. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move unpack_object_header()Libravatar Jonathan Tan1-26/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move get_size_from_delta()Libravatar Jonathan Tan1-39/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move unpack_object_header_buffer()Libravatar Jonathan Tan1-25/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move {,re}prepare_packed_git and approximate_object_countLibravatar Jonathan Tan1-214/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move install_packed_git()Libravatar Jonathan Tan1-9/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move add_packed_git()Libravatar Jonathan Tan1-61/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move unuse_pack()Libravatar Jonathan Tan1-9/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move use_pack()Libravatar Jonathan Tan1-285/+0
The function open_packed_git() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move pack-closing functionsLibravatar Jonathan Tan1-55/+0
The function close_pack_fd() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move release_pack_memory()Libravatar Jonathan Tan1-49/+0
The function unuse_one_window() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move open_pack_index(), parse_pack_index()Libravatar Jonathan Tan1-140/+0
alloc_packed_git() in packfile.c is duplicated from sha1_file.c. In a subsequent commit, alloc_packed_git() will be removed from sha1_file.c. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move pack_report()Libravatar Jonathan Tan1-24/+0
Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move static state variablesLibravatar Jonathan Tan1-13/+0
sha1_file.c declares some static variables that store packfile-related state. Move them to packfile.c. They are temporarily made global, but subsequent commits will restore their scope back to static. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move pack name-related functionsLibravatar Jonathan Tan1-22/+1
Currently, sha1_file.c and cache.h contain many functions, both related to and unrelated to packfiles. This makes both files very large and causes an unclear separation of concerns. Create a new file, packfile.c, to hold all packfile-related functions currently in sha1_file.c. It has a corresponding header packfile.h. In this commit, the pack name-related functions are moved. Subsequent commits will move the other functions. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23Merge branch 'sb/sha1-file-cleanup'Libravatar Junio C Hamano1-1/+2
Code clean-up. * sb/sha1-file-cleanup: sha1_file: make read_info_alternates static
2017-08-23Merge branch 'jt/sha1-file-cleanup'Libravatar Junio C Hamano1-32/+7
Preparatory code clean-up. * jt/sha1-file-cleanup: sha1_file: remove read_packed_sha1() sha1_file: set whence in storage-specific info fn
2017-08-22Merge branch 'rs/unpack-entry-leakfix'Libravatar Junio C Hamano1-2/+3
Memory leak in an error codepath has been plugged. * rs/unpack-entry-leakfix: sha1_file: release delta_stack on error in unpack_entry()
2017-08-22Merge branch 'rs/find-pack-entry-bisection'Libravatar Junio C Hamano1-2/+2
Code clean-up. * rs/find-pack-entry-bisection: sha1_file: avoid comparison if no packed hash matches the first byte
2017-08-22Merge branch 'jk/drop-sha1-entry-pos'Libravatar Junio C Hamano1-11/+0
Code clean-up. * jk/drop-sha1-entry-pos: sha1_file: drop experimental GIT_USE_LOOKUP search
2017-08-15sha1_file: make read_info_alternates staticLibravatar Stefan Beller1-1/+2
read_info_alternates is not used from outside, so let's make it static. We have to declare the function before link_alt_odb_entry instead of moving the code around, link_alt_odb_entry calls read_info_alternates, which in turn calls link_alt_odb_entry. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-11sha1_file: remove read_packed_sha1()Libravatar Jonathan Tan1-25/+1
Use read_object() in its place instead. This avoids duplication of code. This makes force_object_loose() slightly slower (because of a redundant check of loose object storage), but only in the error case. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-11sha1_file: set whence in storage-specific info fnLibravatar Jonathan Tan1-7/+6
Move the setting of oi->whence to sha1_loose_object_info() and packed_object_info(). This allows sha1_object_info_extended() to not need to know about the delta base cache. This will be useful during a future refactoring in which packfile-related functions, including the handling of the delta base cache, will be moved to a separate file. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-10sha1_file: release delta_stack on error in unpack_entry()Libravatar René Scharfe1-2/+3
When unpack_entry() encounters a broken packed object, it returns early. It adjusts the reference count of the pack window, but leaks the buffer for a big delta stack in case the small automatic one was not enough. Jump to the cleanup code at end instead, which takes care of that. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-09sha1_file: drop experimental GIT_USE_LOOKUP searchLibravatar Jeff King1-11/+0
Long ago in 628522ec14 (sha1-lookup: more memory efficient search in sorted list of SHA-1, 2007-12-29) we added sha1_entry_pos(), a binary search that uses the uniform distribution of sha1s to scale the selection of mid-points. As this was a performance experiment, we tied it to the GIT_USE_LOOKUP environment variable and never enabled it by default. This code was successful in reducing the number of steps in each search. But the overhead of the scaling ends up making it slower when the cache is warm. Here are best-of-five timings for running rev-list on linux.git, which will have to look up every object: $ time git rev-list --objects --all >/dev/null real 0m35.357s user 0m35.016s sys 0m0.340s $ time GIT_USE_LOOKUP=1 git rev-list --objects --all >/dev/null real 0m37.364s user 0m37.045s sys 0m0.316s The USE_LOOKUP version might have more benefit on a cold cache, as the time to fault in each page would dominate. But that would be for a single lookup. In practice, most operations tend to look up many objects, and the whole pack .idx will end up warm. It's possible that the code could be better optimized to compete with a naive binary search for the warm-cache case, and we could have the best of both worlds. But over the years nobody has done so, and this is largely dead code that is rarely run outside of the test suite. Let's drop it in the name of simplicity. This lets us remove sha1_entry_pos() entirely, as the .idx lookup code was the only caller. Note that sha1-lookup.c still contains sha1_pos(), which differs from sha1_entry_pos() in two ways: - it has a different interface; it uses a function pointer to access sha1 entries rather than a size/offset pair describing the table's memory layout - it only scales the initial selection of "mi", rather than each iteration of the search We can't get rid of this function, as it's called from several places. It may be that we could replace it with a simple binary search, but that's out of scope for this patch (and would need benchmarking). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-09sha1_file: avoid comparison if no packed hash matches the first byteLibravatar René Scharfe1-2/+2
find_pack_entry_one() uses the fan-out table of pack indexes to find out which entries match the first byte of the searched hash and does a binary search on this subset of the main index table. If there are no matching entries then lo and hi will have the same value. The binary search still starts and compares the hash of the following entry (which has a non-matching first byte, so won't cause any trouble), or whatever comes after the sorted list of entries. The probability of that stray comparison matching by mistake is low, but let's not take any chances and check when entering the binary search loop if we're actually done already. Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-20Merge branch 'ew/fd-cloexec-fix'Libravatar Junio C Hamano1-3/+3
Portability/fallback fix. * ew/fd-cloexec-fix: set FD_CLOEXEC properly when O_CLOEXEC is not supported
2017-07-17set FD_CLOEXEC properly when O_CLOEXEC is not supportedLibravatar Eric Wong1-3/+3
FD_CLOEXEC only applies to the file descriptor, so it needs to be manipuluated via F_GETFD/F_SETFD. F_GETFL/F_SETFL are for file description flags. Verified via strace with o_cloexec set to zero. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-13Merge branch 'sb/hashmap-customize-comparison'Libravatar Junio C Hamano1-2/+3
Update the hashmap API so that data to customize the behaviour of the comparison function can be specified at the time a hashmap is initialized. * sb/hashmap-customize-comparison: hashmap: migrate documentation from Documentation/technical into header patch-ids.c: use hashmap correctly hashmap.h: compare function has access to a data field
2017-07-05Merge branch 'jt/unify-object-info'Libravatar Junio C Hamano1-189/+196
Code clean-ups. * jt/unify-object-info: sha1_file: refactor has_sha1_file_with_flags sha1_file: do not access pack if unneeded sha1_file: teach sha1_object_info_extended more flags sha1_file: refactor read_object sha1_file: move delta base cache code up sha1_file: rename LOOKUP_REPLACE_OBJECT sha1_file: rename LOOKUP_UNKNOWN_OBJECT sha1_file: teach packed_object_info about typename
2017-07-05Merge branch 'rs/sha1-name-readdir-optim'Libravatar Junio C Hamano1-15/+24
Optimize "what are the object names already taken in an alternate object database?" query that is used to derive the length of prefix an object name is uniquely abbreviated to. * rs/sha1-name-readdir-optim: sha1_file: guard against invalid loose subdirectory numbers sha1_file: let for_each_file_in_obj_subdir() handle subdir names p4205: add perf test script for pretty log formats sha1_name: cache readdir(3) results in find_short_object_filename()
2017-06-30hashmap.h: compare function has access to a data fieldLibravatar Stefan Beller1-2/+3
When using the hashmap a common need is to have access to caller provided data in the compare function. A couple of times we abuse the keydata field to pass in the data needed. This happens for example in patch-ids.c. This patch changes the function signature of the compare function to have one more void pointer available. The pointer given for each invocation of the compare function must be defined in the init function of the hashmap and is just passed through. Documentation of this new feature is deferred to a later patch. This is a rather mechanical conversion, just adding the new pass-through parameter. However while at it improve the naming of the fields of all compare functions used by hashmaps by ensuring unused parameters are prefixed with 'unused_' and naming the parameters what they are (instead of 'unused' make it 'unused_keydata'). Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-26sha1_file: refactor has_sha1_file_with_flagsLibravatar Jonathan Tan1-10/+2
has_sha1_file_with_flags() implements many mechanisms in common with sha1_object_info_extended(). Make has_sha1_file_with_flags() a convenience function for sha1_object_info_extended() instead. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-26sha1_file: do not access pack if unneededLibravatar Jonathan Tan1-0/+11
Currently, regardless of the contents of the "struct object_info" passed to sha1_object_info_extended(), that function always accesses the packfile whenever it returns information about a packed object, since it needs to populate "u.packed". Add the ability to pass NULL, and use NULL-ness of the argument to activate an optimization in which sha1_object_info_extended() does not needlessly access the packfile. A subsequent patch will make use of this optimization. A similar optimization is not made for the cached and loose cases as it would not cause a significant performance improvement. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-26sha1_file: teach sha1_object_info_extended more flagsLibravatar Jonathan Tan1-19/+24
Improve sha1_object_info_extended() by supporting additional flags. This allows has_sha1_file_with_flags() to be modified to use sha1_object_info_extended() in a subsequent patch. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24Merge branch 'ab/free-and-null'Libravatar Junio C Hamano1-2/+1
A common pattern to free a piece of memory and assign NULL to the pointer that used to point at it has been replaced with a new FREE_AND_NULL() macro. * ab/free-and-null: *.[ch] refactoring: make use of the FREE_AND_NULL() macro coccinelle: make use of the "expression" FREE_AND_NULL() rule coccinelle: add a rule to make "expression" code use FREE_AND_NULL() coccinelle: make use of the "type" FREE_AND_NULL() rule coccinelle: add a rule to make "type" code use FREE_AND_NULL() git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
2017-06-24Merge branch 'bw/config-h'Libravatar Junio C Hamano1-0/+1
Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h
2017-06-24sha1_file: guard against invalid loose subdirectory numbersLibravatar René Scharfe1-1/+4
Loose object subdirectories have hexadecimal names based on the first byte of the hash of contained objects, thus their numerical representation can range from 0 (0x00) to 255 (0xff). Change the type of the corresponding variable in for_each_file_in_obj_subdir() and associated callback functions to unsigned int and add a range check. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24sha1_file: let for_each_file_in_obj_subdir() handle subdir namesLibravatar René Scharfe1-8/+14
The function for_each_file_in_obj_subdir() takes a object subdirectory number and expects the name of the same subdirectory to be included in the path strbuf. Avoid this redundancy by letting the function append the hexadecimal subdirectory name itself. This makes it a bit easier and safer to use the function -- it becomes impossible to specify different subdirectories in subdir_nr and path. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-22sha1_name: cache readdir(3) results in find_short_object_filename()Libravatar René Scharfe1-6/+6
Read each loose object subdirectory at most once when looking for unique abbreviated hashes. This speeds up commands like "git log --pretty=%h" considerably, which previously caused one readdir(3) call for each candidate, even for subdirectories that were visited before. The new cache is kept until the program ends and never invalidated. The same is already true for pack indexes. The inherent racy nature of finding unique short hashes makes it still fit for this purpose -- a conflicting new object may be added at any time. Tasks with higher consistency requirements should not use it, though. The cached object names are stored in an oid_array, which is quite compact. The bitmap for remembering which subdir was already read is stored as a char array, with one char per directory -- that's not quite as compact, but really simple and incurs only an overhead equivalent to 11 hashes after all. Suggested-by: Jeff King <peff@peff.net> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-21sha1_file: refactor read_objectLibravatar Jonathan Tan1-42/+42
read_object() and sha1_object_info_extended() both implement mechanisms such as object replacement, retrying the packed store after failing to find the object in the packed store then the loose store, and being able to mark a packed object as bad and then retrying the whole process. Consolidating these mechanisms would be a great help to maintainability. Therefore, consolidate them by extending sha1_object_info_extended() to support the functionality needed, and then modifying read_object() to use sha1_object_info_extended(). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-21sha1_file: move delta base cache code upLibravatar Jonathan Tan1-110/+110
In a subsequent patch, packed_object_info() will be modified to use the delta base cache, so move the relevant code to before packed_object_info(). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>