summaryrefslogtreecommitdiff
path: root/pack-bitmap.c
AgeCommit message (Collapse)AuthorFilesLines
2018-09-17Merge branch 'jk/pack-objects-with-bitmap-fix'Libravatar Junio C Hamano1-11/+3
Hotfix of the base topic. * jk/pack-objects-with-bitmap-fix: pack-bitmap: drop "loaded" flag traverse_bitmap_commit_list(): don't free result t5310: test delta reuse with bitmaps bitmap_has_sha1_in_uninteresting(): drop BUG check
2018-09-17Merge branch 'jk/pack-delta-reuse-with-bitmap'Libravatar Junio C Hamano1-1/+24
When creating a thin pack, which allows objects to be made into a delta against another object that is not in the resulting pack but is known to be present on the receiving end, the code learned to take advantage of the reachability bitmap; this allows the server to send a delta against a base beyond the "boundary" commit. * jk/pack-delta-reuse-with-bitmap: pack-objects: reuse on-disk deltas for thin "have" objects pack-bitmap: save "have" bitmap from walk t/perf: add perf tests for fetches from a bitmapped server t/perf: add infrastructure for measuring sizes t/perf: factor out percent calculations t/perf: factor boilerplate out of test_perf
2018-09-04pack-bitmap: drop "loaded" flagLibravatar Jeff King1-6/+3
In the early days of the bitmap code, there was a single static bitmap_index struct that was used behind the scenes, and any bitmap-related functions could lazily check bitmap_git.loaded to see if they needed to read the on-disk data. But since 3ae5fa0768 (pack-bitmap: remove bitmap_git global variable, 2018-06-07), the caller is responsible for the lifetime of the bitmap_index struct, and we return it from prepare_bitmap_git() and prepare_bitmap_walk(), both of which load the on-disk data (or return NULL). So outside of these functions, it's not possible to have a bitmap_index for which the loaded flag is not true. Nor is it possible to accidentally pass an already-loaded bitmap_index to the loading function (which is static-local to the file). We can drop this unnecessary and confusing flag. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-04traverse_bitmap_commit_list(): don't free resultLibravatar Jeff King1-3/+0
Since it was introduced in fff42755ef (pack-bitmap: add support for bitmap indexes, 2013-12-21), this function has freed the result after traversing it. That is an artifact of the early days of the bitmap code, when we had a single static "struct bitmap_index". Back then, it was intended that you would do: prepare_bitmap_walk(&revs); traverse_bitmap_commit_list(&revs); Since the actual bitmap_index struct was totally behind the scenes, it was convenient for traverse_bitmap_commit_list() to clean it up, clearing the way for another traversal. But since 3ae5fa0768 (pack-bitmap: remove bitmap_git global variable, 2018-06-07), the caller explicitly manages the bitmap_index struct itself, like this: b = prepare_bitmap_walk(&revs); traverse_bitmap_commit_list(b, &revs); free_bitmap_index(b); It no longer makes sense to auto-free the result after the traversal. If you want to do another traversal, you'd just create a new bitmap_index. And while nobody tries to call traverse_bitmap_commit_list() twice, the fact that it throws away the result might be surprising, and is better avoided. Note that in the "old" way it was possible for two walks to amortize the cost of opening the on-disk .bitmap file (since it was stored in the global bitmap_index), but we lost that in 3ae5fa0768. However, no code actually does this, so it's not worth addressing now. The solution might involve a new: reset_bitmap_walk(b, &revs); call. Or we might even attach the bitmap data to its matching packed_git struct, so that multiple prepare_bitmap_walk() calls could use it. That can wait until somebody actually has need of the optimization (and until then, we'll do the correct, unsurprising thing). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-04bitmap_has_sha1_in_uninteresting(): drop BUG checkLibravatar Jeff King1-2/+0
Commit 30cdc33fba (pack-bitmap: save "have" bitmap from walk, 2018-08-21) introduced a new function for looking at the "have" side of a bitmap walk. Because it only makes sense to do so after we've finished the walk, we added an extra safety assertion, making sure that bitmap_git->result is non-NULL. However, this safety is misguided. It was trying to catch the case where we had called prepare_bitmap_walk() to give us a "struct bitmap_index", but had not yet called traverse_bitmap_commit_list() to walk it. But all of the interesting computation (including setting up the result and "have" bitmaps) happens in the first function! The latter function only delivers the result to a callback function. So the case we were worried about is impossible; if you get a non-NULL result from prepare_bitmap_walk(), then its "have" field will be fully formed. But much worse, traverse_bitmap_commit_list() actually frees the result field as it finishes. Which means that this assertion is worse than useless: it's almost guaranteed to trigger! Our test suite didn't catch this because the function isn't actually exercised at all. The only caller comes from 6a1e32d532 (pack-objects: reuse on-disk deltas for thin "have" objects, 2018-08-21), and that's triggered only when you fetch or push history that contains an object with a base that is found deep in history. Our test suite fetches and pushes either don't use bitmaps, or use too-small example repositories. But any reasonably-sized real-world push or fetch (with bitmaps) would trigger this. This patch drops the harmful assertion and tweaks the docstring for the function to make the precondition clear. The tests need to be improved to exercise this new pack-objects feature, but we'll do that in a separate commit. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-21pack-bitmap: save "have" bitmap from walkLibravatar Jeff King1-1/+24
When we do a bitmap walk, we save the result, which represents (WANTs & ~HAVEs); i.e., every object we care about visiting in our walk. However, we throw away the haves bitmap, which can sometimes be useful, too. Save it and provide an access function so code which has performed a walk can query it. A few notes on the accessor interface: - the bitmap code calls these "haves" because it grew out of the want/have negotiation for fetches. But really, these are simply the objects that would be flagged UNINTERESTING in a regular traversal. Let's use that more universal nomenclature for the external module interface. We may want to change the internal naming inside the bitmap code, but that's outside the scope of this patch. - it still uses a bare "sha1" rather than "oid". That's true of all of the bitmap code. And in this particular instance, our caller in pack-objects is dealing with the bare sha1 that comes from a packed REF_DELTA (we're pointing directly to the mmap'd pack on disk). That's something we'll have to deal with as we transition to a new hash, but we can wait and see how the caller ends up being fixed and adjust this interface accordingly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20treewide: use get_all_packsLibravatar Derrick Stolee1-1/+1
There are many places in the codebase that want to iterate over all packfiles known to Git. The purposes are wide-ranging, and those that can take advantage of the multi-pack-index already do. So, use get_all_packs() instead of get_packed_git() to be sure we are iterating over all packfiles. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-21pack-bitmap: add free functionLibravatar Jonathan Tan1-6/+29
Add a function to free struct bitmap_index instances, and use it where needed (except when rebuild_existing_bitmaps() is used, since it creates references to the bitmaps within the struct bitmap_index passed to it). Note that the hashes field in struct bitmap_index is not freed because it points to another field within the same struct. The documentation for that field has been updated to clarify that. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-21pack-bitmap: remove bitmap_git global variableLibravatar Jonathan Tan1-144/+173
Remove the bitmap_git global variable. Instead, generate on demand an instance of struct bitmap_index for code that needs to access it. This allows us significant control over the lifetime of instances of struct bitmap_index. In particular, packs can now be closed without worrying if an unnecessarily long-lived "pack" field in struct bitmap_index still points to it. The bitmap API is also clearer in that we need to first obtain a struct bitmap_index, then we use it. This patch raises two potential issues: (1) memory for the struct bitmap_index is allocated without being freed, and (2) prepare_bitmap_git() and prepare_bitmap_walk() can reuse a previously loaded bitmap. For (1), this will be dealt with in a subsequent patch in this patch set that also deals with freeing the contents of the struct bitmap_index (which were not freed previously, because they have global scope). For (2), current bitmap users only load the bitmap once at most (note that pack-objects can use bitmaps or write bitmaps, but not both at the same time), so support for reuse has no effect - and future users can pass around the struct bitmap_index * obtained if they need to do 2 or more things with the same bitmap. Helped-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-18Merge branch 'jk/ewah-bounds-check'Libravatar Junio C Hamano1-1/+1
The code to read compressed bitmap was not careful to avoid reading past the end of the file, which has been corrected. * jk/ewah-bounds-check: ewah: adjust callers of ewah_read_mmap() ewah_read_mmap: bounds-check mmap reads
2018-06-18ewah: adjust callers of ewah_read_mmap()Libravatar Jeff King1-1/+1
The return value of ewah_read_mmap() is now an ssize_t, since we could (in theory) process up to 32GB of data. This would never happen in practice, but a corrupt or malicious .bitmap or index file could convince us to do so. Let's make sure that we don't stuff the value into an int, which would cause us to incorrectly move our pointer forward. We'd always move too little, since negative values are used for reporting errors. So the worst case is just that we end up reporting a corrupt file, not an out-of-bounds read. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-30Merge branch 'js/use-bug-macro'Libravatar Junio C Hamano1-3/+3
Developer support update, by using BUG() macro instead of die() to mark codepaths that should not happen more clearly. * js/use-bug-macro: BUG_exit_code: fix sparse "symbol not declared" warning Convert remaining die*(BUG) messages Replace all die("BUG: ...") calls by BUG() ones run-command: use BUG() to report bugs, not die() test-tool: help verifying BUG() code paths
2018-05-06Replace all die("BUG: ...") calls by BUG() onesLibravatar Johannes Schindelin1-3/+3
In d8193743e08 (usage.c: add BUG() function, 2017-05-12), a new macro was introduced to use for reporting bugs instead of die(). It was then subsequently used to convert one single caller in 588a538ae55 (setup_git_env: convert die("BUG") to BUG(), 2017-05-12). The cover letter of the patch series containing this patch (cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not terribly clear why only one call site was converted, or what the plan is for other, similar calls to die() to report bugs. Let's just convert all remaining ones in one fell swoop. This trick was performed by this invocation: sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16pack-objects: move in_pack_pos out of struct object_entryLibravatar Nguyễn Thái Ngọc Duy1-1/+1
This field is only need for pack-bitmap, which is an optional feature. Move it to a separate array that is only allocated when pack-bitmap is used (like objects[], it is not freed, since we need it until the end of the process) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-26packfile: keep prepare_packed_git() privateLibravatar Nguyễn Thái Ngọc Duy1-1/+0
The reason callers have to call this is to make sure either packed_git or packed_git_mru pointers are initialized since we don't do that by default. Sometimes it's hard to see this connection between where the function is called and where packed_git pointer is used (sometimes in separate functions). Keep this dependency internal because now all access to packed_git and packed_git_mru must go through get_xxx() wrappers. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-26packfile: add repository argument to prepare_packed_gitLibravatar Stefan Beller1-1/+1
Add a repository argument to allow prepare_packed_git callers to be more specific about which repository to handle. See commit "sha1_file: add repository argument to link_alt_odb_entry" for an explanation of the #define trick. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-26object-store: move packed_git and packed_git_mru to object storeLibravatar Stefan Beller1-1/+3
In a process with multiple repositories open, packfile accessors should be associated to a single repository and not shared globally. Move packed_git and packed_git_mru into the_repository and adjust callers to reflect this. [nd: while at there, wrap access to these two fields in get_packed_git() and get_packed_git_mru(). This allows us to lazily initialize these fields without caller doing that explicitly] Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16pack-bitmap: convert traverse_bitmap_commit_list to object_idLibravatar brian m. carlson1-4/+4
Convert traverse_bitmap_commit_list and the callbacks it takes to use a pointer to struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-29Merge branch 'ma/leakplugs'Libravatar Junio C Hamano1-7/+3
Memory leaks in various codepaths have been plugged. * ma/leakplugs: pack-bitmap[-write]: use `object_array_clear()`, don't leak object_array: add and use `object_array_pop()` object_array: use `object_array_clear()`, not `free()` leak_pending: use `object_array_clear()`, not `free()` commit: fix memory leak in `reduce_heads()` builtin/commit: fix memory leak in `prepare_index()`
2017-09-24pack-bitmap[-write]: use `object_array_clear()`, don't leakLibravatar Martin Ågren1-7/+3
Instead of setting the fields of rev->pending to 0/NULL, thereby leaking memory, call `object_array_clear(&rev->pending)`. In pack-bitmap.c, we make copies of those fields as `pending_nr` and `pending_e`. We never update the aliases and the original fields never change, so the aliases are not really needed and just make it harder than necessary to understand the code. While we're here, remove the aliases to make the code easier to follow. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move open_pack_index(), parse_pack_index()Libravatar Jonathan Tan1-0/+1
alloc_packed_git() in packfile.c is duplicated from sha1_file.c. In a subsequent commit, alloc_packed_git() will be removed from sha1_file.c. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30Merge branch 'jc/pack-bitmap-unaligned'Libravatar Junio C Hamano1-1/+1
An unaligned 32-bit access in pack-bitmap code ahs been corrected. * jc/pack-bitmap-unaligned: pack-bitmap: don't perform unaligned memory access
2017-06-26pack-bitmap: don't perform unaligned memory accessLibravatar James Clarke1-1/+1
The preceding bitmap entries have a 1-byte XOR-offset and 1-byte flags, so their size is not a multiple of 4. Thus the name-hash cache is only guaranteed to be 2-byte aligned and so we must use get_be32 rather than indexing the array directly. Signed-off-by: James Clarke <jrtc27@jrtc27.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08object: convert parse_object* to take struct object_idLibravatar brian m. carlson1-2/+2
Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-25sha1_file: rename git_open_noatime() to git_open()Libravatar Lars Schneider1-1/+1
This function is meant to be used when reading from files in the object store, and the original objective was to avoid smudging atime of loose object files too often, hence its name. Because we'll be extending its role in the next commit to also arrange the file descriptors they return auto-closed in the child processes, rename it to lose "noatime" part that is too specific. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-12list-objects: pass full pathname to callbacksLibravatar Jeff King1-9/+4
When we find a blob at "a/b/c", we currently pass this to our show_object_fn callbacks as two components: "a/b/" and "c". Callbacks which want the full value then call path_name(), which concatenates the two. But this is an inefficient interface; the path is a strbuf, and we could simply append "c" to it temporarily, then roll back the length, without creating a new copy. So we could improve this by teaching the callsites of path_name() this trick (and there are only 3). But we can also notice that no callback actually cares about the broken-down representation, and simply pass each callback the full path "a/b/c" as a string. The callback code becomes even simpler, then, as we do not have to worry about freeing an allocated buffer, nor rolling back our modification to the strbuf. This is theoretically less efficient, as some callbacks would not bother to format the final path component. But in practice this is not measurable. Since we use the same strbuf over and over, our work to grow it is amortized, and we really only pay to memcpy a few bytes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-12list-objects: drop name_path entirelyLibravatar Jeff King1-2/+2
In the previous commit, we left name_path as a thin wrapper around a strbuf. This patch drops it entirely. As a result, every show_object_fn callback needs to be adjusted. However, none of their code needs to be changed at all, because the only use was to pass it to path_name(), which now handles the bare strbuf. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-20Merge branch 'jk/pack-revindex'Libravatar Junio C Hamano1-8/+5
In-core storage of the reverse index for .pack files (which lets you go from a pack offset to an object name) has been streamlined. * jk/pack-revindex: pack-revindex: store entries directly in packed_git pack-revindex: drop hash table
2015-12-21pack-revindex: store entries directly in packed_gitLibravatar Jeff King1-8/+5
A pack_revindex struct has two elements: the revindex entries themselves, and a pointer to the packed_git. We need both to do lookups, because only the latter knows things like the number of objects in the pack. Now that packed_git contains the pack_revindex struct it's just as easy to pass around the packed_git itself, and we do not need the extra back-pointer. We can instead just store the entries directly in the pack. All functions which took a pack_revindex now just take a packed_git. We still lazy-load in find_pack_revindex, so most callers are unaffected. The exception is the bitmap code, which computes the revindex and caches the pointer when we load the bitmaps. We can continue to load, drop the extra cache pointer, and just access bitmap_git.pack.revindex directly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-20Remove get_object_hash.Libravatar brian m. carlson1-13/+13
Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-11-20Convert struct object to object_idLibravatar brian m. carlson1-4/+4
struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-11-20Add several uses of get_object_hash.Libravatar brian m. carlson1-13/+13
Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-09-25use strip_suffix and xstrfmt to replace suffixLibravatar Jeff King1-9/+4
When we want to convert "foo.pack" to "foo.idx", we do it by duplicating the original string and then munging the bytes in place. Let's use strip_suffix and xstrfmt instead, which has several advantages: 1. It's more clear what the intent is. 2. It does not implicitly rely on the fact that strlen(".idx") <= strlen(".pack") to avoid an overflow. 3. We communicate the assumption that the input file ends with ".pack" (and get a run-time check that this is so). 4. We drop calls to strcpy, which makes auditing the code base easier. Likewise, we can do this to convert ".pack" to ".bitmap", avoiding some manual memory computation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-24Merge branch 'es/osx-header-pollutes-mask-macro'Libravatar Junio C Hamano1-5/+5
* es/osx-header-pollutes-mask-macro: ewah: use less generic macro name ewah/bitmap: silence warning about MASK macro redefinition
2015-06-03ewah: use less generic macro nameLibravatar Jeff King1-5/+5
The ewah/ewok.h header pollutes the global namespace with "BITS_IN_WORD", without any specific notion that we are talking about the bits in an eword_t. We can give this the more specific name "BITS_IN_EWORD". Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-01Merge branch 'sb/test-bitmap-free-at-end'Libravatar Junio C Hamano1-1/+1
An earlier leakfix to bitmap testing code was incomplete. * sb/test-bitmap-free-at-end: test_bitmap_walk: free bitmap with bitmap_free
2015-05-26Merge branch 'rs/plug-leak-in-pack-bitmaps'Libravatar Junio C Hamano1-5/+3
The code to read pack-bitmap wanted to allocate a few hundred pointers to a structure, but by mistake allocated and leaked memory enough to hold that many actual structures. Correct the allocation size and also have it on stack, as it is small enough. * rs/plug-leak-in-pack-bitmaps: pack-bitmaps: plug memory leak, fix allocation size for recent_bitmaps
2015-05-22test_bitmap_walk: free bitmap with bitmap_freeLibravatar Jeff King1-1/+1
Commit f86a374 (pack-bitmap.c: fix a memleak, 2015-03-30) noticed that we leak the "result" bitmap. But we should use "bitmap_free" rather than straight "free", as the former remembers to free the bitmap array pointed to by the struct. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-19pack-bitmaps: plug memory leak, fix allocation size for recent_bitmapsLibravatar René Scharfe1-5/+3
Use an automatic variable for recent_bitmaps, an array of pointers. This way we don't allocate too much and don't have to free the memory at the end. The old code over-allocated because it reserved enough memory to store all of the structs it is only pointing to and never freed it. 160 64-bit pointers take up 1280 bytes, which is not too much to be placed on the stack. MAX_XOR_OFFSET is turned into a preprocessor constant to make it constant enough for use in an non-variable array declaration. Noticed-by: Stefan Beller <stefanbeller@gmail.com> Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-05Merge branch 'sb/test-bitmap-free-at-end'Libravatar Junio C Hamano1-0/+2
* sb/test-bitmap-free-at-end: pack-bitmap.c: fix a memleak
2015-04-12pack-bitmap.c: fix a memleakLibravatar Stefan Beller1-0/+2
Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-11Merge branch 'jc/unused-symbols'Libravatar Junio C Hamano1-14/+14
Mark file-local symbols as "static", and drop functions that nobody uses. * jc/unused-symbols: shallow.c: make check_shallow_file_for_update() static remote.c: make clear_cas_option() static urlmatch.c: make match_urls() static revision.c: make save_parents() and free_saved_parents() static line-log.c: make line_log_data_init() static pack-bitmap.c: make pack_bitmap_filename() static prompt.c: remove git_getpass() nobody uses http.c: make finish_active_slot() and handle_curl_result() static
2015-02-11Merge branch 'ak/typofixes'Libravatar Junio C Hamano1-1/+1
* ak/typofixes: t/lib-terminal.sh: fix typo pack-bitmap: fix typo
2015-01-21pack-bitmap: fix typoLibravatar Alexander Kuleshov1-1/+1
Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-15pack-bitmap.c: make pack_bitmap_filename() staticLibravatar Junio C Hamano1-14/+14
Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-12Merge branch 'jk/pack-bitmap'Libravatar Junio C Hamano1-7/+15
* jk/pack-bitmap: pack-bitmap: do not use gcc packed attribute
2014-11-30pack-bitmap: do not use gcc packed attributeLibravatar Karsten Blees1-7/+15
The "__attribute__" flag may be a noop on some compilers. That's OK as long as the code is correct without the attribute, but in this case it is not. We would typically end up with a struct that is 2 bytes too long due to struct padding, breaking both reading and writing of bitmaps. Instead of marshalling the data in a struct, let's just provide helpers for reading and writing the appropriate types. Besides being correct on all platforms, the result is more efficient and simpler to read. Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-18use REALLOC_ARRAY for changing the allocation size of arraysLibravatar René Scharfe1-4/+2
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-04add `ignore_missing_links` mode to revwalkLibravatar Vicent Marti1-0/+2
When pack-objects is computing the reachability bitmap to serve a fetch request, it can erroneously die() if some of the UNINTERESTING objects are not present. Upload-pack throws away HAVE lines from the client for objects we do not have, but we may have a tip object without all of its ancestors (e.g., if the tip is no longer reachable and was new enough to survive a `git prune`, but some of its reachable objects did get pruned). In the non-bitmap case, we do a revision walk with the HAVE objects marked as UNINTERESTING. The revision walker explicitly ignores errors in accessing UNINTERESTING commits to handle this case (and we do not bother looking at UNINTERESTING trees or blobs at all). When we have bitmaps, however, the process is quite different. The bitmap index for a pack-objects run is calculated in two separate steps: First, we perform an extensive walk from all the HAVEs to find the full set of objects reachable from them. This walk is usually optimized away because we are expected to hit an object with a bitmap during the traversal, which allows us to terminate early. Secondly, we perform an extensive walk from all the WANTs, which usually also terminates early because we hit a commit with an existing bitmap. Once we have the resulting bitmaps from the two walks, we AND-NOT them together to obtain the resulting set of objects we need to pack. When we are walking the HAVE objects, the revision walker does not know that we are walking it only to mark the results as uninteresting. We strip out the UNINTERESTING flag, because those objects _are_ interesting to us during the first walk. We want to keep going to get a complete set of reachable objects if we can. We need some way to tell the revision walker that it's OK to silently truncate the HAVE walk, just like it does for the UNINTERESTING case. This patch introduces a new `ignore_missing_links` flag to the `rev_info` struct, which we set only for the HAVE walk. It also adds tests to cover UNINTERESTING objects missing from several positions: a missing blob, a missing tree, and a missing parent commit. The missing blob already worked (as we do not care about its contents at all), but the other two cases caused us to die(). Note that there are a few cases we do not need to test: 1. We do not need to test a missing tree, with the blob still present. Without the tree that refers to it, we would not know that the blob is relevant to our walk. 2. We do not need to test a tip commit that is missing. Upload-pack omits these for us (and in fact, we complain even in the non-bitmap case if it fails to do so). Reported-by: Siddharth Agarwal <sid0@fb.com> Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-30pack-bitmap: implement optional name_hash cacheLibravatar Vicent Marti1-0/+11
When we use pack bitmaps rather than walking the object graph, we end up with the list of objects to include in the packfile, but we do not know the path at which any tree or blob objects would be found. In a recently packed repository, this is fine. A fetch would use the paths only as a heuristic in the delta compression phase, and a fully packed repository should not need to do much delta compression. As time passes, though, we may acquire more objects on top of our large bitmapped pack. If clients fetch frequently, then they never even look at the bitmapped history, and all works as usual. However, a client who has not fetched since the last bitmap repack will have "have" tips in the bitmapped history, but "want" newer objects. The bitmaps themselves degrade gracefully in this circumstance. We manually walk the more recent bits of history, and then use bitmaps when we hit them. But we would also like to perform delta compression between the newer objects and the bitmapped objects (both to delta against what we know the user already has, but also between "new" and "old" objects that the user is fetching). The lack of pathnames makes our delta heuristics much less effective. This patch adds an optional cache of the 32-bit name_hash values to the end of the bitmap file. If present, a reader can use it to match bitmapped and non-bitmapped names during delta compression. Here are perf results for p5310: Test origin/master HEAD^ HEAD ------------------------------------------------------------------------------------------------- 5310.2: repack to disk 36.81(37.82+1.43) 47.70(48.74+1.41) +29.6% 47.75(48.70+1.51) +29.7% 5310.3: simulated clone 30.78(29.70+2.14) 1.08(0.97+0.10) -96.5% 1.07(0.94+0.12) -96.5% 5310.4: simulated fetch 3.16(6.10+0.08) 3.54(10.65+0.06) +12.0% 1.70(3.07+0.06) -46.2% 5310.6: partial bitmap 36.76(43.19+1.81) 6.71(11.25+0.76) -81.7% 4.08(6.26+0.46) -88.9% You can see that the time spent on an incremental fetch goes down, as our delta heuristics are able to do their work. And we save time on the partial bitmap clone for the same reason. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>