summaryrefslogtreecommitdiff
path: root/streaming.c
AgeCommit message (Collapse)AuthorFilesLines
2021-05-06streaming.c: move {open,close,read} from vtable to "struct git_istream"Libravatar Ævar Arnfjörð Bjarmason1-43/+29
Move the definition of the structure around the open/close/read functions introduced in 46bf043807c (streaming: a new API to read from the object store, 2011-05-11) to instead populate "close" and "read" members in the "struct git_istream". This gets us rid of an extra pointer deference, and I think makes more sense. The "close" and "read" functions are the primary interface to the stream itself. Let's also populate a "open" callback in the same struct. That's now used by open_istream() after istream_source() decides what "open" function should be used. This isn't needed to get rid of the "stream_vtbl" variables, but makes sense for consistency. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06streaming.c: stop passing around "object_info *" to open()Libravatar Ævar Arnfjörð Bjarmason1-22/+20
Change the streaming interface to stop passing around the "struct object_info" the open() functions. As seen in 7ef2d9a2604 (streaming: read non-delta incrementally from a pack, 2011-05-13) which introduced the "st->u.in_pack" assignments being changed here only the open_istream_pack_non_delta() path need these. So let's instead do this when preparing the selected callback in the istream_source() function. This might also allow the compiler to reduce the lifetime of the "oi" variable, as we've moved it from "git_istream()" to "istream_source()". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06streaming.c: remove {open,close,read}_method_decl() macrosLibravatar Ævar Arnfjörð Bjarmason1-25/+22
Remove the {open,close,read}_method_decl() macros added in 46bf043807c (streaming: a new API to read from the object store, 2011-05-11) in favor of inlining the definition of the arguments of these functions. Since we'll end up using them via the "{open,close,read}_istream_fn" types we don't gain anything in the way of compiler checking by using these macros, and as of preceding commits we no longer need to declare these argument lists twice. So declaring them at a distance just serves to make the code less readable. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06streaming.c: remove enum/function/vtbl indirectionLibravatar Ævar Arnfjörð Bjarmason1-25/+11
Remove the indirection of discovering a function pointer to use via an enum and virtual table. This refactors code added in 46bf043807c (streaming: a new API to read from the object store, 2011-05-11). We can instead simply return an "open_istream_fn" for use from the "istream_source()" selector function directly. This allows us to get rid of the "incore", "loose" and "pack_non_delta" enum variables. We'll return the functions instead. The "stream_error" variable in that enum can likewise go in favor of returning NULL, which is what the open_istream() was doing when it got that value anyway. We can thus remove the entire enum, and the "open_istream_tbl" virtual table that (indirectly) referenced it. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-06streaming.c: avoid forward declarationsLibravatar Ævar Arnfjörð Bjarmason1-88/+83
Change code added in 46bf043807c (streaming: a new API to read from the object store, 2011-05-11) to avoid forward declarations of the functions it uses. We can instead move this code to the bottom of the file, and thus avoid the open_method_decl() calls. Aside from the addition of the "static helpers[...]" comment being added here, and the removal of the forward declarations this is a move-only change. The style of the added "static helpers[...]" comment isn't in line with our usual coding style, but is consistent with several other comments used in this file, so let's use that style consistently here. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-01-31streaming: allow open_istream() to handle any repoLibravatar Matheus Tavares1-13/+15
Some callers of open_istream() at archive-tar.c and archive-zip.c are capable of working on arbitrary repositories but the repo struct is not passed down to open_istream(), which uses the_repository internally. For now, that's not a problem since the said callers are only being called with the_repository. But to be consistent and avoid future problems, let's allow open_istream() to receive a struct repository and use that instead of the_repository. This parameter addition will also be used in a future patch to make sha1-file.c:check_object_signature() be able to work on arbitrary repos. Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-06Merge branch 'jk/loose-object-cache-oid'Libravatar Junio C Hamano1-8/+8
Code clean-up. * jk/loose-object-cache-oid: prefer "hash mismatch" to "sha1 mismatch" sha1-file: avoid "sha1 file" for generic use in messages sha1-file: prefer "loose object file" to "sha1 file" in messages sha1-file: drop has_sha1_file() convert has_sha1_file() callers to has_object_file() sha1-file: convert pass-through functions to object_id sha1-file: modernize loose header/stream functions sha1-file: modernize loose object file functions http: use struct object_id instead of bare sha1 update comment references to sha1_object_info() sha1-file: fix outdated sha1 comment references
2019-02-05Merge branch 'sb/more-repo-in-api'Libravatar Junio C Hamano1-1/+1
The in-core repository instances are passed through more codepaths. * sb/more-repo-in-api: (23 commits) t/helper/test-repository: celebrate independence from the_repository path.h: make REPO_GIT_PATH_FUNC repository agnostic commit: prepare free_commit_buffer and release_commit_memory for any repo commit-graph: convert remaining functions to handle any repo submodule: don't add submodule as odb for push submodule: use submodule repos for object lookup pretty: prepare format_commit_message to handle arbitrary repositories commit: prepare logmsg_reencode to handle arbitrary repositories commit: prepare repo_unuse_commit_buffer to handle any repo commit: prepare get_commit_buffer to handle any repo commit-reach: prepare in_merge_bases[_many] to handle any repo commit-reach: prepare get_merge_bases to handle any repo commit-reach.c: allow get_merge_bases_many_0 to handle any repo commit-reach.c: allow remove_redundant to handle any repo commit-reach.c: allow merge_bases_many to handle any repo commit-reach.c: allow paint_down_to_common to handle any repo commit: allow parse_commit* to handle any repo object: parse_object to honor its repository argument object-store: prepare has_{sha1, object}_file to handle any repo object-store: prepare read_object_file to deal with any repo ...
2019-01-08sha1-file: modernize loose header/stream functionsLibravatar Jeff King1-6/+6
As with the open/map/close functions for loose objects that were recently converted, the functions for parsing the loose object stream use the name "sha1" and a bare "unsigned char *". Let's fix that so that unpack_sha1_header() becomes unpack_loose_header(), etc. These conversions are less clear-cut than the file access functions. You could argue that the they are parsing Git's canonical object format (i.e., "type size\0contents", over which we compute the hash), which is not strictly tied to loose storage. But in practice these functions are used only for loose objects, and using the term "loose_header" (instead of "object_header") distinguishes it from the object header found in packfiles (which contains the same information in a different format). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-08sha1-file: modernize loose object file functionsLibravatar Jeff King1-2/+2
The loose object access code in sha1-file.c is some of the oldest in Git, and could use some modernizing. It mostly uses "unsigned char *" for object ids, which these days should be "struct object_id". It also uses the term "sha1_file" in many functions, which is confusing. The term "loose_objects" is much better. It clearly distinguishes them from packed objects (which didn't even exist back when the name "sha1_file" came into being). And it also distinguishes it from the checksummed-file concept in csum-file.c (which until recently was actually called "struct sha1file"!). This patch converts the functions {open,close,map,stat}_sha1_file() into open_loose_object(), etc, and switches their sha1 arguments for object_id structs. Similarly, path functions like fill_sha1_path() become fill_loose_path() and use object_ids. The function sha1_loose_object_info() already says "loose", so we can just drop the "sha1" (and teach it to use object_id). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-14object-store: allow read_object_file_extended to read from any repoLibravatar Stefan Beller1-1/+1
read_object_file_extended is not widely used, so migrate it all at once. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-31read_istream_pack_non_delta(): document input handlingLibravatar Jeff King1-0/+9
Twice now we have scratched our heads about why the loose streaming code needs the protection added by 692f0bc7ae (avoid infinite loop in read_istream_loose, 2013-03-25), but the similar code in its pack counterpart does not. The short answer is that use_pack() will die before it lets us run out of bytes. Note that this could mean reading garbage (including the trailing hash) from the packfile in some cases of corruption, but that's OK. zlib will notice and complain (and if not, certainly the end result will not match the object hash we expect). Let's leave a comment this time to document our findings. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-26cache.h: add repository argument to oid_object_info_extendedLibravatar Stefan Beller1-1/+1
Add a repository argument to allow oid_object_info_extended callers to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-12replace-object: add repository argument to lookup_replace_objectLibravatar Stefan Beller1-1/+1
Add a repository argument to allow callers of lookup_replace_object to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-12object-store: move lookup_replace_object to replace-object.hLibravatar Stefan Beller1-0/+1
lookup_replace_object is a low-level function that most users of the object store do not need to use directly. Move it to replace-object.h to avoid a dependency loop in an upcoming change to its inline definition that will make use of repository.h. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11Merge branch 'sb/object-store'Libravatar Junio C Hamano1-1/+4
Refactoring the internal global data structure to make it possible to open multiple repositories, work with and then close them. Rerolled by Duy on top of a separate preliminary clean-up topic. The resulting structure of the topics looked very sensible. * sb/object-store: (27 commits) sha1_file: allow sha1_loose_object_info to handle arbitrary repositories sha1_file: allow map_sha1_file to handle arbitrary repositories sha1_file: allow map_sha1_file_1 to handle arbitrary repositories sha1_file: allow open_sha1_file to handle arbitrary repositories sha1_file: allow stat_sha1_file to handle arbitrary repositories sha1_file: allow sha1_file_name to handle arbitrary repositories sha1_file: add repository argument to sha1_loose_object_info sha1_file: add repository argument to map_sha1_file sha1_file: add repository argument to map_sha1_file_1 sha1_file: add repository argument to open_sha1_file sha1_file: add repository argument to stat_sha1_file sha1_file: add repository argument to sha1_file_name sha1_file: allow prepare_alt_odb to handle arbitrary repositories sha1_file: allow link_alt_odb_entries to handle arbitrary repositories sha1_file: add repository argument to prepare_alt_odb sha1_file: add repository argument to link_alt_odb_entries sha1_file: add repository argument to read_info_alternates sha1_file: add repository argument to link_alt_odb_entry sha1_file: add raw_object_store argument to alt_odb_usable pack: move approximate object count to object store ...
2018-03-26sha1_file: add repository argument to map_sha1_fileLibravatar Stefan Beller1-1/+4
Add a repository argument to allow map_sha1_file callers to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. While at it, move the declaration to object-store.h, where it should be easier to find. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14Convert lookup_replace_object to struct object_idLibravatar brian m. carlson1-11/+5
Convert both the argument and the return value to be pointers to struct object_id. Update the callers and their internals to deal with the new type. Remove several temporaries which are no longer needed. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14sha1_file: convert read_sha1_file to struct object_idLibravatar brian m. carlson1-1/+1
Convert read_sha1_file to take a pointer to struct object_id and rename it read_object_file. Do the same for read_sha1_file_extended. Convert one use in grep.c to use the new function without any other code change, since the pointer being passed is a void pointer that is already initialized with a pointer to struct object_id. Update the declaration and definitions of the modified functions, and apply the following semantic patch to convert the remaining callers: @@ expression E1, E2, E3; @@ - read_sha1_file(E1.hash, E2, E3) + read_object_file(&E1, E2, E3) @@ expression E1, E2, E3; @@ - read_sha1_file(E1->hash, E2, E3) + read_object_file(E1, E2, E3) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1.hash, E2, E3, E4) + read_object_file_extended(&E1, E2, E3, E4) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1->hash, E2, E3, E4) + read_object_file_extended(E1, E2, E3, E4) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14streaming: convert istream internals to struct object_idLibravatar brian m. carlson1-6/+9
Convert the various open_istream variants to take a pointer to struct object_id. Introduce a temporary, which will be removed later, to work around the fact that lookup_replace_object still returns a pointer to unsigned char. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14sha1_file: convert sha1_object_info* to object_idLibravatar brian m. carlson1-1/+1
Convert sha1_object_info and sha1_object_info_extended to take pointers to struct object_id and rename them to use "oid" instead of "sha1" in their names. Update the declaration and definition and apply the following semantic patch, plus the standard object_id transforms: @@ expression E1, E2; @@ - sha1_object_info(E1.hash, E2) + oid_object_info(&E1, E2) @@ expression E1, E2; @@ - sha1_object_info(E1->hash, E2) + oid_object_info(E1, E2) @@ expression E1, E2, E3; @@ - sha1_object_info_extended(E1.hash, E2, E3) + oid_object_info_extended(&E1, E2, E3) @@ expression E1, E2, E3; @@ - sha1_object_info_extended(E1->hash, E2, E3) + oid_object_info_extended(E1, E2, E3) Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14Convert remaining callers of sha1_object_info_extended to object_idLibravatar brian m. carlson1-1/+4
Convert the remaining caller of sha1_object_info_extended to use struct object_id. Introduce temporaries, which will be removed later, since there is a dependency loop between sha1_object_info_extended and lookup_replace_object_extended. This allows us to convert the code in a piecemeal fashion instead of all at once. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14streaming: convert open_istream to use struct object_idLibravatar brian m. carlson1-3/+3
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25Merge branch 'jk/write-in-full-fix'Libravatar Junio C Hamano1-1/+1
Many codepaths did not diagnose write failures correctly when disks go full, due to their misuse of write_in_full() helper function, which have been corrected. * jk/write-in-full-fix: read_pack_header: handle signed/unsigned comparison in read result config: flip return value of store_write_*() notes-merge: use ssize_t for write_in_full() return value pkt-line: check write_in_full() errors against "< 0" convert less-trivial versions of "write_in_full() != len" avoid "write_in_full(fd, buf, len) != len" pattern get-tar-commit-id: check write_in_full() return against 0 config: avoid "write_in_full(fd, buf, len) < len" pattern
2017-09-14convert less-trivial versions of "write_in_full() != len"Libravatar Jeff King1-1/+1
The prior commit converted many sites to check the return value of write_in_full() for negativity, rather than a mismatch with the input length. This patch covers similar cases, but where the return value is stored in an intermediate variable. These should get the same treatment, but they need to be reviewed more carefully since it would be a bug if the return value is stored in an unsigned type (which indeed, it is in one of the cases). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23pack: move use_pack()Libravatar Jonathan Tan1-0/+1
The function open_packed_git() needs to be temporarily made global. Its scope will be restored to static in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-10Merge branch 'jk/pack-objects-optim-mru'Libravatar Junio C Hamano1-1/+1
"git pack-objects" in a repository with many packfiles used to spend a lot of time looking for/at objects in them; the accesses to the packfiles are now optimized by checking the most-recently-used packfile first. * jk/pack-objects-optim-mru: pack-objects: use mru list when iterating over packs pack-objects: break delta cycles before delta-search phase sha1_file: make packed_object_info public provide an initializer for "struct object_info"
2016-10-03Merge branch 'jc/verify-loose-object-header'Libravatar Junio C Hamano1-6/+6
Codepaths that read from an on-disk loose object were too loose in validating what they are reading is a proper object file and sometimes read past the data they read from the disk, which has been corrected. H/t to Gustavo Grieco for reporting. * jc/verify-loose-object-header: unpack_sha1_header(): detect malformed object header streaming: make sure to notice corrupt object
2016-09-26streaming: make sure to notice corrupt objectLibravatar Junio C Hamano1-6/+6
The streaming read interface from a loose object called parse_sha1_header() but discarded its return value, without noticing a potential error. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-07streaming: make stream_blob_to_fd take struct object_idLibravatar brian m. carlson1-2/+2
Since all of its callers have been updated, modify stream_blob_to_fd to take a struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11provide an initializer for "struct object_info"Libravatar Jeff King1-1/+1
An all-zero initializer is fine for this struct, but because the first element is a pointer, call sites need to know to use "NULL" instead of "0". Otherwise some static checkers like "sparse" will complain; see d099b71 (Fix some sparse warnings, 2013-07-18) for example. So let's provide an initializer to make this easier to get right. But let's also comment that memset() to zero is explicitly OK[1]. One of the callers embeds object_info in another struct which is initialized via memset (expand_data in builtin/cat-file.c). Since our subset of C doesn't allow assignment from a compound literal, handling this in any other way is awkward, so we'd like to keep the ability to initialize by memset(). By documenting this property, it should make anybody who wants to change the initializer think twice before doing so. There's one other caller of interest. In parse_sha1_header(), we did not initialize the struct fully in the first place. This turned out not to be a bug because the sub-function it calls does not look at any other fields except the ones we did initialize. But that assumption might not hold in the future, so it's a dangerous construct. This patch switches it to initializing the whole struct, which protects us against unexpected reads of the other fields. [1] Obviously using memset() to initialize a pointer violates the C standard, but we long ago decided that it was an acceptable tradeoff in the real world. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-14Merge branch 'sb/plug-streaming-leak'Libravatar Junio C Hamano1-1/+4
* sb/plug-streaming-leak: streaming.c: fix a memleak
2015-03-31streaming.c: fix a memleakLibravatar John Keeping1-1/+4
When stream_blob_to_fd() opens an input stream with a filter, the filter gets discarded upon calling close_istream() before the function returns in the normal case. However, when we fail to open the stream, we failed to discard the filter. By discarding the filter in the failure case, give a consistent life-time rule of the filter to the callers; otherwise the callers need to conditionally discard the filter themselves, and this function does not give enough hint for the caller to do so correctly. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18open_istream(): do not dereference NULL in the error caseLibravatar Junio C Hamano1-1/+3
When stream-filter cannot be attached, it is expected to return NULL, and we should close the stream we opened and signal an error by returning NULL ourselves from this function. However, we attempted to dereference that NULL pointer between the point we detected the error and returned from the function. Brought-to-attention-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-27Merge branch 'ef/mingw-write'Libravatar Junio C Hamano1-1/+1
* ef/mingw-write: mingw: remove mingw_write prefer xwrite instead of write
2014-01-17prefer xwrite instead of writeLibravatar Erik Faye-Lund1-1/+1
Our xwrite wrapper already deals with a few potential hazards, and are as such more robust. Prefer it instead of write to get the robustness benefits everywhere. Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-and-improved-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-12sha1_object_info_extended(): add an "unsigned flags" parameterLibravatar Christian Couder1-1/+1
This parameter is not used yet, but it will be used to tell sha1_object_info_extended() if it should perform object replacement or not. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-24Merge branch 'jk/cat-file-batch-optim'Libravatar Junio C Hamano1-2/+2
If somebody wants to only know on-disk footprint of an object without having to know its type or payload size, we can bypass a lot of code to cheaply learn it. * jk/cat-file-batch-optim: Fix some sparse warnings sha1_object_info_extended: pass object_info to helpers sha1_object_info_extended: make type calculation optional packed_object_info: make type lookup optional packed_object_info: hoist delta type resolution to helper sha1_loose_object_info: make type lookup optional sha1_object_info_extended: rename "status" to "type" cat-file: disable object/refname ambiguity check for batch mode
2013-07-23open_istream: remove unneeded check for null pointerLibravatar Stefan Beller1-1/+1
'st' is allocated via xmalloc a few lines before and passed to the stream opening functions. The xmalloc function is written in a way that either 'st' is allocated valid memory or xmalloc already dies. The function calls to open_istream_* do not change 'st', as the pointer is passed by reference and not a pointer of a pointer. Hence 'st' cannot be null at that part of the code. Signed-off-by: Stefan Beller <stefanbeller@googlemail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-18Fix some sparse warningsLibravatar Ramsay Jones1-1/+1
Sparse issues some "Using plain integer as NULL pointer" warnings. Each warning relates to the use of an '{0}' initialiser expression in the declaration of an 'struct object_info'. The first field of this structure has pointer type. Thus, in order to suppress these warnings, we replace the initialiser expression with '{NULL}'. Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-12sha1_object_info_extended: make type calculation optionalLibravatar Jeff King1-1/+1
Each caller of sha1_object_info_extended sets up an object_info struct to tell the function which elements of the object it wants to get. Until now, getting the type of the object has always been required (and it is returned via the return type rather than a pointer in object_info). This can involve actually opening a loose object file to determine its type, or following delta chains to determine a packed file's base type. These effects produce a measurable slow-down when doing a "cat-file --batch-check" that does not include %(objecttype). This patch adds a "typep" query to struct object_info, so that it can be optionally queried just like size and disk_size. As a result, the return type of the function is no longer the object type, but rather 0/-1 for success/error. As there are only three callers total, we just fix up each caller rather than keep a compatibility wrapper: 1. The simpler sha1_object_info wrapper continues to always ask for and return the type field. 2. The istream_source function wants to know the type, and so always asks for it. 3. The cat-file batch code asks for the type only when %(objecttype) is part of the format string. On linux.git, the best-of-five for running: $ git rev-list --objects --all >objects $ time git cat-file --batch-check='%(objectsize:disk)' on a fully packed repository goes from: real 0m8.680s user 0m8.160s sys 0m0.512s to: real 0m7.205s user 0m6.580s sys 0m0.608s Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-07zero-initialize object_info structsLibravatar Jeff King1-1/+1
The sha1_object_info_extended function expects the caller to provide a "struct object_info" which contains pointers to "query" items that will be filled in. The purpose of providing pointers rather than storing the response directly in the struct is so that callers can choose not to incur the expense in finding particular fields that they do not care about. Right now the only query item is "sizep", and all callers set it explicitly to choose whether or not to query it; they can then leave the rest of the struct uninitialized. However, as we add new query items, each caller will have to be updated to explicitly turn off the new ones (by setting them to NULL). Instead, let's teach each caller to zero-initialize the struct, so that they do not have to learn about each new query item added. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-27avoid infinite loop in read_istream_looseLibravatar Jeff King1-1/+1
The read_istream_loose function loops on inflating a chunk of data from an mmap'd loose object. We end the loop when we run out of space in our output buffer, or if we see a zlib error. We need to treat Z_BUF_ERROR specially, though, as it is not fatal; it is just zlib's way of telling us that we need to either feed it more input or give it more output space. It is perfectly normal for us to hit this when we are at the end of our buffer. However, we may also get Z_BUF_ERROR because we have run out of input. In a well-formed object, this should not happen, because we have fed the whole mmap'd contents to zlib. But if the object is truncated or corrupt, we will loop forever, never giving zlib any more data, but continuing to ask it to inflate. We can fix this by considering it an error when zlib returns Z_BUF_ERROR but we still have output space left (which means it must want more input, which we know is a truncation error). It would not be sufficient to just check whether zlib had consumed all the input at the start of the loop, as it might still want to generate output from what is in its internal state. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-27read_istream_filtered: propagate read error from upstreamLibravatar Jeff King1-1/+1
The filter istream pulls data from an "upstream" stream, running it through a filter function. However, we did not properly notice when the upstream filter yielded an error, and just returned what we had read. Instead, we should propagate the error. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-27stream_blob_to_fd: detect errors reading from streamLibravatar Jeff King1-0/+2
We call read_istream, but never check its return value for errors. This can lead to us looping infinitely, as we just keep trying to write "-1" bytes (and we do not notice the error, as we simply check that write_in_full reports the same number of bytes we fed it, which of course is also -1). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-18pack-objects, streaming: turn "xx >= big_file_threshold" to ".. > .."Libravatar Nguyễn Thái Ngọc Duy1-1/+1
This is because all other places do "xx > big_file_threshold" Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03streaming: void pointer instead of char pointerLibravatar René Scharfe1-1/+1
Allow any kind of buffer to be fed to read_istream() without an explicit cast by making it's buf argument a void pointer. It's about arbitrary data, not only characters. Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-07streaming: make streaming-write-entry to be more reusableLibravatar Junio C Hamano1-0/+55
The static function in entry.c takes a cache entry and streams its blob contents to a file in the working tree. Refactor the logic to a new API function stream_blob_to_fd() that takes an object name and an open file descriptor, so that it can be reused by other callers. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-01Merge branch 'jc/streaming-filter'Libravatar Junio C Hamano1-1/+3
* jc/streaming-filter: streaming: free git_istream upon closing
2011-07-22streaming: free git_istream upon closingLibravatar Jeff King1-1/+3
Kirill Smelkov noticed that post-1.7.6 "git checkout" started leaking tons of memory. The streaming_write_entry function properly calls close_istream(), but that function did not actually free() the allocated git_istream struct. The git_istream struct is totally opaque to calling code, and must be heap-allocated by open_istream. Therefore it's not appropriate for callers to have to free it. This patch makes close_istream() into "close and de-allocate all associated resources". We could add a new "free_istream" call, but there's not much point in letting callers inspect the istream after close. And this patch's semantics make us match fopen/fclose, which is well-known and understood. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>