summaryrefslogtreecommitdiff
path: root/builtin/rev-list.c
AgeCommit message (Collapse)AuthorFilesLines
2017-08-11Merge branch 'jk/rev-list-empty-input'Libravatar Junio C Hamano1-1/+2
"git log --tag=no-such-tag" showed log starting from HEAD, which has been fixed---it now shows nothing. * jk/rev-list-empty-input: revision: do not fallback to default when rev_input_given is set rev-list: don't show usage when we see empty ref patterns revision: add rev_input_given flag t6018: flesh out empty input/output rev-list tests
2017-08-11Merge branch 'jk/reflog-walk'Libravatar Junio C Hamano1-1/+2
Numerous bugs in walking of reflogs via "log -g" and friends have been fixed. * jk/reflog-walk: reflog-walk: apply --since/--until to reflog dates reflog-walk: stop using fake parents rev-list: check reflog_info before showing usage get_revision_1(): replace do-while with an early return log: do not free parents when walking reflog log: clarify comment about reflog cycles revision: disallow reflog walking with revs->limited t1414: document some reflog-walk oddities
2017-08-02rev-list: don't show usage when we see empty ref patternsLibravatar Jeff King1-1/+2
If the user gives us no starting point for a traversal, we want to complain with our normal usage message. But if they tried to do so with "--all" or "--glob", but that happened not to match any refs, the usage message isn't helpful. We should just give them the empty output they asked for instead. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-13rev-list: pass diffopt->use_colors through to pretty-printLibravatar Jeff King1-0/+1
When rev-list pretty-prints a commit, it creates a new pretty_print_context and copies items from the rev_info struct. We don't currently copy the "use_color" field, though. Nobody seems to have noticed because the only part of pretty.c that cares is the %C(auto,...) placeholder, and presumably not many people use that with the rev-list plumbing (as opposed to with git-log). It will become more noticeable in a future patch, though, when we start treating all user-format colors as auto-colors (in which case it would become impossible to format colors with rev-list, even with --color=always). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-09rev-list: check reflog_info before showing usageLibravatar Jeff King1-1/+2
When git-rev-list sees no pending commits, it shows a usage message. This works even when reflog-walking is requested, because the reflog-walk code currently puts the reflog tips into the pending queue. In preparation for refactoring the reflog-walk code, let's explicitly check whether we have any reflogs to walk. For now this is a noop, but the existing reflog tests will make sure that it kicks in after the refactoring. Likewise, we'll add a test that "rev-list -g" without specifying any reflogs continues to fail (so that we know our check does not kick in too aggressively). Note that the implementation needs to go into its own sub-function, as the walk code does not expose its innards outside of reflog-walk.c. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24Merge branch 'bw/config-h'Libravatar Junio C Hamano1-0/+1
Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h
2017-06-19Merge branch 'jk/consistent-h'Libravatar Junio C Hamano1-0/+3
"git $cmd -h" for builtin commands calls the implementation of the command (i.e. cmd_$cmd() function) without doing any repository set-up, and the commands that expect RUN_SETUP is done by the Git potty needs to be prepared to show the help text without barfing. * jk/consistent-h: t0012: test "-h" with builtins git: add hidden --list-builtins option version: convert to parse-options diff- and log- family: handle "git cmd -h" early submodule--helper: show usage for "-h" remote-{ext,fd}: print usage message on invalid arguments upload-archive: handle "-h" option early credential: handle invalid arguments earlier
2017-06-15config: don't include config.h by defaultLibravatar Brandon Williams1-0/+1
Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-05diff- and log- family: handle "git cmd -h" earlyLibravatar Junio C Hamano1-0/+3
"git $builtin -h" bypasses the usual repository setup and calls the cmd_$builtin() function, expecting it to show the help text. Unfortunately the commands in the log- and the diff- family want to call into the revisions machinery, which by definition needs to have a repository already discovered. Strictly speaking, they may not need a repository only for parsing "-h", but it is a good discipline to future-proof codepath to ensure that setup_revisions() is called after we know that a repository is there. Handle the "git $builtin -h" special case very early in these commands to work around potential issues. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-29Merge branch 'bc/object-id'Libravatar Junio C Hamano1-1/+1
Conversion from uchar[20] to struct object_id continues. * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...
2017-05-08object: convert parse_object* to take struct object_idLibravatar brian m. carlson1-1/+1
Make parse_object, parse_object_or_die, and parse_object_buffer take a pointer to struct object_id. Remove the temporary variables inserted earlier, since they are no longer necessary. Transform all of the callers using the following semantic patch: @@ expression E1; @@ - parse_object(E1.hash) + parse_object(&E1) @@ expression E1; @@ - parse_object(E1->hash) + parse_object(E1) @@ expression E1, E2; @@ - parse_object_or_die(E1.hash, E2) + parse_object_or_die(&E1, E2) @@ expression E1, E2; @@ - parse_object_or_die(E1->hash, E2) + parse_object_or_die(E1, E2) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1.hash, E2, E3, E4, E5) + parse_object_buffer(&E1, E2, E3, E4, E5) @@ expression E1, E2, E3, E4, E5; @@ - parse_object_buffer(E1->hash, E2, E3, E4, E5) + parse_object_buffer(E1, E2, E3, E4, E5) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-23PRItime: introduce a new "printf format" for timestampsLibravatar Johannes Schindelin1-1/+1
Currently, Git's source code treats all timestamps as if they were unsigned longs. Therefore, it is okay to write "%lu" when printing them. There is a substantial problem with that, though: at least on Windows, time_t is *larger* than unsigned long, and hence we will want to switch away from the ill-specified `unsigned long` data type. So let's introduce the pseudo format "PRItime" (currently simply being defined to "lu") to make it easier to change the data type used for timestamps. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-26Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZLibravatar brian m. carlson1-1/+1
Since we will likely be introducing a new hash function at some point, and that hash function might be longer than 40 hex characters, use the constant GIT_MAX_HEXSZ, which is designed to be suitable for allocations, instead of GIT_SHA1_HEXSZ. This will ease the transition down the line by distinguishing between places where we need to allocate memory suitable for the largest hash from those where we need to handle the current hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-30use oid_to_hex_r() for converting struct object_id hashes to hex stringsLibravatar René Scharfe1-1/+1
Patch generated by Coccinelle and contrib/coccinelle/object_id.cocci. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-20rev-list: use hdr_termination instead of a always using a newlineLibravatar Jacob Keller1-1/+1
When adding support for prefixing output of log and other commands using --line-prefix, commit 660e113ce118 ("graph: add support for --line-prefix on all graph-aware output", 2016-08-31) accidentally broke rev-list --header output. In order to make the output appear with a line-prefix, the flow was changed to always use the graph subsystem for display. Unfortunately the graph flow in rev-list did not use info->hdr_termination as it was assumed that graph output would never need to putput NULs. Since we now always use the graph code in order to handle the case of line-prefix, simply replace putchar('\n') with putchar(info->hdr_termination) which will correct this issue. Add a test for the --header case to make sure we don't break it in the future. Reported-by: Dennis Kaarsemaker <dennis@kaarsemaker.net> Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-31graph: add support for --line-prefix on all graph-aware outputLibravatar Jacob Keller1-41/+33
Add an extension to git-diff and git-log (and any other graph-aware displayable output) such that "--line-prefix=<string>" will print the additional line-prefix on every line of output. To make this work, we have to fix a few bugs in the graph API that force graph_show_commit_msg to be used only when you have a valid graph. Additionally, we extend the default_diff_output_prefix handler to work even when no graph is enabled. This is somewhat of a hack on top of the graph API, but I think it should be acceptable here. This will be used by a future extension of submodule display which displays the submodule diff as the actual diff between the pre and post commit in the submodule project. Add some tests for both git-log and git-diff to ensure that the prefix is honored correctly. Signed-off-by: Jacob Keller <jacob.keller@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-20rev-list: add optional progress reportingLibravatar Jeff King1-0/+17
It's easy to ask rev-list to do a traversal that may takes many seconds (e.g., by calling "--objects --all"). In theory you can monitor its progress by the output you get to stdout, but this isn't always easy. Some operations, like "--count", don't make any output until the end. And some callers, like check_everything_connected(), are using it just for the error-checking of the traversal, and throw away stdout entirely. This patch adds a "--progress" option which can be used to give some eye-candy for a user waiting for a long traversal. This is just a rev-list option and not a regular traversal option, because it needs cooperation from the callbacks in builtin/rev-list.c to do the actual count. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-20Merge branch 'jk/rev-list-count-with-bitmap'Libravatar Junio C Hamano1-1/+5
"git rev-list --count" whose walk-length is limited with "-n" option did not work well with the counting optimized to look at the bitmap index. * jk/rev-list-count-with-bitmap: rev-list: disable bitmaps when "-n" is used with listing objects rev-list: "adjust" results of "--count --use-bitmap-index -n"
2016-06-03rev-list: disable bitmaps when "-n" is used with listing objectsLibravatar Jeff King1-1/+2
You can ask rev-list to use bitmaps to speed up an --objects traversal, which should generally give you your answers much faster. Likewise, you can ask rev-list to limit such a traversal with `-n`, in which case we'll show only a limited set of commits (and only the tree and commit objects directly reachable from those commits). But if you do both together, the results are nonsensical. We end up limiting any fallback traversal we do to _find_ the bitmaps, but the actual set of objects we output will be picked arbitrarily from the union of any bitmaps we do find, and will involve the objects of many more commits. It's possible that somebody might want this as a "show me what you can, but limit the amount of work you do" flag. But as with the prior commit clamping "--count", the results are basically non-deterministic; you'll get the values from some commits between `n` and the total number, and you can't tell which. And unlike the `--count` case, we can't easily generate the "real" value from the bitmap values (you can't just walk back `-n` commits and subtract out the reachable objects from the boundary commits; the bitmaps for `X` record its total reachability, so you don't know which objects are directly from `X` itself, which from `X^`, and so on). So let's just fallback to the non-bitmap code path in this case, so we always give a sane answer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-03rev-list: "adjust" results of "--count --use-bitmap-index -n"Libravatar Jeff King1-0/+3
If you ask rev-list for: git rev-list --count --use-bitmap-index HEAD we optimize out the actual traversal and just give you the number of bits set in the commit bitmap. This is faster, which is good. But if you ask to limit the size of the traversal, like: git rev-list --count --use-bitmap-index -n 100 HEAD we'll still output the full bitmapped number we found. On the surface, that might even seem OK. You explicitly asked to use the bitmap index, and it was cheap to compute the real answer, so we gave it to you. But there's something much more complicated going on under the hood. If we don't have a bitmap directly for HEAD, then we have to actually traverse backwards, looking for a bitmapped commit. And _that_ traversal is bounded by our `-n` count. This is a good thing, because it bounds the work we have to do, which is probably what the user wanted by asking for `-n`. But now it makes the output quite confusing. You might get many values: - your `-n` value, if we walked back and never found a bitmap (or fewer if there weren't that many commits) - the actual full count, if we found a bitmap root for every path of our traversal with in the `-n` limit - any number in between! We might have walked back and found _some_ bitmaps, but then cut off the traversal early with some commits not accounted for in the result. So you cannot even see a value higher than your `-n` and say "OK, bitmaps kicked in, this must be the real full count". The only sane thing is for git to just clamp the value to a maximum of the `-n` value, which means we should output the exact same results whether bitmaps are in use or not. The test in t5310 demonstrates this by using `-n 1`. Without this patch we fail in the full-bitmap case (where we do not have to traverse at all) but _not_ in the partial-bitmap case (where we have to walk down to find an actual bitmap). With this patch, both cases just work. I didn't implement the crazy in-between case, just because it's complicated to set up, and is really a subset of the full-count case, which we do cover. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-16list-objects: pass full pathname to callbacksLibravatar Jeff King1-8/+4
When we find a blob at "a/b/c", we currently pass this to our show_object_fn callbacks as two components: "a/b/" and "c". Callbacks which want the full value then call path_name(), which concatenates the two. But this is an inefficient interface; the path is a strbuf, and we could simply append "c" to it temporarily, then roll back the length, without creating a new copy. So we could improve this by teaching the callsites of path_name() this trick (and there are only 3). But we can also notice that no callback actually cares about the broken-down representation, and simply pass each callback the full path "a/b/c" as a string. The callback code becomes even simpler, then, as we do not have to worry about freeing an allocated buffer, nor rolling back our modification to the strbuf. This is theoretically less efficient, as some callbacks would not bother to format the final path component. But in practice this is not measurable. Since we use the same strbuf over and over, our work to grow it is amortized, and we really only pay to memcpy a few bytes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-16list-objects: drop name_path entirelyLibravatar Jeff King1-2/+2
In the previous commit, we left name_path as a thin wrapper around a strbuf. This patch drops it entirely. As a result, every show_object_fn callback needs to be adjusted. However, none of their code needs to be changed at all, because the only use was to pass it to path_name(), which now handles the bare strbuf. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-12list-objects: pass full pathname to callbacksLibravatar Jeff King1-8/+4
When we find a blob at "a/b/c", we currently pass this to our show_object_fn callbacks as two components: "a/b/" and "c". Callbacks which want the full value then call path_name(), which concatenates the two. But this is an inefficient interface; the path is a strbuf, and we could simply append "c" to it temporarily, then roll back the length, without creating a new copy. So we could improve this by teaching the callsites of path_name() this trick (and there are only 3). But we can also notice that no callback actually cares about the broken-down representation, and simply pass each callback the full path "a/b/c" as a string. The callback code becomes even simpler, then, as we do not have to worry about freeing an allocated buffer, nor rolling back our modification to the strbuf. This is theoretically less efficient, as some callbacks would not bother to format the final path component. But in practice this is not measurable. Since we use the same strbuf over and over, our work to grow it is amortized, and we really only pay to memcpy a few bytes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-12list-objects: drop name_path entirelyLibravatar Jeff King1-2/+2
In the previous commit, we left name_path as a thin wrapper around a strbuf. This patch drops it entirely. As a result, every show_object_fn callback needs to be adjusted. However, none of their code needs to be changed at all, because the only use was to pass it to path_name(), which now handles the bare strbuf. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-20Remove get_object_hash.Libravatar brian m. carlson1-2/+2
Convert all instances of get_object_hash to use an appropriate reference to the hash member of the oid member of struct object. This provides no functional change, as it is essentially a macro substitution. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-11-20Convert struct object to object_idLibravatar brian m. carlson1-7/+7
struct object is one of the major data structures dealing with object IDs. Convert it to use struct object_id instead of an unsigned char array. Convert get_object_hash to refer to the new member as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-11-20Add several uses of get_object_hash.Libravatar brian m. carlson1-2/+2
Convert most instances where the sha1 member of struct object is dereferenced to use get_object_hash. Most instances that are passed to functions that have versions taking struct object_id, such as get_sha1_hex/get_oid_hex, or instances that can be trivially converted to use struct object_id instead, are not converted. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Jeff King <peff@peff.net>
2015-10-05use sha1_to_hex_r() instead of strcpyLibravatar Jeff King1-2/+2
Before sha1_to_hex_r() existed, a simple way to get hex sha1 into a buffer was with: strcpy(buf, sha1_to_hex(sha1)); This isn't wrong (assuming the buf is 41 characters), but it makes auditing the code base for bad strcpy() calls harder, as these become false positives. Let's convert them to sha1_to_hex_r(), and likewise for some calls to find_unique_abbrev(). While we're here, we'll double-check that all of the buffers are correctly sized, and use the more obvious GIT_SHA1_HEXSZ constant. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-24rev-list: make it obvious that we do not support notesLibravatar Jeff King1-0/+3
The rev-list command does not have the internal infrastructure to display notes. Running: git rev-list --notes HEAD will silently ignore the "--notes" option. Running: git rev-list --notes --grep=. HEAD will crash on an assert. Running: git rev-list --format=%N HEAD will place a literal "%N" in the output (it does not even expand to an empty string). Let's have rev-list tell the user that it cannot fill the user's request, rather than silently producing wrong data. Likewise, let's remove mention of the notes options from the rev-list documentation. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-27Merge branch 'ls/hint-rev-list-count' into maintLibravatar Junio C Hamano1-0/+1
* ls/hint-rev-list-count: rev-list: add --count to usage guide
2015-07-10Merge branch 'ls/hint-rev-list-count'Libravatar Junio C Hamano1-0/+1
* ls/hint-rev-list-count: rev-list: add --count to usage guide
2015-07-01rev-list: disable --use-bitmap-index when pruning commitsLibravatar Jeff King1-1/+1
The reachability bitmaps do not have enough information to tell us which commits might have changed path "foo", so the current code produces wrong answers for: git rev-list --use-bitmap-index --count HEAD -- foo (it silently ignores the "foo" limiter). Instead, we should fall back to doing a normal traversal (it is OK to fall back rather than complain, because --use-bitmap-index is a pure optimization, and might not kick in for other reasons, such as there being no bitmaps in the repository). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-01rev-list: add --count to usage guideLibravatar Lawrence Siebert1-0/+1
--count should be mentioned in the usage guide, this updates code and documentation. Signed-off-by: Lawrence Siebert <lawrencesiebert@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13commit: record buffer length in cacheLibravatar Jeff King1-1/+1
Most callsites which use the commit buffer try to use the cached version attached to the commit, rather than re-reading from disk. Unfortunately, that interface provides only a pointer to the NUL-terminated buffer, with no indication of the original length. For the most part, this doesn't matter. People do not put NULs in their commit messages, and the log code is happy to treat it all as a NUL-terminated string. However, some code paths do care. For example, when checking signatures, we want to be very careful that we verify all the bytes to avoid malicious trickery. This patch just adds an optional "size" out-pointer to get_commit_buffer and friends. The existing callers all pass NULL (there did not seem to be any obvious sites where we could avoid an immediate strlen() call, though perhaps with some further refactoring we could). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13use get_cached_commit_buffer where appropriateLibravatar Jeff King1-1/+1
Some call sites check commit->buffer to see whether we have a cached buffer, and if so, do some work with it. In the long run we may want to switch these code paths to make their decision on a different boolean flag (because checking the cache may get a little more expensive in the future). But for now, we can easily support them by converting the calls to use get_cached_commit_buffer. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13provide a helper to free commit bufferLibravatar Jeff King1-2/+1
This converts two lines into one at each caller. But more importantly, it abstracts the concept of freeing the buffer, which will make it easier to change later. Note that we also need to provide a "detach" mechanism for a tricky case in index-pack. We are passed a buffer for the object generated by processing the incoming pack. If we are not using --strict, we just calculate the sha1 on that buffer and return, leaving the caller to free it. But if we are using --strict, we actually attach that buffer to an object, pass the object to the fsck functions, and then detach the buffer from the object again (so that the caller can free it as usual). In this case, we don't want to free the buffer ourselves, but just make sure it is no longer associated with the commit. Note that we are making the assumption here that the attach/detach process does not impact the buffer at all (e.g., it is never reallocated or modified). That holds true now, and we have no plans to change that. However, as we abstract the commit_buffer code, this dependency becomes less obvious. So when we detach, let's also make sure that we get back the same buffer that we gave to the commit_buffer code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-27Merge branch 'jk/pack-bitmap'Libravatar Junio C Hamano1-0/+39
Borrow the bitmap index into packfiles from JGit to speed up enumeration of objects involved in a commit range without having to fully traverse the history. * jk/pack-bitmap: (26 commits) ewah: unconditionally ntohll ewah data ewah: support platforms that require aligned reads read-cache: use get_be32 instead of hand-rolled ntoh_l block-sha1: factor out get_be and put_be wrappers do not discard revindex when re-preparing packfiles pack-bitmap: implement optional name_hash cache t/perf: add tests for pack bitmaps t: add basic bitmap functionality tests count-objects: recognize .bitmap in garbage-checking repack: consider bitmaps when performing repacks repack: handle optional files created by pack-objects repack: turn exts array into array-of-struct repack: stop using magic number for ARRAY_SIZE(exts) pack-objects: implement bitmap writing rev-list: add bitmap mode to speed up object lists pack-objects: use bitmaps when packing objects pack-objects: split add_object_entry pack-bitmap: add support for bitmap indexes documentation: add documentation for the bitmap format ewah: compressed bitmap implementation ...
2013-12-30rev-list: add bitmap mode to speed up object listsLibravatar Vicent Marti1-0/+39
The bitmap reachability index used to speed up the counting objects phase during `pack-objects` can also be used to optimize a normal rev-list if the only thing required are the SHA1s of the objects during the list (i.e., not the path names at which trees and blobs were found). Calling `git rev-list --objects --use-bitmap-index [committish]` will perform an object iteration based on a bitmap result instead of actually walking the object graph. These are some example timings for `torvalds/linux` (warm cache, best-of-five): $ time git rev-list --objects master > /dev/null real 0m34.191s user 0m33.904s sys 0m0.268s $ time git rev-list --objects --use-bitmap-index master > /dev/null real 0m1.041s user 0m0.976s sys 0m0.064s Likewise, using `git rev-list --count --use-bitmap-index` will speed up the counting operation by building the resulting bitmap and performing a fast popcount (number of bits set on the bitmap) on the result. Here are some sample timings of different ways to count commits in `torvalds/linux`: $ time git rev-list master | wc -l 399882 real 0m6.524s user 0m6.060s sys 0m3.284s $ time git rev-list --count master 399882 real 0m4.318s user 0m4.236s sys 0m0.076s $ time git rev-list --use-bitmap-index --count master 399882 real 0m0.217s user 0m0.176s sys 0m0.040s This also respects negative refs, so you can use it to count a slice of history: $ time git rev-list --count v3.0..master 144843 real 0m1.971s user 0m1.932s sys 0m0.036s $ time git rev-list --use-bitmap-index --count v3.0..master real 0m0.280s user 0m0.220s sys 0m0.056s Though note that the closer the endpoints, the less it helps. In the traversal case, we have fewer commits to cross, so we take less time. But the bitmap time is dominated by generating the pack revindex, which is constant with respect to the refs given. Note that you cannot yet get a fast --left-right count of a symmetric difference (e.g., "--count --left-right master...topic"). The slow part of that walk actually happens during the merge-base determination when we parse "master...topic". Even though a count does not actually need to know the real merge base (it only needs to take the symmetric difference of the bitmaps), the revision code would require some refactoring to handle this case. Additionally, a `--test-bitmap` flag has been added that will perform the same rev-list manually (i.e. using a normal revwalk) and using bitmaps, and verify that the results are the same. This can be used to exercise the bitmap code, and also to verify that the contents of the .bitmap file are sane. Signed-off-by: Vicent Marti <tanoku@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-16C: have space around && and || operatorsLibravatar Junio C Hamano1-1/+1
Correct all hits from git grep -e '\(&&\|||\)[^ ]' -e '[^ ]\(&&\|||\)' -- '*.c' i.e. && or || operators that are followed by anything but a SP, or that follow something other than a SP or a HT, so that these operators have a SP around it when necessary. We usually refrain from making this kind of a tree-wide change in order to avoid unnecessary conflicts with other "real work" patches, but in this case, the end result does not have a potentially cumbersome tree-wide impact, while this is a tree-wide cleanup. Fixes to compat/regex/regcomp.c and xdiff/xemit.c are to replace a HT immediately after && with a SP. This is based on Felipe's patch to bultin/symbolic-ref.c; I did all the finding out what other files in the whole tree need to be fixed and did the fix and also the log message while reviewing that single liner, so any screw-ups in this version are mine. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-28list-objects: reduce one argument in mark_edges_uninterestingLibravatar Nguyễn Thái Ngọc Duy1-1/+1
mark_edges_uninteresting() is always called with this form mark_edges_uninteresting(revs->commits, revs, ...); Remove the first argument and let mark_edges_uninteresting figure that out by itself. It helps answer the question "are this commit list and revs related in any way?" when looking at mark_edges_uninteresting implementation. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-26pretty: --format output should honor logOutputEncodingLibravatar Alexey Shumkin1-0/+1
One can set an alias $ git config [--global] alias.lg "log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cd) %C(bold blue)<%an>%Creset' --abbrev-commit --date=local" to see the log as a pretty tree (like *gitk* but in a terminal). However, log messages written in an encoding i18n.commitEncoding which differs from terminal encoding are shown corrupted even when i18n.logOutputEncoding and terminal encoding are the same (e.g. log messages committed on a Cygwin box with Windows-1251 encoding seen on a Linux box with a UTF-8 encoding and vice versa). To simplify an example we can say the following two commands are expected to give the same output to a terminal: $ git log --oneline --no-color $ git log --pretty=format:'%h %s' However, the former pays attention to i18n.logOutputEncoding configuration, while the latter does not when it formats "%s". The same corruption is true for $ git diff --submodule=log and $ git rev-list --pretty=format:%s HEAD and $ git reset --hard This patch makes pretty --format honor logOutputEncoding when it formats log message. Signed-off-by: Alexey Shumkin <Alex.Crezoff@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-29Move print_commit_list to libgit.aLibravatar Nguyễn Thái Ngọc Duy1-10/+0
This is used by bisect.c, part of libgit.a while it stays in builtin/rev-list.c. Move it to commit.c so that we won't get undefined reference if a program that uses libgit.a happens to pull it in. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Jeff King <peff@peff.net>
2012-10-29Move estimate_bisect_steps to libgit.aLibravatar Nguyễn Thái Ngọc Duy1-39/+0
This function is used by bisect.c, part of libgit.a while estimate_bisect_steps stays in builtin/rev-list.c. Move it to bisect.a so we won't have undefine reference if a standalone program that uses libgit.a happens to pull it in. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Jeff King <peff@peff.net>
2012-05-11Merge branch 'jk/maint-reflog-walk-count-vs-time'Libravatar Junio C Hamano1-0/+1
Gives a better DWIM behaviour for --pretty=format:%gd, "stash list", and "log -g", depending on how the starting point ("master" vs "master@{0}" vs "master@{now}") and date formatting options (e.g. "--date=iso") are given on the command line. By Jeff King (4) and Junio C Hamano (1) * jk/maint-reflog-walk-count-vs-time: reflog-walk: tell explicit --date=default from not having --date at all reflog-walk: always make HEAD@{0} show indexed selectors reflog-walk: clean up "flag" field of commit_reflog struct log: respect date_mode_explicit with --format:%gd t1411: add more selector index/date tests
2012-05-04log: respect date_mode_explicit with --format:%gdLibravatar Jeff King1-0/+1
When we show a reflog selector (e.g., via "git log -g"), we perform some DWIM magic: while we normally show the entry's index (e.g., HEAD@{1}), if the user has given us a date with "--date", then we show a date-based select (e.g., HEAD@{yesterday}). However, we don't want to trigger this magic if the alternate date format we got was from the "log.date" configuration; that is not sufficiently strong context for us to invoke this particular magic. To fix this, commit f4ea32f (improve reflog date/number heuristic, 2009-09-24) introduced a "date_mode_explicit" flag in rev_info. This flag is set only when we see a "--date" option on the command line, and we a vanilla date to the reflog code if the date was not explicit. Later, commit 8f8f547 (Introduce new pretty formats %g[sdD] for reflog information, 2009-10-19) added another way to show selectors, and it did not respect the date_mode_explicit flag from f4ea32f. This patch propagates the date_mode_explicit flag to the pretty-print code, which can then use it to pass the appropriate date field to the reflog code. This brings the behavior of "%gd" in line with the other formats, and means that its output is independent of any user configuration. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-28rev-list: fix --verify-objects --quiet becoming --objectsLibravatar Nguyễn Thái Ngọc Duy1-11/+15
When --quiet is specified, finish_object() is called instead of show_object(). The latter is in charge of --verify-objects and will be skipped if --quiet is specified. Move the code up to finish_object(). Also pass the quiet flag along and make it always call show_* functions to avoid similar problems in future. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-28rev-list: remove BISECT_SHOW_TRIED flagLibravatar Nguyễn Thái Ngọc Duy1-11/+1
Since c99f069 (bisect--helper: remove "--next-vars" option as it is now useless - 2009-04-21), this flag has always been off. Remove the flag and all related code. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-13git rev-list: fix invalid typecastLibravatar Clemens Buchacher1-2/+2
git rev-list passes rev_list_info, not rev_list objects. Without this fix, rev-list enables or disables the --verify-objects option depending on a read from an undefined memory location. Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-01rev-list --verify-objectLibravatar Junio C Hamano1-0/+4
Often we want to verify everything reachable from a given set of commits are present in our repository and connected without a gap to the tips of our refs. We used to do this for this purpose: $ rev-list --objects $commits_to_be_tested --not --all Even though this is good enough for catching missing commits and trees, we show the object name but do not verify their existence, let alone their well-formedness, for the blob objects at the leaf level. Add a new "--verify-object" option so that we can catch missing and broken blobs as well. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-01list-objects: pass callback data to show_objects()Libravatar Junio C Hamano1-3/+7
The traverse_commit_list() API takes two callback functions, one to show commit objects, and the other to show other kinds of objects. Even though the former has a callback data parameter, so that the callback does not have to rely on global state, the latter does not. Give the show_objects() callback the same callback data parameter. Signed-off-by: Junio C Hamano <gitster@pobox.com>