summaryrefslogtreecommitdiff
path: root/revision.h
AgeCommit message (Collapse)AuthorFilesLines
2018-11-18Merge branch 'ds/reachable-topo-order'Libravatar Junio C Hamano1-0/+7
The revision walker machinery learned to take advantage of the commit generation numbers stored in the commit-graph file. * ds/reachable-topo-order: t6012: make rev-list tests more interesting revision.c: generation-based topo-order algorithm commit/revisions: bookkeeping before refactoring revision.c: begin refactoring --topo-order logic test-reach: add rev-list tests test-reach: add run_three_modes method prio-queue: add 'peek' operation
2018-11-06Merge branch 'md/exclude-promisor-objects-fix'Libravatar Junio C Hamano1-0/+1
Operations on promisor objects make sense in the context of only a small subset of the commands that internally use the revisions machinery, but the "--exclude-promisor-objects" option were taken and led to nonsense results by commands like "log", to which it didn't make much sense. This has been corrected. * md/exclude-promisor-objects-fix: exclude-promisor-objects: declare when option is allowed Documentation/git-log.txt: do not show --exclude-promisor-objects
2018-11-02revision.c: generation-based topo-order algorithmLibravatar Derrick Stolee1-0/+2
The current --topo-order algorithm requires walking all reachable commits up front, topo-sorting them, all before outputting the first value. This patch introduces a new algorithm which uses stored generation numbers to incrementally walk in topo-order, outputting commits as we go. This can dramatically reduce the computation time to write a fixed number of commits, such as when limiting with "-n <N>" or filling the first page of a pager. When running a command like 'git rev-list --topo-order HEAD', Git performed the following steps: 1. Run limit_list(), which parses all reachable commits, adds them to a linked list, and distributes UNINTERESTING flags. If all unprocessed commits are UNINTERESTING, then it may terminate without walking all reachable commits. This does not occur if we do not specify UNINTERESTING commits. 2. Run sort_in_topological_order(), which is an implementation of Kahn's algorithm. It first iterates through the entire set of important commits and computes the in-degree of each (plus one, as we use 'zero' as a special value here). Then, we walk the commits in priority order, adding them to the priority queue if and only if their in-degree is one. As we remove commits from this priority queue, we decrement the in-degree of their parents. 3. While we are peeling commits for output, get_revision_1() uses pop_commit on the full list of commits computed by sort_in_topological_order(). In the new algorithm, these three steps correspond to three different commit walks. We run these walks simultaneously, and advance each only as far as necessary to satisfy the requirements of the 'higher order' walk. We know when we can pause each walk by using generation numbers from the commit- graph feature. Recall that the generation number of a commit satisfies: * If the commit has at least one parent, then the generation number is one more than the maximum generation number among its parents. * If the commit has no parent, then the generation number is one. There are two special generation numbers: * GENERATION_NUMBER_INFINITY: this value is 0xffffffff and indicates that the commit is not stored in the commit-graph and the generation number was not previously calculated. * GENERATION_NUMBER_ZERO: this value (0) is a special indicator to say that the commit-graph was generated by a version of Git that does not compute generation numbers (such as v2.18.0). Since we use generation_numbers_enabled() before using the new algorithm, we do not need to worry about GENERATION_NUMBER_ZERO. However, the existence of GENERATION_NUMBER_INFINITY implies the following weaker statement than the usual we expect from generation numbers: If A and B are commits with generation numbers gen(A) and gen(B) and gen(A) < gen(B), then A cannot reach B. Thus, we will walk in each of our stages until the "maximum unexpanded generation number" is strictly lower than the generation number of a commit we are about to use. The walks are as follows: 1. EXPLORE: using the explore_queue priority queue (ordered by maximizing the generation number), parse each reachable commit until all commits in the queue have generation number strictly lower than needed. During this walk, update the UNINTERESTING flags as necessary. 2. INDEGREE: using the indegree_queue priority queue (ordered by maximizing the generation number), add one to the in- degree of each parent for each commit that is walked. Since we walk in order of decreasing generation number, we know that discovering an in-degree value of 0 means the value for that commit was not initialized, so should be initialized to two. (Recall that in-degree value "1" is what we use to say a commit is ready for output.) As we iterate the parents of a commit during this walk, ensure the EXPLORE walk has walked beyond their generation numbers. 3. TOPO: using the topo_queue priority queue (ordered based on the sort_order given, which could be commit-date, author- date, or typical topo-order which treats the queue as a LIFO stack), remove a commit from the queue and decrement the in-degree of each parent. If a parent has an in-degree of one, then we add it to the topo_queue. Before we decrement the in-degree, however, ensure the INDEGREE walk has walked beyond that generation number. The implementations of these walks are in the following methods: * explore_walk_step and explore_to_depth * indegree_walk_step and compute_indegrees_to_depth * next_topo_commit and expand_topo_walk These methods have some patterns that may seem strange at first, but they are probably carry-overs from their equivalents in limit_list and sort_in_topological_order. One thing that is missing from this implementation is a proper way to stop walking when the entire queue is UNINTERESTING, so this implementation is not enabled by comparisions, such as in 'git rev-list --topo-order A..B'. This can be updated in the future. In my local testing, I used the following Git commands on the Linux repository in three modes: HEAD~1 with no commit-graph, HEAD~1 with a commit-graph, and HEAD with a commit-graph. This allows comparing the benefits we get from parsing commits from the commit-graph and then again the benefits we get by restricting the set of commits we walk. Test: git rev-list --topo-order -100 HEAD HEAD~1, no commit-graph: 6.80 s HEAD~1, w/ commit-graph: 0.77 s HEAD, w/ commit-graph: 0.02 s Test: git rev-list --topo-order -100 HEAD -- tools HEAD~1, no commit-graph: 9.63 s HEAD~1, w/ commit-graph: 6.06 s HEAD, w/ commit-graph: 0.06 s This speedup is due to a few things. First, the new generation- number-enabled algorithm walks commits on order of the number of results output (subject to some branching structure expectations). Since we limit to 100 results, we are running a query similar to filling a single page of results. Second, when specifying a path, we must parse the root tree object for each commit we walk. The previous benefits from the commit-graph are entirely from reading the commit-graph instead of parsing commits. Since we need to parse trees for the same number of commits as before, we slow down significantly from the non-path-based query. For the test above, I specifically selected a path that is changed frequently, including by merge commits. A less-frequently-changed path (such as 'README') has similar end-to-end time since we need to walk the same number of commits (before determining we do not have 100 hits). However, get the benefit that the output is presented to the user as it is discovered, much the same as a normal 'git log' command (no '--topo-order'). This is an improved user experience, even if the command has the same runtime. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-02revision.c: begin refactoring --topo-order logicLibravatar Derrick Stolee1-0/+4
When running 'git rev-list --topo-order' and its kin, the topo_order setting in struct rev_info implies the limited setting. This means that the following things happen during prepare_revision_walk(): * revs->limited implies we run limit_list() to walk the entire reachable set. There are some short-cuts here, such as if we perform a range query like 'git rev-list COMPARE..HEAD' and we can stop limit_list() when all queued commits are uninteresting. * revs->topo_order implies we run sort_in_topological_order(). See the implementation of that method in commit.c. It implies that the full set of commits to order is in the given commit_list. These two methods imply that a 'git rev-list --topo-order HEAD' command must walk the entire reachable set of commits _twice_ before returning a single result. If we have a commit-graph file with generation numbers computed, then there is a better way. This patch introduces some necessary logic redirection when we are in this situation. In v2.18.0, the commit-graph file contains zero-valued bytes in the positions where the generation number is stored in v2.19.0 and later. Thus, we use generation_numbers_enabled() to check if the commit-graph is available and has non-zero generation numbers. When setting revs->limited only because revs->topo_order is true, only do so if generation numbers are not available. There is no reason to use the new logic as it will behave similarly when all generation numbers are INFINITY or ZERO. In prepare_revision_walk(), if we have revs->topo_order but not revs->limited, then we trigger the new logic. It breaks the logic into three pieces, to fit with the existing framework: 1. init_topo_walk() fills a new struct topo_walk_info in the rev_info struct. We use the presence of this struct as a signal to use the new methods during our walk. In this patch, this method simply calls limit_list() and sort_in_topological_order(). In the future, this method will set up a new data structure to perform that logic in-line. 2. next_topo_commit() provides get_revision_1() with the next topo- ordered commit in the list. Currently, this simply pops the commit from revs->commits. 3. expand_topo_walk() provides get_revision_1() with a way to signal walking beyond the latest commit. Currently, this calls add_parents_to_list() exactly like the old logic. While this commit presents method redirection for performing the exact same logic as before, it allows the next commit to focus only on the new logic. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-30Merge branch 'md/filter-trees'Libravatar Junio C Hamano1-2/+24
The "rev-list --filter" feature learned to exclude all trees via "tree:0" filter. * md/filter-trees: list-objects: support for skipping tree traversal filter-trees: code clean-up of tests list-objects-filter: implement filter tree:0 list-objects-filter-options: do not over-strbuf_init list-objects-filter: use BUG rather than die revision: mark non-user-given objects instead rev-list: handle missing tree objects properly list-objects: always parse trees gently list-objects: refactor to process_tree_contents list-objects: store common func args in struct
2018-10-23exclude-promisor-objects: declare when option is allowedLibravatar Matthew DeVore1-0/+1
The --exclude-promisor-objects option causes some funny behavior in at least two commands: log and blame. It causes a BUG crash: $ git log --exclude-promisor-objects BUG: revision.c:2143: exclude_promisor_objects can only be used when fetch_if_missing is 0 Aborted [134] Fix this such that the option is treated like any other unknown option. The commands that must support it are limited, so declare in those commands that the flag is supported. In particular: pack-objects prune rev-list The commands were found by searching for logic which parses --exclude-promisor-objects outside of revision.c. Extra logic outside of revision.c is needed because fetch_if_missing must be turned on before revision.c sees the option or it will BUG-crash. The above list is supported by the fact that no other command is introspectively invoked by another command passing --exclude-promisor-object. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-19Merge branch 'nd/the-index'Libravatar Junio C Hamano1-4/+11
Various codepaths in the core-ish part learn to work on an arbitrary in-core index structure, not necessarily the default instance "the_index". * nd/the-index: (23 commits) revision.c: reduce implicit dependency the_repository revision.c: remove implicit dependency on the_index ws.c: remove implicit dependency on the_index tree-diff.c: remove implicit dependency on the_index submodule.c: remove implicit dependency on the_index line-range.c: remove implicit dependency on the_index userdiff.c: remove implicit dependency on the_index rerere.c: remove implicit dependency on the_index sha1-file.c: remove implicit dependency on the_index patch-ids.c: remove implicit dependency on the_index merge.c: remove implicit dependency on the_index merge-blobs.c: remove implicit dependency on the_index ll-merge.c: remove implicit dependency on the_index diff-lib.c: remove implicit dependency on the_index read-cache.c: remove implicit dependency on the_index diff.c: remove implicit dependency on the_index grep.c: remove implicit dependency on the_index diff.c: remove the_index dependency in textconv() functions blame.c: rename "repo" argument to "r" combine-diff.c: remove implicit dependency on the_index ...
2018-10-07revision: mark non-user-given objects insteadLibravatar Matthew DeVore1-2/+9
Currently, list-objects.c incorrectly treats all root trees of commits as USER_GIVEN. Also, it would be easier to mark objects that are non-user-given instead of user-given, since the places in the code where we access an object through a reference are more obvious than the places where we access an object that was given by the user. Resolve these two problems by introducing a flag NOT_USER_GIVEN that marks blobs and trees that are non-user-given, replacing USER_GIVEN. (Only blobs and trees are marked because this mark is only used when filtering objects, and filtering of other types of objects is not supported yet.) This fixes a bug in that git rev-list behaved differently from git pack-objects. pack-objects would *not* filter objects given explicitly on the command line and rev-list would filter. This was because the two commands used a different function to add objects to the rev_info struct. This seems to have been an oversight, and pack-objects has the correct behavior, so I added a test to make sure that rev-list now behaves properly. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-07rev-list: handle missing tree objects properlyLibravatar Matthew DeVore1-0/+15
Previously, we assumed only blob objects could be missing. This patch makes rev-list handle missing trees like missing blobs. The --missing=* and --exclude-promisor-objects flags now work for trees as they already do for blobs. This is demonstrated in t6112. Signed-off-by: Matthew DeVore <matvore@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21revision.c: reduce implicit dependency the_repositoryLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-21revision.c: remove implicit dependency on the_indexLibravatar Nguyễn Thái Ngọc Duy1-3/+10
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-17Merge branch 'es/format-patch-rangediff'Libravatar Junio C Hamano1-0/+6
"git format-patch" learned a new "--range-diff" option to explain the difference between this version and the previous attempt in the cover letter (or after the tree-dashes as a comment). * es/format-patch-rangediff: format-patch: allow --range-diff to apply to a lone-patch format-patch: add --creation-factor tweak for --range-diff format-patch: teach --range-diff to respect -v/--reroll-count format-patch: extend --range-diff to accept revision range format-patch: add --range-diff option to embed diff in cover letter range-diff: relieve callers of low-level configuration burden range-diff: publish default creation factor range-diff: respect diff_option.file rather than assuming 'stdout'
2018-09-17Merge branch 'es/format-patch-interdiff'Libravatar Junio C Hamano1-0/+5
"git format-patch" learned a new "--interdiff" option to explain the difference between this version and the previous atttempt in the cover letter (or after the tree-dashes as a comment). * es/format-patch-interdiff: format-patch: allow --interdiff to apply to a lone-patch log-tree: show_log: make commentary block delimiting reusable interdiff: teach show_interdiff() to indent interdiff format-patch: teach --interdiff to respect -v/--reroll-count format-patch: add --interdiff option to embed diff in cover letter format-patch: allow additional generated content in make_cover_letter()
2018-09-17Merge branch 'jk/rev-list-stdin-noop-is-ok'Libravatar Junio C Hamano1-0/+5
"git rev-list --stdin </dev/null" used to be an error; it now shows no output without an error. "git rev-list --stdin --default HEAD" still falls back to the given default when nothing is given on the standard input. * jk/rev-list-stdin-noop-is-ok: rev-list: make empty --stdin not an error
2018-08-22rev-list: make empty --stdin not an errorLibravatar Jeff King1-0/+5
When we originally did the series that contains 7ba826290a (revision: add rev_input_given flag, 2017-08-02) the intent was that "git rev-list --stdin </dev/null" would similarly become a successful noop. However, an attempt at the time to do that did not work[1]. The problem is that rev_input_given serves two roles: - it tells rev-list.c that it should not error out - it tells revision.c that it should not have the "default" ref kick (e.g., "HEAD" in "git log") We want to trigger the former, but not the latter. This is technically possible with a single flag, if we set the flag only after revision.c's revs->def check. But this introduces a rather subtle ordering dependency. Instead, let's keep two flags: one to denote when we got actual input (which triggers both roles) and one for when we read stdin (which triggers only the first). This does mean a caller interested in the first role has to check both flags, but there's only one such caller. And any future callers might want to make the distinction anyway (e.g., if they care less about erroring out, and more about whether revision.c soaked up our stdin). In fact, we already keep such a flag internally in revision.c for this purpose, so this is really just exposing that to the caller (and the old function-local flag can go away in favor of our new one). [1] https://public-inbox.org/git/20170802223416.gwiezhbuxbdmbjzx@sigill.intra.peff.net/ Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20Merge branch 'en/incl-forward-decl'Libravatar Junio C Hamano1-0/+1
Code hygiene improvement for the header files. * en/incl-forward-decl: Remove forward declaration of an enum compat/precompose_utf8.h: use more common include guard style urlmatch.h: fix include guard Move definition of enum branch_track from cache.h to branch.h alloc: make allocate_alloc_state and clear_alloc_state more consistent Add missing includes and forward declarations
2018-08-15Add missing includes and forward declarationsLibravatar Elijah Newren1-0/+1
I looped over the toplevel header files, creating a temporary two-line C program for each consisting of #include "git-compat-util.h" #include $HEADER This patch is the result of manually fixing errors in compiling those tiny programs. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-14format-patch: teach --range-diff to respect -v/--reroll-countLibravatar Eric Sunshine1-0/+1
The --range-diff option announces the embedded range-diff generically as "Range-diff:", however, we can do better when --reroll-count is specified by emitting "Range-diff against v{n}:" instead. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-14format-patch: add --range-diff option to embed diff in cover letterLibravatar Eric Sunshine1-0/+5
When submitting a revised version of a patch series, it can be helpful (to reviewers) to include a summary of changes since the previous attempt in the form of a range-diff, however, doing so involves manually copy/pasting the diff into the cover letter. Add a --range-diff option to automate this process. The argument to --range-diff specifies the tip of the previous attempt against which to generate the range-diff. For example: git format-patch --cover-letter --range-diff=v1 -3 v2 (At this stage, the previous attempt and the patch series being formatted must share a common base, however, a subsequent enhancement will make it possible to specify an explicit revision range for the previous attempt.) Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-14Merge branch 'es/format-patch-interdiff' into es/format-patch-rangediffLibravatar Junio C Hamano1-0/+5
* es/format-patch-interdiff: format-patch: allow --interdiff to apply to a lone-patch log-tree: show_log: make commentary block delimiting reusable interdiff: teach show_interdiff() to indent interdiff format-patch: teach --interdiff to respect -v/--reroll-count format-patch: add --interdiff option to embed diff in cover letter format-patch: allow additional generated content in make_cover_letter()
2018-08-03revision.h: drop extern from function declarationLibravatar Nguyễn Thái Ngọc Duy1-34/+35
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-24Merge branch 'jt/partial-clone-fsck-connectivity'Libravatar Junio C Hamano1-1/+2
Partial clone support of "git clone" has been updated to correctly validate the objects it receives from the other side. The server side has been corrected to send objects that are directly requested, even if they may match the filtering criteria (e.g. when doing a "lazy blob" partial clone). * jt/partial-clone-fsck-connectivity: clone: check connectivity even if clone is partial upload-pack: send refs' objects despite "filter"
2018-07-23format-patch: teach --interdiff to respect -v/--reroll-countLibravatar Eric Sunshine1-0/+1
The --interdiff option introduces the embedded interdiff generically as "Interdiff:", however, we can do better when --reroll-count is specified by emitting "Interdiff against v{n}:" instead. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-23format-patch: add --interdiff option to embed diff in cover letterLibravatar Eric Sunshine1-0/+4
When submitting a revised version of a patch series, it can be helpful (to reviewers) to include a summary of changes since the previous attempt in the form of an interdiff, however, doing so involves manually copy/pasting the diff into the cover letter. Add an --interdiff option to automate this process. The argument to --interdiff specifies the tip of the previous attempt against which to generate the interdiff. For example: git format-patch --cover-letter --interdiff=v1 -3 v2 The previous attempt and the patch series being formatted must share a common base. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-09upload-pack: send refs' objects despite "filter"Libravatar Jonathan Tan1-1/+2
A filter line in a request to upload-pack filters out objects regardless of whether they are directly referenced by a "want" line or not. This means that cloning with "--filter=blob:none" (or another filter that excludes blobs) from a repository with at least one ref pointing to a blob (for example, the Git repository itself) results in output like the following: error: missing object referenced by 'refs/tags/junio-gpg-pub' and if that particular blob is not referenced by a fetched tree, the resulting clone fails fsck because there is no object from the remote to vouch that the missing object is a promisor object. Update both the protocol and the upload-pack implementation to include all explicitly specified "want" objects in the packfile regardless of the filter specification. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-21revision.c: use commit-slab for show_sourceLibravatar Nguyễn Thái Ngọc Duy1-1/+4
Instead of relying on commit->util to store the source string, let the user provide a commit-slab to store the source strings in. It's done so that commit->util can be removed. See more explanation in the commit that removes commit->util. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-06Merge branch 'jk/cached-commit-buffer'Libravatar Junio C Hamano1-1/+0
Code clean-up. * jk/cached-commit-buffer: revision: drop --show-all option commit: drop uses of get_cached_commit_buffer()
2018-02-22revision: drop --show-all optionLibravatar Jeff King1-1/+0
This was an undocumented debugging aid that does not seem to have come in handy in the past decade, judging from its lack of mentions on the mailing list. Let's drop it in the name of simplicity. This is morally a revert of 3131b71301 (Add "--show-all" revision walker flag for debugging, 2008-02-09), but note that I did leave in the mapping of UNINTERESTING to "^" in get_revision_mark(). I don't think this would be possible to trigger with the current code, but it's the only sensible marker. We'll skip the usual deprecation period because this was explicitly a debugging aid that was never documented. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-15Merge branch 'rs/lose-leak-pending' into maintLibravatar Junio C Hamano1-12/+0
API clean-up around revision traversal. * rs/lose-leak-pending: commit: remove unused function clear_commit_marks_for_object_array() revision: remove the unused flag leak_pending checkout: avoid using the rev_info flag leak_pending bundle: avoid using the rev_info flag leak_pending bisect: avoid using the rev_info flag leak_pending object: add clear_commit_marks_all() ref-filter: use clear_commit_marks_many() in do_merge_filter() commit: use clear_commit_marks_many() in remove_redundant() commit: avoid allocation in clear_commit_marks_many()
2018-02-13Merge branch 'jh/fsck-promisors'Libravatar Junio C Hamano1-1/+4
In preparation for implementing narrow/partial clone, the machinery for checking object connectivity used by gc and fsck has been taught that a missing object is OK when it is referenced by a packfile specially marked as coming from trusted repository that promises to make them available on-demand and lazily. * jh/fsck-promisors: gc: do not repack promisor packfiles rev-list: support termination at promisor objects sha1_file: support lazily fetching missing objects introduce fetch-object: fetch one promisor object index-pack: refactor writing of .keep files fsck: support promisor objects as CLI argument fsck: support referenced promisor objects fsck: support refs pointing to promisor objects fsck: introduce partialclone extension extension.partialclone: introduce partial clone extension
2018-01-23Merge branch 'rs/lose-leak-pending'Libravatar Junio C Hamano1-12/+0
API clean-up around revision traversal. * rs/lose-leak-pending: commit: remove unused function clear_commit_marks_for_object_array() revision: remove the unused flag leak_pending checkout: avoid using the rev_info flag leak_pending bundle: avoid using the rev_info flag leak_pending bisect: avoid using the rev_info flag leak_pending object: add clear_commit_marks_all() ref-filter: use clear_commit_marks_many() in do_merge_filter() commit: use clear_commit_marks_many() in remove_redundant() commit: avoid allocation in clear_commit_marks_many()
2017-12-28Merge branch 'sb/describe-blob'Libravatar Junio C Hamano1-1/+2
"git describe" was taught to dig trees deeper to find a <commit-ish>:<path> that refers to a given blob object. * sb/describe-blob: builtin/describe.c: describe a blob builtin/describe.c: factor out describe_commit builtin/describe.c: print debug statements earlier builtin/describe.c: rename `oid` to avoid variable shadowing revision.h: introduce blob/tree walking in order of the commits list-objects.c: factor out traverse_trees_and_blobs t6120: fix typo in test name
2017-12-28revision: remove the unused flag leak_pendingLibravatar René Scharfe1-12/+0
Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-12format: create pretty.h fileLibravatar Olga Telezhnaya1-1/+1
Create header for pretty.c to make formatting interface more structured. This is a middle point, this file would be merged further with other files which contain formatting stuff. Signed-off-by: Olga Telezhnaia <olyatelezhnaya@gmail.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08rev-list: support termination at promisor objectsLibravatar Jonathan Tan1-1/+4
Teach rev-list to support termination of an object traversal at any object from a promisor remote (whether one that the local repo also has, or one that the local repo knows about because it has another promisor object that references it). This will be used subsequently in gc and in the connectivity check used by fetch. For efficiency, if an object is referenced by a promisor object, and is in the local repo only as a non-promisor object, object traversal will not stop there. This is to avoid building the list of promisor object references. (In list-objects.c, the case where obj is NULL in process_blob() and process_tree() do not need to be changed because those happen only when there is a conflict between the expected type and the existing object. If the object doesn't exist, an object will be synthesized, which is fine.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-16revision.h: introduce blob/tree walking in order of the commitsLibravatar Stefan Beller1-1/+2
The functionality to list tree objects in the order they were seen while traversing the commits will be used in one of the next commits, where we teach `git describe` to describe not only commits, but blobs, too. The change in list-objects.c is rather minimal as we'll be re-using the infrastructure put in place of the revision walking machinery. For example one could expect that add_pending_tree is not called, but rather commit->tree is directly passed to the tree traversal function. This however requires a lot more code than just emptying the queue containing trees after each commit. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-29Merge branch 'ma/leakplugs'Libravatar Junio C Hamano1-0/+11
Memory leaks in various codepaths have been plugged. * ma/leakplugs: pack-bitmap[-write]: use `object_array_clear()`, don't leak object_array: add and use `object_array_pop()` object_array: use `object_array_clear()`, not `free()` leak_pending: use `object_array_clear()`, not `free()` commit: fix memory leak in `reduce_heads()` builtin/commit: fix memory leak in `prepare_index()`
2017-09-24leak_pending: use `object_array_clear()`, not `free()`Libravatar Martin Ågren1-0/+11
Setting `leak_pending = 1` tells `prepare_revision_walk()` not to release the `pending` array, and makes that the caller's responsibility. See 4a43d374f (revision: add leak_pending flag, 2011-10-01) and 353f5657a (bisect: use leak_pending flag, 2011-10-01). Commit 1da1e07c8 (clean up name allocation in prepare_revision_walk, 2014-10-15) fixed a memory leak in `prepare_revision_walk()` by switching from `free()` to `object_array_clear()`. However, where we use the `leak_pending`-mechanism, we're still only calling `free()`. Use `object_array_clear()` instead. Copy some helpful comments from 353f5657a to the other callers that we update to clarify the memory responsibilities, and to highlight that the commits are not affected when we clear the array -- it is indeed correct to both tidy up the commit flags and clear the object array. Document `leak_pending` in revision.h to help future users get this right. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-24revision.h: new flag in struct rev_info wrt. worktree-related refsLibravatar Nguyễn Thái Ngọc Duy1-0/+1
The revision walker can walk through per-worktree refs like HEAD or SHA-1 references in the index. These currently are from the current worktree only. This new flag is added to change rev-list behavior in this regard: When single_worktree is set, only current worktree is considered. When it is not set (which is the default), all worktrees are considered. The default is chosen so because the two big components that rev-list works with are object database (entirely shared between worktrees) and refs (mostly shared). It makes sense that default behavior goes per-repo too instead of per-worktree. The flag will eventually be exposed as a rev-list argument with documents. For now it stays internal until the new behavior is fully implemented. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-02revision: add rev_input_given flagLibravatar Jeff King1-0/+7
Normally a caller that invokes setup_revisions() has to check rev.pending to see if anything was actually queued for the traversal. But they can't tell the difference between two cases: 1. The user gave us no tip from which to start a traversal. 2. The user tried to give us tips via --glob, --all, etc, but their patterns ended up being empty. Let's set a flag in the rev_info struct that callers can use to tell the difference. We can set this from the init_all_refs_cb() function. That's a little funny because it's not exactly about initializing the "cb" struct itself. But that function is the common setup place for doing pattern traversals that is used by --glob, --all, etc. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-22Merge branch 'sg/revision-parser-skip-prefix'Libravatar Junio C Hamano1-2/+3
Code clean-up. * sg/revision-parser-skip-prefix: revision.c: use skip_prefix() in handle_revision_pseudo_opt() revision.c: use skip_prefix() in handle_revision_opt() revision.c: stricter parsing of '--early-output' revision.c: stricter parsing of '--no-{min,max}-parents' revision.h: turn rev_info.early_output back into an unsigned int
2017-06-12revision.h: turn rev_info.early_output back into an unsigned intLibravatar SZEDER Gábor1-2/+3
rev_info.early_output started out as an unsigned int in cdcefbc97 (Add "--early-output" log flag for interactive GUI use, 2007-11-03), but later it was turned into a single bit in a bit field in cc243c3ce (show: --ignore-missing, 2011-05-18) without explanation, though the code using it still expects it to be a regular integer type and uses it as a counter. Consequently, any even number given via '--early-output=<N>', or indeed a plain '--early-output' defaulting to 100 effectively disabled the feature. Turn rev_info.early_output back into its origin unsigned int data type, making '--early-output' work again. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-29Merge branch 'bc/object-id'Libravatar Junio C Hamano1-3/+3
Conversion from uchar[20] to struct object_id continues. * bc/object-id: (53 commits) object: convert parse_object* to take struct object_id tree: convert parse_tree_indirect to struct object_id sequencer: convert do_recursive_merge to struct object_id diff-lib: convert do_diff_cache to struct object_id builtin/ls-tree: convert to struct object_id merge: convert checkout_fast_forward to struct object_id sequencer: convert fast_forward_to to struct object_id builtin/ls-files: convert overlay_tree_on_cache to object_id builtin/read-tree: convert to struct object_id sha1_name: convert internals of peel_onion to object_id upload-pack: convert remaining parse_object callers to object_id revision: convert remaining parse_object callers to object_id revision: rename add_pending_sha1 to add_pending_oid http-push: convert process_ls_object and descendants to object_id refs/files-backend: convert many internals to struct object_id refs: convert struct ref_update to use struct object_id ref-filter: convert some static functions to struct object_id Convert struct ref_array_item to struct object_id Convert the verify_pack callback to struct object_id Convert lookup_tag to struct object_id ...
2017-05-08revision: rename add_pending_sha1 to add_pending_oidLibravatar brian m. carlson1-3/+3
Rename this function and convert it to take a pointer to struct object_id. This is a prerequisite for converting get_reference, which is needed to convert parse_object. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-27timestamp_t: a new data type for timestampsLibravatar Johannes Schindelin1-2/+2
Git's source code assumes that unsigned long is at least as precise as time_t. Which is incorrect, and causes a lot of problems, in particular where unsigned long is only 32-bit (notably on Windows, even in 64-bit versions). So let's just use a more appropriate data type instead. In preparation for this, we introduce the new `timestamp_t` data type. By necessity, this is a very, very large patch, as it has to replace all timestamps' data type in one go. As we will use a data type that is not necessarily identical to `time_t`, we need to be very careful to use `time_t` whenever we interact with the system functions, and `timestamp_t` everywhere else. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-24Merge branch 'rs/path-name-safety-cleanup'Libravatar Junio C Hamano1-2/+0
Code clean-up. * rs/path-name-safety-cleanup: revision: remove declaration of path_name()
2017-03-18revision: remove declaration of path_name()Libravatar René Scharfe1-2/+0
The definition of path_name() was removed by 2824e1841 (list-objects: pass full pathname to callbacks); remove its declaration as well. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-13Merge branch 'lt/pretty-expand-tabs'Libravatar Junio C Hamano1-0/+2
When "git log" shows the log message indented by 4-spaces, the remainder of a line after a HT does not align in the way the author originally intended. The command now expands tabs by default in such a case, and allows the users to override it with a new option, '--no-expand-tabs'. * lt/pretty-expand-tabs: pretty: test --expand-tabs pretty: allow tweaking tabwidth in --expand-tabs pretty: enable --expand-tabs by default for selected pretty formats pretty: expand tabs in indented logs to make things line up properly
2016-03-30pretty: enable --expand-tabs by default for selected pretty formatsLibravatar Junio C Hamano1-1/+2
"git log --pretty={medium,full,fuller}" and "git log" by default prepend 4 spaces to the log message, so it makes sense to enable the new "expand-tabs" facility by default for these formats. Add --no-expand-tabs option to override the new default. The change alone breaks a test in t4201 that runs "git shortlog" on the output from "git log", and expects that the output from "git log" does not do such a tab expansion. Adjust the test to explicitly disable expand-tabs with --no-expand-tabs. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-30pretty: expand tabs in indented logs to make things line up properlyLibravatar Linus Torvalds1-0/+1
A commit log message sometimes tries to line things up using tabs, assuming fixed-width font with the standard 8-place tab settings. Viewing such a commit however does not work well in "git log", as we indent the lines by prefixing 4 spaces in front of them. This should all line up: Column 1 Column 2 -------- -------- A B ABCD EFGH SPACES Instead of Tabs Even with multi-byte UTF8 characters: Column 1 Column 2 -------- -------- Ä B åäö 100 A Møøse once bit my sister.. Tab-expand the lines in "git log --expand-tabs" output before prefixing 4 spaces. This is based on the patch by Linus Torvalds, but at this step, we require an explicit command line option to enable the behaviour. Signed-off-by: Junio C Hamano <gitster@pobox.com>