diff options
author | Junio C Hamano <gitster@pobox.com> | 2013-06-14 08:45:59 -0700 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2013-06-14 08:45:59 -0700 |
commit | b27a79d16b21be064c0ac3634928a91e3eee5c01 (patch) | |
tree | e9f75de0da78552765dc5fae1451402e78c36797 | |
parent | Merge branch 'rr/remove-contrib-some' (diff) | |
parent | revision.c: make default history consider bottom commits (diff) | |
download | tgif-b27a79d16b21be064c0ac3634928a91e3eee5c01.tar.xz |
Merge branch 'kb/full-history-compute-treesame-carefully-2'
Major update to the revision traversal logic to improve culling of
irrelevant parents while traversing a mergy history.
* kb/full-history-compute-treesame-carefully-2:
revision.c: make default history consider bottom commits
revision.c: don't show all merges for --parents
revision.c: discount side branches when computing TREESAME
revision.c: add BOTTOM flag for commits
simplify-merges: drop merge from irrelevant side branch
simplify-merges: never remove all TREESAME parents
t6012: update test for tweaked full-history traversal
revision.c: Make --full-history consider more merges
Documentation: avoid "uninteresting"
rev-list-options.txt: correct TREESAME for P
t6111: add parents to tests
t6111: allow checking the parents as well
t6111: new TREESAME test set
t6019: test file dropped in -s ours merge
decorate.c: compact table when growing
-rw-r--r-- | Documentation/rev-list-options.txt | 42 | ||||
-rw-r--r-- | decorate.c | 2 | ||||
-rw-r--r-- | revision.c | 539 | ||||
-rw-r--r-- | revision.h | 4 | ||||
-rwxr-xr-x | t/t6012-rev-list-simplify.sh | 31 | ||||
-rwxr-xr-x | t/t6019-rev-list-ancestry-path.sh | 27 | ||||
-rwxr-xr-x | t/t6111-rev-list-treesame.sh | 196 |
7 files changed, 750 insertions, 91 deletions
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt index 3bdbf5e856..b462f17f62 100644 --- a/Documentation/rev-list-options.txt +++ b/Documentation/rev-list-options.txt @@ -271,8 +271,8 @@ See also linkgit:git-reflog[1]. --boundary:: - Output uninteresting commits at the boundary, which are usually - not shown. + Output excluded boundary commits. Boundary commits are + prefixed with `-`. -- @@ -342,13 +342,13 @@ In the following, we will always refer to the same example history to illustrate the differences between simplification settings. We assume that you are filtering for a file `foo` in this commit graph: ----------------------------------------------------------------------- - .-A---M---N---O---P - / / / / / - I B C D E - \ / / / / - `-------------' + .-A---M---N---O---P---Q + / / / / / / + I B C D E Y + \ / / / / / + `-------------' X ----------------------------------------------------------------------- -The horizontal line of history A---P is taken to be the first parent of +The horizontal line of history A---Q is taken to be the first parent of each merge. The commits are: * `I` is the initial commit, in which `foo` exists with contents @@ -367,8 +367,11 @@ each merge. The commits are: `N` and `D` to "foobarbaz"; i.e., it is not TREESAME to any parent. * `E` changes `quux` to "xyzzy", and its merge `P` combines the - strings to "quux xyzzy". Despite appearing interesting, `P` is - TREESAME to all parents. + strings to "quux xyzzy". `P` is TREESAME to `O`, but not to `E`. + +* `X` is an indpendent root commit that added a new file `side`, and `Y` + modified it. `Y` is TREESAME to `X`. Its merge `Q` added `side` to `P`, and + `Q` is TREESAME to `P`, but not to `Y`. 'rev-list' walks backwards through history, including or excluding commits based on whether '\--full-history' and/or parent rewriting @@ -410,10 +413,10 @@ parent lines. the example, we get + ----------------------------------------------------------------------- - I A B N D O + I A B N D O P Q ----------------------------------------------------------------------- + -`P` and `M` were excluded because they are TREESAME to a parent. `E`, +`M` was excluded because it is TREESAME to both parents. `E`, `C` and `B` were all walked, but only `B` was !TREESAME, so the others do not appear. + @@ -431,7 +434,7 @@ Along each parent, prune away commits that are not included themselves. This results in + ----------------------------------------------------------------------- - .-A---M---N---O---P + .-A---M---N---O---P---Q / / / / / I B / D / \ / / / / @@ -441,7 +444,7 @@ themselves. This results in Compare to '\--full-history' without rewriting above. Note that `E` was pruned away because it is TREESAME, but the parent list of P was rewritten to contain `E`'s parent `I`. The same happened for `C` and -`N`. Note also that `P` was included despite being TREESAME. +`N`, and `X`, `Y` and `Q`. In addition to the above settings, you can change whether TREESAME affects inclusion: @@ -471,8 +474,9 @@ history according to the following rules: * Set `C'` to `C`. + * Replace each parent `P` of `C'` with its simplification `P'`. In - the process, drop parents that are ancestors of other parents, and - remove duplicates. + the process, drop parents that are ancestors of other parents or that are + root commits TREESAME to an empty tree, and remove duplicates, but take care + to never drop all parents that we are TREESAME to. + * If after this parent rewriting, `C'` is a root or merge commit (has zero or >1 parents), a boundary commit, or !TREESAME, it remains. @@ -490,7 +494,7 @@ The effect of this is best shown by way of comparing to `---------' ----------------------------------------------------------------------- + -Note the major differences in `N` and `P` over '--full-history': +Note the major differences in `N`, `P` and `Q` over '--full-history': + -- * `N`'s parent list had `I` removed, because it is an ancestor of the @@ -498,6 +502,10 @@ Note the major differences in `N` and `P` over '--full-history': + * `P`'s parent list similarly had `I` removed. `P` was then removed completely, because it had one parent and is TREESAME. ++ +* `Q`'s parent list had `Y` simplified to `X`. `X` was then removed, because it + was a TREESAME root. `Q` was then removed completely, because it had one + parent and is TREESAME. -- Finally, there is a fifth simplification mode available: diff --git a/decorate.c b/decorate.c index 2f8a63e388..7cb5d29a89 100644 --- a/decorate.c +++ b/decorate.c @@ -49,7 +49,7 @@ static void grow_decoration(struct decoration *n) const struct object *base = old_hash[i].base; void *decoration = old_hash[i].decoration; - if (!base) + if (!decoration) continue; insert_decoration(n, base, decoration); } diff --git a/revision.c b/revision.c index 518cd08693..9318b7d763 100644 --- a/revision.c +++ b/revision.c @@ -334,6 +334,80 @@ static int everybody_uninteresting(struct commit_list *orig) } /* + * A definition of "relevant" commit that we can use to simplify limited graphs + * by eliminating side branches. + * + * A "relevant" commit is one that is !UNINTERESTING (ie we are including it + * in our list), or that is a specified BOTTOM commit. Then after computing + * a limited list, during processing we can generally ignore boundary merges + * coming from outside the graph, (ie from irrelevant parents), and treat + * those merges as if they were single-parent. TREESAME is defined to consider + * only relevant parents, if any. If we are TREESAME to our on-graph parents, + * we don't care if we were !TREESAME to non-graph parents. + * + * Treating bottom commits as relevant ensures that a limited graph's + * connection to the actual bottom commit is not viewed as a side branch, but + * treated as part of the graph. For example: + * + * ....Z...A---X---o---o---B + * . / + * W---Y + * + * When computing "A..B", the A-X connection is at least as important as + * Y-X, despite A being flagged UNINTERESTING. + * + * And when computing --ancestry-path "A..B", the A-X connection is more + * important than Y-X, despite both A and Y being flagged UNINTERESTING. + */ +static inline int relevant_commit(struct commit *commit) +{ + return (commit->object.flags & (UNINTERESTING | BOTTOM)) != UNINTERESTING; +} + +/* + * Return a single relevant commit from a parent list. If we are a TREESAME + * commit, and this selects one of our parents, then we can safely simplify to + * that parent. + */ +static struct commit *one_relevant_parent(const struct rev_info *revs, + struct commit_list *orig) +{ + struct commit_list *list = orig; + struct commit *relevant = NULL; + + if (!orig) + return NULL; + + /* + * For 1-parent commits, or if first-parent-only, then return that + * first parent (even if not "relevant" by the above definition). + * TREESAME will have been set purely on that parent. + */ + if (revs->first_parent_only || !orig->next) + return orig->item; + + /* + * For multi-parent commits, identify a sole relevant parent, if any. + * If we have only one relevant parent, then TREESAME will be set purely + * with regard to that parent, and we can simplify accordingly. + * + * If we have more than one relevant parent, or no relevant parents + * (and multiple irrelevant ones), then we can't select a parent here + * and return NULL. + */ + while (list) { + struct commit *commit = list->item; + list = list->next; + if (relevant_commit(commit)) { + if (relevant) + return NULL; + relevant = commit; + } + } + return relevant; +} + +/* * The goal is to get REV_TREE_NEW as the result only if the * diff consists of all '+' (and no other changes), REV_TREE_OLD * if the whole diff is removal of old data, and otherwise @@ -430,10 +504,125 @@ static int rev_same_tree_as_empty(struct rev_info *revs, struct commit *commit) return retval >= 0 && (tree_difference == REV_TREE_SAME); } +struct treesame_state { + unsigned int nparents; + unsigned char treesame[FLEX_ARRAY]; +}; + +static struct treesame_state *initialise_treesame(struct rev_info *revs, struct commit *commit) +{ + unsigned n = commit_list_count(commit->parents); + struct treesame_state *st = xcalloc(1, sizeof(*st) + n); + st->nparents = n; + add_decoration(&revs->treesame, &commit->object, st); + return st; +} + +/* + * Must be called immediately after removing the nth_parent from a commit's + * parent list, if we are maintaining the per-parent treesame[] decoration. + * This does not recalculate the master TREESAME flag - update_treesame() + * should be called to update it after a sequence of treesame[] modifications + * that may have affected it. + */ +static int compact_treesame(struct rev_info *revs, struct commit *commit, unsigned nth_parent) +{ + struct treesame_state *st; + int old_same; + + if (!commit->parents) { + /* + * Have just removed the only parent from a non-merge. + * Different handling, as we lack decoration. + */ + if (nth_parent != 0) + die("compact_treesame %u", nth_parent); + old_same = !!(commit->object.flags & TREESAME); + if (rev_same_tree_as_empty(revs, commit)) + commit->object.flags |= TREESAME; + else + commit->object.flags &= ~TREESAME; + return old_same; + } + + st = lookup_decoration(&revs->treesame, &commit->object); + if (!st || nth_parent >= st->nparents) + die("compact_treesame %u", nth_parent); + + old_same = st->treesame[nth_parent]; + memmove(st->treesame + nth_parent, + st->treesame + nth_parent + 1, + st->nparents - nth_parent - 1); + + /* + * If we've just become a non-merge commit, update TREESAME + * immediately, and remove the no-longer-needed decoration. + * If still a merge, defer update until update_treesame(). + */ + if (--st->nparents == 1) { + if (commit->parents->next) + die("compact_treesame parents mismatch"); + if (st->treesame[0] && revs->dense) + commit->object.flags |= TREESAME; + else + commit->object.flags &= ~TREESAME; + free(add_decoration(&revs->treesame, &commit->object, NULL)); + } + + return old_same; +} + +static unsigned update_treesame(struct rev_info *revs, struct commit *commit) +{ + if (commit->parents && commit->parents->next) { + unsigned n; + struct treesame_state *st; + struct commit_list *p; + unsigned relevant_parents; + unsigned relevant_change, irrelevant_change; + + st = lookup_decoration(&revs->treesame, &commit->object); + if (!st) + die("update_treesame %s", sha1_to_hex(commit->object.sha1)); + relevant_parents = 0; + relevant_change = irrelevant_change = 0; + for (p = commit->parents, n = 0; p; n++, p = p->next) { + if (relevant_commit(p->item)) { + relevant_change |= !st->treesame[n]; + relevant_parents++; + } else + irrelevant_change |= !st->treesame[n]; + } + if (relevant_parents ? relevant_change : irrelevant_change) + commit->object.flags &= ~TREESAME; + else + commit->object.flags |= TREESAME; + } + + return commit->object.flags & TREESAME; +} + +static inline int limiting_can_increase_treesame(const struct rev_info *revs) +{ + /* + * TREESAME is irrelevant unless prune && dense; + * if simplify_history is set, we can't have a mixture of TREESAME and + * !TREESAME INTERESTING parents (and we don't have treesame[] + * decoration anyway); + * if first_parent_only is set, then the TREESAME flag is locked + * against the first parent (and again we lack treesame[] decoration). + */ + return revs->prune && revs->dense && + !revs->simplify_history && + !revs->first_parent_only; +} + static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) { struct commit_list **pp, *parent; - int tree_changed = 0, tree_same = 0, nth_parent = 0; + struct treesame_state *ts = NULL; + int relevant_change = 0, irrelevant_change = 0; + int relevant_parents, nth_parent; /* * If we don't do pruning, everything is interesting @@ -457,33 +646,54 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) if (!revs->dense && !commit->parents->next) return; - pp = &commit->parents; - while ((parent = *pp) != NULL) { + for (pp = &commit->parents, nth_parent = 0, relevant_parents = 0; + (parent = *pp) != NULL; + pp = &parent->next, nth_parent++) { struct commit *p = parent->item; + if (relevant_commit(p)) + relevant_parents++; - /* - * Do not compare with later parents when we care only about - * the first parent chain, in order to avoid derailing the - * traversal to follow a side branch that brought everything - * in the path we are limited to by the pathspec. - */ - if (revs->first_parent_only && nth_parent++) - break; + if (nth_parent == 1) { + /* + * This our second loop iteration - so we now know + * we're dealing with a merge. + * + * Do not compare with later parents when we care only about + * the first parent chain, in order to avoid derailing the + * traversal to follow a side branch that brought everything + * in the path we are limited to by the pathspec. + */ + if (revs->first_parent_only) + break; + /* + * If this will remain a potentially-simplifiable + * merge, remember per-parent treesame if needed. + * Initialise the array with the comparison from our + * first iteration. + */ + if (revs->treesame.name && + !revs->simplify_history && + !(commit->object.flags & UNINTERESTING)) { + ts = initialise_treesame(revs, commit); + if (!(irrelevant_change || relevant_change)) + ts->treesame[0] = 1; + } + } if (parse_commit(p) < 0) die("cannot simplify commit %s (because of %s)", sha1_to_hex(commit->object.sha1), sha1_to_hex(p->object.sha1)); switch (rev_compare_tree(revs, p, commit)) { case REV_TREE_SAME: - tree_same = 1; - if (!revs->simplify_history || (p->object.flags & UNINTERESTING)) { + if (!revs->simplify_history || !relevant_commit(p)) { /* Even if a merge with an uninteresting * side branch brought the entire change * we are interested in, we do not want * to lose the other branches of this * merge, so we just keep going. */ - pp = &parent->next; + if (ts) + ts->treesame[nth_parent] = 1; continue; } parent->next = NULL; @@ -511,15 +721,27 @@ static void try_to_simplify_commit(struct rev_info *revs, struct commit *commit) /* fallthrough */ case REV_TREE_OLD: case REV_TREE_DIFFERENT: - tree_changed = 1; - pp = &parent->next; + if (relevant_commit(p)) + relevant_change = 1; + else + irrelevant_change = 1; continue; } die("bad tree compare for commit %s", sha1_to_hex(commit->object.sha1)); } - if (tree_changed && !tree_same) - return; - commit->object.flags |= TREESAME; + + /* + * TREESAME is straightforward for single-parent commits. For merge + * commits, it is most useful to define it so that "irrelevant" + * parents cannot make us !TREESAME - if we have any relevant + * parents, then we only consider TREESAMEness with respect to them, + * allowing irrelevant merges from uninteresting branches to be + * simplified away. Only if we have only irrelevant parents do we + * base TREESAME on them. Note that this logic is replicated in + * update_treesame, which should be kept in sync. + */ + if (relevant_parents ? !relevant_change : !irrelevant_change) + commit->object.flags |= TREESAME; } static void commit_list_insert_by_date_cached(struct commit *p, struct commit_list **head, @@ -802,16 +1024,12 @@ static void limit_to_ancestry(struct commit_list *bottom, struct commit_list *li * to filter the result of "A..B" further to the ones that can actually * reach A. */ -static struct commit_list *collect_bottom_commits(struct rev_info *revs) +static struct commit_list *collect_bottom_commits(struct commit_list *list) { - struct commit_list *bottom = NULL; - int i; - for (i = 0; i < revs->cmdline.nr; i++) { - struct rev_cmdline_entry *elem = &revs->cmdline.rev[i]; - if ((elem->flags & UNINTERESTING) && - elem->item->type == OBJ_COMMIT) - commit_list_insert((struct commit *)elem->item, &bottom); - } + struct commit_list *elem, *bottom = NULL; + for (elem = list; elem; elem = elem->next) + if (elem->item->object.flags & BOTTOM) + commit_list_insert(elem->item, &bottom); return bottom; } @@ -842,7 +1060,7 @@ static int limit_list(struct rev_info *revs) struct commit_list *bottom = NULL; if (revs->ancestry_path) { - bottom = collect_bottom_commits(revs); + bottom = collect_bottom_commits(list); if (!bottom) die("--ancestry-path given but there are no bottom commits"); } @@ -895,6 +1113,18 @@ static int limit_list(struct rev_info *revs) free_commit_list(bottom); } + /* + * Check if any commits have become TREESAME by some of their parents + * becoming UNINTERESTING. + */ + if (limiting_can_increase_treesame(revs)) + for (list = newlist; list; list = list->next) { + struct commit *c = list->item; + if (c->object.flags & (UNINTERESTING | TREESAME)) + continue; + update_treesame(revs, c); + } + revs->commits = newlist; return 0; } @@ -1014,7 +1244,7 @@ static int add_parents_only(struct rev_info *revs, const char *arg_, int flags) const char *arg = arg_; if (*arg == '^') { - flags ^= UNINTERESTING; + flags ^= UNINTERESTING | BOTTOM; arg++; } if (get_sha1_committish(arg, sha1)) @@ -1106,8 +1336,8 @@ static void prepare_show_merge(struct rev_info *revs) add_pending_object(revs, &head->object, "HEAD"); add_pending_object(revs, &other->object, "MERGE_HEAD"); bases = get_merge_bases(head, other, 1); - add_rev_cmdline_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING); - add_pending_commit_list(revs, bases, UNINTERESTING); + add_rev_cmdline_list(revs, bases, REV_CMD_MERGE_BASE, UNINTERESTING | BOTTOM); + add_pending_commit_list(revs, bases, UNINTERESTING | BOTTOM); free_commit_list(bases); head->object.flags |= SYMMETRIC_LEFT; @@ -1143,13 +1373,15 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi int cant_be_filename = revarg_opt & REVARG_CANNOT_BE_FILENAME; unsigned get_sha1_flags = 0; + flags = flags & UNINTERESTING ? flags | BOTTOM : flags & ~BOTTOM; + dotdot = strstr(arg, ".."); if (dotdot) { unsigned char from_sha1[20]; const char *next = dotdot + 2; const char *this = arg; int symmetric = *next == '.'; - unsigned int flags_exclude = flags ^ UNINTERESTING; + unsigned int flags_exclude = flags ^ (UNINTERESTING | BOTTOM); static const char head_by_default[] = "HEAD"; unsigned int a_flags; @@ -1225,13 +1457,13 @@ int handle_revision_arg(const char *arg_, struct rev_info *revs, int flags, unsi dotdot = strstr(arg, "^!"); if (dotdot && !dotdot[2]) { *dotdot = 0; - if (!add_parents_only(revs, arg, flags ^ UNINTERESTING)) + if (!add_parents_only(revs, arg, flags ^ (UNINTERESTING | BOTTOM))) *dotdot = '^'; } local_flags = 0; if (*arg == '^') { - local_flags = UNINTERESTING; + local_flags = UNINTERESTING | BOTTOM; arg++; } @@ -1708,7 +1940,7 @@ static int handle_revision_pseudo_opt(const char *submodule, handle_refs(submodule, revs, *flags, for_each_branch_ref_submodule); } else if (!strcmp(arg, "--bisect")) { handle_refs(submodule, revs, *flags, for_each_bad_bisect_ref); - handle_refs(submodule, revs, *flags ^ UNINTERESTING, for_each_good_bisect_ref); + handle_refs(submodule, revs, *flags ^ (UNINTERESTING | BOTTOM), for_each_good_bisect_ref); revs->bisect = 1; } else if (!strcmp(arg, "--tags")) { handle_refs(submodule, revs, *flags, for_each_tag_ref_submodule); @@ -1734,7 +1966,7 @@ static int handle_revision_pseudo_opt(const char *submodule, } else if (!strcmp(arg, "--reflog")) { handle_reflog(revs, *flags); } else if (!strcmp(arg, "--not")) { - *flags ^= UNINTERESTING; + *flags ^= UNINTERESTING | BOTTOM; } else if (!strcmp(arg, "--no-walk")) { revs->no_walk = REVISION_WALK_NO_WALK_SORTED; } else if (!prefixcmp(arg, "--no-walk=")) { @@ -1954,28 +2186,32 @@ static void add_child(struct rev_info *revs, struct commit *parent, struct commi l->next = add_decoration(&revs->children, &parent->object, l); } -static int remove_duplicate_parents(struct commit *commit) +static int remove_duplicate_parents(struct rev_info *revs, struct commit *commit) { + struct treesame_state *ts = lookup_decoration(&revs->treesame, &commit->object); struct commit_list **pp, *p; int surviving_parents; /* Examine existing parents while marking ones we have seen... */ pp = &commit->parents; + surviving_parents = 0; while ((p = *pp) != NULL) { struct commit *parent = p->item; if (parent->object.flags & TMP_MARK) { *pp = p->next; + if (ts) + compact_treesame(revs, commit, surviving_parents); continue; } parent->object.flags |= TMP_MARK; + surviving_parents++; pp = &p->next; } - /* count them while clearing the temporary mark */ - surviving_parents = 0; + /* clear the temporary mark */ for (p = commit->parents; p; p = p->next) { p->item->object.flags &= ~TMP_MARK; - surviving_parents++; } + /* no update_treesame() - removing duplicates can't affect TREESAME */ return surviving_parents; } @@ -1995,9 +2231,157 @@ static struct merge_simplify_state *locate_simplify_state(struct rev_info *revs, return st; } +static int mark_redundant_parents(struct rev_info *revs, struct commit *commit) +{ + struct commit_list *h = reduce_heads(commit->parents); + int i = 0, marked = 0; + struct commit_list *po, *pn; + + /* Want these for sanity-checking only */ + int orig_cnt = commit_list_count(commit->parents); + int cnt = commit_list_count(h); + + /* + * Not ready to remove items yet, just mark them for now, based + * on the output of reduce_heads(). reduce_heads outputs the reduced + * set in its original order, so this isn't too hard. + */ + po = commit->parents; + pn = h; + while (po) { + if (pn && po->item == pn->item) { + pn = pn->next; + i++; + } else { + po->item->object.flags |= TMP_MARK; + marked++; + } + po=po->next; + } + + if (i != cnt || cnt+marked != orig_cnt) + die("mark_redundant_parents %d %d %d %d", orig_cnt, cnt, i, marked); + + free_commit_list(h); + + return marked; +} + +static int mark_treesame_root_parents(struct rev_info *revs, struct commit *commit) +{ + struct commit_list *p; + int marked = 0; + + for (p = commit->parents; p; p = p->next) { + struct commit *parent = p->item; + if (!parent->parents && (parent->object.flags & TREESAME)) { + parent->object.flags |= TMP_MARK; + marked++; + } + } + + return marked; +} + +/* + * Awkward naming - this means one parent we are TREESAME to. + * cf mark_treesame_root_parents: root parents that are TREESAME (to an + * empty tree). Better name suggestions? + */ +static int leave_one_treesame_to_parent(struct rev_info *revs, struct commit *commit) +{ + struct treesame_state *ts = lookup_decoration(&revs->treesame, &commit->object); + struct commit *unmarked = NULL, *marked = NULL; + struct commit_list *p; + unsigned n; + + for (p = commit->parents, n = 0; p; p = p->next, n++) { + if (ts->treesame[n]) { + if (p->item->object.flags & TMP_MARK) { + if (!marked) + marked = p->item; + } else { + if (!unmarked) { + unmarked = p->item; + break; + } + } + } + } + + /* + * If we are TREESAME to a marked-for-deletion parent, but not to any + * unmarked parents, unmark the first TREESAME parent. This is the + * parent that the default simplify_history==1 scan would have followed, + * and it doesn't make sense to omit that path when asking for a + * simplified full history. Retaining it improves the chances of + * understanding odd missed merges that took an old version of a file. + * + * Example: + * + * I--------*X A modified the file, but mainline merge X used + * \ / "-s ours", so took the version from I. X is + * `-*A--' TREESAME to I and !TREESAME to A. + * + * Default log from X would produce "I". Without this check, + * --full-history --simplify-merges would produce "I-A-X", showing + * the merge commit X and that it changed A, but not making clear that + * it had just taken the I version. With this check, the topology above + * is retained. + * + * Note that it is possible that the simplification chooses a different + * TREESAME parent from the default, in which case this test doesn't + * activate, and we _do_ drop the default parent. Example: + * + * I------X A modified the file, but it was reverted in B, + * \ / meaning mainline merge X is TREESAME to both + * *A-*B parents. + * + * Default log would produce "I" by following the first parent; + * --full-history --simplify-merges will produce "I-A-B". But this is a + * reasonable result - it presents a logical full history leading from + * I to X, and X is not an important merge. + */ + if (!unmarked && marked) { + marked->object.flags &= ~TMP_MARK; + return 1; + } + + return 0; +} + +static int remove_marked_parents(struct rev_info *revs, struct commit *commit) +{ + struct commit_list **pp, *p; + int nth_parent, removed = 0; + + pp = &commit->parents; + nth_parent = 0; + while ((p = *pp) != NULL) { + struct commit *parent = p->item; + if (parent->object.flags & TMP_MARK) { + parent->object.flags &= ~TMP_MARK; + *pp = p->next; + free(p); + removed++; + compact_treesame(revs, commit, nth_parent); + continue; + } + pp = &p->next; + nth_parent++; + } + + /* Removing parents can only increase TREESAMEness */ + if (removed && !(commit->object.flags & TREESAME)) + update_treesame(revs, commit); + + return nth_parent; +} + static struct commit_list **simplify_one(struct rev_info *revs, struct commit *commit, struct commit_list **tail) { struct commit_list *p; + struct commit *parent; struct merge_simplify_state *st, *pst; int cnt; @@ -2039,7 +2423,9 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c } /* - * Rewrite our list of parents. + * Rewrite our list of parents. Note that this cannot + * affect our TREESAME flags in any way - a commit is + * always TREESAME to its simplification. */ for (p = commit->parents; p; p = p->next) { pst = locate_simplify_state(revs, p->item); @@ -2051,43 +2437,53 @@ static struct commit_list **simplify_one(struct rev_info *revs, struct commit *c if (revs->first_parent_only) cnt = 1; else - cnt = remove_duplicate_parents(commit); + cnt = remove_duplicate_parents(revs, commit); /* * It is possible that we are a merge and one side branch * does not have any commit that touches the given paths; - * in such a case, the immediate parents will be rewritten - * to different commits. + * in such a case, the immediate parent from that branch + * will be rewritten to be the merge base. * * o----X X: the commit we are looking at; * / / o: a commit that touches the paths; * ---o----' * - * Further reduce the parents by removing redundant parents. + * Further, a merge of an independent branch that doesn't + * touch the path will reduce to a treesame root parent: + * + * ----o----X X: the commit we are looking at; + * / o: a commit that touches the paths; + * r r: a root commit not touching the paths + * + * Detect and simplify both cases. */ if (1 < cnt) { - struct commit_list *h = reduce_heads(commit->parents); - cnt = commit_list_count(h); - free_commit_list(commit->parents); - commit->parents = h; + int marked = mark_redundant_parents(revs, commit); + marked += mark_treesame_root_parents(revs, commit); + if (marked) + marked -= leave_one_treesame_to_parent(revs, commit); + if (marked) + cnt = remove_marked_parents(revs, commit); } /* * A commit simplifies to itself if it is a root, if it is * UNINTERESTING, if it touches the given paths, or if it is a - * merge and its parents simplifies to more than one commits + * merge and its parents don't simplify to one relevant commit * (the first two cases are already handled at the beginning of * this function). * - * Otherwise, it simplifies to what its sole parent simplifies to. + * Otherwise, it simplifies to what its sole relevant parent + * simplifies to. */ if (!cnt || (commit->object.flags & UNINTERESTING) || !(commit->object.flags & TREESAME) || - (1 < cnt)) + (parent = one_relevant_parent(revs, commit->parents)) == NULL) st->simplified = commit; else { - pst = locate_simplify_state(revs, commit->parents->item); + pst = locate_simplify_state(revs, parent); st->simplified = pst->simplified; } return tail; @@ -2183,6 +2579,11 @@ int prepare_revision_walk(struct rev_info *revs) if (!revs->leak_pending) free(list); + /* Signal whether we need per-parent treesame decoration */ + if (revs->simplify_merges || + (revs->limited && limiting_can_increase_treesame(revs))) + revs->treesame.name = "treesame"; + if (revs->no_walk != REVISION_WALK_NO_WALK_UNSORTED) commit_list_sort_by_date(&revs->commits); if (revs->no_walk) @@ -2210,15 +2611,15 @@ static enum rewrite_result rewrite_one(struct rev_info *revs, struct commit **pp if (!revs->limited) if (add_parents_to_list(revs, p, &revs->commits, &cache) < 0) return rewrite_one_error; - if (p->parents && p->parents->next) - return rewrite_one_ok; if (p->object.flags & UNINTERESTING) return rewrite_one_ok; if (!(p->object.flags & TREESAME)) return rewrite_one_ok; if (!p->parents) return rewrite_one_noparents; - *pp = p->parents->item; + if ((p = one_relevant_parent(revs, p->parents)) == NULL) + return rewrite_one_ok; + *pp = p; } } @@ -2239,7 +2640,7 @@ int rewrite_parents(struct rev_info *revs, struct commit *commit, } pp = &parent->next; } - remove_duplicate_parents(commit); + remove_duplicate_parents(revs, commit); return 0; } @@ -2363,10 +2764,7 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi if (revs->min_age != -1 && (commit->date > revs->min_age)) return commit_ignore; if (revs->min_parents || (revs->max_parents >= 0)) { - int n = 0; - struct commit_list *p; - for (p = commit->parents; p; p = p->next) - n++; + int n = commit_list_count(commit->parents); if ((n < revs->min_parents) || ((revs->max_parents >= 0) && (n > revs->max_parents))) return commit_ignore; @@ -2376,12 +2774,23 @@ enum commit_action get_commit_action(struct rev_info *revs, struct commit *commi if (revs->prune && revs->dense) { /* Commit without changes? */ if (commit->object.flags & TREESAME) { + int n; + struct commit_list *p; /* drop merges unless we want parenthood */ if (!want_ancestry(revs)) return commit_ignore; - /* non-merge - always ignore it */ - if (!commit->parents || !commit->parents->next) - return commit_ignore; + /* + * If we want ancestry, then need to keep any merges + * between relevant commits to tie together topology. + * For consistency with TREESAME and simplification + * use "relevant" here rather than just INTERESTING, + * to treat bottom commit(s) as part of the topology. + */ + for (n = 0, p = commit->parents; p; p = p->next) + if (relevant_commit(p->item)) + if (++n >= 2) + return commit_show; + return commit_ignore; } } return commit_show; diff --git a/revision.h b/revision.h index a313a13405..24547b59f1 100644 --- a/revision.h +++ b/revision.h @@ -15,7 +15,8 @@ #define ADDED (1u<<7) /* Parents already parsed and added? */ #define SYMMETRIC_LEFT (1u<<8) #define PATCHSAME (1u<<9) -#define ALL_REV_FLAGS ((1u<<10)-1) +#define BOTTOM (1u<<10) +#define ALL_REV_FLAGS ((1u<<11)-1) #define DECORATE_SHORT_REFS 1 #define DECORATE_FULL_REFS 2 @@ -169,6 +170,7 @@ struct rev_info { struct reflog_walk_info *reflog_info; struct decoration children; struct decoration merge_simplification; + struct decoration treesame; /* notes-specific options: which refs to show */ struct display_notes_opt notes_opt; diff --git a/t/t6012-rev-list-simplify.sh b/t/t6012-rev-list-simplify.sh index dd6dc844e7..57ce2395d6 100755 --- a/t/t6012-rev-list-simplify.sh +++ b/t/t6012-rev-list-simplify.sh @@ -14,21 +14,24 @@ unnote () { test_expect_success setup ' echo "Hi there" >file && - git add file && - test_tick && git commit -m "Initial file" && + echo "initial" >lost && + git add file lost && + test_tick && git commit -m "Initial file and lost" && note A && git branch other-branch && echo "Hello" >file && - git add file && - test_tick && git commit -m "Modified file" && + echo "second" >lost && + git add file lost && + test_tick && git commit -m "Modified file and lost" && note B && git checkout other-branch && echo "Hello" >file && - git add file && + >lost && + git add file lost && test_tick && git commit -m "Modified the file identically" && note C && @@ -37,7 +40,9 @@ test_expect_success setup ' test_tick && git commit -m "Add another file" && note D && - test_tick && git merge -m "merge" master && + test_tick && + test_must_fail git merge -m "merge" master && + >lost && git commit -a -m "merge" && note E && echo "Yet another" >elif && @@ -105,9 +110,21 @@ check_result 'L K J I H G F E D C B A' --full-history check_result 'K I H E C B A' --full-history -- file check_result 'K I H E C B A' --full-history --topo-order -- file check_result 'K I H E C B A' --full-history --date-order -- file -check_outcome failure 'I E C B A' --simplify-merges -- file +check_result 'I E C B A' --simplify-merges -- file check_result 'I B A' -- file check_result 'I B A' --topo-order -- file check_result 'H' --first-parent -- another-file +check_result 'E C B A' --full-history E -- lost +test_expect_success 'full history simplification without parent' ' + printf "%s\n" E C B A >expect && + git log --pretty="$FMT" --full-history E -- lost | + unnote >actual && + sed -e "s/^.* \([^ ]*\) .*/\1/" >check <actual && + test_cmp expect check || { + cat actual + false + } +' + test_done diff --git a/t/t6019-rev-list-ancestry-path.sh b/t/t6019-rev-list-ancestry-path.sh index dd5b0e55d2..dabebaee0b 100755 --- a/t/t6019-rev-list-ancestry-path.sh +++ b/t/t6019-rev-list-ancestry-path.sh @@ -16,6 +16,10 @@ test_description='--ancestry-path' # # F...I == F G H I # --ancestry-path F...I == F H I +# +# G..M -- G.t == [nothing - was dropped in "-s ours" merge L] +# --ancestry-path G..M -- G.t == L +# --ancestry-path --simplify-merges G^..M -- G.t == G L . ./test-lib.sh @@ -89,6 +93,29 @@ test_expect_success 'rev-list --ancestry-path F...I' ' test_cmp expect actual ' +# G.t is dropped in an "-s ours" merge +test_expect_success 'rev-list G..M -- G.t' ' + >expect && + git rev-list --format=%s G..M -- G.t | + sed -e "/^commit /d" >actual && + test_cmp expect actual +' + +test_expect_success 'rev-list --ancestry-path G..M -- G.t' ' + echo L >expect && + git rev-list --ancestry-path --format=%s G..M -- G.t | + sed -e "/^commit /d" >actual && + test_cmp expect actual +' + +test_expect_success 'rev-list --ancestry-path --simplify-merges G^..M -- G.t' ' + for c in G L; do echo $c; done >expect && + git rev-list --ancestry-path --simplify-merges --format=%s G^..M -- G.t | + sed -e "/^commit /d" | + sort >actual && + test_cmp expect actual +' + # b---bc # / \ / # a X diff --git a/t/t6111-rev-list-treesame.sh b/t/t6111-rev-list-treesame.sh new file mode 100755 index 0000000000..88b84dfa73 --- /dev/null +++ b/t/t6111-rev-list-treesame.sh @@ -0,0 +1,196 @@ +#!/bin/sh +# +# ,---E--. *H----------. * marks !TREESAME parent paths +# / \ / \* +# *A--*B---D--*F-*G---------K-*L-*M +# \ /* \ / +# `-C-' `-*I-*J +# +# A creates "file", B and F change it. +# Odd merge G takes the old version from B. +# I changes it, but J reverts it, so K is TREESAME to both parents. +# H and L both change "file", and M merges those changes. + +test_description='TREESAME and limiting' + +. ./test-lib.sh + +note () { + git tag "$1" +} + +unnote () { + git name-rev --tags --stdin | sed -e "s|$_x40 (tags/\([^)]*\))\([ ]\)|\1\2|g" +} + +test_expect_success setup ' + test_commit "Initial file" file "Hi there" A && + git branch other-branch && + + test_commit "file=Hello" file "Hello" B && + git branch third-branch && + + git checkout other-branch && + test_commit "Added other" other "Hello" C && + + git checkout master && + test_merge D other-branch && + + git checkout third-branch && + test_commit "Third file" third "Nothing" E && + + git checkout master && + test_commit "file=Blah" file "Blah" F && + + test_tick && git merge --no-commit third-branch && + git checkout third-branch file && + git commit && + note G && + git branch fiddler-branch && + + git checkout -b part2-branch && + test_commit "file=Part 2" file "Part 2" H && + + git checkout fiddler-branch && + test_commit "Bad commit" file "Silly" I && + + test_tick && git revert I && note J && + + git checkout master && + test_tick && git merge --no-ff fiddler-branch && + note K + + test_commit "file=Part 1" file "Part 1" L && + + test_tick && test_must_fail git merge part2-branch && + test_commit M file "Parts 1+2" +' + +check_outcome () { + outcome=$1 + shift + + case "$1" in + *"("*) + FMT="%P %H | %s" + munge_actual=" + s/^\([^ ]*\) \([^ ]*\) .*/(\1)\2/ + s/ //g + s/()// + " + ;; + *) + FMT="%H | %s" + munge_actual="s/^\([^ ]*\) .*/\1/" + ;; + esac && + printf "%s\n" $1 >expect && + shift + + param="$*" && + test_expect_$outcome "log $param" ' + git log --format="$FMT" $param | + unnote >actual && + sed -e "$munge_actual" <actual >check && + test_cmp expect check || { + cat actual + false + } + ' +} + +check_result () { + check_outcome success "$@" +} + +# Odd merge G drops a change in F. Important that G is listed in all +# except the most basic list. Achieving this means normal merge D will also be +# shown in normal full-history, as we can't distinguish unless we do a +# simplification pass. After simplification, D is dropped but G remains. +# Also, merge simplification of G should not drop the parent B that the default +# simple history follows. +check_result 'M L K J I H G F E D C B A' +check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FE)G (D)F (B)E (BC)D (A)C (A)B A' +check_result 'M H L K J I G E F D C B A' --topo-order +check_result 'M L H B A' -- file +check_result '(LH)M (B)L (B)H (A)B A' --parents -- file +check_result 'M L J I H G F D B A' --full-history -- file +check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FB)G (D)F (BA)D (A)B A' --full-history --parents -- file +check_result '(LH)M (G)H (J)L (I)J (G)I (FB)G (B)F (A)B A' --simplify-merges -- file +check_result 'M L K G F D B A' --first-parent +check_result 'M L G F B A' --first-parent -- file + +# Check that odd merge G remains shown when F is the bottom. +check_result 'M L K J I H G E' F..M +check_result 'M H L K J I G E' F..M --topo-order +check_result 'M L H' F..M -- file +check_result '(LH)M (B)L (B)H' --parents F..M -- file +check_result 'M L J I H G' F..M --full-history -- file +check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FB)G' F..M --full-history --parents -- file +check_result '(LH)M (G)H (J)L (I)J (G)I (FB)G' F..M --simplify-merges -- file +check_result 'M L K J I H G' F..M --ancestry-path +check_result 'M L J I H G' F..M --ancestry-path -- file +check_result '(LH)M (K)L (GJ)K (I)J (G)I (G)H (FE)G' F..M --ancestry-path --parents -- file +check_result '(LH)M (G)H (J)L (I)J (G)I (FE)G' F..M --ancestry-path --simplify-merges -- file +check_result 'M L K G' F..M --first-parent +check_result 'M L G' F..M --first-parent -- file + +# Note that G is pruned when E is the bottom, even if it's the same commit list +# If we want history since E, then we're quite happy to ignore G that took E. +check_result 'M L K J I H G' E..M --ancestry-path +check_result 'M L J I H' E..M --ancestry-path -- file +check_result '(LH)M (K)L (EJ)K (I)J (E)I (E)H' E..M --ancestry-path --parents -- file +check_result '(LH)M (E)H (J)L (I)J (E)I' E..M --ancestry-path --simplify-merges -- file + +# Should still be able to ignore I-J branch in simple log, despite limiting +# to G. +check_result 'M L K J I H' G..M +check_result 'M H L K J I' G..M --topo-order +check_result 'M L H' G..M -- file +check_result '(LH)M (G)L (G)H' G..M --parents -- file +check_result 'M L J I H' G..M --full-history -- file +check_result 'M L K J I H' G..M --full-history --parents -- file +check_result 'M H L J I' G..M --simplify-merges -- file +check_result 'M L K J I H' G..M --ancestry-path +check_result 'M L J I H' G..M --ancestry-path -- file +check_result 'M L K J I H' G..M --ancestry-path --parents -- file +check_result 'M H L J I' G..M --ancestry-path --simplify-merges -- file + +# B..F should be able to simplify the merge D from irrelevant side branch C. +# Default log should also be free to follow B-D, and ignore C. +# But --full-history shouldn't drop D on its own - without simplification, +# we can't decide if the merge from INTERESTING commit C was sensible. +check_result 'F D C' B..F +check_result 'F' B..F -- file +check_result '(B)F' B..F --parents -- file +check_result 'F D' B..F --full-history -- file +check_result '(D)F (BA)D' B..F --full-history --parents -- file +check_result '(B)F' B..F --simplify-merges -- file +check_result 'F D' B..F --ancestry-path +check_result 'F' B..F --ancestry-path -- file +check_result 'F' B..F --ancestry-path --parents -- file +check_result 'F' B..F --ancestry-path --simplify-merges -- file +check_result 'F D' B..F --first-parent +check_result 'F' B..F --first-parent -- file + +# E...F should be equivalent to E F ^B, and be able to drop D as above. +check_result 'F' E F ^B -- file # includes D +check_result 'F' E...F -- file # includes D + +# Any sort of full history of C..F should show D, as it's the connection to C, +# and it differs from it. +check_result 'F D B' C..F +check_result 'F B' C..F -- file +check_result '(B)F (A)B' C..F --parents -- file +check_result 'F D B' C..F --full-history -- file +check_result '(D)F (BC)D (A)B' C..F --full-history --parents -- file +check_result '(D)F (BC)D (A)B' C..F --simplify-merges -- file +check_result 'F D' C..F --ancestry-path +check_result 'F D' C..F --ancestry-path -- file +check_result 'F D' C..F --ancestry-path --parents -- file +check_result 'F D' C..F --ancestry-path --simplify-merges -- file +check_result 'F D B' C..F --first-parent +check_result 'F B' C..F --first-parent -- file + + +test_done |