From 88473c8baeefafe6b95ab0f62eb2f12e2b098ac7 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 22 Jun 2021 13:06:48 -0400 Subject: load_ref_decorations(): avoid parsing non-tag objects When we load the ref decorations, we parse the object pointed to by each ref in order to get a "struct object". This is unnecessarily expensive; we really only need the object struct, and don't even look at the parsed contents. The exception is tags, which we do need to peel. We can improve this by looking up the object type first (which is much cheaper), and skipping the parse entirely for non-tags. This increases the work slightly for annotated tags (which now do a type lookup _and_ a parse), but decreases it a lot for other types. On balance, this seems to be a good tradeoff. In my git.git clone, with ~2k refs, most of which are branches, the time to run "git log -1 --decorate" drops from 34ms to 11ms. Even on my linux.git clone, which contains mostly tags and only a handful of branches, the time drops from 30ms to 19ms. And on a more extreme real-world case with ~220k refs, mostly non-tags, the time drops from 2.6s to 650ms. That command is a lop-sided example, of course, because it does as little non-loading work as possible. But it does show the absolute time improvement. Even in something like a full "git log --decorate" on that extreme repo, we'd still be saving 2s of CPU time. Ideally we could push this even further, and avoid parsing even tags, by relying on the packed-refs "peel" optimization (which we could do by calling peel_iterated_oid() instead of peeling manually). But we can't do that here. The packed-refs file only stores the bottom-layer of the peel (so in a "tag->tag->commit" chain, it stores only the commit as the peel result). But the decoration code wants to peel the layers individually, annotating the middle layers of the chain. If the packed-refs file ever learns to store all of the peeled layers, then we could switch to it. Or even if it stored a flag to indicate the peel was not multi-layer (because most of them aren't), then we could use it most of the time and fall back to a manual peel for the rare cases. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- log-tree.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'log-tree.c') diff --git a/log-tree.c b/log-tree.c index 7b823786c2..c3c8e7e1df 100644 --- a/log-tree.c +++ b/log-tree.c @@ -134,6 +134,7 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid, int flags, void *cb_data) { struct object *obj; + enum object_type objtype; enum decoration_type type = DECORATION_NONE; struct decoration_filter *filter = (struct decoration_filter *)cb_data; @@ -155,9 +156,10 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid, return 0; } - obj = parse_object(the_repository, oid); - if (!obj) + objtype = oid_object_info(the_repository, oid, NULL); + if (objtype < 0) return 0; + obj = lookup_object_by_type(the_repository, oid, objtype); if (starts_with(refname, "refs/heads/")) type = DECORATION_REF_LOCAL; -- cgit v1.2.3 From 6afb265b9675b1e6061f53a323825db239f6e0fa Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 22 Jun 2021 13:09:26 -0400 Subject: add_ref_decoration(): rename s/type/deco_type/ Now that we have two types (a decoration type and an object type) in the function, let's give them both unique names to avoid accidentally using one instead of the other. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- log-tree.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) (limited to 'log-tree.c') diff --git a/log-tree.c b/log-tree.c index c3c8e7e1df..4f69ed176d 100644 --- a/log-tree.c +++ b/log-tree.c @@ -135,7 +135,7 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid, { struct object *obj; enum object_type objtype; - enum decoration_type type = DECORATION_NONE; + enum decoration_type deco_type = DECORATION_NONE; struct decoration_filter *filter = (struct decoration_filter *)cb_data; if (filter && !ref_filter_match(refname, filter)) @@ -162,17 +162,17 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid, obj = lookup_object_by_type(the_repository, oid, objtype); if (starts_with(refname, "refs/heads/")) - type = DECORATION_REF_LOCAL; + deco_type = DECORATION_REF_LOCAL; else if (starts_with(refname, "refs/remotes/")) - type = DECORATION_REF_REMOTE; + deco_type = DECORATION_REF_REMOTE; else if (starts_with(refname, "refs/tags/")) - type = DECORATION_REF_TAG; + deco_type = DECORATION_REF_TAG; else if (!strcmp(refname, "refs/stash")) - type = DECORATION_REF_STASH; + deco_type = DECORATION_REF_STASH; else if (!strcmp(refname, "HEAD")) - type = DECORATION_REF_HEAD; + deco_type = DECORATION_REF_HEAD; - add_name_decoration(type, refname, obj); + add_name_decoration(deco_type, refname, obj); while (obj->type == OBJ_TAG) { obj = ((struct tag *)obj)->tagged; if (!obj) -- cgit v1.2.3 From d1ed8d6cee57c91ec770a8a183ed40c3ec867ac1 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Wed, 14 Jul 2021 12:31:36 -0400 Subject: load_ref_decorations(): fix decoration with tags MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Commit 88473c8bae ("load_ref_decorations(): avoid parsing non-tag objects", 2021-06-22) introduced a shortcut to `add_ref_decoration()`: Rather than calling `parse_object()`, we go for `oid_object_info()` and then `lookup_object_by_type()` using the type just discovered. As detailed in the commit message, this provides a significant time saving. Unfortunately, it also changes the behavior: We lose all annotated tags from the decoration. The reason this happens is in the loop where we try to peel the tags, we won't necessarily have parsed that first object. If we haven't, its `tagged` field will be NULL, so we won't actually add a decoration for the pointed-to object. Make sure to parse the tag object at the top of the peeling loop. This effectively restores the pre-88473c8bae parsing -- but only of tags, allowing us to keep most of the possible speedup from 88473c8bae. On my big ~220k ref test case (where it's mostly non-tags), the timings [using "git log -1 --decorate"] are: - before either patch: 2.945s - with my broken patch: 0.707s - with [this patch]: 0.788s The simplest way to do this is to just conditionally parse before the loop: if (obj->type == OBJ_TAG) parse_object(&obj->oid); But we can observe that our tag-peeling loop needs to peel already, to examine recursive tags-of-tags. So instead of introducing a new call to parse_object(), we can simply move the parsing higher in the loop: instead of parsing the new object before we loop, parse each tag object before we look at its "tagged" field. This has another beneficial side effect: if a tag points at a commit (or other non-tag type), we do not bother to parse the commit at all now. And we know it is a commit without calling oid_object_info(), because parsing the surrounding tag object will have created the correct in-core object based on the "type" field of the tag. Our test coverage for --decorate was obviously not good, since we missed this quite-basic regression. The new tests covers an annotated tag (showing the fix), but also that we correctly show annotations for lightweight tags and double-annotated tag-of-tags. Reported-by: Martin Ågren Helped-by: Martin Ågren Signed-off-by: Martin Ågren Signed-off-by: Jeff King Reviewed-by: Martin Ågren Signed-off-by: Junio C Hamano --- log-tree.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'log-tree.c') diff --git a/log-tree.c b/log-tree.c index 4f69ed176d..6dc4412268 100644 --- a/log-tree.c +++ b/log-tree.c @@ -174,11 +174,11 @@ static int add_ref_decoration(const char *refname, const struct object_id *oid, add_name_decoration(deco_type, refname, obj); while (obj->type == OBJ_TAG) { + if (!obj->parsed) + parse_object(the_repository, &obj->oid); obj = ((struct tag *)obj)->tagged; if (!obj) break; - if (!obj->parsed) - parse_object(the_repository, &obj->oid); add_name_decoration(DECORATION_REF_TAG, refname, obj); } return 0; -- cgit v1.2.3