diff options
author | Patrick Steinhardt <ps@pks.im> | 2021-08-09 10:12:03 +0200 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2021-08-09 09:51:12 -0700 |
commit | f559d6d45e7e58ae1f922213948723de77ea77bd (patch) | |
tree | 5db9c2b4540075ffb8f27aaab1f5d4ba489ca9e8 /revision.c | |
parent | commit-graph: split out function to search commit position (diff) | |
download | tgif-f559d6d45e7e58ae1f922213948723de77ea77bd.tar.xz |
revision: avoid hitting packfiles when commits are in commit-graph
When queueing references in git-rev-list(1), we try to optimize parsing
of commits via the commit-graph. To do so, we first look up the object's
type, and if it is a commit we call `repo_parse_commit()` instead of
`parse_object()`. This is quite inefficient though given that we're
always uncompressing the object header in order to determine the type.
Instead, we can opportunistically search the commit-graph for the object
ID: in case it's found, we know it's a commit and can directly fill in
the commit object without having to uncompress the object header.
Expose a new function `lookup_commit_in_graph()`, which tries to find a
commit in the commit-graph by ID, and convert `get_reference()` to use
this function. This provides a big performance win in cases where we
load references in a repository with lots of references pointing to
commits. The following has been executed in a real-world repository with
about 2.2 million refs:
Benchmark #1: HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
Time (mean ± σ): 4.458 s ± 0.044 s [User: 4.115 s, System: 0.342 s]
Range (min … max): 4.409 s … 4.534 s 10 runs
Benchmark #2: HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev
Time (mean ± σ): 3.089 s ± 0.015 s [User: 2.768 s, System: 0.321 s]
Range (min … max): 3.061 s … 3.105 s 10 runs
Summary
'HEAD: rev-list --unsorted-input --objects --quiet --not --all --not $newrev' ran
1.44 ± 0.02 times faster than 'HEAD~: rev-list --unsorted-input --objects --quiet --not --all --not $newrev'
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'revision.c')
-rw-r--r-- | revision.c | 18 |
1 files changed, 8 insertions, 10 deletions
diff --git a/revision.c b/revision.c index 80a59896b9..0dabb5a0bc 100644 --- a/revision.c +++ b/revision.c @@ -360,20 +360,18 @@ static struct object *get_reference(struct rev_info *revs, const char *name, unsigned int flags) { struct object *object; + struct commit *commit; /* - * If the repository has commit graphs, repo_parse_commit() avoids - * reading the object buffer, so use it whenever possible. + * If the repository has commit graphs, we try to opportunistically + * look up the object ID in those graphs. Like this, we can avoid + * parsing commit data from disk. */ - if (oid_object_info(revs->repo, oid, NULL) == OBJ_COMMIT) { - struct commit *c = lookup_commit(revs->repo, oid); - if (!repo_parse_commit(revs->repo, c)) - object = (struct object *) c; - else - object = NULL; - } else { + commit = lookup_commit_in_graph(revs->repo, oid); + if (commit) + object = &commit->object; + else object = parse_object(revs->repo, oid); - } if (!object) { if (revs->ignore_missing) |