From d4f4d60f6da8b93f768b2b4958c04cc1b3cea443 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Tue, 18 Jun 2019 11:14:24 -0700 Subject: commit-graph: prepare for commit-graph chains To prepare for a chain of commit-graph files, augment the commit_graph struct to point to a base commit_graph. As we load commits from the graph, we may actually want to read from a base file according to the graph position. The "graph position" of a commit is given by concatenating the lexicographic commit orders from each of the commit-graph files in the chain. This means that we must distinguish two values: * lexicographic index : the position within the lexicographic order in a single commit-graph file. * graph position: the position within the concatenated order of multiple commit-graph files Given the lexicographic index of a commit in a graph, we can compute the graph position by adding the number of commits in the lower-level graphs. To find the lexicographic index of a commit, we subtract the number of commits in lower-level graphs. While here, change insert_parent_or_die() to take a uint32_t position, as that is the type used by its only caller and that makes more sense with the limits in the commit-graph format. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- commit-graph.h | 3 +++ 1 file changed, 3 insertions(+) (limited to 'commit-graph.h') diff --git a/commit-graph.h b/commit-graph.h index 390c7f6961..5819910a5b 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -48,6 +48,9 @@ struct commit_graph { uint32_t num_commits; struct object_id oid; + uint32_t num_commits_in_base; + struct commit_graph *base_graph; + const uint32_t *chunk_oid_fanout; const unsigned char *chunk_oid_lookup; const unsigned char *chunk_commit_data; -- cgit v1.2.3 From 118bd570029f5610abdbf1a220e87d3a5c241f5f Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Tue, 18 Jun 2019 11:14:26 -0700 Subject: commit-graph: add base graphs chunk To quickly verify a commit-graph chain is valid on load, we will read from the new "Base Graphs Chunk" of each file in the chain. This will prevent accidentally loading incorrect data from manually editing the commit-graph-chain file or renaming graph-{hash}.graph files. The commit_graph struct already had an object_id struct "oid", but it was never initialized or used. Add a line to read the hash from the end of the commit-graph file and into the oid member. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- commit-graph.h | 1 + 1 file changed, 1 insertion(+) (limited to 'commit-graph.h') diff --git a/commit-graph.h b/commit-graph.h index 5819910a5b..6e7d42cf32 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -55,6 +55,7 @@ struct commit_graph { const unsigned char *chunk_oid_lookup; const unsigned char *chunk_commit_data; const unsigned char *chunk_extra_edges; + const unsigned char *chunk_base_graphs; }; struct commit_graph *load_commit_graph_one_fd_st(int fd, struct stat *st); -- cgit v1.2.3 From 6c622f9f0bbb38a23341dc4294f56d0d909b3d50 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Tue, 18 Jun 2019 11:14:27 -0700 Subject: commit-graph: write commit-graph chains Extend write_commit_graph() to write a commit-graph chain when given the COMMIT_GRAPH_SPLIT flag. This implementation is purposefully simplistic in how it creates a new chain. The commits not already in the chain are added to a new tip commit-graph file. Much of the logic around writing a graph-{hash}.graph file and updating the commit-graph-chain file is the same as the commit-graph file case. However, there are several places where we need to do some extra logic in the split case. Track the list of graph filenames before and after the planned write. This will be more important when we start merging graph files, but it also allows us to upgrade our commit-graph file to the appropriate graph-{hash}.graph file when we upgrade to a chain of commit-graphs. Note that we use the eighth byte of the commit-graph header to store the number of base graph files. This determines the length of the base graphs chunk. A subtle change of behavior with the new logic is that we do not write a commit-graph if we our commit list is empty. This extends to the typical case, which is reflected in t5318-commit-graph.sh. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- commit-graph.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'commit-graph.h') diff --git a/commit-graph.h b/commit-graph.h index 6e7d42cf32..c321834533 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -47,6 +47,7 @@ struct commit_graph { unsigned char num_chunks; uint32_t num_commits; struct object_id oid; + char *filename; uint32_t num_commits_in_base; struct commit_graph *base_graph; @@ -71,6 +72,7 @@ int generation_numbers_enabled(struct repository *r); #define COMMIT_GRAPH_APPEND (1 << 0) #define COMMIT_GRAPH_PROGRESS (1 << 1) +#define COMMIT_GRAPH_SPLIT (1 << 2) /* * The write_commit_graph* methods return zero on success -- cgit v1.2.3 From c523035cbd8515d095dc15e9661fd896733bedbc Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Tue, 18 Jun 2019 11:14:30 -0700 Subject: commit-graph: allow cross-alternate chains In an environment like a fork network, it is helpful to have a commit-graph chain that spans both the base repo and the fork repo. The fork is usually a small set of data on top of the large repo, but sometimes the fork is much larger. For example, git-for-windows/git has almost double the number of commits as git/git because it rebases its commits on every major version update. To allow cross-alternate commit-graph chains, we need a few pieces: 1. When looking for a graph-{hash}.graph file, check all alternates. 2. When merging commit-graph chains, do not merge across alternates. 3. When writing a new commit-graph chain based on a commit-graph file in another object directory, do not allow success if the base file has of the name "commit-graph" instead of "commit-graphs/graph-{hash}.graph". Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- commit-graph.h | 1 + 1 file changed, 1 insertion(+) (limited to 'commit-graph.h') diff --git a/commit-graph.h b/commit-graph.h index c321834533..802d35254f 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -48,6 +48,7 @@ struct commit_graph { uint32_t num_commits; struct object_id oid; char *filename; + const char *obj_dir; uint32_t num_commits_in_base; struct commit_graph *base_graph; -- cgit v1.2.3 From c2bc6e6ab0ade70c475a73d06326e677e70840d2 Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Tue, 18 Jun 2019 11:14:32 -0700 Subject: commit-graph: create options for split files The split commit-graph feature is now fully implemented, but needs some more run-time configurability. Allow direct callers to 'git commit-graph write --split' to specify the values used in the merge strategy and the expire time. Update the documentation to specify these values. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- commit-graph.h | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) (limited to 'commit-graph.h') diff --git a/commit-graph.h b/commit-graph.h index 802d35254f..a84c22a560 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -75,17 +75,25 @@ int generation_numbers_enabled(struct repository *r); #define COMMIT_GRAPH_PROGRESS (1 << 1) #define COMMIT_GRAPH_SPLIT (1 << 2) +struct split_commit_graph_opts { + int size_multiple; + int max_commits; + timestamp_t expire_time; +}; + /* * The write_commit_graph* methods return zero on success * and a negative value on failure. Note that if the repository * is not compatible with the commit-graph feature, then the * methods will return 0 without writing a commit-graph. */ -int write_commit_graph_reachable(const char *obj_dir, unsigned int flags); +int write_commit_graph_reachable(const char *obj_dir, unsigned int flags, + const struct split_commit_graph_opts *split_opts); int write_commit_graph(const char *obj_dir, struct string_list *pack_indexes, struct string_list *commit_hex, - unsigned int flags); + unsigned int flags, + const struct split_commit_graph_opts *split_opts); int verify_commit_graph(struct repository *r, struct commit_graph *g); -- cgit v1.2.3 From 3da4b609bb14b13672f64af908706462617f53cb Mon Sep 17 00:00:00 2001 From: Derrick Stolee Date: Tue, 18 Jun 2019 11:14:32 -0700 Subject: commit-graph: verify chains with --shallow mode If we wrote a commit-graph chain, we only modified the tip file in the chain. It is valuable to verify what we wrote, but not waste time checking files we did not write. Add a '--shallow' option to the 'git commit-graph verify' subcommand and check that it does not read the base graph in a two-file chain. Making the verify subcommand read from a chain of commit-graphs takes some rearranging of the builtin code. Signed-off-by: Derrick Stolee Signed-off-by: Junio C Hamano --- commit-graph.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) (limited to 'commit-graph.h') diff --git a/commit-graph.h b/commit-graph.h index a84c22a560..df9a3b20e4 100644 --- a/commit-graph.h +++ b/commit-graph.h @@ -61,7 +61,7 @@ struct commit_graph { }; struct commit_graph *load_commit_graph_one_fd_st(int fd, struct stat *st); - +struct commit_graph *read_commit_graph_one(struct repository *r, const char *obj_dir); struct commit_graph *parse_commit_graph(void *graph_map, int fd, size_t graph_size); @@ -95,7 +95,9 @@ int write_commit_graph(const char *obj_dir, unsigned int flags, const struct split_commit_graph_opts *split_opts); -int verify_commit_graph(struct repository *r, struct commit_graph *g); +#define COMMIT_GRAPH_VERIFY_SHALLOW (1 << 0) + +int verify_commit_graph(struct repository *r, struct commit_graph *g, int flags); void close_commit_graph(struct raw_object_store *); void free_commit_graph(struct commit_graph *); -- cgit v1.2.3