summaryrefslogtreecommitdiff
path: root/Documentation/technical
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/technical')
-rw-r--r--Documentation/technical/api-argv-array.txt2
-rw-r--r--Documentation/technical/api-config.txt20
-rw-r--r--Documentation/technical/api-decorate.txt6
-rw-r--r--Documentation/technical/api-directory-listing.txt27
-rw-r--r--Documentation/technical/api-gitattributes.txt2
-rw-r--r--Documentation/technical/api-object-access.txt4
-rw-r--r--Documentation/technical/api-oid-array.txt17
-rw-r--r--Documentation/technical/api-ref-iteration.txt7
-rw-r--r--Documentation/technical/api-string-list.txt209
-rw-r--r--Documentation/technical/api-submodule-config.txt4
-rw-r--r--Documentation/technical/api-tree-walking.txt6
-rw-r--r--Documentation/technical/commit-graph-format.txt97
-rw-r--r--Documentation/technical/commit-graph.txt160
-rw-r--r--Documentation/technical/directory-rename-detection.txt115
-rw-r--r--Documentation/technical/hash-function-transition.txt827
-rw-r--r--Documentation/technical/http-protocol.txt15
-rw-r--r--Documentation/technical/index-format.txt19
-rw-r--r--Documentation/technical/long-running-process-protocol.txt50
-rw-r--r--Documentation/technical/pack-format.txt92
-rw-r--r--Documentation/technical/pack-protocol.txt56
-rw-r--r--Documentation/technical/partial-clone.txt324
-rw-r--r--Documentation/technical/protocol-capabilities.txt8
-rw-r--r--Documentation/technical/protocol-v2.txt439
-rw-r--r--Documentation/technical/repository-version.txt12
-rw-r--r--Documentation/technical/shallow.txt20
-rw-r--r--Documentation/technical/trivial-merge.txt4
26 files changed, 2273 insertions, 269 deletions
diff --git a/Documentation/technical/api-argv-array.txt b/Documentation/technical/api-argv-array.txt
index cfc063018c..870c8edbfb 100644
--- a/Documentation/technical/api-argv-array.txt
+++ b/Documentation/technical/api-argv-array.txt
@@ -8,7 +8,7 @@ always NULL-terminated at the element pointed to by `argv[argc]`. This
makes the result suitable for passing to functions expecting to receive
argv from main(), or the link:api-run-command.html[run-command API].
-The link:api-string-list.html[string-list API] is similar, but cannot be
+The string-list API (documented in string-list.h) is similar, but cannot be
used for these purposes; instead of storing a straight string pointer,
it contains an item structure with a `util` field that is not compatible
with the traditional argv interface.
diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt
index 20741f345e..fa39ac9d71 100644
--- a/Documentation/technical/api-config.txt
+++ b/Documentation/technical/api-config.txt
@@ -47,21 +47,23 @@ will first feed the user-wide one to the callback, and then the
repo-specific one; by overwriting, the higher-priority repo-specific
value is left at the end).
-The `git_config_with_options` function lets the caller examine config
+The `config_with_options` function lets the caller examine config
while adjusting some of the default behavior of `git_config`. It should
almost never be used by "regular" Git code that is looking up
configuration variables. It is intended for advanced callers like
`git-config`, which are intentionally tweaking the normal config-lookup
process. It takes two extra parameters:
-`filename`::
-If this parameter is non-NULL, it specifies the name of a file to
-parse for configuration, rather than looking in the usual files. Regular
-`git_config` defaults to `NULL`.
+`config_source`::
+If this parameter is non-NULL, it specifies the source to parse for
+configuration, rather than looking in the usual files. See `struct
+git_config_source` in `config.h` for details. Regular `git_config` defaults
+to `NULL`.
-`respect_includes`::
-Specify whether include directives should be followed in parsed files.
-Regular `git_config` defaults to `1`.
+`opts`::
+Specify options to adjust the behavior of parsing config files. See `struct
+config_options` in `config.h` for details. As an example: regular `git_config`
+sets `opts.respect_includes` to `1` by default.
Reading Specific Files
----------------------
@@ -186,7 +188,7 @@ parsing is successful, the return value is the result.
Same as `git_config_bool`, except that integers are returned as-is, and
an `is_bool` flag is unset.
-`git_config_maybe_bool`::
+`git_parse_maybe_bool`::
Same as `git_config_bool`, except that it returns -1 on error rather
than dying.
diff --git a/Documentation/technical/api-decorate.txt b/Documentation/technical/api-decorate.txt
deleted file mode 100644
index 1d52a6ce14..0000000000
--- a/Documentation/technical/api-decorate.txt
+++ /dev/null
@@ -1,6 +0,0 @@
-decorate API
-============
-
-Talk about <decorate.h>
-
-(Linus)
diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index 6c77b4920c..5abb8e8b1f 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -22,16 +22,20 @@ The notable options are:
`flags`::
- A bit-field of options (the `*IGNORED*` flags are mutually exclusive):
+ A bit-field of options:
`DIR_SHOW_IGNORED`:::
- Return just ignored files in `entries[]`, not untracked files.
+ Return just ignored files in `entries[]`, not untracked
+ files. This flag is mutually exclusive with
+ `DIR_SHOW_IGNORED_TOO`.
`DIR_SHOW_IGNORED_TOO`:::
- Similar to `DIR_SHOW_IGNORED`, but return ignored files in `ignored[]`
- in addition to untracked files in `entries[]`.
+ Similar to `DIR_SHOW_IGNORED`, but return ignored files in
+ `ignored[]` in addition to untracked files in
+ `entries[]`. This flag is mutually exclusive with
+ `DIR_SHOW_IGNORED`.
`DIR_KEEP_UNTRACKED_CONTENTS`:::
@@ -39,6 +43,21 @@ The notable options are:
untracked contents of untracked directories are also returned in
`entries[]`.
+`DIR_SHOW_IGNORED_TOO_MODE_MATCHING`:::
+
+ Only has meaning if `DIR_SHOW_IGNORED_TOO` is also set; if
+ this is set, returns ignored files and directories that match
+ an exclude pattern. If a directory matches an exclude pattern,
+ then the directory is returned and the contained paths are
+ not. A directory that does not match an exclude pattern will
+ not be returned even if all of its contents are ignored. In
+ this case, the contents are returned as individual entries.
++
+If this is set, files and directories that explicitly match an ignore
+pattern are reported. Implicitly ignored directories (directories that
+do not match an ignore pattern, but whose contents are all ignored)
+are not reported, instead all of the contents are reported.
+
`DIR_COLLECT_IGNORED`:::
Special mode for git-add. Return ignored files in `ignored[]` and
diff --git a/Documentation/technical/api-gitattributes.txt b/Documentation/technical/api-gitattributes.txt
index e7cbb7c13a..45f0df600f 100644
--- a/Documentation/technical/api-gitattributes.txt
+++ b/Documentation/technical/api-gitattributes.txt
@@ -146,7 +146,7 @@ To get the values of all attributes associated with a file:
* Iterate over the `attr_check.items[]` array to examine
the attribute names and values. The name of the attribute
- described by a `attr_check.items[]` object can be retrieved via
+ described by an `attr_check.items[]` object can be retrieved via
`git_attr_name(check->items[i].attr)`. (Please note that no items
will be returned for unset attributes, so `ATTR_UNSET()` will return
false for all returned `attr_check.items[]` objects.)
diff --git a/Documentation/technical/api-object-access.txt b/Documentation/technical/api-object-access.txt
index 03bb0e950d..5b29622d00 100644
--- a/Documentation/technical/api-object-access.txt
+++ b/Documentation/technical/api-object-access.txt
@@ -1,13 +1,13 @@
object access API
=================
-Talk about <sha1_file.c> and <object.h> family, things like
+Talk about <sha1-file.c> and <object.h> family, things like
* read_sha1_file()
* read_object_with_reference()
* has_sha1_file()
* write_sha1_file()
-* pretend_sha1_file()
+* pretend_object_file()
* lookup_{object,commit,tag,blob,tree}
* parse_{object,commit,tag,blob,tree}
* Use of object flags
diff --git a/Documentation/technical/api-oid-array.txt b/Documentation/technical/api-oid-array.txt
index b0c11f868d..9febfb1d52 100644
--- a/Documentation/technical/api-oid-array.txt
+++ b/Documentation/technical/api-oid-array.txt
@@ -35,13 +35,18 @@ Functions
Free all memory associated with the array and return it to the
initial, empty state.
+`oid_array_for_each`::
+ Iterate over each element of the list, executing the callback
+ function for each one. Does not sort the list, so any custom
+ hash order is retained. If the callback returns a non-zero
+ value, the iteration ends immediately and the callback's
+ return is propagated; otherwise, 0 is returned.
+
`oid_array_for_each_unique`::
- Efficiently iterate over each unique element of the list,
- executing the callback function for each one. If the array is
- not sorted, this function has the side effect of sorting it. If
- the callback returns a non-zero value, the iteration ends
- immediately and the callback's return is propagated; otherwise,
- 0 is returned.
+ Iterate over each unique element of the list in sorted order,
+ but otherwise behave like `oid_array_for_each`. If the array
+ is not sorted, this function has the side effect of sorting
+ it.
Examples
--------
diff --git a/Documentation/technical/api-ref-iteration.txt b/Documentation/technical/api-ref-iteration.txt
index 37379d8337..46c3d5c355 100644
--- a/Documentation/technical/api-ref-iteration.txt
+++ b/Documentation/technical/api-ref-iteration.txt
@@ -32,11 +32,8 @@ Iteration functions
* `for_each_glob_ref_in()` the previous and `for_each_ref_in()` combined.
-* `head_ref_submodule()`, `for_each_ref_submodule()`,
- `for_each_ref_in_submodule()`, `for_each_tag_ref_submodule()`,
- `for_each_branch_ref_submodule()`, `for_each_remote_ref_submodule()`
- do the same as the functions described above but for a specified
- submodule.
+* Use `refs_` API for accessing submodules. The submodule ref store could
+ be obtained with `get_submodule_ref_store()`.
* `for_each_rawref()` can be used to learn about broken ref and symref.
diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt
deleted file mode 100644
index c08402b12e..0000000000
--- a/Documentation/technical/api-string-list.txt
+++ /dev/null
@@ -1,209 +0,0 @@
-string-list API
-===============
-
-The string_list API offers a data structure and functions to handle
-sorted and unsorted string lists. A "sorted" list is one whose
-entries are sorted by string value in `strcmp()` order.
-
-The 'string_list' struct used to be called 'path_list', but was renamed
-because it is not specific to paths.
-
-The caller:
-
-. Allocates and clears a `struct string_list` variable.
-
-. Initializes the members. You might want to set the flag `strdup_strings`
- if the strings should be strdup()ed. For example, this is necessary
- when you add something like git_path("..."), since that function returns
- a static buffer that will change with the next call to git_path().
-+
-If you need something advanced, you can manually malloc() the `items`
-member (you need this if you add things later) and you should set the
-`nr` and `alloc` members in that case, too.
-
-. Adds new items to the list, using `string_list_append`,
- `string_list_append_nodup`, `string_list_insert`,
- `string_list_split`, and/or `string_list_split_in_place`.
-
-. Can check if a string is in the list using `string_list_has_string` or
- `unsorted_string_list_has_string` and get it from the list using
- `string_list_lookup` for sorted lists.
-
-. Can sort an unsorted list using `string_list_sort`.
-
-. Can remove duplicate items from a sorted list using
- `string_list_remove_duplicates`.
-
-. Can remove individual items of an unsorted list using
- `unsorted_string_list_delete_item`.
-
-. Can remove items not matching a criterion from a sorted or unsorted
- list using `filter_string_list`, or remove empty strings using
- `string_list_remove_empty_items`.
-
-. Finally it should free the list using `string_list_clear`.
-
-Example:
-
-----
-struct string_list list = STRING_LIST_INIT_NODUP;
-int i;
-
-string_list_append(&list, "foo");
-string_list_append(&list, "bar");
-for (i = 0; i < list.nr; i++)
- printf("%s\n", list.items[i].string)
-----
-
-NOTE: It is more efficient to build an unsorted list and sort it
-afterwards, instead of building a sorted list (`O(n log n)` instead of
-`O(n^2)`).
-+
-However, if you use the list to check if a certain string was added
-already, you should not do that (using unsorted_string_list_has_string()),
-because the complexity would be quadratic again (but with a worse factor).
-
-Functions
----------
-
-* General ones (works with sorted and unsorted lists as well)
-
-`string_list_init`::
-
- Initialize the members of the string_list, set `strdup_strings`
- member according to the value of the second parameter.
-
-`filter_string_list`::
-
- Apply a function to each item in a list, retaining only the
- items for which the function returns true. If free_util is
- true, call free() on the util members of any items that have
- to be deleted. Preserve the order of the items that are
- retained.
-
-`string_list_remove_empty_items`::
-
- Remove any empty strings from the list. If free_util is true,
- call free() on the util members of any items that have to be
- deleted. Preserve the order of the items that are retained.
-
-`print_string_list`::
-
- Dump a string_list to stdout, useful mainly for debugging purposes. It
- can take an optional header argument and it writes out the
- string-pointer pairs of the string_list, each one in its own line.
-
-`string_list_clear`::
-
- Free a string_list. The `string` pointer of the items will be freed in
- case the `strdup_strings` member of the string_list is set. The second
- parameter controls if the `util` pointer of the items should be freed
- or not.
-
-* Functions for sorted lists only
-
-`string_list_has_string`::
-
- Determine if the string_list has a given string or not.
-
-`string_list_insert`::
-
- Insert a new element to the string_list. The returned pointer can be
- handy if you want to write something to the `util` pointer of the
- string_list_item containing the just added string. If the given
- string already exists the insertion will be skipped and the
- pointer to the existing item returned.
-+
-Since this function uses xrealloc() (which die()s if it fails) if the
-list needs to grow, it is safe not to check the pointer. I.e. you may
-write `string_list_insert(...)->util = ...;`.
-
-`string_list_lookup`::
-
- Look up a given string in the string_list, returning the containing
- string_list_item. If the string is not found, NULL is returned.
-
-`string_list_remove_duplicates`::
-
- Remove all but the first of consecutive entries that have the
- same string value. If free_util is true, call free() on the
- util members of any items that have to be deleted.
-
-* Functions for unsorted lists only
-
-`string_list_append`::
-
- Append a new string to the end of the string_list. If
- `strdup_string` is set, then the string argument is copied;
- otherwise the new `string_list_entry` refers to the input
- string.
-
-`string_list_append_nodup`::
-
- Append a new string to the end of the string_list. The new
- `string_list_entry` always refers to the input string, even if
- `strdup_string` is set. This function can be used to hand
- ownership of a malloc()ed string to a `string_list` that has
- `strdup_string` set.
-
-`string_list_sort`::
-
- Sort the list's entries by string value in `strcmp()` order.
-
-`unsorted_string_list_has_string`::
-
- It's like `string_list_has_string()` but for unsorted lists.
-
-`unsorted_string_list_lookup`::
-
- It's like `string_list_lookup()` but for unsorted lists.
-+
-The above two functions need to look through all items, as opposed to their
-counterpart for sorted lists, which performs a binary search.
-
-`unsorted_string_list_delete_item`::
-
- Remove an item from a string_list. The `string` pointer of the items
- will be freed in case the `strdup_strings` member of the string_list
- is set. The third parameter controls if the `util` pointer of the
- items should be freed or not.
-
-`string_list_split`::
-`string_list_split_in_place`::
-
- Split a string into substrings on a delimiter character and
- append the substrings to a `string_list`. If `maxsplit` is
- non-negative, then split at most `maxsplit` times. Return the
- number of substrings appended to the list.
-+
-`string_list_split` requires a `string_list` that has `strdup_strings`
-set to true; it leaves the input string untouched and makes copies of
-the substrings in newly-allocated memory.
-`string_list_split_in_place` requires a `string_list` that has
-`strdup_strings` set to false; it splits the input string in place,
-overwriting the delimiter characters with NULs and creating new
-string_list_items that point into the original string (the original
-string must therefore not be modified or freed while the `string_list`
-is in use).
-
-
-Data structures
----------------
-
-* `struct string_list_item`
-
-Represents an item of the list. The `string` member is a pointer to the
-string, and you may use the `util` member for any purpose, if you want.
-
-* `struct string_list`
-
-Represents the list itself.
-
-. The array of items are available via the `items` member.
-. The `nr` member contains the number of items stored in the list.
-. The `alloc` member is used to avoid reallocating at every insertion.
- You should not tamper with it.
-. Setting the `strdup_strings` member to 1 will strdup() the strings
- before adding them, see above.
-. The `compare_strings_fn` member is used to specify a custom compare
- function, otherwise `strcmp()` is used as the default function.
diff --git a/Documentation/technical/api-submodule-config.txt b/Documentation/technical/api-submodule-config.txt
index 3dce003fda..fb06089393 100644
--- a/Documentation/technical/api-submodule-config.txt
+++ b/Documentation/technical/api-submodule-config.txt
@@ -4,7 +4,7 @@ submodule config cache API
The submodule config cache API allows to read submodule
configurations/information from specified revisions. Internally
information is lazily read into a cache that is used to avoid
-unnecessary parsing of the same .gitmodule files. Lookups can be done by
+unnecessary parsing of the same .gitmodules files. Lookups can be done by
submodule path or name.
Usage
@@ -38,7 +38,7 @@ Data Structures
Functions
---------
-`void submodule_free()`::
+`void submodule_free(struct repository *r)`::
Use these to free the internally cached values.
diff --git a/Documentation/technical/api-tree-walking.txt b/Documentation/technical/api-tree-walking.txt
index 14af37c3f1..bde18622a8 100644
--- a/Documentation/technical/api-tree-walking.txt
+++ b/Documentation/technical/api-tree-walking.txt
@@ -55,9 +55,9 @@ Initializing
`fill_tree_descriptor`::
- Initialize a `tree_desc` and decode its first entry given the sha1 of
- a tree. Returns the `buffer` member if the sha1 is a valid tree
- identifier and NULL otherwise.
+ Initialize a `tree_desc` and decode its first entry given the
+ object ID of a tree. Returns the `buffer` member if the latter
+ is a valid tree identifier and NULL otherwise.
`setup_traverse_info`::
diff --git a/Documentation/technical/commit-graph-format.txt b/Documentation/technical/commit-graph-format.txt
new file mode 100644
index 0000000000..cc0474ba3e
--- /dev/null
+++ b/Documentation/technical/commit-graph-format.txt
@@ -0,0 +1,97 @@
+Git commit graph format
+=======================
+
+The Git commit graph stores a list of commit OIDs and some associated
+metadata, including:
+
+- The generation number of the commit. Commits with no parents have
+ generation number 1; commits with parents have generation number
+ one more than the maximum generation number of its parents. We
+ reserve zero as special, and can be used to mark a generation
+ number invalid or as "not computed".
+
+- The root tree OID.
+
+- The commit date.
+
+- The parents of the commit, stored using positional references within
+ the graph file.
+
+These positional references are stored as unsigned 32-bit integers
+corresponding to the array position within the list of commit OIDs. Due
+to some special constants we use to track parents, we can store at most
+(1 << 30) + (1 << 29) + (1 << 28) - 1 (around 1.8 billion) commits.
+
+== Commit graph files have the following format:
+
+In order to allow extensions that add extra data to the graph, we organize
+the body into "chunks" and provide a binary lookup table at the beginning
+of the body. The header includes certain values, such as number of chunks
+and hash type.
+
+All 4-byte numbers are in network order.
+
+HEADER:
+
+ 4-byte signature:
+ The signature is: {'C', 'G', 'P', 'H'}
+
+ 1-byte version number:
+ Currently, the only valid version is 1.
+
+ 1-byte Hash Version (1 = SHA-1)
+ We infer the hash length (H) from this value.
+
+ 1-byte number (C) of "chunks"
+
+ 1-byte (reserved for later use)
+ Current clients should ignore this value.
+
+CHUNK LOOKUP:
+
+ (C + 1) * 12 bytes listing the table of contents for the chunks:
+ First 4 bytes describe the chunk id. Value 0 is a terminating label.
+ Other 8 bytes provide the byte-offset in current file for chunk to
+ start. (Chunks are ordered contiguously in the file, so you can infer
+ the length using the next chunk position if necessary.) Each chunk
+ ID appears at most once.
+
+ The remaining data in the body is described one chunk at a time, and
+ these chunks may be given in any order. Chunks are required unless
+ otherwise specified.
+
+CHUNK DATA:
+
+ OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)
+ The ith entry, F[i], stores the number of OIDs with first
+ byte at most i. Thus F[255] stores the total
+ number of commits (N).
+
+ OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
+ The OIDs for all commits in the graph, sorted in ascending order.
+
+ Commit Data (ID: {'C', 'D', 'A', 'T' }) (N * (H + 16) bytes)
+ * The first H bytes are for the OID of the root tree.
+ * The next 8 bytes are for the positions of the first two parents
+ of the ith commit. Stores value 0x7000000 if no parent in that
+ position. If there are more than two parents, the second value
+ has its most-significant bit on and the other bits store an array
+ position into the Large Edge List chunk.
+ * The next 8 bytes store the generation number of the commit and
+ the commit time in seconds since EPOCH. The generation number
+ uses the higher 30 bits of the first 4 bytes, while the commit
+ time uses the 32 bits of the second 4 bytes, along with the lowest
+ 2 bits of the lowest byte, storing the 33rd and 34th bit of the
+ commit time.
+
+ Large Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional]
+ This list of 4-byte values store the second through nth parents for
+ all octopus merges. The second parent value in the commit data stores
+ an array position within this list along with the most-significant bit
+ on. Starting at that array position, iterate through this list of commit
+ positions for the parents until reaching a value with the most-significant
+ bit on. The other bits correspond to the position of the last parent.
+
+TRAILER:
+
+ H-byte HASH-checksum of all of the above.
diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt
new file mode 100644
index 0000000000..c664acbd76
--- /dev/null
+++ b/Documentation/technical/commit-graph.txt
@@ -0,0 +1,160 @@
+Git Commit Graph Design Notes
+=============================
+
+Git walks the commit graph for many reasons, including:
+
+1. Listing and filtering commit history.
+2. Computing merge bases.
+
+These operations can become slow as the commit count grows. The merge
+base calculation shows up in many user-facing commands, such as 'merge-base'
+or 'status' and can take minutes to compute depending on history shape.
+
+There are two main costs here:
+
+1. Decompressing and parsing commits.
+2. Walking the entire graph to satisfy topological order constraints.
+
+The commit graph file is a supplemental data structure that accelerates
+commit graph walks. If a user downgrades or disables the 'core.commitGraph'
+config setting, then the existing ODB is sufficient. The file is stored
+as "commit-graph" either in the .git/objects/info directory or in the info
+directory of an alternate.
+
+The commit graph file stores the commit graph structure along with some
+extra metadata to speed up graph walks. By listing commit OIDs in lexi-
+cographic order, we can identify an integer position for each commit and
+refer to the parents of a commit using those integer positions. We use
+binary search to find initial commits and then use the integer positions
+for fast lookups during the walk.
+
+A consumer may load the following info for a commit from the graph:
+
+1. The commit OID.
+2. The list of parents, along with their integer position.
+3. The commit date.
+4. The root tree OID.
+5. The generation number (see definition below).
+
+Values 1-4 satisfy the requirements of parse_commit_gently().
+
+Define the "generation number" of a commit recursively as follows:
+
+ * A commit with no parents (a root commit) has generation number one.
+
+ * A commit with at least one parent has generation number one more than
+ the largest generation number among its parents.
+
+Equivalently, the generation number of a commit A is one more than the
+length of a longest path from A to a root commit. The recursive definition
+is easier to use for computation and observing the following property:
+
+ If A and B are commits with generation numbers N and M, respectively,
+ and N <= M, then A cannot reach B. That is, we know without searching
+ that B is not an ancestor of A because it is further from a root commit
+ than A.
+
+ Conversely, when checking if A is an ancestor of B, then we only need
+ to walk commits until all commits on the walk boundary have generation
+ number at most N. If we walk commits using a priority queue seeded by
+ generation numbers, then we always expand the boundary commit with highest
+ generation number and can easily detect the stopping condition.
+
+This property can be used to significantly reduce the time it takes to
+walk commits and determine topological relationships. Without generation
+numbers, the general heuristic is the following:
+
+ If A and B are commits with commit time X and Y, respectively, and
+ X < Y, then A _probably_ cannot reach B.
+
+This heuristic is currently used whenever the computation is allowed to
+violate topological relationships due to clock skew (such as "git log"
+with default order), but is not used when the topological order is
+required (such as merge base calculations, "git log --graph").
+
+In practice, we expect some commits to be created recently and not stored
+in the commit graph. We can treat these commits as having "infinite"
+generation number and walk until reaching commits with known generation
+number.
+
+We use the macro GENERATION_NUMBER_INFINITY = 0xFFFFFFFF to mark commits not
+in the commit-graph file. If a commit-graph file was written by a version
+of Git that did not compute generation numbers, then those commits will
+have generation number represented by the macro GENERATION_NUMBER_ZERO = 0.
+
+Since the commit-graph file is closed under reachability, we can guarantee
+the following weaker condition on all commits:
+
+ If A and B are commits with generation numbers N amd M, respectively,
+ and N < M, then A cannot reach B.
+
+Note how the strict inequality differs from the inequality when we have
+fully-computed generation numbers. Using strict inequality may result in
+walking a few extra commits, but the simplicity in dealing with commits
+with generation number *_INFINITY or *_ZERO is valuable.
+
+We use the macro GENERATION_NUMBER_MAX = 0x3FFFFFFF to for commits whose
+generation numbers are computed to be at least this value. We limit at
+this value since it is the largest value that can be stored in the
+commit-graph file using the 30 bits available to generation numbers. This
+presents another case where a commit can have generation number equal to
+that of a parent.
+
+Design Details
+--------------
+
+- The commit graph file is stored in a file named 'commit-graph' in the
+ .git/objects/info directory. This could be stored in the info directory
+ of an alternate.
+
+- The core.commitGraph config setting must be on to consume graph files.
+
+- The file format includes parameters for the object ID hash function,
+ so a future change of hash algorithm does not require a change in format.
+
+Future Work
+-----------
+
+- The commit graph feature currently does not honor commit grafts. This can
+ be remedied by duplicating or refactoring the current graft logic.
+
+- After computing and storing generation numbers, we must make graph
+ walks aware of generation numbers to gain the performance benefits they
+ enable. This will mostly be accomplished by swapping a commit-date-ordered
+ priority queue with one ordered by generation number. The following
+ operations are important candidates:
+
+ - 'log --topo-order'
+ - 'tag --merged'
+
+- A server could provide a commit graph file as part of the network protocol
+ to avoid extra calculations by clients. This feature is only of benefit if
+ the user is willing to trust the file, because verifying the file is correct
+ is as hard as computing it from scratch.
+
+Related Links
+-------------
+[0] https://bugs.chromium.org/p/git/issues/detail?id=8
+ Chromium work item for: Serialized Commit Graph
+
+[1] https://public-inbox.org/git/20110713070517.GC18566@sigill.intra.peff.net/
+ An abandoned patch that introduced generation numbers.
+
+[2] https://public-inbox.org/git/20170908033403.q7e6dj7benasrjes@sigill.intra.peff.net/
+ Discussion about generation numbers on commits and how they interact
+ with fsck.
+
+[3] https://public-inbox.org/git/20170908034739.4op3w4f2ma5s65ku@sigill.intra.peff.net/
+ More discussion about generation numbers and not storing them inside
+ commit objects. A valuable quote:
+
+ "I think we should be moving more in the direction of keeping
+