summaryrefslogtreecommitdiff
path: root/Documentation/technical/index-format.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/technical/index-format.txt')
-rw-r--r--Documentation/technical/index-format.txt46
1 files changed, 34 insertions, 12 deletions
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index 69edf46c03..d363a71c37 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -26,7 +26,7 @@ Git index format
Extensions are identified by signature. Optional extensions can
be ignored if Git does not understand them.
- Git currently supports cached tree and resolve undo extensions.
+ Git currently supports cache tree and resolve undo extensions.
4-byte extension signature. If the first byte is 'A'..'Z' the
extension is optional and can be ignored.
@@ -136,14 +136,35 @@ Git index format
== Extensions
-=== Cached tree
-
- Cached tree extension contains pre-computed hashes for trees that can
- be derived from the index. It helps speed up tree object generation
- from index for a new commit.
-
- When a path is updated in index, the path must be invalidated and
- removed from tree cache.
+=== Cache tree
+
+ Since the index does not record entries for directories, the cache
+ entries cannot describe tree objects that already exist in the object
+ database for regions of the index that are unchanged from an existing
+ commit. The cache tree extension stores a recursive tree structure that
+ describes the trees that already exist and completely match sections of
+ the cache entries. This speeds up tree object generation from the index
+ for a new commit by only computing the trees that are "new" to that
+ commit. It also assists when comparing the index to another tree, such
+ as `HEAD^{tree}`, since sections of the index can be skipped when a tree
+ comparison demonstrates equality.
+
+ The recursive tree structure uses nodes that store a number of cache
+ entries, a list of subnodes, and an object ID (OID). The OID references
+ the existing tree for that node, if it is known to exist. The subnodes
+ correspond to subdirectories that themselves have cache tree nodes. The
+ number of cache entries corresponds to the number of cache entries in
+ the index that describe paths within that tree's directory.
+
+ The extension tracks the full directory structure in the cache tree
+ extension, but this is generally smaller than the full cache entry list.
+
+ When a path is updated in index, Git invalidates all nodes of the
+ recursive cache tree corresponding to the parent directories of that
+ path. We store these tree nodes as being "invalid" by using "-1" as the
+ number of cache entries. Invalid nodes still store a span of index
+ entries, allowing Git to focus its efforts when reconstructing a full
+ cache tree.
The signature for this extension is { 'T', 'R', 'E', 'E' }.
@@ -174,7 +195,8 @@ Git index format
first entry represents the root level of the repository, followed by the
first subtree--let's call this A--of the root level (with its name
relative to the root level), followed by the first subtree of A (with
- its name relative to A), ...
+ its name relative to A), and so on. The specified number of subtrees
+ indicates when the current level of the recursive stack is complete.
=== Resolve undo
@@ -251,14 +273,14 @@ Git index format
- Stat data of $GIT_DIR/info/exclude. See "Index entry" section from
ctime field until "file size".
- - Stat data of core.excludesfile
+ - Stat data of core.excludesFile
- 32-bit dir_flags (see struct dir_struct)
- Hash of $GIT_DIR/info/exclude. A null hash means the file
does not exist.
- - Hash of core.excludesfile. A null hash means the file does
+ - Hash of core.excludesFile. A null hash means the file does
not exist.
- NUL-terminated string of per-dir exclude file name. This usually