summaryrefslogtreecommitdiff
path: root/Documentation/technical/index-format.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/technical/index-format.txt')
-rw-r--r--Documentation/technical/index-format.txt75
1 files changed, 61 insertions, 14 deletions
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt
index f9a3644711..65da0daaa5 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -26,7 +26,7 @@ Git index format
Extensions are identified by signature. Optional extensions can
be ignored if Git does not understand them.
- Git currently supports cached tree and resolve undo extensions.
+ Git currently supports cache tree and resolve undo extensions.
4-byte extension signature. If the first byte is 'A'..'Z' the
extension is optional and can be ignored.
@@ -44,6 +44,13 @@ Git index format
localization, no special casing of directory separator '/'). Entries
with the same name are sorted by their stage field.
+ An index entry typically represents a file. However, if sparse-checkout
+ is enabled in cone mode (`core.sparseCheckoutCone` is enabled) and the
+ `extensions.sparseIndex` extension is enabled, then the index may
+ contain entries for directories outside of the sparse-checkout definition.
+ These entries have mode `040000`, include the `SKIP_WORKTREE` bit, and
+ the path ends in a directory separator.
+
32-bit ctime seconds, the last time a file's metadata changed
this is stat(2) data
@@ -136,14 +143,35 @@ Git index format
== Extensions
-=== Cached tree
-
- Cached tree extension contains pre-computed hashes for trees that can
- be derived from the index. It helps speed up tree object generation
- from index for a new commit.
-
- When a path is updated in index, the path must be invalidated and
- removed from tree cache.
+=== Cache tree
+
+ Since the index does not record entries for directories, the cache
+ entries cannot describe tree objects that already exist in the object
+ database for regions of the index that are unchanged from an existing
+ commit. The cache tree extension stores a recursive tree structure that
+ describes the trees that already exist and completely match sections of
+ the cache entries. This speeds up tree object generation from the index
+ for a new commit by only computing the trees that are "new" to that
+ commit. It also assists when comparing the index to another tree, such
+ as `HEAD^{tree}`, since sections of the index can be skipped when a tree
+ comparison demonstrates equality.
+
+ The recursive tree structure uses nodes that store a number of cache
+ entries, a list of subnodes, and an object ID (OID). The OID references
+ the existing tree for that node, if it is known to exist. The subnodes
+ correspond to subdirectories that themselves have cache tree nodes. The
+ number of cache entries corresponds to the number of cache entries in
+ the index that describe paths within that tree's directory.
+
+ The extension tracks the full directory structure in the cache tree
+ extension, but this is generally smaller than the full cache entry list.
+
+ When a path is updated in index, Git invalidates all nodes of the
+ recursive cache tree corresponding to the parent directories of that
+ path. We store these tree nodes as being "invalid" by using "-1" as the
+ number of cache entries. Invalid nodes still store a span of index
+ entries, allowing Git to focus its efforts when reconstructing a full
+ cache tree.
The signature for this extension is { 'T', 'R', 'E', 'E' }.
@@ -174,7 +202,8 @@ Git index format
first entry represents the root level of the repository, followed by the
first subtree--let's call this A--of the root level (with its name
relative to the root level), followed by the first subtree of A (with
- its name relative to A), ...
+ its name relative to A), and so on. The specified number of subtrees
+ indicates when the current level of the recursive stack is complete.
=== Resolve undo
@@ -251,14 +280,14 @@ Git index format
- Stat data of $GIT_DIR/info/exclude. See "Index entry" section from
ctime field until "file size".
- - Stat data of core.excludesfile
+ - Stat data of core.excludesFile
- 32-bit dir_flags (see struct dir_struct)
- Hash of $GIT_DIR/info/exclude. A null hash means the file
does not exist.
- - Hash of core.excludesfile. A null hash means the file does
+ - Hash of core.excludesFile. A null hash means the file does
not exist.
- NUL-terminated string of per-dir exclude file name. This usually
@@ -306,12 +335,18 @@ The remaining data of each directory block is grouped by type:
The extension starts with
- - 32-bit version number: the current supported version is 1.
+ - 32-bit version number: the current supported versions are 1 and 2.
- - 64-bit time: the extension data reflects all changes through the given
+ - (Version 1)
+ 64-bit time: the extension data reflects all changes through the given
time which is stored as the nanoseconds elapsed since midnight,
January 1, 1970.
+ - (Version 2)
+ A null terminated string: an opaque token defined by the file system
+ monitor application. The extension data reflects all changes relative
+ to that token.
+
- 32-bit bitmap size: the size of the CE_FSMONITOR_VALID bitmap.
- An ewah bitmap, the n-th bit indicates whether the n-th index entry
@@ -357,3 +392,15 @@ The remaining data of each directory block is grouped by type:
in this block of entries.
- 32-bit count of cache entries in this block
+
+== Sparse Directory Entries
+
+ When using sparse-checkout in cone mode, some entire directories within
+ the index can be summarized by pointing to a tree object instead of the
+ entire expanded list of paths within that tree. An index containing such
+ entries is a "sparse index". Index format versions 4 and less were not
+ implemented with such entries in mind. Thus, for these versions, an
+ index containing sparse directory entries will include this extension
+ with signature { 's', 'd', 'i', 'r' }. Like the split-index extension,
+ tools should avoid interacting with a sparse index unless they understand
+ this extension.