diff options
Diffstat (limited to 'Documentation/technical/index-format.txt')
-rw-r--r-- | Documentation/technical/index-format.txt | 69 |
1 files changed, 59 insertions, 10 deletions
diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index 8930b3fabc..35112e4966 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -1,7 +1,7 @@ -GIT index format +Git index format ================ -= The git index file has the following format +== The Git index file has the following format All binary numbers are in network byte order. Version 2 is described here unless stated otherwise. @@ -12,7 +12,7 @@ GIT index format The signature is { 'D', 'I', 'R', 'C' } (stands for "dircache") 4-byte version number: - The current supported versions are 2 and 3. + The current supported versions are 2, 3 and 4. 32-bit number of index entries. @@ -21,9 +21,9 @@ GIT index format - Extensions Extensions are identified by signature. Optional extensions can - be ignored if GIT does not understand them. + be ignored if Git does not understand them. - GIT currently supports cached tree and resolve undo extensions. + Git currently supports cached tree and resolve undo extensions. 4-byte extension signature. If the first byte is 'A'..'Z' the extension is optional and can be ignored. @@ -93,8 +93,8 @@ GIT index format 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF is stored in this field. - (Version 3) A 16-bit field, only applicable if the "extended flag" - above is 1, split into (high to low bits). + (Version 3 or later) A 16-bit field, only applicable if the + "extended flag" above is 1, split into (high to low bits). 1-bit reserved for future @@ -113,9 +113,25 @@ GIT index format are encoded in 7-bit ASCII and the encoding cannot contain a NUL byte (iow, this is a UNIX pathname). + (Version 4) In version 4, the entry path name is prefix-compressed + relative to the path name for the previous entry (the very first + entry is encoded as if the path name for the previous entry is an + empty string). At the beginning of an entry, an integer N in the + variable width encoding (the same encoding as the offset is encoded + for OFS_DELTA pack entries; see pack-format.txt) is stored, followed + by a NUL-terminated string S. Removing N bytes from the end of the + path name for the previous entry, and replacing it with the string S + yields the path name for this entry. + 1-8 nul bytes as necessary to pad the entry to a multiple of eight bytes while keeping the name NUL-terminated. + (Version 4) In version 4, the padding after the pathname does not + exist. + + Interpretation of index entries in split index mode is completely + different. See below for details. + == Extensions === Cached tree @@ -148,8 +164,9 @@ GIT index format this span of index as a tree. An entry can be in an invalidated state and is represented by having - -1 in the entry_count field. In this case, there is no object name - and the next entry starts immediately after the newline. + a negative number in the entry_count field. In this case, there is no + object name and the next entry starts immediately after the newline. + When writing an invalid entry, -1 should always be used as entry_count. The entries are written out in the top-down, depth-first order. The first entry represents the root level of the repository, followed by the @@ -161,7 +178,7 @@ GIT index format A conflict is represented in the index as a set of higher stage entries. When a conflict is resolved (e.g. with "git add path"), these higher - stage entries will be removed and a stage-0 entry with proper resoluton + stage entries will be removed and a stage-0 entry with proper resolution is added. When these higher stage entries are removed, they are saved in the @@ -184,3 +201,35 @@ GIT index format - At most three 160-bit object names of the entry in stages from 1 to 3 (nothing is written for a missing stage). +=== Split index + + In split index mode, the majority of index entries could be stored + in a separate file. This extension records the changes to be made on + top of that to produce the final index. + + The signature for this extension is { 'l', 'i', 'n', 'k' }. + + The extension consists of: + + - 160-bit SHA-1 of the shared index file. The shared index file path + is $GIT_DIR/sharedindex.<SHA-1>. If all 160 bits are zero, the + index does not require a shared index file. + + - An ewah-encoded delete bitmap, each bit represents an entry in the + shared index. If a bit is set, its corresponding entry in the + shared index will be removed from the final index. Note, because + a delete operation changes index entry positions, but we do need + original positions in replace phase, it's best to just mark + entries for removal, then do a mass deletion after replacement. + + - An ewah-encoded replace bitmap, each bit represents an entry in + the shared index. If a bit is set, its corresponding entry in the + shared index will be replaced with an entry in this index + file. All replaced entries are stored in sorted order in this + index. The first "1" bit in the replace bitmap corresponds to the + first index entry, the second "1" bit to the second entry and so + on. Replaced entries may have empty path names to save space. + + The remaining index entries after replaced ones will be added to the + final index. These added entries are also sorted by entry name then + stage. |