diff options
Diffstat (limited to 'Documentation/technical/bitmap-format.txt')
-rw-r--r-- | Documentation/technical/bitmap-format.txt | 33 |
1 files changed, 33 insertions, 0 deletions
diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt index 7a86bd77d5..f8c18a0f7a 100644 --- a/Documentation/technical/bitmap-format.txt +++ b/Documentation/technical/bitmap-format.txt @@ -21,6 +21,12 @@ GIT bitmap v1 format requirement for the bitmap index format, also present in JGit, that greatly reduces the complexity of the implementation. + - BITMAP_OPT_HASH_CACHE (0x4) + If present, the end of the bitmap file contains + `N` 32-bit name-hash values, one per object in the + pack. The format and meaning of the name-hash is + described below. + 4-byte entry count (network byte order) The total count of entries (bitmapped commits) in this bitmap index. @@ -129,3 +135,30 @@ The bitstream represented by the above chunk is then: The next word after `L_M` (if any) must again be a RLW, for the next chunk. For efficient appending to the bitstream, the EWAH stores a pointer to the last RLW in the stream. + + +== Appendix B: Optional Bitmap Sections + +These sections may or may not be present in the `.bitmap` file; their +presence is indicated by the header flags section described above. + +Name-hash cache +--------------- + +If the BITMAP_OPT_HASH_CACHE flag is set, the end of the bitmap contains +a cache of 32-bit values, one per object in the pack. The value at +position `i` is the hash of the pathname at which the `i`th object +(counting in index order) in the pack can be found. This can be fed +into the delta heuristics to compare objects with similar pathnames. + +The hash algorithm used is: + + hash = 0; + while ((c = *name++)) + if (!isspace(c)) + hash = (hash >> 2) + (c << 24); + +Note that this hashing scheme is tied to the BITMAP_OPT_HASH_CACHE flag. +If implementations want to choose a different hashing scheme, they are +free to do so, but MUST allocate a new header flag (because comparing +hashes made under two different schemes would be pointless). |