summaryrefslogtreecommitdiff
path: root/Documentation/technical/index-format.txt
AgeCommit message (Collapse)AuthorFilesLines
2021-03-30sparse-index: add 'sdir' index extensionLibravatar Derrick Stolee1-0/+12
The index format does not currently allow for sparse directory entries. This violates some expectations that older versions of Git or third-party tools might not understand. We need an indicator inside the index file to warn these tools to not interact with a sparse index unless they are aware of sparse directory entries. Add a new _required_ index extension, 'sdir', that indicates that the index may contain sparse directory entries. This allows us to continue to use the differences in index formats 2, 3, and 4 before we create a new index version 5 in a later change. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-30sparse-index: design doc and format updateLibravatar Derrick Stolee1-0/+7
This begins a long effort to update the index format to allow sparse directory entries. This should result in a significant improvement to Git commands when HEAD contains millions of files, but the user has selected many fewer files to keep in their sparse-checkout definition. Currently, the index format is only updated in the presence of extensions.sparseIndex instead of increasing a file format version number. This is temporary, and index v5 is part of the plan for future work in this area. The design document details many of the reasons for embarking on this work, and also the plan for completing it safely. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-25Merge branch 'dl/doc-config-camelcase'Libravatar Junio C Hamano1-2/+2
A handful of multi-word configuration variable names in documentation that are spelled in all lowercase have been corrected to use the more canonical camelCase. * dl/doc-config-camelcase: index-format doc: camelCase core.excludesFile blame-options.txt: camelcase blame.blankBoundary i18n.txt: camel case and monospace "i18n.commitEncoding"
2021-02-24index-format doc: camelCase core.excludesFileLibravatar Junio C Hamano1-2/+2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15index-format: discuss recursion of cache-tree betterLibravatar Derrick Stolee1-1/+2
The end of the cache tree index extension format trails off with ellipses ever since 23fcc98 (doc: technical details about the index file format, 2011-03-01). While an intuitive reader could gather what this means, it could be better to use "and so on" instead. Really, this is only justified because I also wanted to point out that the number of subtrees in the index format is used to determine when the recursive depth-first-search stack should be "popped." This should help to add clarity to the format. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15index-format: update preamble to cache tree extensionLibravatar Derrick Stolee1-6/+27
I had difficulty in my efforts to learn about the cache tree extension based on the documentation and code because I had an incorrect assumption about how it behaved. This might be due to some ambiguity in the documentation, so this change modifies the beginning of the cache tree format by expanding the description of the feature. My hope is that this documentation clarifies a few things: 1. There is an in-memory recursive tree structure that is constructed from the extension data. This structure has a few differences, such as where the name is stored. 2. What does it mean for an entry to be invalid? 3. When exactly are "new" trees created? Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-15index-format: use 'cache tree' over 'cached tree'Libravatar Derrick Stolee1-3/+3
The index has a "cache tree" extension. This corresponds to a significant API implemented in cache-tree.[ch]. However, there are a few places that refer to this erroneously as "cached tree". These are rare, but notably the index-format.txt file itself makes this error. The only other reference is in t7104-reset-hard.sh. Reported-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-12-17Merge branch 'jh/index-v2-doc-on-fsmn'Libravatar Junio C Hamano1-2/+8
Doc update. * jh/index-v2-doc-on-fsmn: index-format.txt: document v2 format of file system monitor extension
2020-12-14index-format.txt: document v2 format of file system monitor extensionLibravatar Jeff Hostetler1-2/+8
Update the documentation of the file system monitor extension to describe version 2. The format was extended to support opaque tokens in: 56c6910028 fsmonitor: change last update timestamp on the index_state to opaque token Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-17index-format.txt: document SHA-256 index formatLibravatar Martin Ågren1-16/+18
Document that in SHA-1 repositories, we use SHA-1 and in SHA-256 repositories, we use SHA-256, then replace all other uses of "SHA-1" with something more neutral. Avoid referring to "160-bit" hash values. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-07Documentation: fix a bunch of typos, both old and newLibravatar Elijah Newren1-2/+2
Reported-by: Jens Schleusener <Jens.Schleusener@fossies.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-11ieot: add Index Entry Offset Table (IEOT) extensionLibravatar Ben Peart1-0/+18
This patch enables addressing the CPU cost of loading the index by adding additional data to the index that will allow us to efficiently multi- thread the loading and conversion of cache entries. It accomplishes this by adding an (optional) index extension that is a table of offsets to blocks of cache entries in the index file. To make this work for V4 indexes, when writing the cache entries, it periodically "resets" the prefix-compression by encoding the current entry as if the path name for the previous entry is completely different and saves the offset of that entry in the IEOT. Basically, with V4 indexes, it generates offsets into blocks of prefix-compressed entries. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-11eoie: add End of Index Entry (EOIE) extensionLibravatar Ben Peart1-0/+23
The End of Index Entry (EOIE) is used to locate the end of the variable length index entries and the beginning of the extensions. Code can take advantage of this to quickly locate the index extensions without having to parse through all of the index entries. The EOIE extension is always written out to the index file including to the shared index when using the split index feature. Because it is always written out, the SHA checksums in t/t1700-split-index.sh were updated to reflect its inclusion. It is written as an optional extension to ensure compatibility with other git implementations that do not yet support it. It is always written out to ensure it is available as often as possible to speed up index operations. Because it must be able to be loaded before the variable length cache entries and other index extensions, this extension must be written last. The signature for this extension is { 'E', 'O', 'I', 'E' }. The extension consists of: - 32-bit offset to the end of the index entries - 160-bit SHA-1 over the extension types and their sizes (but not their contents). E.g. if we have "TREE" extension that is N-bytes long, "REUC" extension that is M-bytes long, followed by "EOIE", then the hash would be: SHA-1("TREE" + <binary representation of N> + "REUC" + <binary representation of M>) Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01fsmonitor: add documentation for the fsmonitor extension.Libravatar Ben Peart1-0/+19
This includes the core.fsmonitor setting, the fsmonitor integration hook, and the fsmonitor index extension. Also add documentation for the new fsmonitor options to ls-files and update-index. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-29Merge branch 'jc/em-dash-in-doc'Libravatar Junio C Hamano1-1/+1
AsciiDoc markup fixes. * jc/em-dash-in-doc: Documentation: AsciiDoc spells em-dash as double-dashes, not triple
2015-10-22Documentation: AsciiDoc spells em-dash as double-dashes, not tripleLibravatar Junio C Hamano1-1/+1
Again, we do not usually process release notes with AsciiDoc, but it is better to be consistent. This incidentally reveals breakages left by an ancient 5e00439f (Documentation: build html for all files in technical and howto, 2012-10-23). The index-format documentation was originally written to be read as straight text without formatting and when the commit forced everything in Documentation/ to go through AsciiDoc, it did not do any adjustment--hence the double-dashes will be seen in the resulting text that is rendered as preformatted fixed-width without converted into em-dashes. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-17Merge branch 'ta/docfix-index-format-tech'Libravatar Junio C Hamano1-1/+1
* ta/docfix-index-format-tech: typofix for index-format.txt
2015-07-28typofix for index-format.txtLibravatar Thomas Ackermann1-1/+1
Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: guard and disable on system changesLibravatar Nguyễn Thái Ngọc Duy1-0/+4
If the user enables untracked cache, then - move worktree to an unsupported filesystem - or simply upgrade OS - or move the whole (portable) disk from one machine to another - or access a shared fs from another machine there's no guarantee that untracked cache can still function properly. Record the worktree location and OS footprint in the cache. If it changes, err on the safe side and disable the cache. The user can 'update-index --untracked-cache' again to make sure all conditions are met. This adds a new requirement that setup_git_directory* must be called before read_cache() because we need worktree location by then, or the cache is dropped. This change does not cover all bases, you can fool it if you try hard. The point is to stop accidents. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: save to an index extensionLibravatar Nguyễn Thái Ngọc Duy1-0/+58
Helped-by: Stefan Beller <sbeller@google.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22Merge branch 'nd/split-index'Libravatar Junio C Hamano1-1/+1
A typofix to the documentation of a feature already in the release. * nd/split-index: index-format.txt: add a missing closing quote
2014-12-11index-format.txt: add a missing closing quoteLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-04Documentation: typofixesLibravatar Thomas Ackermann1-1/+1
In addition to fixing trivial and obvious typos, be careful about the following points: - Spell ASCII, URL and CRC in ALL CAPS; - Spell Linux as Capitalized; - Do not omit periods in "i.e." and "e.g.". Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-13read-cache: split-index modeLibravatar Nguyễn Thái Ngọc Duy1-0/+35
This split-index mode is designed to keep write cost proportional to the number of changes the user has made, not the size of the work tree. (Read cost is another matter, to be dealt separately.) This mode stores index info in a pair of $GIT_DIR/index and $GIT_DIR/sharedindex.<SHA-1>. sharedindex is large and unchanged over time while "index" is smaller and updated often. Format details are in index-format.txt, although not everything is implemented in this patch. Shared indexes are not automatically removed, because it's unclear if the shared index is needed by any (even temporary) indexes by just looking at it. After a while you'll collect stale shared indexes. The good news is one shared index is useable for long, until $GIT_DIR/index becomes too big and sluggish that the new shared index must be created. The safest way to clean shared indexes is to turn off split index mode, so shared files are all garbage, delete them all, then turn on split index mode again. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-22typofix: documentationLibravatar Ondřej Bílka1-1/+1
Signed-off-by: Ondřej Bílka <neleai@seznam.cz> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-19Merge branch 'nd/doc-index-format'Libravatar Junio C Hamano1-3/+3
Update the index format documentation to mention the v4 format. * nd/doc-index-format: update-index: list supported idx versions and their features read-cache.c: use INDEX_FORMAT_{LB,UB} in verify_hdr() index-format.txt: mention of v4 is missing in some places
2013-02-22index-format.txt: mention of v4 is missing in some placesLibravatar Nguyễn Thái Ngọc Duy1-3/+3
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-01Documentation: the name of the system is 'Git', not 'git'Libravatar Thomas Ackermann1-1/+1
Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-01Documentation: avoid poor-man's small caps GITLibravatar Thomas Ackermann1-3/+3
In the earlier days, we used to spell the name of the system as GIT, to simulate as if it were typeset with capital G and IT in small caps. Later we stopped doing so at around 1.6.5 days. Let's stop doing so throughout the documentation. The name to refer to the whole system (and the concept it embodies) is "Git"; the command end-users type is "git". And document this in the coding guideline. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-21Merge branch 'nd/index-format-doc'Libravatar Junio C Hamano1-2/+3
* nd/index-format-doc: index-format.txt: clarify what is "invalid"
2012-12-13index-format.txt: clarify what is "invalid"Libravatar Nguyễn Thái Ngọc Duy1-2/+3
A cache-tree entry with a negative entry count is considered invalid by the current Git; it records that we do not know the object name of a tree that would result by writing the directory covered by the cache-tree as a tree object. Clarify that any entry with a negative entry count is invalid, but the implementations must write -1 there. This way, we can later decide to allow writers to use negative values other than -1 to encode optional information on such invalidated entries without harming interoperability; we do not know what will be encoded and how, so we keep these other negative values as reserved for now. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-16Documentation/technical: convert plain text files to asciidocLibravatar Thomas Ackermann1-1/+1
These were not originally meant for asciidoc, but they are already so close. Mark them up in asciidoc. Signed-off-by: Thomas Ackermann <th.acker@arcor.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-04-27index-v4: document the entry formatLibravatar Junio C Hamano1-0/+13
Document the format so that others can learn from and build on top of the series. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-31Documentation: clarify the invalidated tree entry formatLibravatar Carlos Martín Nieto1-2/+3
When the entry_count is -1, the tree is invalidated and therefore has not associated hash (or object name). Explicitly state that the next entry starts after the newline. Signed-off-by: Carlos Martín Nieto <cmn@elego.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-23doc: technical details about the index file formatLibravatar Junio C Hamano1-37/+57
* Clarify "string of unsigned bytes"; * Blob has two variants (regular file vs symlink), not (blob vs symlink); * Clarify permission mode bits; * Clarify ce_namelen() "too long to fit in the length field" case; * Clarify "." etc are forbidden as path components; * Match the description with the internal wording "cache-tree"; * All types of extension begin with signature and length as explained in the first part. Don't repeat the "length" part in the description of each extension (can be mistaken as if there is a separate 32-bit size field inside the extension), but state what the signature for each extension is. * Don't say "Extension tag", as we have said "Extension signature" in the first part---be consistent; * Clarify the invalidation of cache-tree entries; * Correct description on subtree_nr field in the cache-tree; * Clarify the order of entries in cache-tree; Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-27doc: technical details about the index file formatLibravatar Nguyễn Thái Ngọc Duy1-0/+165
This bases on the original work by Robin Rosenberg. Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>