diff options
Diffstat (limited to 'Documentation/git-pack-objects.txt')
-rw-r--r-- | Documentation/git-pack-objects.txt | 123 |
1 files changed, 117 insertions, 6 deletions
diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt index d95b472d16..54d715ead1 100644 --- a/Documentation/git-pack-objects.txt +++ b/Documentation/git-pack-objects.txt @@ -14,7 +14,7 @@ SYNOPSIS [--local] [--incremental] [--window=<n>] [--depth=<n>] [--revs [--unpacked | --all]] [--keep-pack=<pack-name>] [--stdout [--filter=<filter-spec>] | base-name] - [--shallow] [--keep-true-parents] < object-list + [--shallow] [--keep-true-parents] [--[no-]sparse] < object-list DESCRIPTION @@ -131,7 +131,7 @@ depth is 4095. --keep-pack=<pack-name>:: This flag causes an object already in the given pack to be ignored, even if it would have otherwise been - packed. `<pack-name>` is the the pack file name without + packed. `<pack-name>` is the pack file name without leading directory (e.g. `pack-123.pack`). The option could be specified multiple times to keep multiple packs. @@ -196,6 +196,17 @@ depth is 4095. Add --no-reuse-object if you want to force a uniform compression level on all data no matter the source. +--[no-]sparse:: + Toggle the "sparse" algorithm to determine which objects to include in + the pack, when combined with the "--revs" option. This algorithm + only walks trees that appear in paths that introduce new objects. + This can have significant performance benefits when computing + a pack to send a small change. However, it is possible that extra + objects are added to the pack-file if the included commits contain + certain types of direct renames. If this option is not included, + it defaults to the value of `pack.useSparse`, which is true unless + otherwise specified. + --thin:: Create a "thin" pack by omitting the common objects between a sender and a receiver in order to reduce network transfer. This @@ -259,15 +270,18 @@ So does `git bundle` (see linkgit:git-bundle[1]) when it creates a bundle. This option specifies how missing objects are handled. + The form '--missing=error' requests that pack-objects stop with an error if -a missing object is encountered. This is the default action. +a missing object is encountered. If the repository is a partial clone, an +attempt to fetch missing objects will be made before declaring them missing. +This is the default action. + The form '--missing=allow-any' will allow object traversal to continue -if a missing object is encountered. Missing objects will silently be -omitted from the results. +if a missing object is encountered. No fetch of a missing object will occur. +Missing objects will silently be omitted from the results. + The form '--missing=allow-promisor' is like 'allow-any', but will only allow object traversal to continue for EXPECTED promisor missing objects. -Unexpected missing object will raise an error. +No fetch of a missing object will occur. An unexpected missing object will +raise an error. --exclude-promisor-objects:: Omit objects that are known to be in the promisor remote. (This @@ -289,6 +303,103 @@ Unexpected missing object will raise an error. --unpack-unreachable:: Keep unreachable objects in loose form. This implies `--revs`. +--delta-islands:: + Restrict delta matches based on "islands". See DELTA ISLANDS + below. + + +DELTA ISLANDS +------------- + +When possible, `pack-objects` tries to reuse existing on-disk deltas to +avoid having to search for new ones on the fly. This is an important +optimization for serving fetches, because it means the server can avoid +inflating most objects at all and just send the bytes directly from +disk. This optimization can't work when an object is stored as a delta +against a base which the receiver does not have (and which we are not +already sending). In that case the server "breaks" the delta and has to +find a new one, which has a high CPU cost. Therefore it's important for +performance that the set of objects in on-disk delta relationships match +what a client would fetch. + +In a normal repository, this tends to work automatically. The objects +are mostly reachable from the branches and tags, and that's what clients +fetch. Any deltas we find on the server are likely to be between objects +the client has or will have. + +But in some repository setups, you may have several related but separate +groups of ref tips, with clients tending to fetch those groups +independently. For example, imagine that you are hosting several "forks" +of a repository in a single shared object store, and letting clients +view them as separate repositories through `GIT_NAMESPACE` or separate +repos using the alternates mechanism. A naive repack may find that the +optimal delta for an object is against a base that is only found in +another fork. But when a client fetches, they will not have the base +object, and we'll have to find a new delta on the fly. + +A similar situation may exist if you have many refs outside of +`refs/heads/` and `refs/tags/` that point to related objects (e.g., +`refs/pull` or `refs/changes` used by some hosting providers). By +default, clients fetch only heads and tags, and deltas against objects +found only in those other groups cannot be sent as-is. + +Delta islands solve this problem by allowing you to group your refs into +distinct "islands". Pack-objects computes which objects are reachable +from which islands, and refuses to make a delta from an object `A` +against a base which is not present in all of `A`'s islands. This +results in slightly larger packs (because we miss some delta +opportunities), but guarantees that a fetch of one island will not have +to recompute deltas on the fly due to crossing island boundaries. + +When repacking with delta islands the delta window tends to get +clogged with candidates that are forbidden by the config. Repacking +with a big --window helps (and doesn't take as long as it otherwise +might because we can reject some object pairs based on islands before +doing any computation on the content). + +Islands are configured via the `pack.island` option, which can be +specified multiple times. Each value is a left-anchored regular +expressions matching refnames. For example: + +------------------------------------------- +[pack] +island = refs/heads/ +island = refs/tags/ +------------------------------------------- + +puts heads and tags into an island (whose name is the empty string; see +below for more on naming). Any refs which do not match those regular +expressions (e.g., `refs/pull/123`) is not in any island. Any object +which is reachable only from `refs/pull/` (but not heads or tags) is +therefore not a candidate to be used as a base for `refs/heads/`. + +Refs are grouped into islands based on their "names", and two regexes +that produce the same name are considered to be in the same +island. The names are computed from the regexes by concatenating any +capture groups from the regex, with a '-' dash in between. (And if +there are no capture groups, then the name is the empty string, as in +the above example.) This allows you to create arbitrary numbers of +islands. Only up to 14 such capture groups are supported though. + +For example, imagine you store the refs for each fork in +`refs/virtual/ID`, where `ID` is a numeric identifier. You might then +configure: + +------------------------------------------- +[pack] +island = refs/virtual/([0-9]+)/heads/ +island = refs/virtual/([0-9]+)/tags/ +island = refs/virtual/([0-9]+)/(pull)/ +------------------------------------------- + +That puts the heads and tags for each fork in their own island (named +"1234" or similar), and the pull refs for each go into their own +"1234-pull". + +Note that we pick a single island for each regex to go into, using "last +one wins" ordering (which allows repo-specific config to take precedence +over user-wide config, and so forth). + SEE ALSO -------- linkgit:git-rev-list[1] |