diff options
Diffstat (limited to 'Documentation/technical')
-rw-r--r-- | Documentation/technical/api-directory-listing.txt | 6 | ||||
-rw-r--r-- | Documentation/technical/api-ref-iteration.txt | 2 | ||||
-rw-r--r-- | Documentation/technical/api-trace2.txt | 31 | ||||
-rw-r--r-- | Documentation/technical/api-tree-walking.txt | 8 | ||||
-rw-r--r-- | Documentation/technical/hash-function-transition.txt | 2 | ||||
-rw-r--r-- | Documentation/technical/partial-clone.txt | 117 | ||||
-rw-r--r-- | Documentation/technical/protocol-v2.txt | 2 |
7 files changed, 119 insertions, 49 deletions
diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt index 5abb8e8b1f..76b6e4f71b 100644 --- a/Documentation/technical/api-directory-listing.txt +++ b/Documentation/technical/api-directory-listing.txt @@ -111,11 +111,11 @@ marked. If you to exclude files, make sure you have loaded index first. * Prepare `struct dir_struct dir` and clear it with `memset(&dir, 0, sizeof(dir))`. -* To add single exclude pattern, call `add_exclude_list()` and then - `add_exclude()`. +* To add single exclude pattern, call `add_pattern_list()` and then + `add_pattern()`. * To add patterns from a file (e.g. `.git/info/exclude`), call - `add_excludes_from_file()` , and/or set `dir.exclude_per_dir`. A + `add_patterns_from_file()` , and/or set `dir.exclude_per_dir`. A short-hand function `setup_standard_excludes()` can be used to set up the standard set of exclude settings. diff --git a/Documentation/technical/api-ref-iteration.txt b/Documentation/technical/api-ref-iteration.txt index 46c3d5c355..ad9d019ff9 100644 --- a/Documentation/technical/api-ref-iteration.txt +++ b/Documentation/technical/api-ref-iteration.txt @@ -54,7 +54,7 @@ this: do not do this you will get an error for each ref that it does not point to a valid object. -Note: As a side-effect of this you can not safely assume that all +Note: As a side-effect of this you cannot safely assume that all objects you lookup are available in superproject. All submodule objects will be available the same way as the superprojects objects. diff --git a/Documentation/technical/api-trace2.txt b/Documentation/technical/api-trace2.txt index 71eb081fed..a045dbe422 100644 --- a/Documentation/technical/api-trace2.txt +++ b/Documentation/technical/api-trace2.txt @@ -128,7 +128,7 @@ yields ------------ $ cat ~/log.event -{"event":"version","sid":"sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.620713Z","file":"common-main.c","line":38,"evt":"1","exe":"2.20.1.155.g426c96fcdb"} +{"event":"version","sid":"sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.620713Z","file":"common-main.c","line":38,"evt":"2","exe":"2.20.1.155.g426c96fcdb"} {"event":"start","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621027Z","file":"common-main.c","line":39,"t_abs":0.001173,"argv":["git","version"]} {"event":"cmd_name","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621122Z","file":"git.c","line":432,"name":"version","hierarchy":"version"} {"event":"exit","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621236Z","file":"git.c","line":662,"t_abs":0.001227,"code":0} @@ -142,10 +142,9 @@ system or global config value to one of the following: include::../trace2-target-values.txt[] -If the target already exists and is a directory, the traces will be -written to files (one per process) underneath the given directory. They -will be named according to the last component of the SID (optionally -followed by a counter to avoid filename collisions). +When trace files are written to a target directory, they will be named according +to the last component of the SID (optionally followed by a counter to avoid +filename collisions). == Trace2 API @@ -605,17 +604,35 @@ only present on the "start" and "atexit" events. ==== Event-Specific Key/Value Pairs `"version"`:: - This event gives the version of the executable and the EVENT format. + This event gives the version of the executable and the EVENT format. It + should always be the first event in a trace session. The EVENT format + version will be incremented if new event types are added, if existing + fields are removed, or if there are significant changes in + interpretation of existing events or fields. Smaller changes, such as + adding a new field to an existing event, will not require an increment + to the EVENT format version. + ------------ { "event":"version", ... - "evt":"1", # EVENT format version + "evt":"2", # EVENT format version "exe":"2.20.1.155.g426c96fcdb" # git version } ------------ +`"discard"`:: + This event is written to the git-trace2-discard sentinel file if there + are too many files in the target trace directory (see the + trace2.maxFiles config option). ++ +------------ +{ + "event":"discard", + ... +} +------------ + `"start"`:: This event contains the complete argv received by main(). + diff --git a/Documentation/technical/api-tree-walking.txt b/Documentation/technical/api-tree-walking.txt index bde18622a8..7962e32854 100644 --- a/Documentation/technical/api-tree-walking.txt +++ b/Documentation/technical/api-tree-walking.txt @@ -62,9 +62,7 @@ Initializing `setup_traverse_info`:: Initialize a `traverse_info` given the pathname of the tree to start - traversing from. The `base` argument is assumed to be the `path` - member of the `name_entry` being recursed into unless the tree is a - top-level tree in which case the empty string ("") is used. + traversing from. Walking ------- @@ -140,6 +138,10 @@ same in the next callback invocation. This utilizes the memory structure of a tree entry to avoid the overhead of using a generic strlen(). +`strbuf_make_traverse_path`:: + + Convenience wrapper to `make_traverse_path` into a strbuf. + Authors ------- diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt index bc2ace2a6e..2ae8fa470a 100644 --- a/Documentation/technical/hash-function-transition.txt +++ b/Documentation/technical/hash-function-transition.txt @@ -456,7 +456,7 @@ packfile marked as UNREACHABLE_GARBAGE (using the PSRC field; see below). To avoid the race when writing new objects referring to an about-to-be-deleted object, code paths that write new objects will need to copy any objects from UNREACHABLE_GARBAGE packs that they -refer to to new, non-UNREACHABLE_GARBAGE packs (or loose objects). +refer to new, non-UNREACHABLE_GARBAGE packs (or loose objects). UNREACHABLE_GARBAGE are then safe to delete if their creation time (as indicated by the file's mtime) is long enough ago. diff --git a/Documentation/technical/partial-clone.txt b/Documentation/technical/partial-clone.txt index 896c7b3878..210373e258 100644 --- a/Documentation/technical/partial-clone.txt +++ b/Documentation/technical/partial-clone.txt @@ -30,12 +30,20 @@ advance* during clone and fetch operations and thereby reduce download times and disk usage. Missing objects can later be "demand fetched" if/when needed. +A remote that can later provide the missing objects is called a +promisor remote, as it promises to send the objects when +requested. Initialy Git supported only one promisor remote, the origin +remote from which the user cloned and that was configured in the +"extensions.partialClone" config option. Later support for more than +one promisor remote has been implemented. + Use of partial clone requires that the user be online and the origin -remote be available for on-demand fetching of missing objects. This may -or may not be problematic for the user. For example, if the user can -stay within the pre-selected subset of the source tree, they may not -encounter any missing objects. Alternatively, the user could try to -pre-fetch various objects if they know that they are going offline. +remote or other promisor remotes be available for on-demand fetching +of missing objects. This may or may not be problematic for the user. +For example, if the user can stay within the pre-selected subset of +the source tree, they may not encounter any missing objects. +Alternatively, the user could try to pre-fetch various objects if they +know that they are going offline. Non-Goals @@ -100,18 +108,18 @@ or commits that reference missing trees. Handling Missing Objects ------------------------ -- An object may be missing due to a partial clone or fetch, or missing due - to repository corruption. To differentiate these cases, the local - repository specially indicates such filtered packfiles obtained from the - promisor remote as "promisor packfiles". +- An object may be missing due to a partial clone or fetch, or missing + due to repository corruption. To differentiate these cases, the + local repository specially indicates such filtered packfiles + obtained from promisor remotes as "promisor packfiles". + These promisor packfiles consist of a "<name>.promisor" file with arbitrary contents (like the "<name>.keep" files), in addition to their "<name>.pack" and "<name>.idx" files. - The local repository considers a "promisor object" to be an object that - it knows (to the best of its ability) that the promisor remote has promised - that it has, either because the local repository has that object in one of + it knows (to the best of its ability) that promisor remotes have promised + that they have, either because the local repository has that object in one of its promisor packfiles, or because another promisor object refers to it. + When Git encounters a missing object, Git can see if it is a promisor object @@ -123,12 +131,12 @@ expensive-to-modify list of missing objects.[a] - Since almost all Git code currently expects any referenced object to be present locally and because we do not want to force every command to do a dry-run first, a fallback mechanism is added to allow Git to attempt - to dynamically fetch missing objects from the promisor remote. + to dynamically fetch missing objects from promisor remotes. + When the normal object lookup fails to find an object, Git invokes -fetch-object to try to get the object from the server and then retry -the object lookup. This allows objects to be "faulted in" without -complicated prediction algorithms. +promisor_remote_get_direct() to try to get the object from a promisor +remote and then retry the object lookup. This allows objects to be +"faulted in" without complicated prediction algorithms. + For efficiency reasons, no check as to whether the missing object is actually a promisor object is performed. @@ -157,8 +165,7 @@ and prefetch those objects in bulk. + We are not happy with this global variable and would like to remove it, but that requires significant refactoring of the object code to pass an -additional flag. We hope that concurrent efforts to add an ODB API can -encompass this. +additional flag. Fetching Missing Objects @@ -182,21 +189,63 @@ has been updated to not use any object flags when the corresponding argument though they are not necessary. +Using many promisor remotes +--------------------------- + +Many promisor remotes can be configured and used. + +This allows for example a user to have multiple geographically-close +cache servers for fetching missing blobs while continuing to do +filtered `git-fetch` commands from the central server. + +When fetching objects, promisor remotes are tried one after the other +until all the objects have been fetched. + +Remotes that are considered "promisor" remotes are those specified by +the following configuration variables: + +- `extensions.partialClone = <name>` + +- `remote.<name>.promisor = true` + +- `remote.<name>.partialCloneFilter = ...` + +Only one promisor remote can be configured using the +`extensions.partialClone` config variable. This promisor remote will +be the last one tried when fetching objects. + +We decided to make it the last one we try, because it is likely that +someone using many promisor remotes is doing so because the other +promisor remotes are better for some reason (maybe they are closer or +faster for some kind of objects) than the origin, and the origin is +likely to be the remote specified by extensions.partialClone. + +This justification is not very strong, but one choice had to be made, +and anyway the long term plan should be to make the order somehow +fully configurable. + +For now though the other promisor remotes will be tried in the order +they appear in the config file. + Current Limitations ------------------- -- The remote used for a partial clone (or the first partial fetch - following a regular clone) is marked as the "promisor remote". +- It is not possible to specify the order in which the promisor + remotes are tried in other ways than the order in which they appear + in the config file. + -We are currently limited to a single promisor remote and only that -remote may be used for subsequent partial fetches. +It is also not possible to specify an order to be used when fetching +from one remote and a different order when fetching from another +remote. + +- It is not possible to push only specific objects to a promisor + remote. + -We accept this limitation because we believe initial users of this -feature will be using it on repositories with a strong single central -server. +It is not possible to push at the same time to multiple promisor +remote in a specific order. -- Dynamic object fetching will only ask the promisor remote for missing - objects. We assume that the promisor remote has a complete view of the +- Dynamic object fetching will only ask promisor remotes for missing + objects. We assume that promisor remotes have a complete view of the repository and can satisfy all such requests. - Repack essentially treats promisor and non-promisor packfiles as 2 @@ -218,15 +267,17 @@ server. Future Work ----------- -- Allow more than one promisor remote and define a strategy for fetching - missing objects from specific promisor remotes or of iterating over the - set of promisor remotes until a missing object is found. +- Improve the way to specify the order in which promisor remotes are + tried. + -A user might want to have multiple geographically-close cache servers -for fetching missing blobs while continuing to do filtered `git-fetch` -commands from the central server, for example. +For example this could allow to specify explicitly something like: +"When fetching from this remote, I want to use these promisor remotes +in this order, though, when pushing or fetching to that remote, I want +to use those promisor remotes in that order." + +- Allow pushing to promisor remotes. + -Or the user might want to work in a triangular work flow with multiple +The user might want to work in a triangular work flow with multiple promisor remotes that each have an incomplete view of the repository. - Allow repack to work on promisor packfiles (while keeping them distinct diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt index 03264c7d9a..40f91f6b1e 100644 --- a/Documentation/technical/protocol-v2.txt +++ b/Documentation/technical/protocol-v2.txt @@ -141,7 +141,7 @@ Capabilities ------------ There are two different types of capabilities: normal capabilities, -which can be used to to convey information or alter the behavior of a +which can be used to convey information or alter the behavior of a request, and commands, which are the core actions that a client wants to perform (fetch, push, etc). |