diff options
Diffstat (limited to 'Documentation/technical')
-rw-r--r-- | Documentation/technical/api-config.txt | 319 | ||||
-rw-r--r-- | Documentation/technical/api-directory-listing.txt | 6 | ||||
-rw-r--r-- | Documentation/technical/api-grep.txt | 8 | ||||
-rw-r--r-- | Documentation/technical/api-object-access.txt | 15 | ||||
-rw-r--r-- | Documentation/technical/api-quote.txt | 10 | ||||
-rw-r--r-- | Documentation/technical/api-submodule-config.txt | 2 | ||||
-rw-r--r-- | Documentation/technical/api-trace2.txt | 45 | ||||
-rw-r--r-- | Documentation/technical/api-tree-walking.txt | 8 | ||||
-rw-r--r-- | Documentation/technical/api-xdiff-interface.txt | 7 | ||||
-rw-r--r-- | Documentation/technical/commit-graph.txt | 22 | ||||
-rw-r--r-- | Documentation/technical/hash-function-transition.txt | 18 | ||||
-rw-r--r-- | Documentation/technical/index-format.txt | 4 | ||||
-rw-r--r-- | Documentation/technical/multi-pack-index.txt | 4 | ||||
-rw-r--r-- | Documentation/technical/pack-protocol.txt | 2 | ||||
-rw-r--r-- | Documentation/technical/partial-clone.txt | 129 | ||||
-rw-r--r-- | Documentation/technical/protocol-v2.txt | 2 | ||||
-rw-r--r-- | Documentation/technical/racy-git.txt | 2 | ||||
-rw-r--r-- | Documentation/technical/rerere.txt | 2 |
18 files changed, 158 insertions, 447 deletions
diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt deleted file mode 100644 index 7d20716c32..0000000000 --- a/Documentation/technical/api-config.txt +++ /dev/null @@ -1,319 +0,0 @@ -config API -========== - -The config API gives callers a way to access Git configuration files -(and files which have the same syntax). See linkgit:git-config[1] for a -discussion of the config file syntax. - -General Usage -------------- - -Config files are parsed linearly, and each variable found is passed to a -caller-provided callback function. The callback function is responsible -for any actions to be taken on the config option, and is free to ignore -some options. It is not uncommon for the configuration to be parsed -several times during the run of a Git program, with different callbacks -picking out different variables useful to themselves. - -A config callback function takes three parameters: - -- the name of the parsed variable. This is in canonical "flat" form: the - section, subsection, and variable segments will be separated by dots, - and the section and variable segments will be all lowercase. E.g., - `core.ignorecase`, `diff.SomeType.textconv`. - -- the value of the found variable, as a string. If the variable had no - value specified, the value will be NULL (typically this means it - should be interpreted as boolean true). - -- a void pointer passed in by the caller of the config API; this can - contain callback-specific data - -A config callback should return 0 for success, or -1 if the variable -could not be parsed properly. - -Basic Config Querying ---------------------- - -Most programs will simply want to look up variables in all config files -that Git knows about, using the normal precedence rules. To do this, -call `git_config` with a callback function and void data pointer. - -`git_config` will read all config sources in order of increasing -priority. Thus a callback should typically overwrite previously-seen -entries with new ones (e.g., if both the user-wide `~/.gitconfig` and -repo-specific `.git/config` contain `color.ui`, the config machinery -will first feed the user-wide one to the callback, and then the -repo-specific one; by overwriting, the higher-priority repo-specific -value is left at the end). - -The `config_with_options` function lets the caller examine config -while adjusting some of the default behavior of `git_config`. It should -almost never be used by "regular" Git code that is looking up -configuration variables. It is intended for advanced callers like -`git-config`, which are intentionally tweaking the normal config-lookup -process. It takes two extra parameters: - -`config_source`:: -If this parameter is non-NULL, it specifies the source to parse for -configuration, rather than looking in the usual files. See `struct -git_config_source` in `config.h` for details. Regular `git_config` defaults -to `NULL`. - -`opts`:: -Specify options to adjust the behavior of parsing config files. See `struct -config_options` in `config.h` for details. As an example: regular `git_config` -sets `opts.respect_includes` to `1` by default. - -Reading Specific Files ----------------------- - -To read a specific file in git-config format, use -`git_config_from_file`. This takes the same callback and data parameters -as `git_config`. - -Querying For Specific Variables -------------------------------- - -For programs wanting to query for specific variables in a non-callback -manner, the config API provides two functions `git_config_get_value` -and `git_config_get_value_multi`. They both read values from an internal -cache generated previously from reading the config files. - -`int git_config_get_value(const char *key, const char **value)`:: - - Finds the highest-priority value for the configuration variable `key`, - stores the pointer to it in `value` and returns 0. When the - configuration variable `key` is not found, returns 1 without touching - `value`. The caller should not free or modify `value`, as it is owned - by the cache. - -`const struct string_list *git_config_get_value_multi(const char *key)`:: - - Finds and returns the value list, sorted in order of increasing priority - for the configuration variable `key`. When the configuration variable - `key` is not found, returns NULL. The caller should not free or modify - the returned pointer, as it is owned by the cache. - -`void git_config_clear(void)`:: - - Resets and invalidates the config cache. - -The config API also provides type specific API functions which do conversion -as well as retrieval for the queried variable, including: - -`int git_config_get_int(const char *key, int *dest)`:: - - Finds and parses the value to an integer for the configuration variable - `key`. Dies on error; otherwise, stores the value of the parsed integer in - `dest` and returns 0. When the configuration variable `key` is not found, - returns 1 without touching `dest`. - -`int git_config_get_ulong(const char *key, unsigned long *dest)`:: - - Similar to `git_config_get_int` but for unsigned longs. - -`int git_config_get_bool(const char *key, int *dest)`:: - - Finds and parses the value into a boolean value, for the configuration - variable `key` respecting keywords like "true" and "false". Integer - values are converted into true/false values (when they are non-zero or - zero, respectively). Other values cause a die(). If parsing is successful, - stores the value of the parsed result in `dest` and returns 0. When the - configuration variable `key` is not found, returns 1 without touching - `dest`. - -`int git_config_get_bool_or_int(const char *key, int *is_bool, int *dest)`:: - - Similar to `git_config_get_bool`, except that integers are copied as-is, - and `is_bool` flag is unset. - -`int git_config_get_maybe_bool(const char *key, int *dest)`:: - - Similar to `git_config_get_bool`, except that it returns -1 on error - rather than dying. - -`int git_config_get_string_const(const char *key, const char **dest)`:: - - Allocates and copies the retrieved string into the `dest` parameter for - the configuration variable `key`; if NULL string is given, prints an - error message and returns -1. When the configuration variable `key` is - not found, returns 1 without touching `dest`. - -`int git_config_get_string(const char *key, char **dest)`:: - - Similar to `git_config_get_string_const`, except that retrieved value - copied into the `dest` parameter is a mutable string. - -`int git_config_get_pathname(const char *key, const char **dest)`:: - - Similar to `git_config_get_string`, but expands `~` or `~user` into - the user's home directory when found at the beginning of the path. - -`git_die_config(const char *key, const char *err, ...)`:: - - First prints the error message specified by the caller in `err` and then - dies printing the line number and the file name of the highest priority - value for the configuration variable `key`. - -`void git_die_config_linenr(const char *key, const char *filename, int linenr)`:: - - Helper function which formats the die error message according to the - parameters entered. Used by `git_die_config()`. It can be used by callers - handling `git_config_get_value_multi()` to print the correct error message - for the desired value. - -See test-config.c for usage examples. - -Value Parsing Helpers ---------------------- - -To aid in parsing string values, the config API provides callbacks with -a number of helper functions, including: - -`git_config_int`:: -Parse the string to an integer, including unit factors. Dies on error; -otherwise, returns the parsed result. - -`git_config_ulong`:: -Identical to `git_config_int`, but for unsigned longs. - -`git_config_bool`:: -Parse a string into a boolean value, respecting keywords like "true" and -"false". Integer values are converted into true/false values (when they -are non-zero or zero, respectively). Other values cause a die(). If -parsing is successful, the return value is the result. - -`git_config_bool_or_int`:: -Same as `git_config_bool`, except that integers are returned as-is, and -an `is_bool` flag is unset. - -`git_parse_maybe_bool`:: -Same as `git_config_bool`, except that it returns -1 on error rather -than dying. - -`git_config_string`:: -Allocates and copies the value string into the `dest` parameter; if no -string is given, prints an error message and returns -1. - -`git_config_pathname`:: -Similar to `git_config_string`, but expands `~` or `~user` into the -user's home directory when found at the beginning of the path. - -Include Directives ------------------- - -By default, the config parser does not respect include directives. -However, a caller can use the special `git_config_include` wrapper -callback to support them. To do so, you simply wrap your "real" callback -function and data pointer in a `struct config_include_data`, and pass -the wrapper to the regular config-reading functions. For example: - -------------------------------------------- -int read_file_with_include(const char *file, config_fn_t fn, void *data) -{ - struct config_include_data inc = CONFIG_INCLUDE_INIT; - inc.fn = fn; - inc.data = data; - return git_config_from_file(git_config_include, file, &inc); -} -------------------------------------------- - -`git_config` respects includes automatically. The lower-level -`git_config_from_file` does not. - -Custom Configsets ------------------ - -A `config_set` can be used to construct an in-memory cache for -config-like files that the caller specifies (i.e., files like `.gitmodules`, -`~/.gitconfig` etc.). For example, - ----------------------------------------- -struct config_set gm_config; -git_configset_init(&gm_config); -int b; -/* we add config files to the config_set */ -git_configset_add_file(&gm_config, ".gitmodules"); -git_configset_add_file(&gm_config, ".gitmodules_alt"); - -if (!git_configset_get_bool(gm_config, "submodule.frotz.ignore", &b)) { - /* hack hack hack */ -} - -/* when we are done with the configset */ -git_configset_clear(&gm_config); ----------------------------------------- - -Configset API provides functions for the above mentioned work flow, including: - -`void git_configset_init(struct config_set *cs)`:: - - Initializes the config_set `cs`. - -`int git_configset_add_file(struct config_set *cs, const char *filename)`:: - - Parses the file and adds the variable-value pairs to the `config_set`, - dies if there is an error in parsing the file. Returns 0 on success, or - -1 if the file does not exist or is inaccessible. The user has to decide - if he wants to free the incomplete configset or continue using it when - the function returns -1. - -`int git_configset_get_value(struct config_set *cs, const char *key, const char **value)`:: - - Finds the highest-priority value for the configuration variable `key` - and config set `cs`, stores the pointer to it in `value` and returns 0. - When the configuration variable `key` is not found, returns 1 without - touching `value`. The caller should not free or modify `value`, as it - is owned by the cache. - -`const struct string_list *git_configset_get_value_multi(struct config_set *cs, const char *key)`:: - - Finds and returns the value list, sorted in order of increasing priority - for the configuration variable `key` and config set `cs`. When the - configuration variable `key` is not found, returns NULL. The caller - should not free or modify the returned pointer, as it is owned by the cache. - -`void git_configset_clear(struct config_set *cs)`:: - - Clears `config_set` structure, removes all saved variable-value pairs. - -In addition to above functions, the `config_set` API provides type specific -functions in the vein of `git_config_get_int` and family but with an extra -parameter, pointer to struct `config_set`. -They all behave similarly to the `git_config_get*()` family described in -"Querying For Specific Variables" above. - -Writing Config Files --------------------- - -Git gives multiple entry points in the Config API to write config values to -files namely `git_config_set_in_file` and `git_config_set`, which write to -a specific config file or to `.git/config` respectively. They both take a -key/value pair as parameter. -In the end they both call `git_config_set_multivar_in_file` which takes four -parameters: - -- the name of the file, as a string, to which key/value pairs will be written. - -- the name of key, as a string. This is in canonical "flat" form: the section, - subsection, and variable segments will be separated by dots, and the section - and variable segments will be all lowercase. - E.g., `core.ignorecase`, `diff.SomeType.textconv`. - -- the value of the variable, as a string. If value is equal to NULL, it will - remove the matching key from the config file. - -- the value regex, as a string. It will disregard key/value pairs where value - does not match. - -- a multi_replace value, as an int. If value is equal to zero, nothing or only - one matching key/value is replaced, else all matching key/values (regardless - how many) are removed, before the new pair is written. - -It returns 0 on success. - -Also, there are functions `git_config_rename_section` and -`git_config_rename_section_in_file` with parameters `old_name` and `new_name` -for renaming or removing sections in the config files. If NULL is passed -through `new_name` parameter, the section will be removed from the config file. diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt index 5abb8e8b1f..76b6e4f71b 100644 --- a/Documentation/technical/api-directory-listing.txt +++ b/Documentation/technical/api-directory-listing.txt @@ -111,11 +111,11 @@ marked. If you to exclude files, make sure you have loaded index first. * Prepare `struct dir_struct dir` and clear it with `memset(&dir, 0, sizeof(dir))`. -* To add single exclude pattern, call `add_exclude_list()` and then - `add_exclude()`. +* To add single exclude pattern, call `add_pattern_list()` and then + `add_pattern()`. * To add patterns from a file (e.g. `.git/info/exclude`), call - `add_excludes_from_file()` , and/or set `dir.exclude_per_dir`. A + `add_patterns_from_file()` , and/or set `dir.exclude_per_dir`. A short-hand function `setup_standard_excludes()` can be used to set up the standard set of exclude settings. diff --git a/Documentation/technical/api-grep.txt b/Documentation/technical/api-grep.txt deleted file mode 100644 index a69cc8964d..0000000000 --- a/Documentation/technical/api-grep.txt +++ /dev/null @@ -1,8 +0,0 @@ -grep API -======== - -Talk about <grep.h>, things like: - -* grep_buffer() - -(JC) diff --git a/Documentation/technical/api-object-access.txt b/Documentation/technical/api-object-access.txt deleted file mode 100644 index 5b29622d00..0000000000 --- a/Documentation/technical/api-object-access.txt +++ /dev/null @@ -1,15 +0,0 @@ -object access API -================= - -Talk about <sha1-file.c> and <object.h> family, things like - -* read_sha1_file() -* read_object_with_reference() -* has_sha1_file() -* write_sha1_file() -* pretend_object_file() -* lookup_{object,commit,tag,blob,tree} -* parse_{object,commit,tag,blob,tree} -* Use of object flags - -(JC, Shawn, Daniel, Dscho, Linus) diff --git a/Documentation/technical/api-quote.txt b/Documentation/technical/api-quote.txt deleted file mode 100644 index e8a1bce94e..0000000000 --- a/Documentation/technical/api-quote.txt +++ /dev/null @@ -1,10 +0,0 @@ -quote API -========= - -Talk about <quote.h>, things like - -* sq_quote and unquote -* c_style quote and unquote -* quoting for foreign languages - -(JC) diff --git a/Documentation/technical/api-submodule-config.txt b/Documentation/technical/api-submodule-config.txt index fb06089393..c409559b86 100644 --- a/Documentation/technical/api-submodule-config.txt +++ b/Documentation/technical/api-submodule-config.txt @@ -58,7 +58,7 @@ Functions Whenever a submodule configuration is parsed in `parse_submodule_config_option` via e.g. `gitmodules_config()`, it will overwrite the null_sha1 entry. -So in the normal case, when HEAD:.gitmodules is parsed first and then overlayed +So in the normal case, when HEAD:.gitmodules is parsed first and then overlaid with the repository configuration, the null_sha1 entry contains the local configuration of a submodule (e.g. consolidated values from local git configuration and the .gitmodules file in the worktree). diff --git a/Documentation/technical/api-trace2.txt b/Documentation/technical/api-trace2.txt index 71eb081fed..17490b528c 100644 --- a/Documentation/technical/api-trace2.txt +++ b/Documentation/technical/api-trace2.txt @@ -128,7 +128,7 @@ yields ------------ $ cat ~/log.event -{"event":"version","sid":"sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.620713Z","file":"common-main.c","line":38,"evt":"1","exe":"2.20.1.155.g426c96fcdb"} +{"event":"version","sid":"sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.620713Z","file":"common-main.c","line":38,"evt":"2","exe":"2.20.1.155.g426c96fcdb"} {"event":"start","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621027Z","file":"common-main.c","line":39,"t_abs":0.001173,"argv":["git","version"]} {"event":"cmd_name","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621122Z","file":"git.c","line":432,"name":"version","hierarchy":"version"} {"event":"exit","sid":"20190408T191610.507018Z-H9b68c35f-P000059a8","thread":"main","time":"2019-01-16T17:28:42.621236Z","file":"git.c","line":662,"t_abs":0.001227,"code":0} @@ -142,10 +142,9 @@ system or global config value to one of the following: include::../trace2-target-values.txt[] -If the target already exists and is a directory, the traces will be -written to files (one per process) underneath the given directory. They -will be named according to the last component of the SID (optionally -followed by a counter to avoid filename collisions). +When trace files are written to a target directory, they will be named according +to the last component of the SID (optionally followed by a counter to avoid +filename collisions). == Trace2 API @@ -179,7 +178,7 @@ describe the simplified forms. == Public API -All Trace2 API functions send a messsage to all of the active +All Trace2 API functions send a message to all of the active Trace2 Targets. This section describes the set of available messages. @@ -378,7 +377,7 @@ of `pthread_create()`. and the thread elapsed time. + This function must be called by the thread-proc before it returns -(so that the coorect TLS data is used and cleaned up. It should +(so that the correct TLS data is used and cleaned up). It should not be called by the caller of `pthread_join()`. === Region and Data Messages @@ -407,7 +406,7 @@ The `label` field is an arbitrary label used to describe the activity being started, such as "read_recursive" or "do_read_index". + The `repo` field, if set, will be used to get the "repo-id", so that -recursive oerations can be attributed to the correct repository. +recursive operations can be attributed to the correct repository. `void trace2_region_leave(const char *category, const char *label, const struct repository *repo)`:: @@ -422,7 +421,7 @@ This function pops the region nesting stack on the current thread and reports the elapsed time of the stack frame. + The `category`, `label`, and `repo` fields are the same as above. -The `category` and `label` do not need to match the correpsonding +The `category` and `label` do not need to match the corresponding "region_enter" message, but it makes the data stream easier to understand. @@ -605,17 +604,35 @@ only present on the "start" and "atexit" events. ==== Event-Specific Key/Value Pairs `"version"`:: - This event gives the version of the executable and the EVENT format. + This event gives the version of the executable and the EVENT format. It + should always be the first event in a trace session. The EVENT format + version will be incremented if new event types are added, if existing + fields are removed, or if there are significant changes in + interpretation of existing events or fields. Smaller changes, such as + adding a new field to an existing event, will not require an increment + to the EVENT format version. + ------------ { "event":"version", ... - "evt":"1", # EVENT format version + "evt":"2", # EVENT format version "exe":"2.20.1.155.g426c96fcdb" # git version } ------------ +`"discard"`:: + This event is written to the git-trace2-discard sentinel file if there + are too many files in the target trace directory (see the + trace2.maxFiles config option). ++ +------------ +{ + "event":"discard", + ... +} +------------ + `"start"`:: This event contains the complete argv received by main(). + @@ -799,7 +816,7 @@ with "?". Note that the session-id of the child process is not available to the current/spawning process, so the child's PID is reported here as a hint for post-processing. (But it is only a hint because the child -proces may be a shell script which doesn't have a session-id.) +process may be a shell script which doesn't have a session-id.) + Note that the `t_rel` field contains the observed run time in seconds for the child process (starting before the fork/exec/spawn and @@ -1159,7 +1176,7 @@ d0 | main | atexit | | 0.028809 | | + Regions may be nested. This causes messages to be indented in the PERF target, for example. -Elapsed times are relative to the start of the correpsonding nesting +Elapsed times are relative to the start of the corresponding nesting level as expected. For example, if we add region message to: + ---------------- @@ -1354,7 +1371,7 @@ d0 | main | atexit | | 0.030027 | | In this example, the preload region took 0.009122 seconds. The 7 threads took between 0.006069 and 0.008947 seconds to work on their portion of the index. Thread "th01" worked on 508 items at offset 0. Thread "th02" -worked on 508 items at offset 2032. Thread "th04" worked on 508 itemts +worked on 508 items at offset 2032. Thread "th04" worked on 508 items at offset 508. + This example also shows that thread names are assigned in a racy manner diff --git a/Documentation/technical/api-tree-walking.txt b/Documentation/technical/api-tree-walking.txt index bde18622a8..7962e32854 100644 --- a/Documentation/technical/api-tree-walking.txt +++ b/Documentation/technical/api-tree-walking.txt @@ -62,9 +62,7 @@ Initializing `setup_traverse_info`:: Initialize a `traverse_info` given the pathname of the tree to start - traversing from. The `base` argument is assumed to be the `path` - member of the `name_entry` being recursed into unless the tree is a - top-level tree in which case the empty string ("") is used. + traversing from. Walking ------- @@ -140,6 +138,10 @@ same in the next callback invocation. This utilizes the memory structure of a tree entry to avoid the overhead of using a generic strlen(). +`strbuf_make_traverse_path`:: + + Convenience wrapper to `make_traverse_path` into a strbuf. + Authors ------- diff --git a/Documentation/technical/api-xdiff-interface.txt b/Documentation/technical/api-xdiff-interface.txt deleted file mode 100644 index 6296ecad1d..0000000000 --- a/Documentation/technical/api-xdiff-interface.txt +++ /dev/null @@ -1,7 +0,0 @@ -xdiff interface API -=================== - -Talk about our calling convention to xdiff library, including -xdiff_emit_consume_fn. - -(Dscho, JC) diff --git a/Documentation/technical/commit-graph.txt b/Documentation/technical/commit-graph.txt index 729fbcb32f..808fa30b99 100644 --- a/Documentation/technical/commit-graph.txt +++ b/Documentation/technical/commit-graph.txt @@ -22,11 +22,11 @@ as "commit-graph" either in the .git/objects/info directory or in the info directory of an alternate. The commit-graph file stores the commit graph structure along with some -extra metadata to speed up graph walks. By listing commit OIDs in lexi- -cographic order, we can identify an integer position for each commit and -refer to the parents of a commit using those integer positions. We use -binary search to find initial commits and then use the integer positions -for fast lookups during the walk. +extra metadata to speed up graph walks. By listing commit OIDs in +lexicographic order, we can identify an integer position for each commit +and refer to the parents of a commit using those integer positions. We +use binary search to find initial commits and then use the integer +positions for fast lookups during the walk. A consumer may load the following info for a commit from the graph: @@ -85,7 +85,7 @@ have generation number represented by the macro GENERATION_NUMBER_ZERO = 0. Since the commit-graph file is closed under reachability, we can guarantee the following weaker condition on all commits: - If A and B are commits with generation numbers N amd M, respectively, + If A and B are commits with generation numbers N and M, respectively, and N < M, then A cannot reach B. Note how the strict inequality differs from the inequality when we have @@ -323,14 +323,14 @@ Related Links [0] https://bugs.chromium.org/p/git/issues/detail?id=8 Chromium work item for: Serialized Commit Graph -[1] https://public-inbox.org/git/20110713070517.GC18566@sigill.intra.peff.net/ +[1] https://lore.kernel.org/git/20110713070517.GC18566@sigill.intra.peff.net/ An abandoned patch that introduced generation numbers. -[2] https://public-inbox.org/git/20170908033403.q7e6dj7benasrjes@sigill.intra.peff.net/ +[2] https://lore.kernel.org/git/20170908033403.q7e6dj7benasrjes@sigill.intra.peff.net/ Discussion about generation numbers on commits and how they interact with fsck. -[3] https://public-inbox.org/git/20170908034739.4op3w4f2ma5s65ku@sigill.intra.peff.net/ +[3] https://lore.kernel.org/git/20170908034739.4op3w4f2ma5s65ku@sigill.intra.peff.net/ More discussion about generation numbers and not storing them inside commit objects. A valuable quote: @@ -342,9 +342,9 @@ Related Links commit objects (i.e., packv4 or something like the "metapacks" I proposed a few years ago)." -[4] https://public-inbox.org/git/20180108154822.54829-1-git@jeffhostetler.com/T/#u +[4] https://lore.kernel.org/git/20180108154822.54829-1-git@jeffhostetler.com/T/#u A patch to remove the ahead-behind calculation from 'status'. -[5] https://public-inbox.org/git/f27db281-abad-5043-6d71-cbb083b1c877@gmail.com/ +[5] https://lore.kernel.org/git/f27db281-abad-5043-6d71-cbb083b1c877@gmail.com/ A discussion of a "two-dimensional graph position" that can allow reading multiple commit-graph chains at the same time. diff --git a/Documentation/technical/hash-function-transition.txt b/Documentation/technical/hash-function-transition.txt index 2ae8fa470a..5b2db3be1e 100644 --- a/Documentation/technical/hash-function-transition.txt +++ b/Documentation/technical/hash-function-transition.txt @@ -531,7 +531,7 @@ Until Git protocol gains SHA-256 support, using SHA-256 based storage on public-facing Git servers is strongly discouraged. Once Git protocol gains SHA-256 support, SHA-256 based servers are likely not to support SHA-1 compatibility, to avoid what may be a very expensive -hash reencode during clone and to encourage peers to modernize. +hash re-encode during clone and to encourage peers to modernize. The design described here allows fetches by SHA-1 clients of a personal SHA-256 repository because it's not much more difficult than @@ -602,7 +602,7 @@ git --output-format=sha1 log abac87a^{sha1}..f787cac^{sha256} Choice of Hash -------------- -In early 2005, around the time that Git was written, Xiaoyun Wang, +In early 2005, around the time that Git was written, Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu announced an attack finding SHA-1 collisions in 2^69 operations. In August they published details. Luckily, no practical demonstrations of a collision in full SHA-1 were @@ -730,7 +730,7 @@ adoption. Using hash functions in parallel ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -(e.g. https://public-inbox.org/git/22708.8913.864049.452252@chiark.greenend.org.uk/ ) +(e.g. https://lore.kernel.org/git/22708.8913.864049.452252@chiark.greenend.org.uk/ ) Objects newly created would be addressed by the new hash, but inside such an object (e.g. commit) it is still possible to address objects using the old hash function. @@ -783,7 +783,7 @@ bmwill@google.com, jonathantanmy@google.com, jrnieder@gmail.com, sbeller@google.com Initial version sent to -http://public-inbox.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com +http://lore.kernel.org/git/20170304011251.GA26789@aiede.mtv.corp.google.com 2017-03-03 jrnieder@gmail.com Incorporated suggestions from jonathantanmy and sbeller: @@ -820,8 +820,8 @@ Later history: edits. This document history is no longer being maintained as it would now be superfluous to the commit log -[1] http://public-inbox.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/ -[2] http://public-inbox.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/ -[3] http://public-inbox.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/ -[4] http://public-inbox.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net -[5] https://public-inbox.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/ +[1] http://lore.kernel.org/git/CA+55aFzJtejiCjV0e43+9oR3QuJK2PiFiLQemytoLpyJWe6P9w@mail.gmail.com/ +[2] http://lore.kernel.org/git/CA+55aFz+gkAsDZ24zmePQuEs1XPS9BP_s8O7Q4wQ7LV7X5-oDA@mail.gmail.com/ +[3] http://lore.kernel.org/git/20170306084353.nrns455dvkdsfgo5@sigill.intra.peff.net/ +[4] http://lore.kernel.org/git/20170304224936.rqqtkdvfjgyezsht@genre.crustytoothpaste.net +[5] https://lore.kernel.org/git/CAJo=hJtoX9=AyLHHpUJS7fueV9ciZ_MNpnEPHUz8Whui6g9F0A@mail.gmail.com/ diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index 7c4d67aa6a..faa25c5c52 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -318,7 +318,7 @@ The remaining data of each directory block is grouped by type: == End of Index Entry The End of Index Entry (EOIE) is used to locate the end of the variable - length index entries and the begining of the extensions. Code can take + length index entries and the beginning of the extensions. Code can take advantage of this to quickly locate the index extensions without having to parse through all of the index entries. @@ -351,7 +351,7 @@ The remaining data of each directory block is grouped by type: - A number of index offset entries each consisting of: - - 32-bit offset from the begining of the file to the first cache entry + - 32-bit offset from the beginning of the file to the first cache entry in this block of entries. - 32-bit count of cache entries in this block diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt index d7e57639f7..1e31239696 100644 --- a/Documentation/technical/multi-pack-index.txt +++ b/Documentation/technical/multi-pack-index.txt @@ -102,8 +102,8 @@ Related Links [0] https://bugs.chromium.org/p/git/issues/detail?id=6 Chromium work item for: Multi-Pack Index (MIDX) -[1] https://public-inbox.org/git/20180107181459.222909-1-dstolee@microsoft.com/ +[1] https://lore.kernel.org/git/20180107181459.222909-1-dstolee@microsoft.com/ An earlier RFC for the multi-pack-index feature -[2] https://public-inbox.org/git/alpine.DEB.2.20.1803091557510.23109@alexmv-linux/ +[2] https://lore.kernel.org/git/alpine.DEB.2.20.1803091557510.23109@alexmv-linux/ Git Merge 2018 Contributor's summit notes (includes discussion of MIDX) diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt index c73e72de0e..d5ce4eea8a 100644 --- a/Documentation/technical/pack-protocol.txt +++ b/Documentation/technical/pack-protocol.txt @@ -644,7 +644,7 @@ update was successful, or 'ng [refname] [error]' if the update was not. command-ok = PKT-LINE("ok" SP refname) command-fail = PKT-LINE("ng" SP refname SP error-msg) - error-msg = 1*(OCTECT) ; where not "ok" + error-msg = 1*(OCTET) ; where not "ok" ---- Updates can be unsuccessful for a number of reasons. The reference can have diff --git a/Documentation/technical/partial-clone.txt b/Documentation/technical/partial-clone.txt index 896c7b3878..b9e17e7a28 100644 --- a/Documentation/technical/partial-clone.txt +++ b/Documentation/technical/partial-clone.txt @@ -30,12 +30,20 @@ advance* during clone and fetch operations and thereby reduce download times and disk usage. Missing objects can later be "demand fetched" if/when needed. +A remote that can later provide the missing objects is called a +promisor remote, as it promises to send the objects when +requested. Initially Git supported only one promisor remote, the origin +remote from which the user cloned and that was configured in the +"extensions.partialClone" config option. Later support for more than +one promisor remote has been implemented. + Use of partial clone requires that the user be online and the origin -remote be available for on-demand fetching of missing objects. This may -or may not be problematic for the user. For example, if the user can -stay within the pre-selected subset of the source tree, they may not -encounter any missing objects. Alternatively, the user could try to -pre-fetch various objects if they know that they are going offline. +remote or other promisor remotes be available for on-demand fetching +of missing objects. This may or may not be problematic for the user. +For example, if the user can stay within the pre-selected subset of +the source tree, they may not encounter any missing objects. +Alternatively, the user could try to pre-fetch various objects if they +know that they are going offline. Non-Goals @@ -100,18 +108,18 @@ or commits that reference missing trees. Handling Missing Objects ------------------------ -- An object may be missing due to a partial clone or fetch, or missing due - to repository corruption. To differentiate these cases, the local - repository specially indicates such filtered packfiles obtained from the - promisor remote as "promisor packfiles". +- An object may be missing due to a partial clone or fetch, or missing + due to repository corruption. To differentiate these cases, the + local repository specially indicates such filtered packfiles + obtained from promisor remotes as "promisor packfiles". + These promisor packfiles consist of a "<name>.promisor" file with arbitrary contents (like the "<name>.keep" files), in addition to their "<name>.pack" and "<name>.idx" files. - The local repository considers a "promisor object" to be an object that - it knows (to the best of its ability) that the promisor remote has promised - that it has, either because the local repository has that object in one of + it knows (to the best of its ability) that promisor remotes have promised + that they have, either because the local repository has that object in one of its promisor packfiles, or because another promisor object refers to it. + When Git encounters a missing object, Git can see if it is a promisor object @@ -123,12 +131,12 @@ expensive-to-modify list of missing objects.[a] - Since almost all Git code currently expects any referenced object to be present locally and because we do not want to force every command to do a dry-run first, a fallback mechanism is added to allow Git to attempt - to dynamically fetch missing objects from the promisor remote. + to dynamically fetch missing objects from promisor remotes. + When the normal object lookup fails to find an object, Git invokes -fetch-object to try to get the object from the server and then retry -the object lookup. This allows objects to be "faulted in" without -complicated prediction algorithms. +promisor_remote_get_direct() to try to get the object from a promisor +remote and then retry the object lookup. This allows objects to be +"faulted in" without complicated prediction algorithms. + For efficiency reasons, no check as to whether the missing object is actually a promisor object is performed. @@ -157,8 +165,7 @@ and prefetch those objects in bulk. + We are not happy with this global variable and would like to remove it, but that requires significant refactoring of the object code to pass an -additional flag. We hope that concurrent efforts to add an ODB API can -encompass this. +additional flag. Fetching Missing Objects @@ -182,21 +189,63 @@ has been updated to not use any object flags when the corresponding argument though they are not necessary. +Using many promisor remotes +--------------------------- + +Many promisor remotes can be configured and used. + +This allows for example a user to have multiple geographically-close +cache servers for fetching missing blobs while continuing to do +filtered `git-fetch` commands from the central server. + +When fetching objects, promisor remotes are tried one after the other +until all the objects have been fetched. + +Remotes that are considered "promisor" remotes are those specified by +the following configuration variables: + +- `extensions.partialClone = <name>` + +- `remote.<name>.promisor = true` + +- `remote.<name>.partialCloneFilter = ...` + +Only one promisor remote can be configured using the +`extensions.partialClone` config variable. This promisor remote will +be the last one tried when fetching objects. + +We decided to make it the last one we try, because it is likely that +someone using many promisor remotes is doing so because the other +promisor remotes are better for some reason (maybe they are closer or +faster for some kind of objects) than the origin, and the origin is +likely to be the remote specified by extensions.partialClone. + +This justification is not very strong, but one choice had to be made, +and anyway the long term plan should be to make the order somehow +fully configurable. + +For now though the other promisor remotes will be tried in the order +they appear in the config file. + Current Limitations ------------------- -- The remote used for a partial clone (or the first partial fetch - following a regular clone) is marked as the "promisor remote". +- It is not possible to specify the order in which the promisor + remotes are tried in other ways than the order in which they appear + in the config file. + -We are currently limited to a single promisor remote and only that -remote may be used for subsequent partial fetches. +It is also not possible to specify an order to be used when fetching +from one remote and a different order when fetching from another +remote. + +- It is not possible to push only specific objects to a promisor + remote. + -We accept this limitation because we believe initial users of this -feature will be using it on repositories with a strong single central -server. +It is not possible to push at the same time to multiple promisor +remote in a specific order. -- Dynamic object fetching will only ask the promisor remote for missing - objects. We assume that the promisor remote has a complete view of the +- Dynamic object fetching will only ask promisor remotes for missing + objects. We assume that promisor remotes have a complete view of the repository and can satisfy all such requests. - Repack essentially treats promisor and non-promisor packfiles as 2 @@ -218,15 +267,17 @@ server. Future Work ----------- -- Allow more than one promisor remote and define a strategy for fetching - missing objects from specific promisor remotes or of iterating over the - set of promisor remotes until a missing object is found. +- Improve the way to specify the order in which promisor remotes are + tried. + -A user might want to have multiple geographically-close cache servers -for fetching missing blobs while continuing to do filtered `git-fetch` -commands from the central server, for example. +For example this could allow to specify explicitly something like: +"When fetching from this remote, I want to use these promisor remotes +in this order, though, when pushing or fetching to that remote, I want +to use those promisor remotes in that order." + +- Allow pushing to promisor remotes. + -Or the user might want to work in a triangular work flow with multiple +The user might want to work in a triangular work flow with multiple promisor remotes that each have an incomplete view of the repository. - Allow repack to work on promisor packfiles (while keeping them distinct @@ -299,26 +350,26 @@ Related Links [0] https://crbug.com/git/2 Bug#2: Partial Clone -[1] https://public-inbox.org/git/20170113155253.1644-1-benpeart@microsoft.com/ + +[1] https://lore.kernel.org/git/20170113155253.1644-1-benpeart@microsoft.com/ + Subject: [RFC] Add support for downloading blobs on demand + Date: Fri, 13 Jan 2017 10:52:53 -0500 -[2] https://public-inbox.org/git/cover.1506714999.git.jonathantanmy@google.com/ + +[2] https://lore.kernel.org/git/cover.1506714999.git.jonathantanmy@google.com/ + Subject: [PATCH 00/18] Partial clone (from clone to lazy fetch in 18 patches) + Date: Fri, 29 Sep 2017 13:11:36 -0700 -[3] https://public-inbox.org/git/20170426221346.25337-1-jonathantanmy@google.com/ + +[3] https://lore.kernel.org/git/20170426221346.25337-1-jonathantanmy@google.com/ + Subject: Proposal for missing blob support in Git repos + Date: Wed, 26 Apr 2017 15:13:46 -0700 -[4] https://public-inbox.org/git/1488999039-37631-1-git-send-email-git@jeffhostetler.com/ + +[4] https://lore.kernel.org/git/1488999039-37631-1-git-send-email-git@jeffhostetler.com/ + Subject: [PATCH 00/10] RFC Partial Clone and Fetch + Date: Wed, 8 Mar 2017 18:50:29 +0000 -[5] https://public-inbox.org/git/20170505152802.6724-1-benpeart@microsoft.com/ + +[5] https://lore.kernel.org/git/20170505152802.6724-1-benpeart@microsoft.com/ + Subject: [PATCH v7 00/10] refactor the filter process code into a reusable module + Date: Fri, 5 May 2017 11:27:52 -0400 -[6] https://public-inbox.org/git/20170714132651.170708-1-benpeart@microsoft.com/ + +[6] https://lore.kernel.org/git/20170714132651.170708-1-benpeart@microsoft.com/ + Subject: [RFC/PATCH v2 0/1] Add support for downloading blobs on demand + Date: Fri, 14 Jul 2017 09:26:50 -0400 diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt index 40f91f6b1e..7e3766cafb 100644 --- a/Documentation/technical/protocol-v2.txt +++ b/Documentation/technical/protocol-v2.txt @@ -252,7 +252,7 @@ A `fetch` request can take the following arguments: ofs-delta Indicate that the client understands PACKv2 with delta referring to its base by position in pack rather than by an oid. That is, - they can read OBJ_OFS_DELTA (ake type 6) in a packfile. + they can read OBJ_OFS_DELTA (aka type 6) in a packfile. If the 'shallow' feature is advertised the following arguments can be included in the clients request as well as the potential addition of the diff --git a/Documentation/technical/racy-git.txt b/Documentation/technical/racy-git.txt index 4a8be4d144..ceda4bbfda 100644 --- a/Documentation/technical/racy-git.txt +++ b/Documentation/technical/racy-git.txt @@ -51,7 +51,7 @@ of git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git only fixes the issue for file systems with exactly 1 ns or 1 s resolution. Other file systems are still broken in current Linux kernels (e.g. CEPH, CIFS, NTFS, UDF), see -https://lkml.org/lkml/2015/6/9/714 +https://lore.kernel.org/lkml/5577240D.7020309@gmail.com/ Racy Git -------- diff --git a/Documentation/technical/rerere.txt b/Documentation/technical/rerere.txt index aa22d7ace8..af5f9fc24f 100644 --- a/Documentation/technical/rerere.txt +++ b/Documentation/technical/rerere.txt @@ -117,7 +117,7 @@ early A became C or B, a late X became Y or Z". We can see there are 4 combinations of ("B or C", "C or B") x ("X or Y", "Y or X"). By sorting, the conflict is given its canonical name, namely, "an -early part became B or C, a late part becames X or Y", and whenever +early part became B or C, a late part became X or Y", and whenever any of these four patterns appear, and we can get to the same conflict and resolution that we saw earlier. |