diff options
Diffstat (limited to 'Documentation/technical')
20 files changed, 613 insertions, 460 deletions
diff --git a/Documentation/technical/api-allocation-growing.txt b/Documentation/technical/api-allocation-growing.txt index 542946b1ba..5a59b54844 100644 --- a/Documentation/technical/api-allocation-growing.txt +++ b/Documentation/technical/api-allocation-growing.txt @@ -34,3 +34,6 @@ item[nr++] = value you like; ------------ You are responsible for updating the `nr` variable. + +If you need to specify the number of elements to allocate explicitly +then use the macro `REALLOC_ARRAY(item, alloc)` instead of `ALLOC_GROW`. diff --git a/Documentation/technical/api-argv-array.txt b/Documentation/technical/api-argv-array.txt index 1a797812fb..8076172a08 100644 --- a/Documentation/technical/api-argv-array.txt +++ b/Documentation/technical/api-argv-array.txt @@ -46,6 +46,9 @@ Functions Format a string and push it onto the end of the array. This is a convenience wrapper combining `strbuf_addf` and `argv_array_push`. +`argv_array_pushv`:: + Push a null-terminated array of strings onto the end of the array. + `argv_array_pop`:: Remove the final element from the array. If there are no elements in the array, do nothing. diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt index edd5018e15..0d8b99b368 100644 --- a/Documentation/technical/api-config.txt +++ b/Documentation/technical/api-config.txt @@ -77,6 +77,99 @@ To read a specific file in git-config format, use `git_config_from_file`. This takes the same callback and data parameters as `git_config`. +Querying For Specific Variables +------------------------------- + +For programs wanting to query for specific variables in a non-callback +manner, the config API provides two functions `git_config_get_value` +and `git_config_get_value_multi`. They both read values from an internal +cache generated previously from reading the config files. + +`int git_config_get_value(const char *key, const char **value)`:: + + Finds the highest-priority value for the configuration variable `key`, + stores the pointer to it in `value` and returns 0. When the + configuration variable `key` is not found, returns 1 without touching + `value`. The caller should not free or modify `value`, as it is owned + by the cache. + +`const struct string_list *git_config_get_value_multi(const char *key)`:: + + Finds and returns the value list, sorted in order of increasing priority + for the configuration variable `key`. When the configuration variable + `key` is not found, returns NULL. The caller should not free or modify + the returned pointer, as it is owned by the cache. + +`void git_config_clear(void)`:: + + Resets and invalidates the config cache. + +The config API also provides type specific API functions which do conversion +as well as retrieval for the queried variable, including: + +`int git_config_get_int(const char *key, int *dest)`:: + + Finds and parses the value to an integer for the configuration variable + `key`. Dies on error; otherwise, stores the value of the parsed integer in + `dest` and returns 0. When the configuration variable `key` is not found, + returns 1 without touching `dest`. + +`int git_config_get_ulong(const char *key, unsigned long *dest)`:: + + Similar to `git_config_get_int` but for unsigned longs. + +`int git_config_get_bool(const char *key, int *dest)`:: + + Finds and parses the value into a boolean value, for the configuration + variable `key` respecting keywords like "true" and "false". Integer + values are converted into true/false values (when they are non-zero or + zero, respectively). Other values cause a die(). If parsing is successful, + stores the value of the parsed result in `dest` and returns 0. When the + configuration variable `key` is not found, returns 1 without touching + `dest`. + +`int git_config_get_bool_or_int(const char *key, int *is_bool, int *dest)`:: + + Similar to `git_config_get_bool`, except that integers are copied as-is, + and `is_bool` flag is unset. + +`int git_config_get_maybe_bool(const char *key, int *dest)`:: + + Similar to `git_config_get_bool`, except that it returns -1 on error + rather than dying. + +`int git_config_get_string_const(const char *key, const char **dest)`:: + + Allocates and copies the retrieved string into the `dest` parameter for + the configuration variable `key`; if NULL string is given, prints an + error message and returns -1. When the configuration variable `key` is + not found, returns 1 without touching `dest`. + +`int git_config_get_string(const char *key, char **dest)`:: + + Similar to `git_config_get_string_const`, except that retrieved value + copied into the `dest` parameter is a mutable string. + +`int git_config_get_pathname(const char *key, const char **dest)`:: + + Similar to `git_config_get_string`, but expands `~` or `~user` into + the user's home directory when found at the beginning of the path. + +`git_die_config(const char *key, const char *err, ...)`:: + + First prints the error message specified by the caller in `err` and then + dies printing the line number and the file name of the highest priority + value for the configuration variable `key`. + +`void git_die_config_linenr(const char *key, const char *filename, int linenr)`:: + + Helper function which formats the die error message according to the + parameters entered. Used by `git_die_config()`. It can be used by callers + handling `git_config_get_value_multi()` to print the correct error message + for the desired value. + +See test-config.c for usage examples. + Value Parsing Helpers --------------------- @@ -134,6 +227,68 @@ int read_file_with_include(const char *file, config_fn_t fn, void *data) `git_config` respects includes automatically. The lower-level `git_config_from_file` does not. +Custom Configsets +----------------- + +A `config_set` can be used to construct an in-memory cache for +config-like files that the caller specifies (i.e., files like `.gitmodules`, +`~/.gitconfig` etc.). For example, + +--------------------------------------- +struct config_set gm_config; +git_configset_init(&gm_config); +int b; +/* we add config files to the config_set */ +git_configset_add_file(&gm_config, ".gitmodules"); +git_configset_add_file(&gm_config, ".gitmodules_alt"); + +if (!git_configset_get_bool(gm_config, "submodule.frotz.ignore", &b)) { + /* hack hack hack */ +} + +/* when we are done with the configset */ +git_configset_clear(&gm_config); +---------------------------------------- + +Configset API provides functions for the above mentioned work flow, including: + +`void git_configset_init(struct config_set *cs)`:: + + Initializes the config_set `cs`. + +`int git_configset_add_file(struct config_set *cs, const char *filename)`:: + + Parses the file and adds the variable-value pairs to the `config_set`, + dies if there is an error in parsing the file. Returns 0 on success, or + -1 if the file does not exist or is inaccessible. The user has to decide + if he wants to free the incomplete configset or continue using it when + the function returns -1. + +`int git_configset_get_value(struct config_set *cs, const char *key, const char **value)`:: + + Finds the highest-priority value for the configuration variable `key` + and config set `cs`, stores the pointer to it in `value` and returns 0. + When the configuration variable `key` is not found, returns 1 without + touching `value`. The caller should not free or modify `value`, as it + is owned by the cache. + +`const struct string_list *git_configset_get_value_multi(struct config_set *cs, const char *key)`:: + + Finds and returns the value list, sorted in order of increasing priority + for the configuration variable `key` and config set `cs`. When the + configuration variable `key` is not found, returns NULL. The caller + should not free or modify the returned pointer, as it is owned by the cache. + +`void git_configset_clear(struct config_set *cs)`:: + + Clears `config_set` structure, removes all saved variable-value pairs. + +In addition to above functions, the `config_set` API provides type specific +functions in the vein of `git_config_get_int` and family but with an extra +parameter, pointer to struct `config_set`. +They all behave similarly to the `git_config_get*()` family described in +"Querying For Specific Variables" above. + Writing Config Files -------------------- diff --git a/Documentation/technical/api-credentials.txt b/Documentation/technical/api-credentials.txt index c1b42a40d3..e44426dd04 100644 --- a/Documentation/technical/api-credentials.txt +++ b/Documentation/technical/api-credentials.txt @@ -248,7 +248,10 @@ FORMAT` in linkgit:git-credential[7] for a detailed specification). For a `get` operation, the helper should produce a list of attributes on stdout in the same format. A helper is free to produce a subset, or even no values at all if it has nothing useful to provide. Any provided -attributes will overwrite those already known about by Git. +attributes will overwrite those already known about by Git. If a helper +outputs a `quit` attribute with a value of `true` or `1`, no further +helpers will be consulted, nor will the user be prompted (if no +credential has been provided, the operation will then fail). For a `store` or `erase` operation, the helper's output is ignored. If it fails to perform the requested operation, it may complain to diff --git a/Documentation/technical/api-error-handling.txt b/Documentation/technical/api-error-handling.txt new file mode 100644 index 0000000000..ceeedd485c --- /dev/null +++ b/Documentation/technical/api-error-handling.txt @@ -0,0 +1,75 @@ +Error reporting in git +====================== + +`die`, `usage`, `error`, and `warning` report errors of various +kinds. + +- `die` is for fatal application errors. It prints a message to + the user and exits with status 128. + +- `usage` is for errors in command line usage. After printing its + message, it exits with status 129. (See also `usage_with_options` + in the link:api-parse-options.html[parse-options API].) + +- `error` is for non-fatal library errors. It prints a message + to the user and returns -1 for convenience in signaling the error + to the caller. + +- `warning` is for reporting situations that probably should not + occur but which the user (and Git) can continue to work around + without running into too many problems. Like `error`, it + returns -1 after reporting the situation to the caller. + +Customizable error handlers +--------------------------- + +The default behavior of `die` and `error` is to write a message to +stderr and then exit or return as appropriate. This behavior can be +overridden using `set_die_routine` and `set_error_routine`. For +example, "git daemon" uses set_die_routine to write the reason `die` +was called to syslog before exiting. + +Library errors +-------------- + +Functions return a negative integer on error. Details beyond that +vary from function to function: + +- Some functions return -1 for all errors. Others return a more + specific value depending on how the caller might want to react + to the error. + +- Some functions report the error to stderr with `error`, + while others leave that for the caller to do. + +- errno is not meaningful on return from most functions (except + for thin wrappers for system calls). + +Check the function's API documentation to be sure. + +Caller-handled errors +--------------------- + +An increasing number of functions take a parameter 'struct strbuf *err'. +On error, such functions append a message about what went wrong to the +'err' strbuf. The message is meant to be complete enough to be passed +to `die` or `error` as-is. For example: + + if (ref_transaction_commit(transaction, &err)) + die("%s", err.buf); + +The 'err' parameter will be untouched if no error occurred, so multiple +function calls can be chained: + + t = ref_transaction_begin(&err); + if (!t || + ref_transaction_update(t, "HEAD", ..., &err) || + ret_transaction_commit(t, &err)) + die("%s", err.buf); + +The 'err' parameter must be a pointer to a valid strbuf. To silence +a message, pass a strbuf that is explicitly ignored: + + if (thing_that_can_fail_in_an_ignorable_way(..., &err)) + /* This failure is okay. */ + strbuf_reset(&err); diff --git a/Documentation/technical/api-lockfile.txt b/Documentation/technical/api-lockfile.txt deleted file mode 100644 index dd894043ae..0000000000 --- a/Documentation/technical/api-lockfile.txt +++ /dev/null @@ -1,74 +0,0 @@ -lockfile API -============ - -The lockfile API serves two purposes: - -* Mutual exclusion. When we write out a new index file, first - we create a new file `$GIT_DIR/index.lock`, write the new - contents into it, and rename it to the final destination - `$GIT_DIR/index`. We try to create the `$GIT_DIR/index.lock` - file with O_EXCL so that we can notice and fail when somebody - else is already trying to update the index file. - -* Automatic cruft removal. After we create the "lock" file, we - may decide to `die()`, and we would want to make sure that we - remove the file that has not been committed to its final - destination. This is done by remembering the lockfiles we - created in a linked list and cleaning them up from an - `atexit(3)` handler. Outstanding lockfiles are also removed - when the program dies on a signal. - - -The functions -------------- - -hold_lock_file_for_update:: - - Take a pointer to `struct lock_file`, the filename of - the final destination (e.g. `$GIT_DIR/index`) and a flag - `die_on_error`. Attempt to create a lockfile for the - destination and return the file descriptor for writing - to the file. If `die_on_error` flag is true, it dies if - a lock is already taken for the file; otherwise it - returns a negative integer to the caller on failure. - -commit_lock_file:: - - Take a pointer to the `struct lock_file` initialized - with an earlier call to `hold_lock_file_for_update()`, - close the file descriptor and rename the lockfile to its - final destination. Returns 0 upon success, a negative - value on failure to close(2) or rename(2). - -rollback_lock_file:: - - Take a pointer to the `struct lock_file` initialized - with an earlier call to `hold_lock_file_for_update()`, - close the file descriptor and remove the lockfile. - -close_lock_file:: - Take a pointer to the `struct lock_file` initialized - with an earlier call to `hold_lock_file_for_update()`, - and close the file descriptor. Returns 0 upon success, - a negative value on failure to close(2). - -Because the structure is used in an `atexit(3)` handler, its -storage has to stay throughout the life of the program. It -cannot be an auto variable allocated on the stack. - -Call `commit_lock_file()` or `rollback_lock_file()` when you are -done writing to the file descriptor. If you do not call either -and simply `exit(3)` from the program, an `atexit(3)` handler -will close and remove the lockfile. - -If you need to close the file descriptor you obtained from -`hold_lock_file_for_update` function yourself, do so by calling -`close_lock_file()`. You should never call `close(2)` yourself! -Otherwise the `struct -lock_file` structure still remembers that the file descriptor -needs to be closed, and a later call to `commit_lock_file()` or -`rollback_lock_file()` will result in duplicate calls to -`close(2)`. Worse yet, if you `close(2)`, open another file -descriptor for completely different purpose, and then call -`commit_lock_file()` or `rollback_lock_file()`, they may close -that unrelated file descriptor. diff --git a/Documentation/technical/api-parse-options.txt b/Documentation/technical/api-parse-options.txt index 1f2db31312..5f0757dcc9 100644 --- a/Documentation/technical/api-parse-options.txt +++ b/Documentation/technical/api-parse-options.txt @@ -168,6 +168,12 @@ There are some macros to easily define options: Introduce an option with integer argument. The integer is put into `int_var`. +`OPT_MAGNITUDE(short, long, &unsigned_long_var, description)`:: + Introduce an option with a size argument. The argument must be a + non-negative integer and may include a suffix of 'k', 'm' or 'g' to + scale the provided value by 1024, 1024^2 or 1024^3 respectively. + The scaled value is put into `unsigned_long_var`. + `OPT_DATE(short, long, &int_var, description)`:: Introduce an option with date argument, see `approxidate()`. The timestamp is put into `int_var`. @@ -212,6 +218,19 @@ There are some macros to easily define options: Use it to hide deprecated options that are still to be recognized and ignored silently. +`OPT_PASSTHRU(short, long, &char_var, arg_str, description, flags)`:: + Introduce an option that will be reconstructed into a char* string, + which must be initialized to NULL. This is useful when you need to + pass the command-line option to another command. Any previous value + will be overwritten, so this should only be used for options where + the last one specified on the command line wins. + +`OPT_PASSTHRU_ARGV(short, long, &argv_array_var, arg_str, description, flags)`:: + Introduce an option where all instances of it on the command-line will + be reconstructed into an argv_array. This is useful when you need to + pass the command-line option, which can be specified multiple times, + to another command. + The last element of the array must be `OPT_END()`. diff --git a/Documentation/technical/api-ref-iteration.txt b/Documentation/technical/api-ref-iteration.txt index 02adfd45d3..37379d8337 100644 --- a/Documentation/technical/api-ref-iteration.txt +++ b/Documentation/technical/api-ref-iteration.txt @@ -6,7 +6,7 @@ Iteration of refs is done by using an iterate function which will call a callback function for every ref. The callback function has this signature: - int handle_one_ref(const char *refname, const unsigned char *sha1, + int handle_one_ref(const char *refname, const struct object_id *oid, int flags, void *cb_data); There are different kinds of iterate functions which all take a diff --git a/Documentation/technical/api-remote.txt b/Documentation/technical/api-remote.txt index 5d245aa9d1..2cfdd224a8 100644 --- a/Documentation/technical/api-remote.txt +++ b/Documentation/technical/api-remote.txt @@ -97,10 +97,6 @@ It contains: The name of the remote listed in the configuration. -`remote`:: - - The struct remote for that remote. - `merge_name`:: An array of the "merge" lines in the configuration. diff --git a/Documentation/technical/api-run-command.txt b/Documentation/technical/api-run-command.txt index 69510ae57a..8bf3e37f53 100644 --- a/Documentation/technical/api-run-command.txt +++ b/Documentation/technical/api-run-command.txt @@ -13,6 +13,10 @@ produces in the caller in order to process it. Functions --------- +`child_process_init`:: + + Initialize a struct child_process variable. + `start_command`:: Start a sub-process. Takes a pointer to a `struct child_process` @@ -42,6 +46,13 @@ Functions The argument dir corresponds the member .dir. The argument env corresponds to the member .env. +`child_process_clear`:: + + Release the memory associated with the struct child_process. + Most users of the run-command API don't need to call this + function explicitly because `start_command` invokes it on + failure and `finish_command` calls it automatically already. + The functions above do the following: . If a system call failed, errno is set and -1 is returned. A diagnostic @@ -96,8 +107,8 @@ command to run in a sub-process. The caller: -1. allocates and clears (memset(&chld, 0, sizeof(chld));) a - struct child_process variable; +1. allocates and clears (using child_process_init() or + CHILD_PROCESS_INIT) a struct child_process variable; 2. initializes the members; 3. calls start_command(); 4. processes the data; @@ -165,6 +176,11 @@ string pointers (NULL terminated) in .env: . If the string does not contain '=', it names an environment variable that will be removed from the child process's environment. +If the .env member is NULL, `start_command` will point it at the +.env_array `argv_array` (so you may use one or the other, but not both). +The memory in .env_array will be cleaned up automatically during +`finish_command` (or during `start_command` when it is unsuccessful). + To specify a new initial working directory for the sub-process, specify it in the .dir member. diff --git a/Documentation/technical/api-strbuf.txt b/Documentation/technical/api-strbuf.txt deleted file mode 100644 index f9c06a7573..0000000000 --- a/Documentation/technical/api-strbuf.txt +++ /dev/null @@ -1,337 +0,0 @@ -strbuf API -========== - -strbuf's are meant to be used with all the usual C string and memory -APIs. Given that the length of the buffer is known, it's often better to -use the mem* functions than a str* one (memchr vs. strchr e.g.). -Though, one has to be careful about the fact that str* functions often -stop on NULs and that strbufs may have embedded NULs. - -A strbuf is NUL terminated for convenience, but no function in the -strbuf API actually relies on the string being free of NULs. - -strbufs have some invariants that are very important to keep in mind: - -. The `buf` member is never NULL, so it can be used in any usual C -string operations safely. strbuf's _have_ to be initialized either by -`strbuf_init()` or by `= STRBUF_INIT` before the invariants, though. -+ -Do *not* assume anything on what `buf` really is (e.g. if it is -allocated memory or not), use `strbuf_detach()` to unwrap a memory -buffer from its strbuf shell in a safe way. That is the sole supported -way. This will give you a malloced buffer that you can later `free()`. -+ -However, it is totally safe to modify anything in the string pointed by -the `buf` member, between the indices `0` and `len-1` (inclusive). - -. The `buf` member is a byte array that has at least `len + 1` bytes - allocated. The extra byte is used to store a `'\0'`, allowing the - `buf` member to be a valid C-string. Every strbuf function ensure this - invariant is preserved. -+ -NOTE: It is OK to "play" with the buffer directly if you work it this - way: -+ ----- -strbuf_grow(sb, SOME_SIZE); <1> -strbuf_setlen(sb, sb->len + SOME_OTHER_SIZE); ----- -<1> Here, the memory array starting at `sb->buf`, and of length -`strbuf_avail(sb)` is all yours, and you can be sure that -`strbuf_avail(sb)` is at least `SOME_SIZE`. -+ -NOTE: `SOME_OTHER_SIZE` must be smaller or equal to `strbuf_avail(sb)`. -+ -Doing so is safe, though if it has to be done in many places, adding the -missing API to the strbuf module is the way to go. -+ -WARNING: Do _not_ assume that the area that is yours is of size `alloc -- 1` even if it's true in the current implementation. Alloc is somehow a -"private" member that should not be messed with. Use `strbuf_avail()` -instead. - -Data structures ---------------- - -* `struct strbuf` - -This is the string buffer structure. The `len` member can be used to -determine the current length of the string, and `buf` member provides -access to the string itself. - -Functions ---------- - -* Life cycle - -`strbuf_init`:: - - Initialize the structure. The second parameter can be zero or a bigger - number to allocate memory, in case you want to prevent further reallocs. - -`strbuf_release`:: - - Release a string buffer and the memory it used. You should not use the - string buffer after using this function, unless you initialize it again. - -`strbuf_detach`:: - - Detach the string from the strbuf and returns it; you now own the - storage the string occupies and it is your responsibility from then on - to release it with `free(3)` when you are done with it. - -`strbuf_attach`:: - - Attach a string to a buffer. You should specify the string to attach, - the current length of the string and the amount of allocated memory. - The amount must be larger than the string length, because the string you - pass is supposed to be a NUL-terminated string. This string _must_ be - malloc()ed, and after attaching, the pointer cannot be relied upon - anymore, and neither be free()d directly. - -`strbuf_swap`:: - - Swap the contents of two string buffers. - -* Related to the size of the buffer - -`strbuf_avail`:: - - Determine the amount of allocated but unused memory. - -`strbuf_grow`:: - - Ensure that at least this amount of unused memory is available after - `len`. This is used when you know a typical size for what you will add - and want to avoid repetitive automatic resizing of the underlying buffer. - This is never a needed operation, but can be critical for performance in - some cases. - -`strbuf_setlen`:: - - Set the length of the buffer to a given value. This function does *not* - allocate new memory, so you should not perform a `strbuf_setlen()` to a - length that is larger than `len + strbuf_avail()`. `strbuf_setlen()` is - just meant as a 'please fix invariants from this strbuf I just messed - with'. - -`strbuf_reset`:: - - Empty the buffer by setting the size of it to zero. - -* Related to the contents of the buffer - -`strbuf_trim`:: - - Strip whitespace from the beginning and end of a string. - Equivalent to performing `strbuf_rtrim()` followed by `strbuf_ltrim()`. - -`strbuf_rtrim`:: - - Strip whitespace from the end of a string. - -`strbuf_ltrim`:: - - Strip whitespace from the beginning of a string. - -`strbuf_reencode`:: - - Replace the contents of the strbuf with a reencoded form. Returns -1 - on error, 0 on success. - -`strbuf_tolower`:: - - Lowercase each character in the buffer using `tolower`. - -`strbuf_cmp`:: - - Compare two buffers. Returns an integer less than, equal to, or greater - than zero if the first buffer is found, respectively, to be less than, - to match, or be greater than the second buffer. - -* Adding data to the buffer - -NOTE: All of the functions in this section will grow the buffer as necessary. -If they fail for some reason other than memory shortage and the buffer hadn't -been allocated before (i.e. the `struct strbuf` was set to `STRBUF_INIT`), -then they will free() it. - -`strbuf_addch`:: - - Add a single character to the buffer. - -`strbuf_insert`:: - - Insert data to the given position of the buffer. The remaining contents - will be shifted, not overwritten. - -`strbuf_remove`:: - - Remove given amount of data from a given position of the buffer. - -`strbuf_splice`:: - - Remove the bytes between `pos..pos+len` and replace it with the given - data. - -`strbuf_add_commented_lines`:: - - Add a NUL-terminated string to the buffer. Each line will be prepended - by a comment character and a blank. - -`strbuf_add`:: - - Add data of given length to the buffer. - -`strbuf_addstr`:: - -Add a NUL-terminated string to the buffer. -+ -NOTE: This function will *always* be implemented as an inline or a macro -that expands to: -+ ----- -strbuf_add(..., s, strlen(s)); ----- -+ -Meaning that this is efficient to write things like: -+ ----- -strbuf_addstr(sb, "immediate string"); ----- - -`strbuf_addbuf`:: - - Copy the contents of another buffer at the end of the current one. - -`strbuf_adddup`:: - - Copy part of the buffer from a given position till a given length to the - end of the buffer. - -`strbuf_expand`:: - - This function can be used to expand a format string containing - placeholders. To that end, it parses the string and calls the specified - function for every percent sign found. -+ -The callback function is given a pointer to the character after the `%` -and a pointer to the struct strbuf. It is expected to add the expanded -version of the placeholder to the strbuf, e.g. to add a newline -character if the letter `n` appears after a `%`. The function returns -the length of the placeholder recognized and `strbuf_expand()` skips -over it. -+ -The format `%%` is automatically expanded to a single `%` as a quoting -mechanism; callers do not need to handle the `%` placeholder themselves, -and the callback function will not be invoked for this placeholder. -+ -All other characters (non-percent and not skipped ones) are copied -verbatim to the strbuf. If the callback returned zero, meaning that the -placeholder is unknown, then the percent sign is copied, too. -+ -In order to facilitate caching and to make it possible to give -parameters to the callback, `strbuf_expand()` passes a context pointer, -which can be used by the programmer of the callback as she sees fit. - -`strbuf_expand_dict_cb`:: - - Used as callback for `strbuf_expand()`, expects an array of - struct strbuf_expand_dict_entry as context, i.e. pairs of - placeholder and replacement string. The array needs to be - terminated by an entry with placeholder set to NULL. - -`strbuf_addbuf_percentquote`:: - - Append the contents of one strbuf to another, quoting any - percent signs ("%") into double-percents ("%%") in the - destination. This is useful for literal data to be fed to either - strbuf_expand or to the *printf family of functions. - -`strbuf_humanise_bytes`:: - - Append the given byte size as a human-readable string (i.e. 12.23 KiB, - 3.50 MiB). - -`strbuf_addf`:: - - Add a formatted string to the buffer. - -`strbuf_commented_addf`:: - - Add a formatted string prepended by a comment character and a - blank to the buffer. - -`strbuf_fread`:: - - Read a given size of data from a FILE* pointer to the buffer. -+ -NOTE: The buffer is rewound if the read fails. If -1 is returned, -`errno` must be consulted, like you would do for `read(3)`. -`strbuf_read()`, `strbuf_read_file()` and `strbuf_getline()` has the -same behaviour as well. - -`strbuf_read`:: - - Read the contents of a given file descriptor. The third argument can be - used to give a hint about the file size, to avoid reallocs. - -`strbuf_read_file`:: - - Read the contents of a file, specified by its path. The third argument - can be used to give a hint about the file size, to avoid reallocs. - -`strbuf_readlink`:: - - Read the target of a symbolic link, specified by its path. The third - argument can be used to give a hint about the size, to avoid reallocs. - -`strbuf_getline`:: - - Read a line from a FILE *, overwriting the existing contents - of the strbuf. The second argument specifies the line - terminator character, typically `'\n'`. - Reading stops after the terminator or at EOF. The terminator - is removed from the buffer before returning. Returns 0 unless - there was nothing left before EOF, in which case it returns `EOF`. - -`strbuf_getwholeline`:: - - Like `strbuf_getline`, but keeps the trailing terminator (if - any) in the buffer. - -`strbuf_getwholeline_fd`:: - - Like `strbuf_getwholeline`, but operates on a file descriptor. - It reads one character at a time, so it is very slow. Do not - use it unless you need the correct position in the file - descriptor. - -`stripspace`:: - - Strip whitespace from a buffer. The second parameter controls if - comments are considered contents to be removed or not. - -`strbuf_split_buf`:: -`strbuf_split_str`:: -`strbuf_split_max`:: -`strbuf_split`:: - - Split a string or strbuf into a list of strbufs at a specified - terminator character. The returned substrings include the - terminator characters. Some of these functions take a `max` - parameter, which, if positive, limits the output to that - number of substrings. - -`strbuf_list_free`:: - - Free a list of strbufs (for example, the return values of the - `strbuf_split()` functions). - -`launch_editor`:: - - Launch the user preferred editor to edit a file and fill the buffer - with the file's contents upon the user completing their editing. The - third argument can be used to set the environment which the editor is - run in. If the buffer is NULL the editor is launched as usual but the - file's contents are not read into the buffer upon completion. diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt index d51a6579c8..c08402b12e 100644 --- a/Documentation/technical/api-string-list.txt +++ b/Documentation/technical/api-string-list.txt @@ -29,7 +29,7 @@ member (you need this if you add things later) and you should set the `unsorted_string_list_has_string` and get it from the list using `string_list_lookup` for sorted lists. -. Can sort an unsorted list using `sort_string_list`. +. Can sort an unsorted list using `string_list_sort`. . Can remove duplicate items from a sorted list using `string_list_remove_duplicates`. @@ -146,7 +146,7 @@ write `string_list_insert(...)->util = ...;`. ownership of a malloc()ed string to a `string_list` that has `strdup_string` set. -`sort_string_list`:: +`string_list_sort`:: Sort the list's entries by string value in `strcmp()` order. diff --git a/Documentation/technical/api-submodule-config.txt b/Documentation/technical/api-submodule-config.txt new file mode 100644 index 0000000000..941fa178dd --- /dev/null +++ b/Documentation/technical/api-submodule-config.txt @@ -0,0 +1,62 @@ +submodule config cache API +========================== + +The submodule config cache API allows to read submodule +configurations/information from specified revisions. Internally +information is lazily read into a cache that is used to avoid +unnecessary parsing of the same .gitmodule files. Lookups can be done by +submodule path or name. + +Usage +----- + +To initialize the cache with configurations from the worktree the caller +typically first calls `gitmodules_config()` to read values from the +worktree .gitmodules and then to overlay the local git config values +`parse_submodule_config_option()` from the config parsing +infrastructure. + +The caller can look up information about submodules by using the +`submodule_from_path()` or `submodule_from_name()` functions. They return +a `struct submodule` which contains the values. The API automatically +initializes and allocates the needed infrastructure on-demand. If the +caller does only want to lookup values from revisions the initialization +can be skipped. + +If the internal cache might grow too big or when the caller is done with +the API, all internally cached values can be freed with submodule_free(). + +Data Structures +--------------- + +`struct submodule`:: + + This structure is used to return the information about one + submodule for a certain revision. It is returned by the lookup + functions. + +Functions +--------- + +`void submodule_free()`:: + + Use these to free the internally cached values. + +`int parse_submodule_config_option(const char *var, const char *value)`:: + + Can be passed to the config parsing infrastructure to parse + local (worktree) submodule configurations. + +`const struct submodule *submodule_from_path(const unsigned char *commit_sha1, const char *path)`:: + + Lookup values for one submodule by its commit_sha1 and path. + +`const struct submodule *submodule_from_name(const unsigned char *commit_sha1, const char *name)`:: + + The same as above but lookup by name. + +If given the null_sha1 as commit_sha1 the local configuration of a +submodule will be returned (e.g. consolidated values from local git +configuration and the .gitmodules file in the worktree). + +For an example usage see test-submodule-config.c. diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt index 229f845dfa..1c561bdd92 100644 --- a/Documentation/technical/http-protocol.txt +++ b/Documentation/technical/http-protocol.txt @@ -319,7 +319,8 @@ Servers SHOULD support all capabilities defined here. Clients MUST send at least one "want" command in the request body. Clients MUST NOT reference an id in a "want" command which did not appear in the response obtained through ref discovery unless the -server advertises capability `allow-tip-sha1-in-want`. +server advertises capability `allow-tip-sha1-in-want` or +`allow-reachable-sha1-in-want`. compute_request = want_list have_list diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index fe6f31667d..ade0b0c445 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -170,7 +170,7 @@ Git index format The entries are written out in the top-down, depth-first order. The first entry represents the root level of the repository, followed by the - first subtree---let's call this A---of the root level (with its name + first subtree--let's call this A--of the root level (with its name relative to the root level), followed by the first subtree of A (with its name relative to A), ... @@ -207,7 +207,7 @@ Git index format in a separate file. This extension records the changes to be made on top of that to produce the final index. - The signature for this extension is { 'l', 'i, 'n', 'k' }. + The signature for this extension is { 'l', 'i', 'n', 'k' }. The extension consists of: @@ -231,5 +231,67 @@ Git index format on. Replaced entries may have empty path names to save space. The remaining index entries after replaced ones will be added to the - final index. These added entries are also sorted by entry namme then + final index. These added entries are also sorted by entry name then stage. + +== Untracked cache + + Untracked cache saves the untracked file list and necessary data to + verify the cache. The signature for this extension is { 'U', 'N', + 'T', 'R' }. + + The extension starts with + + - A sequence of NUL-terminated strings, preceded by the size of the + sequence in variable width encoding. Each string describes the + environment where the cache can be used. + + - Stat data of $GIT_DIR/info/exclude. See "Index entry" section from + ctime field until "file size". + + - Stat data of core.excludesfile + + - 32-bit dir_flags (see struct dir_struct) + + - 160-bit SHA-1 of $GIT_DIR/info/exclude. Null SHA-1 means the file + does not exist. + + - 160-bit SHA-1 of core.excludesfile. Null SHA-1 means the file does + not exist. + + - NUL-terminated string of per-dir exclude file name. This usually + is ".gitignore". + + - The number of following directory blocks, variable width + encoding. If this number is zero, the extension ends here with a + following NUL. + + - A number of directory blocks in depth-first-search order, each + consists of + + - The number of untracked entries, variable width encoding. + + - The number of sub-directory blocks, variable width encoding. + + - The directory name terminated by NUL. + + - A number of untracked file/dir names terminated by NUL. + +The remaining data of each directory block is grouped by type: + + - An ewah bitmap, the n-th bit marks whether the n-th directory has + valid untracked cache entries. + + - An ewah bitmap, the n-th bit records "check-only" bit of + read_directory_recursive() for the n-th directory. + + - An ewah bitmap, the n-th bit indicates whether SHA-1 and stat data + is valid for the n-th directory and exists in the next data. + + - An array of stat data. The n-th data corresponds with the n-th + "one" bit in the previous ewah bitmap. + + - An array of SHA-1. The n-th SHA-1 corresponds with the n-th "one" bit + in the previous ewah bitmap. + + - One NUL. diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt index 569c48a352..c6977bbc5a 100644 --- a/Documentation/technical/pack-protocol.txt +++ b/Documentation/technical/pack-protocol.txt @@ -1,11 +1,11 @@ Packfile transfer protocols =========================== -Git supports transferring data in packfiles over the ssh://, git:// and +Git supports transferring data in packfiles over the ssh://, git://, http:// and file:// transports. There exist two sets of protocols, one for pushing data from a client to a server and another for fetching data from a -server to a client. All three transports (ssh, git, file) use the same -protocol to transfer data. +server to a client. The three transports (ssh, git, file) use the same +protocol to transfer data. http is documented in http-protocol.txt. The processes invoked in the canonical Git implementation are 'upload-pack' on the server side and 'fetch-pack' on the client side for fetching data; @@ -14,6 +14,14 @@ data. The protocol functions to have a server tell a client what is currently on the server, then for the two to negotiate the smallest amount of data to send in order to fully update one or the other. +pkt-line Format +--------------- + +The descriptions below build on the pkt-line format described in +protocol-common.txt. When the grammar indicate `PKT-LINE(...)`, unless +otherwise noted the usual pkt-line LF rules apply: the sender SHOULD +include a LF, but the receiver MUST NOT complain if it is not present. + Transports ---------- There are three transports over which the packfile protocol is @@ -143,9 +151,6 @@ with the object name that each reference currently points to. 003fe92df48743b7bc7d26bcaabfddde0a1e20cae47c refs/tags/v1.0^{} 0000 -Server SHOULD terminate each non-flush line using LF ("\n") terminator; -client MUST NOT complain if there is no terminator. - The returned response is a pkt-line stream describing each ref and its current value. The stream MUST be sorted by name according to the C locale ordering. @@ -165,15 +170,15 @@ MUST peel the ref if it's an annotated tag. flush-pkt no-refs = PKT-LINE(zero-id SP "capabilities^{}" - NUL capability-list LF) + NUL capability-list) list-of-refs = first-ref *other-ref first-ref = PKT-LINE(obj-id SP refname - NUL capability-list LF) + NUL capability-list) other-ref = PKT-LINE(other-tip / other-peeled) - other-tip = obj-id SP refname LF - other-peeled = obj-id SP refname "^{}" LF + other-tip = obj-id SP refname + other-peeled = obj-id SP refname "^{}" shallow = PKT-LINE("shallow" SP obj-id) @@ -212,12 +217,12 @@ out of what the server said it could do with the first 'want' line. want-list = first-want *additional-want - shallow-line = PKT_LINE("shallow" SP obj-id) + shallow-line = PKT-LINE("shallow" SP obj-id) - depth-request = PKT_LINE("deepen" SP depth) + depth-request = PKT-LINE("deepen" SP depth) - first-want = PKT-LINE("want" SP obj-id SP capability-list LF) - additional-want = PKT-LINE("want" SP obj-id LF) + first-want = PKT-LINE("want" SP obj-id SP capability-list) + additional-want = PKT-LINE("want" SP obj-id) depth = 1*DIGIT ---- @@ -284,7 +289,7 @@ so that there is always a block of 32 "in-flight on the wire" at a time. compute-end have-list = *have-line - have-line = PKT-LINE("have" SP obj-id LF) + have-line = PKT-LINE("have" SP obj-id) compute-end = flush-pkt / PKT-LINE("done") ---- @@ -348,10 +353,10 @@ Then the server will start sending its packfile data. ---- server-response = *ack_multi ack / nak - ack_multi = PKT-LINE("ACK" SP obj-id ack_status LF) + ack_multi = PKT-LINE("ACK" SP obj-id ack_status) ack_status = "continue" / "common" / "ready" - ack = PKT-LINE("ACK SP obj-id LF) - nak = PKT-LINE("NAK" LF) + ack = PKT-LINE("ACK" SP obj-id) + nak = PKT-LINE("NAK") ---- A simple clone may look like this (with no 'have' lines): @@ -465,12 +470,12 @@ contain all the objects that the server will need to complete the new references. ---- - update-request = *shallow command-list [pack-file] + update-request = *shallow ( command-list | push-cert ) [packfile] - shallow = PKT-LINE("shallow" SP obj-id LF) + shallow = PKT-LINE("shallow" SP obj-id) - command-list = PKT-LINE(command NUL capability-list LF) - *PKT-LINE(command LF) + command-list = PKT-LINE(command NUL capability-list) + *PKT-LINE(command) flush-pkt command = create / delete / update @@ -481,17 +486,32 @@ references. old-id = obj-id new-id = obj-id - pack-file = "PACK" 28*(OCTET) + push-cert = PKT-LINE("push-cert" NUL capability-list LF) + PKT-LINE("certificate version 0.1" LF) + PKT-LINE("pusher" SP ident LF) + PKT-LINE("pushee" SP url LF) + PKT-LINE("nonce" SP nonce LF) + PKT-LINE(LF) + *PKT-LINE(command LF) + *PKT-LINE(gpg-signature-lines LF) + PKT-LINE("push-cert-end" LF) + + packfile = "PACK" 28*(OCTET) ---- If the receiving end does not support delete-refs, the sending end MUST NOT ask for delete command. -The pack-file MUST NOT be sent if the only command used is 'delete'. +If the receiving end does not support push-cert, the sending end +MUST NOT send a push-cert command. When a push-cert command is +sent, command-list MUST NOT be sent; the commands recorded in the +push certificate is used instead. -A pack-file MUST be sent if either create or update command is used, +The packfile MUST NOT be sent if the only command used is 'delete'. + +A packfile MUST be sent if either create or update command is used, even if the server already has all the necessary objects. In this -case the client MUST send an empty pack-file. The only time this +case the client MUST send an empty packfile. The only time this is likely to happen is if the client is creating a new branch or a tag that points to an existing obj-id. @@ -501,6 +521,35 @@ was being processed (the obj-id is still the same as the old-id), and it will run any update hooks to make sure that the update is acceptable. If all of that is fine, the server will then update the references. +Push Certificate +---------------- + +A push certificate begins with a set of header lines. After the +header and an empty line, the protocol commands follow, one per +line. Note that the the trailing LF in push-cert PKT-LINEs is _not_ +optional; it must be present. + +Currently, the following header fields are defined: + +`pusher` ident:: + Identify the GPG key in "Human Readable Name <email@address>" + format. + +`pushee` url:: + The repository URL (anonymized, if the URL contains + authentication material) the user who ran `git push` + intended to push into. + +`nonce` nonce:: + The 'nonce' string the receiving repository asked the + pushing user to include in the certificate, to prevent + replay attacks. + +The GPG signature lines are a detached signature for the contents +recorded in the push certificate before the signature block begins. +The detached signature is used to certify that the commands were +given by the pusher, who must be the signer. + Report Status ------------- @@ -517,12 +566,12 @@ update was successful, or 'ng [refname] [error]' if the update was not. 1*(command-status) flush-pkt - unpack-status = PKT-LINE("unpack" SP unpack-result LF) + unpack-status = PKT-LINE("unpack" SP unpack-result) unpack-result = "ok" / error-msg command-status = command-ok / command-fail - command-ok = PKT-LINE("ok" SP refname LF) - command-fail = PKT-LINE("ng" SP refname SP error-msg LF) + command-ok = PKT-LINE("ok" SP refname) + command-fail = PKT-LINE("ng" SP refname SP error-msg) error-msg = 1*(OCTECT) ; where not "ok" ---- diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt index e174343847..eaab6b4ac7 100644 --- a/Documentation/technical/protocol-capabilities.txt +++ b/Documentation/technical/protocol-capabilities.txt @@ -18,8 +18,9 @@ was sent. Server MUST NOT ignore capabilities that client requested and server advertised. As a consequence of these rules, server MUST NOT advertise capabilities it does not understand. -The 'report-status', 'delete-refs', and 'quiet' capabilities are sent and -recognized by the receive-pack (push to server) process. +The 'atomic', 'report-status', 'delete-refs', 'quiet', and 'push-cert' +capabilities are sent and recognized by the receive-pack (push to server) +process. The 'ofs-delta' and 'side-band-64k' capabilities are sent and recognized by both upload-pack and receive-pack protocols. The 'agent' capability @@ -168,7 +169,7 @@ agent capability). The `X` and `Y` strings may contain any printable ASCII characters except space (i.e., the byte range 32 < x < 127), and are typically of the form "package/version" (e.g., "git/1.8.3.1"). The agent strings are purely informative for statistics and debugging -purposes, and MUST NOT be used to programatically assume the presence +purposes, and MUST NOT be used to programmatically assume the presence or absence of particular features. shallow @@ -244,9 +245,33 @@ respond with the 'quiet' capability to suppress server-side progress reporting if the local progress reporting is also being suppressed (e.g., via `push -q`, or if stderr does not go to a tty). +atomic +------ + +If the server sends the 'atomic' capability it is capable of accepting +atomic pushes. If the pushing client requests this capability, the server +will update the refs in one atomic transaction. Either all refs are +updated or none. + allow-tip-sha1-in-want ---------------------- If the upload-pack server advertises this capability, fetch-pack may send "want" lines with SHA-1s that exist at the server but are not advertised by upload-pack. + +allow-reachable-sha1-in-want +---------------------------- + +If the upload-pack server advertises this capability, fetch-pack may +send "want" lines with SHA-1s that exist at the server but are not +advertised by upload-pack. + +push-cert=<nonce> +----------------- + +The receive-pack server that advertises this capability is willing +to accept a signed push certificate, and asks the <nonce> to be +included in the push certificate. A send-pack client MUST NOT +send a push-cert packet unless the receive-pack server advertises +this capability. diff --git a/Documentation/technical/protocol-common.txt b/Documentation/technical/protocol-common.txt index 889985f707..bf30167ae3 100644 --- a/Documentation/technical/protocol-common.txt +++ b/Documentation/technical/protocol-common.txt @@ -62,7 +62,10 @@ A pkt-line MAY contain binary data, so implementors MUST ensure pkt-line parsing/formatting routines are 8-bit clean. A non-binary line SHOULD BE terminated by an LF, which if present -MUST be included in the total length. +MUST be included in the total length. Receivers MUST treat pkt-lines +with non-binary data the same whether or not they contain the trailing +LF (stripping the LF if present, and not complaining when it is +missing). The maximum length of a pkt-line's data component is 65520 bytes. Implementations MUST NOT send pkt-line whose length exceeds 65524 diff --git a/Documentation/technical/racy-git.txt b/Documentation/technical/racy-git.txt index 242a044db9..4a8be4d144 100644 --- a/Documentation/technical/racy-git.txt +++ b/Documentation/technical/racy-git.txt @@ -41,13 +41,17 @@ With a `USE_STDEV` compile-time option, `st_dev` is also compared, but this is not enabled by default because this member is not stable on network filesystems. With `USE_NSEC` compile-time option, `st_mtim.tv_nsec` and `st_ctim.tv_nsec` -members are also compared, but this is not enabled by default +members are also compared. On Linux, this is not enabled by default because in-core timestamps can have finer granularity than on-disk timestamps, resulting in meaningless changes when an inode is evicted from the inode cache. See commit 8ce13b0 of git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git ([PATCH] Sync in core time granularity with filesystems, -2005-01-04). +2005-01-04). This patch is included in kernel 2.6.11 and newer, but +only fixes the issue for file systems with exactly 1 ns or 1 s +resolution. Other file systems are still broken in current Linux +kernels (e.g. CEPH, CIFS, NTFS, UDF), see +https://lkml.org/lkml/2015/6/9/714 Racy Git -------- diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt new file mode 100644 index 0000000000..00ad37986e --- /dev/null +++ b/Documentation/technical/repository-version.txt @@ -0,0 +1,88 @@ +Git Repository Format Versions +============================== + +Every git repository is marked with a numeric version in the +`core.repositoryformatversion` key of its `config` file. This version +specifies the rules for operating on the on-disk repository data. An +implementation of git which does not understand a particular version +advertised by an on-disk repository MUST NOT operate on that repository; +doing so risks not only producing wrong results, but actually losing +data. + +Because of this rule, version bumps should be kept to an absolute +minimum. Instead, we generally prefer these strategies: + + - bumping format version numbers of individual data files (e.g., + index, packfiles, etc). This restricts the incompatibilities only to + those files. + + - introducing new data that gracefully degrades when used by older + clients (e.g., pack bitmap files are ignored by older clients, which + simply do not take advantage of the optimization they provide). + +A whole-repository format version bump should only be part of a change +that cannot be independently versioned. For instance, if one were to +change the reachability rules for objects, or the rules for locking +refs, that would require a bump of the repository format version. + +Note that this applies only to accessing the repository's disk contents +directly. An older client which understands only format `0` may still +connect via `git://` to a repository using format `1`, as long as the +server process understands format `1`. + +The preferred strategy for rolling out a version bump (whether whole +repository or for a single file) is to teach git to read the new format, +and allow writing the new format with a config switch or command line +option (for experimentation or for those who do not care about backwards +compatibility with older gits). Then after a long period to allow the +reading capability to become common, we may switch to writing the new +format by default. + +The currently defined format versions are: + +Version `0` +----------- + +This is the format defined by the initial version of git, including but +not limited to the format of the repository directory, the repository +configuration file, and the object and ref storage. Specifying the +complete behavior of git is beyond the scope of this document. + +Version `1` +----------- + +This format is identical to version `0`, with the following exceptions: + + 1. When reading the `core.repositoryformatversion` variable, a git + implementation which supports version 1 MUST also read any + configuration keys found in the `extensions` section of the + configuration file. + + 2. If a version-1 repository specifies any `extensions.*` keys that + the running git has not implemented, the operation MUST NOT + proceed. Similarly, if the value of any known key is not understood + by the implementation, the operation MUST NOT proceed. + +Note that if no extensions are specified in the config file, then +`core.repositoryformatversion` SHOULD be set to `0` (setting it to `1` +provides no benefit, and makes the repository incompatible with older +implementations of git). + +This document will serve as the master list for extensions. Any +implementation wishing to define a new extension should make a note of +it here, in order to claim the name. + +The defined extensions are: + +`noop` +~~~~~~ + +This extension does not change git's behavior at all. It is useful only +for testing format-1 compatibility. + +`preciousObjects` +~~~~~~~~~~~~~~~~~ + +When the config key `extensions.preciousObjects` is set to `true`, +objects in the repository MUST NOT be deleted (e.g., by `git-prune` or +`git repack -d`). |