diff options
Diffstat (limited to 'Documentation/technical')
-rw-r--r-- | Documentation/technical/api-argv-array.txt | 8 | ||||
-rw-r--r-- | Documentation/technical/api-builtin.txt | 13 | ||||
-rw-r--r-- | Documentation/technical/api-config.txt | 173 | ||||
-rw-r--r-- | Documentation/technical/api-hashmap.txt | 54 | ||||
-rw-r--r-- | Documentation/technical/api-run-command.txt | 7 | ||||
-rw-r--r-- | Documentation/technical/api-strbuf.txt | 28 | ||||
-rw-r--r-- | Documentation/technical/api-string-list.txt | 7 | ||||
-rw-r--r-- | Documentation/technical/api-trace.txt | 97 | ||||
-rw-r--r-- | Documentation/technical/http-protocol.txt | 2 | ||||
-rw-r--r-- | Documentation/technical/index-format.txt | 35 |
10 files changed, 405 insertions, 19 deletions
diff --git a/Documentation/technical/api-argv-array.txt b/Documentation/technical/api-argv-array.txt index a6b7d83a8e..1a797812fb 100644 --- a/Documentation/technical/api-argv-array.txt +++ b/Documentation/technical/api-argv-array.txt @@ -53,11 +53,3 @@ Functions `argv_array_clear`:: Free all memory associated with the array and return it to the initial, empty state. - -`argv_array_detach`:: - Detach the argv array from the `struct argv_array`, transferring - ownership of the allocated array and strings. - -`argv_array_free_detached`:: - Free the memory allocated by a `struct argv_array` that was later - detached and is now no longer needed. diff --git a/Documentation/technical/api-builtin.txt b/Documentation/technical/api-builtin.txt index e3d6e7a79a..22a39b9299 100644 --- a/Documentation/technical/api-builtin.txt +++ b/Documentation/technical/api-builtin.txt @@ -22,11 +22,14 @@ Git: where options is the bitwise-or of: `RUN_SETUP`:: - - Make sure there is a Git directory to work on, and if there is a - work tree, chdir to the top of it if the command was invoked - in a subdirectory. If there is no work tree, no chdir() is - done. + If there is not a Git directory to work on, abort. If there + is a work tree, chdir to the top of it if the command was + invoked in a subdirectory. If there is no work tree, no + chdir() is done. + +`RUN_SETUP_GENTLY`:: + If there is a Git directory, chdir as per RUN_SETUP, otherwise, + don't chdir anywhere. `USE_PAGER`:: diff --git a/Documentation/technical/api-config.txt b/Documentation/technical/api-config.txt index 230b3a0f60..21f280ca6d 100644 --- a/Documentation/technical/api-config.txt +++ b/Documentation/technical/api-config.txt @@ -77,6 +77,86 @@ To read a specific file in git-config format, use `git_config_from_file`. This takes the same callback and data parameters as `git_config`. +Querying For Specific Variables +------------------------------- + +For programs wanting to query for specific variables in a non-callback +manner, the config API provides two functions `git_config_get_value` +and `git_config_get_value_multi`. They both read values from an internal +cache generated previously from reading the config files. + +`int git_config_get_value(const char *key, const char **value)`:: + + Finds the highest-priority value for the configuration variable `key`, + stores the pointer to it in `value` and returns 0. When the + configuration variable `key` is not found, returns 1 without touching + `value`. The caller should not free or modify `value`, as it is owned + by the cache. + +`const struct string_list *git_config_get_value_multi(const char *key)`:: + + Finds and returns the value list, sorted in order of increasing priority + for the configuration variable `key`. When the configuration variable + `key` is not found, returns NULL. The caller should not free or modify + the returned pointer, as it is owned by the cache. + +`void git_config_clear(void)`:: + + Resets and invalidates the config cache. + +The config API also provides type specific API functions which do conversion +as well as retrieval for the queried variable, including: + +`int git_config_get_int(const char *key, int *dest)`:: + + Finds and parses the value to an integer for the configuration variable + `key`. Dies on error; otherwise, stores the value of the parsed integer in + `dest` and returns 0. When the configuration variable `key` is not found, + returns 1 without touching `dest`. + +`int git_config_get_ulong(const char *key, unsigned long *dest)`:: + + Similar to `git_config_get_int` but for unsigned longs. + +`int git_config_get_bool(const char *key, int *dest)`:: + + Finds and parses the value into a boolean value, for the configuration + variable `key` respecting keywords like "true" and "false". Integer + values are converted into true/false values (when they are non-zero or + zero, respectively). Other values cause a die(). If parsing is successful, + stores the value of the parsed result in `dest` and returns 0. When the + configuration variable `key` is not found, returns 1 without touching + `dest`. + +`int git_config_get_bool_or_int(const char *key, int *is_bool, int *dest)`:: + + Similar to `git_config_get_bool`, except that integers are copied as-is, + and `is_bool` flag is unset. + +`int git_config_get_maybe_bool(const char *key, int *dest)`:: + + Similar to `git_config_get_bool`, except that it returns -1 on error + rather than dying. + +`int git_config_get_string_const(const char *key, const char **dest)`:: + + Allocates and copies the retrieved string into the `dest` parameter for + the configuration variable `key`; if NULL string is given, prints an + error message and returns -1. When the configuration variable `key` is + not found, returns 1 without touching `dest`. + +`int git_config_get_string(const char *key, char **dest)`:: + + Similar to `git_config_get_string_const`, except that retrieved value + copied into the `dest` parameter is a mutable string. + +`int git_config_get_pathname(const char *key, const char **dest)`:: + + Similar to `git_config_get_string`, but expands `~` or `~user` into + the user's home directory when found at the beginning of the path. + +See test-config.c for usage examples. + Value Parsing Helpers --------------------- @@ -134,7 +214,98 @@ int read_file_with_include(const char *file, config_fn_t fn, void *data) `git_config` respects includes automatically. The lower-level `git_config_from_file` does not. +Custom Configsets +----------------- + +A `config_set` can be used to construct an in-memory cache for +config-like files that the caller specifies (i.e., files like `.gitmodules`, +`~/.gitconfig` etc.). For example, + +--------------------------------------- +struct config_set gm_config; +git_configset_init(&gm_config); +int b; +/* we add config files to the config_set */ +git_configset_add_file(&gm_config, ".gitmodules"); +git_configset_add_file(&gm_config, ".gitmodules_alt"); + +if (!git_configset_get_bool(gm_config, "submodule.frotz.ignore", &b)) { + /* hack hack hack */ +} + +/* when we are done with the configset */ +git_configset_clear(&gm_config); +---------------------------------------- + +Configset API provides functions for the above mentioned work flow, including: + +`void git_configset_init(struct config_set *cs)`:: + + Initializes the config_set `cs`. + +`int git_configset_add_file(struct config_set *cs, const char *filename)`:: + + Parses the file and adds the variable-value pairs to the `config_set`, + dies if there is an error in parsing the file. Returns 0 on success, or + -1 if the file does not exist or is inaccessible. The user has to decide + if he wants to free the incomplete configset or continue using it when + the function returns -1. + +`int git_configset_get_value(struct config_set *cs, const char *key, const char **value)`:: + + Finds the highest-priority value for the configuration variable `key` + and config set `cs`, stores the pointer to it in `value` and returns 0. + When the configuration variable `key` is not found, returns 1 without + touching `value`. The caller should not free or modify `value`, as it + is owned by the cache. + +`const struct string_list *git_configset_get_value_multi(struct config_set *cs, const char *key)`:: + + Finds and returns the value list, sorted in order of increasing priority + for the configuration variable `key` and config set `cs`. When the + configuration variable `key` is not found, returns NULL. The caller + should not free or modify the returned pointer, as it is owned by the cache. + +`void git_configset_clear(struct config_set *cs)`:: + + Clears `config_set` structure, removes all saved variable-value pairs. + +In addition to above functions, the `config_set` API provides type specific +functions in the vein of `git_config_get_int` and family but with an extra +parameter, pointer to struct `config_set`. +They all behave similarly to the `git_config_get*()` family described in +"Querying For Specific Variables" above. + Writing Config Files -------------------- -TODO +Git gives multiple entry points in the Config API to write config values to +files namely `git_config_set_in_file` and `git_config_set`, which write to +a specific config file or to `.git/config` respectively. They both take a +key/value pair as parameter. +In the end they both call `git_config_set_multivar_in_file` which takes four +parameters: + +- the name of the file, as a string, to which key/value pairs will be written. + +- the name of key, as a string. This is in canonical "flat" form: the section, + subsection, and variable segments will be separated by dots, and the section + and variable segments will be all lowercase. + E.g., `core.ignorecase`, `diff.SomeType.textconv`. + +- the value of the variable, as a string. If value is equal to NULL, it will + remove the matching key from the config file. + +- the value regex, as a string. It will disregard key/value pairs where value + does not match. + +- a multi_replace value, as an int. If value is equal to zero, nothing or only + one matching key/value is replaced, else all matching key/values (regardless + how many) are removed, before the new pair is written. + +It returns 0 on success. + +Also, there are functions `git_config_rename_section` and +`git_config_rename_section_in_file` with parameters `old_name` and `new_name` +for renaming or removing sections in the config files. If NULL is passed +through `new_name` parameter, the section will be removed from the config file. diff --git a/Documentation/technical/api-hashmap.txt b/Documentation/technical/api-hashmap.txt index b977ae8bbb..ad7a5bddd2 100644 --- a/Documentation/technical/api-hashmap.txt +++ b/Documentation/technical/api-hashmap.txt @@ -8,11 +8,19 @@ Data Structures `struct hashmap`:: - The hash table structure. + The hash table structure. Members can be used as follows, but should + not be modified directly: + -The `size` member keeps track of the total number of entries. The `cmpfn` -member is a function used to compare two entries for equality. The `table` and -`tablesize` members store the hash table and its size, respectively. +The `size` member keeps track of the total number of entries (0 means the +hashmap is empty). ++ +`tablesize` is the allocated size of the hash table. A non-0 value indicates +that the hashmap is initialized. It may also be useful for statistical purposes +(i.e. `size / tablesize` is the current load factor). ++ +`cmpfn` stores the comparison function specified in `hashmap_init()`. In +advanced scenarios, it may be useful to change this, e.g. to switch between +case-sensitive and case-insensitive lookup. `struct hashmap_entry`:: @@ -58,6 +66,15 @@ Functions + `strihash` and `memihash` are case insensitive versions. +`unsigned int sha1hash(const unsigned char *sha1)`:: + + Converts a cryptographic hash (e.g. SHA-1) into an int-sized hash code + for use in hash tables. Cryptographic hashes are supposed to have + uniform distribution, so in contrast to `memhash()`, this just copies + the first `sizeof(int)` bytes without shuffling any bits. Note that + the results will be different on big-endian and little-endian + platforms, so they should not be stored or transferred over the net. + `void hashmap_init(struct hashmap *map, hashmap_cmp_fn equals_function, size_t initial_size)`:: Initializes a hashmap structure. @@ -101,6 +118,20 @@ hashmap_entry) that has at least been initialized with the proper hash code If an entry with matching hash code is found, `key` and `keydata` are passed to `hashmap_cmp_fn` to decide whether the entry matches the key. +`void *hashmap_get_from_hash(const struct hashmap *map, unsigned int hash, const void *keydata)`:: + + Returns the hashmap entry for the specified hash code and key data, + or NULL if not found. ++ +`map` is the hashmap structure. ++ +`hash` is the hash code of the entry to look up. ++ +If an entry with matching hash code is found, `keydata` is passed to +`hashmap_cmp_fn` to decide whether the entry matches the key. The +`entry_or_key` parameter points to a bogus hashmap_entry structure that +should not be used in the comparison. + `void *hashmap_get_next(const struct hashmap *map, const void *entry)`:: Returns the next equal hashmap entry, or NULL if not found. This can be @@ -162,6 +193,21 @@ more entries. `hashmap_iter_first` is a combination of both (i.e. initializes the iterator and returns the first entry, if any). +`const char *strintern(const char *string)`:: +`const void *memintern(const void *data, size_t len)`:: + + Returns the unique, interned version of the specified string or data, + similar to the `String.intern` API in Java and .NET, respectively. + Interned strings remain valid for the entire lifetime of the process. ++ +Can be used as `[x]strdup()` or `xmemdupz` replacement, except that interned +strings / data must not be modified or freed. ++ +Interned strings are best used for short strings with high probability of +duplicates. ++ +Uses a hashmap to store the pool of interned strings. + Usage example ------------- diff --git a/Documentation/technical/api-run-command.txt b/Documentation/technical/api-run-command.txt index 5d7d7f2d32..69510ae57a 100644 --- a/Documentation/technical/api-run-command.txt +++ b/Documentation/technical/api-run-command.txt @@ -109,6 +109,13 @@ terminated), of which .argv[0] is the program name to run (usually without a path). If the command to run is a git command, set argv[0] to the command name without the 'git-' prefix and set .git_cmd = 1. +Note that the ownership of the memory pointed to by .argv stays with the +caller, but it should survive until `finish_command` completes. If the +.argv member is NULL, `start_command` will point it at the .args +`argv_array` (so you may use one or the other, but you must use exactly +one). The memory in .args will be cleaned up automatically during +`finish_command` (or during `start_command` when it is unsuccessful). + The members .in, .out, .err are used to redirect stdin, stdout, stderr as follows: diff --git a/Documentation/technical/api-strbuf.txt b/Documentation/technical/api-strbuf.txt index 1d00e4d596..430302c2f4 100644 --- a/Documentation/technical/api-strbuf.txt +++ b/Documentation/technical/api-strbuf.txt @@ -121,10 +121,28 @@ Functions * Related to the contents of the buffer +`strbuf_trim`:: + + Strip whitespace from the beginning and end of a string. + Equivalent to performing `strbuf_rtrim()` followed by `strbuf_ltrim()`. + `strbuf_rtrim`:: Strip whitespace from the end of a string. +`strbuf_ltrim`:: + + Strip whitespace from the beginning of a string. + +`strbuf_reencode`:: + + Replace the contents of the strbuf with a reencoded form. Returns -1 + on error, 0 on success. + +`strbuf_tolower`:: + + Lowercase each character in the buffer using `tolower`. + `strbuf_cmp`:: Compare two buffers. Returns an integer less than, equal to, or greater @@ -289,6 +307,16 @@ same behaviour as well. use it unless you need the correct position in the file descriptor. +`strbuf_getcwd`:: + + Set the buffer to the path of the current working directory. + +`strbuf_add_absolute_path` + + Add a path to a buffer, converting a relative path to an + absolute one in the process. Symbolic links are not + resolved. + `stripspace`:: Strip whitespace from a buffer. The second parameter controls if diff --git a/Documentation/technical/api-string-list.txt b/Documentation/technical/api-string-list.txt index 20be348834..d51a6579c8 100644 --- a/Documentation/technical/api-string-list.txt +++ b/Documentation/technical/api-string-list.txt @@ -68,6 +68,11 @@ Functions * General ones (works with sorted and unsorted lists as well) +`string_list_init`:: + + Initialize the members of the string_list, set `strdup_strings` + member according to the value of the second parameter. + `filter_string_list`:: Apply a function to each item in a list, retaining only the @@ -200,3 +205,5 @@ Represents the list itself. You should not tamper with it. . Setting the `strdup_strings` member to 1 will strdup() the strings before adding them, see above. +. The `compare_strings_fn` member is used to specify a custom compare + function, otherwise `strcmp()` is used as the default function. diff --git a/Documentation/technical/api-trace.txt b/Documentation/technical/api-trace.txt new file mode 100644 index 0000000000..097a651d96 --- /dev/null +++ b/Documentation/technical/api-trace.txt @@ -0,0 +1,97 @@ +trace API +========= + +The trace API can be used to print debug messages to stderr or a file. Trace +code is inactive unless explicitly enabled by setting `GIT_TRACE*` environment +variables. + +The trace implementation automatically adds `timestamp file:line ... \n` to +all trace messages. E.g.: + +------------ +23:59:59.123456 git.c:312 trace: built-in: git 'foo' +00:00:00.000001 builtin/foo.c:99 foo: some message +------------ + +Data Structures +--------------- + +`struct trace_key`:: + + Defines a trace key (or category). The default (for API functions that + don't take a key) is `GIT_TRACE`. ++ +E.g. to define a trace key controlled by environment variable `GIT_TRACE_FOO`: ++ +------------ +static struct trace_key trace_foo = TRACE_KEY_INIT(FOO); + +static void trace_print_foo(const char *message) +{ + trace_print_key(&trace_foo, message); +} +------------ ++ +Note: don't use `const` as the trace implementation stores internal state in +the `trace_key` structure. + +Functions +--------- + +`int trace_want(struct trace_key *key)`:: + + Checks whether the trace key is enabled. Used to prevent expensive + string formatting before calling one of the printing APIs. + +`void trace_disable(struct trace_key *key)`:: + + Disables tracing for the specified key, even if the environment + variable was set. + +`void trace_printf(const char *format, ...)`:: +`void trace_printf_key(struct trace_key *key, const char *format, ...)`:: + + Prints a formatted message, similar to printf. + +`void trace_argv_printf(const char **argv, const char *format, ...)``:: + + Prints a formatted message, followed by a quoted list of arguments. + +`void trace_strbuf(struct trace_key *key, const struct strbuf *data)`:: + + Prints the strbuf, without additional formatting (i.e. doesn't + choke on `%` or even `\0`). + +`uint64_t getnanotime(void)`:: + + Returns nanoseconds since the epoch (01/01/1970), typically used + for performance measurements. ++ +Currently there are high precision timer implementations for Linux (using +`clock_gettime(CLOCK_MONOTONIC)`) and Windows (`QueryPerformanceCounter`). +Other platforms use `gettimeofday` as time source. + +`void trace_performance(uint64_t nanos, const char *format, ...)`:: +`void trace_performance_since(uint64_t start, const char *format, ...)`:: + + Prints the elapsed time (in nanoseconds), or elapsed time since + `start`, followed by a formatted message. Enabled via environment + variable `GIT_TRACE_PERFORMANCE`. Used for manual profiling, e.g.: ++ +------------ +uint64_t start = getnanotime(); +/* code section to measure */ +trace_performance_since(start, "foobar"); +------------ ++ +------------ +uint64_t t = 0; +for (;;) { + /* ignore */ + t -= getnanotime(); + /* code section to measure */ + t += getnanotime(); + /* ignore */ +} +trace_performance(t, "frotz"); +------------ diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt index 58c7e872d7..229f845dfa 100644 --- a/Documentation/technical/http-protocol.txt +++ b/Documentation/technical/http-protocol.txt @@ -374,7 +374,7 @@ C: Send one `$GIT_URL/git-upload-pack` request: C: 0000 The stream is organized into "commands", with each command -appearing by itself in a pkt-line. Within a command line +appearing by itself in a pkt-line. Within a command line, the text leading up to the first space is the command name, and the remainder of the line to the first LF is the value. Command lines are terminated with an LF as the last byte of diff --git a/Documentation/technical/index-format.txt b/Documentation/technical/index-format.txt index f352a9b22e..fe6f31667d 100644 --- a/Documentation/technical/index-format.txt +++ b/Documentation/technical/index-format.txt @@ -129,6 +129,9 @@ Git index format (Version 4) In version 4, the padding after the pathname does not exist. + Interpretation of index entries in split index mode is completely + different. See below for details. + == Extensions === Cached tree @@ -198,3 +201,35 @@ Git index format - At most three 160-bit object names of the entry in stages from 1 to 3 (nothing is written for a missing stage). +=== Split index + + In split index mode, the majority of index entries could be stored + in a separate file. This extension records the changes to be made on + top of that to produce the final index. + + The signature for this extension is { 'l', 'i, 'n', 'k' }. + + The extension consists of: + + - 160-bit SHA-1 of the shared index file. The shared index file path + is $GIT_DIR/sharedindex.<SHA-1>. If all 160 bits are zero, the + index does not require a shared index file. + + - An ewah-encoded delete bitmap, each bit represents an entry in the + shared index. If a bit is set, its corresponding entry in the + shared index will be removed from the final index. Note, because + a delete operation changes index entry positions, but we do need + original positions in replace phase, it's best to just mark + entries for removal, then do a mass deletion after replacement. + + - An ewah-encoded replace bitmap, each bit represents an entry in + the shared index. If a bit is set, its corresponding entry in the + shared index will be replaced with an entry in this index + file. All replaced entries are stored in sorted order in this + index. The first "1" bit in the replace bitmap corresponds to the + first index entry, the second "1" bit to the second entry and so + on. Replaced entries may have empty path names to save space. + + The remaining index entries after replaced ones will be added to the + final index. These added entries are also sorted by entry namme then + stage. |