summaryrefslogtreecommitdiff
path: root/repository.h
AgeCommit message (Collapse)AuthorFilesLines
2019-10-07Merge branch 'jk/disable-commit-graph-during-upload-pack'Libravatar Junio C Hamano1-0/+3
The "upload-pack" (the counterpart of "git fetch") needs to disable commit-graph when responding to a shallow clone/fetch request, but the way this was done made Git panic, which has been corrected. * jk/disable-commit-graph-during-upload-pack: upload-pack: disable commit graph more gently for shallow traversal commit-graph: bump DIE_ON_LOAD check to actual load-time
2019-09-12upload-pack: disable commit graph more gently for shallow traversalLibravatar Jeff King1-0/+3
When the client has asked for certain shallow options like "deepen-since", we do a custom rev-list walk that pretends to be shallow. Before doing so, we have to disable the commit-graph, since it is not compatible with the shallow view of the repository. That's handled by 829a321569 (commit-graph: close_commit_graph before shallow walk, 2018-08-20). That commit literally closes and frees our repo->objects->commit_graph struct. That creates an interesting problem for commits that have _already_ been parsed using the commit graph. Their commit->object.parsed flag is set, their commit->graph_pos is set, but their commit->maybe_tree may still be NULL. When somebody later calls repo_get_commit_tree(), we see that we haven't loaded the tree oid yet and try to get it from the commit graph. But since it has been freed, we segfault! So the root of the issue is a data dependency between the commit's lazy-load of the tree oid and the fact that the commit graph can go away mid-process. How can we resolve it? There are a couple of general approaches: 1. The obvious answer is to avoid loading the tree from the graph when we see that it's NULL. But then what do we return for the tree oid? If we return NULL, our caller in do_traverse() will rightly complain that we have no tree. We'd have to fallback to loading the actual commit object and re-parsing it. That requires teaching parse_commit_buffer() to understand re-parsing (i.e., not starting from a clean slate and not leaking any allocated bits like parent list pointers). 2. When we close the commit graph, walk through the set of in-memory objects and clear any graph_pos pointers. But this means we also have to "unparse" any such commits so that we know they still need to open the commit object to fill in their trees. So it's no less complicated than (1), and is more expensive (since we clear objects we might not later need). 3. Stop freeing the commit-graph struct. Continue to let it be used for lazy-loads of tree oids, but let upload-pack specify that it shouldn't be used for further commit parsing. 4. Push the whole shallow rev-list out to its own sub-process, with the commit-graph disabled from the start, giving it a clean memory space to work from. I've chosen (3) here. Options (1) and (2) would work, but are non-trivial to implement. Option (4) is more expensive, and I'm not sure how complicated it is (shelling out for the actual rev-list part is easy, but we do then parse the resulting commits internally, and I'm not clear which parts need to be handling shallow-ness). The new test in t5500 triggers this segfault, but see the comments there for how horribly intimate it has to be with how both upload-pack and commit graphs work. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-09-03fetch: add fetch.writeCommitGraph config settingLibravatar Derrick Stolee1-0/+1
The commit-graph feature is now on by default, and is being written during 'git gc' by default. Typically, Git only writes a commit-graph when a 'git gc --auto' command passes the gc.auto setting to actualy do work. This means that a commit-graph will typically fall behind the commits that are being used every day. To stay updated with the latest commits, add a step to 'git fetch' to write a commit-graph after fetching new objects. The fetch.writeCommitGraph config setting enables writing a split commit-graph, so on average the cost of writing this file is very small. Occasionally, the commit-graph chain will collapse to a single level, and this could be slow for very large repos. For additional use, adjust the default to be true when feature.experimental is enabled. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-13repo-settings: create feature.experimental settingLibravatar Derrick Stolee1-0/+8
The 'feature.experimental' setting includes config options that are not committed to become defaults, but could use additional testing. Update the following config settings to take new defaults, and to use the repo_settings struct if not already using them: * 'pack.useSparse=true' * 'fetch.negotiationAlgorithm=skipping' In the case of fetch.negotiationAlgorithm, the existing logic would load the config option only when about to use the setting, so had a die() statement on an unknown string value. This is removed as now the config is parsed under prepare_repo_settings(). In general, this die() is probably misplaced and not valuable. A test was removed that checked this die() statement executed. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-13repo-settings: parse core.untrackedCacheLibravatar Derrick Stolee1-0/+8
The core.untrackedCache config setting is slightly complicated, so clarify its use and centralize its parsing into the repo settings. The default value is "keep" (returned as -1), which persists the untracked cache if it exists. If the value is set as "false" (returned as 0), then remove the untracked cache if it exists. If the value is set as "true" (returned as 1), then write the untracked cache and persist it. Instead of relying on magic values of -1, 0, and 1, split these options into an enum. This allows the use of "-1" as a default value. After parsing the config options, if the value is unset we can initialize it to UNTRACKED_CACHE_KEEP. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-13repo-settings: consolidate some config settingsLibravatar Derrick Stolee1-0/+14
There are a few important config settings that are not loaded during git_default_config. These are instead loaded on-demand. Centralize these config options to a single scan, and store all of the values in a repo_settings struct. The values for each setting are initialized as negative to indicate "unset". This centralization will be particularly important in a later change to introduce "meta" config settings that change the defaults for these config settings. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-22trace2: create new combined trace facilityLibravatar Jeff Hostetler1-0/+3
Create a new unified tracing facility for git. The eventual intent is to replace the current trace_printf* and trace_performance* routines with a unified set of git_trace2* routines. In addition to the usual printf-style API, trace2 provides higer-level event verbs with fixed-fields allowing structured data to be written. This makes post-processing and analysis easier for external tools. Trace2 defines 3 output targets. These are set using the environment variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT". These may be set to "1" or to an absolute pathname (just like the current GIT_TRACE). * GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command summary data. * GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE. It extends the output with columns for the command process, thread, repo, absolute and relative elapsed times. It reports events for child process start/stop, thread start/stop, and per-thread function nesting. * GIT_TR2_EVENT is a new structured format. It writes event data as a series of JSON records. Calls to trace2 functions log to any of the 3 output targets enabled without the need to call different trace_printf* or trace_performance* routines. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-02-06Merge branch 'nd/the-index-final'Libravatar Junio C Hamano1-0/+16
The assumption to work on the single "in-core index" instance has been reduced from the library-ish part of the codebase. * nd/the-index-final: cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch read-cache.c: remove the_* from index_has_changes() merge-recursive.c: remove implicit dependency on the_repository merge-recursive.c: remove implicit dependency on the_index sha1-name.c: remove implicit dependency on the_index read-cache.c: replace update_index_if_able with repo_& read-cache.c: kill read_index() checkout: avoid the_index when possible repository.c: replace hold_locked_index() with repo_hold_locked_index() notes-utils.c: remove the_repository references grep: use grep_opt->repo instead of explict repo argument
2019-01-14read-cache.c: replace update_index_if_able with repo_&Libravatar Nguyễn Thái Ngọc Duy1-0/+6
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14read-cache.c: kill read_index()Libravatar Nguyễn Thái Ngọc Duy1-0/+6
read_index() shares the same problem as hold_locked_index(): it assumes $GIT_DIR/index. Move all call sites to repo_read_index() instead. read_index_preload() and read_index_unmerged() are also killed as a consequence. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-14repository.c: replace hold_locked_index() with repo_hold_locked_index()Libravatar Nguyễn Thái Ngọc Duy1-0/+4
hold_locked_index() assumes the index path at $GIT_DIR/index. This is not good for places that take an arbitrary index_state instead of the_index, which is basically everywhere except builtin/. Replace it with repo_hold_locked_index(). hold_locked_index() remains as a wrapper around repo_hold_locked_index() to reduce changes in builtin/ Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-05repository: repo_submodule_init to take a submodule structLibravatar Stefan Beller1-2/+10
When constructing a struct repository for a submodule for some revision of the superproject where the submodule is not contained in the index, it may not be present in the working tree currently either. In that situation giving a 'path' argument is not useful. Upgrade the repo_submodule_init function to take a struct submodule instead. The submodule struct can be obtained via submodule_from_{path, name} or an artificial submodule struct can be passed in. While we are at it, rename the repository struct in the repo_submodule_init function, which is to be initialized, to a name that is not confused with the struct submodule as easily. Perform such renames in similar functions as well. Also move its documentation into the header file. Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-20Merge branch 'en/incl-forward-decl'Libravatar Junio C Hamano1-0/+2
Code hygiene improvement for the header files. * en/incl-forward-decl: Remove forward declaration of an enum compat/precompose_utf8.h: use more common include guard style urlmatch.h: fix include guard Move definition of enum branch_track from cache.h to branch.h alloc: make allocate_alloc_state and clear_alloc_state more consistent Add missing includes and forward declarations
2018-08-15Add missing includes and forward declarationsLibravatar Elijah Newren1-0/+2
I looped over the toplevel header files, creating a temporary two-line C program for each consisting of #include "git-compat-util.h" #include $HEADER This patch is the result of manually fixing errors in compiling those tiny programs. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-03repository.h: drop extern from function declarationLibravatar Nguyễn Thái Ngọc Duy1-14/+11
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-07-18Merge branch 'sb/object-store-grafts'Libravatar Junio C Hamano1-0/+5
The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h
2018-06-25Merge branch 'sb/object-store-alloc'Libravatar Junio C Hamano1-0/+9
The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-alloc: alloc: allow arbitrary repositories for alloc functions object: allow create_object to handle arbitrary repositories object: allow grow_object_hash to handle arbitrary repositories alloc: add repository argument to alloc_commit_index alloc: add repository argument to alloc_report alloc: add repository argument to alloc_object_node alloc: add repository argument to alloc_tag_node alloc: add repository argument to alloc_commit_node alloc: add repository argument to alloc_tree_node alloc: add repository argument to alloc_blob_node object: add repository argument to grow_object_hash object: add repository argument to create_object repository: introduce parsed objects field
2018-05-18path.c: migrate global git_path_* to take a repository argumentLibravatar Stefan Beller1-0/+5
Migrate all git_path_* functions that are defined in path.c to take a repository argument. Unlike other patches in this series, do not use the #define trick, as we rewrite the whole function, which is rather small. This doesn't migrate all the functions, as other builtins have their own local path functions defined using GIT_PATH_FUNC. So keep that macro around to serve the other locations. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-09repository: introduce parsed objects fieldLibravatar Stefan Beller1-0/+9
Convert the existing global cache for parsed objects (obj_hash) into repository-specific parsed object caches. Existing code that uses obj_hash are modified to use the parsed object cache of the_repository; future patches will use the parsed object caches of other repositories. Another future use case for a pool of objects is ease of memory management in revision walking: If we can free the rev-list related memory early in pack-objects (e.g. part of repack operation) then it could lower memory pressure significantly when running on large repos. While this has been discussed on the mailing list lately, this series doesn't implement this. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-08Merge branch 'sb/object-store-replace'Libravatar Junio C Hamano1-0/+3
The effort to pass the repository in-core structure throughout the API continues. This round deals with the code that implements the refs/replace/ mechanism. * sb/object-store-replace: replace-object: allow lookup_replace_object to handle arbitrary repositories replace-object: allow do_lookup_replace_object to handle arbitrary repositories replace-object: allow prepare_replace_object to handle arbitrary repositories refs: allow for_each_replace_ref to handle arbitrary repositories refs: store the main ref store inside the repository struct replace-object: add repository argument to lookup_replace_object replace-object: add repository argument to do_lookup_replace_object replace-object: add repository argument to prepare_replace_object refs: add repository argument to for_each_replace_ref refs: add repository argument to get_main_ref_store replace-object: check_replace_refs is safe in multi repo environment replace-object: eliminate replace objects prepared flag object-store: move lookup_replace_object to replace-object.h replace-object: move replace_map to object store replace_object: use oidmap
2018-04-12refs: store the main ref store inside the repository structLibravatar Stefan Beller1-0/+3
This moves the 'main_ref_store', which was a global variable in refs.c into the repository struct. This patch does not deal with the parts in the refs subsystem which deal with the submodules there. A later patch needs to get rid of the submodule exposure in the refs API, such as 'get_submodule_ref_store(path)'. Acked-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-29submodule: fixup nested submodules after moving the submoduleLibravatar Stefan Beller1-0/+3
connect_work_tree_and_git_dir is used to connect a submodule worktree with its git directory and vice versa after events that require a reconnection such as moving around the working tree. As submodules can have nested submodules themselves, we'd also want to fix the nested submodules when asked to. Add an option to recurse into the nested submodules and connect them as well. As submodules are identified by their name (which determines their git directory in relation to their superproject's git directory) internally and by their path in the working tree of the superproject, we need to make sure that the mapping of name <-> path is kept intact. We can do that in the git-mv command by writing out the gitmodules file first and then forcing a reload of the submodule config machinery. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-23repository: introduce raw object store fieldLibravatar Stefan Beller1-7/+4
The raw object store field will contain any objects needed for access to objects in a given repository. This patch introduces the raw object store and populates it with the `objectdir`, which used to be part of the repository struct. As the struct gains members, we'll also populate the function to clear the memory for these members. In a later step, we'll introduce a struct object_parser, that will complement the object parsing in a repository struct: The raw object parser is the layer that will provide access to raw object content, while the higher level object parser code will parse raw objects and keeps track of parenthood and other object relationships using 'struct object'. For now only add the lower level to the repository struct. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-23repository.h: add comment and clarify repo_set_gitdirLibravatar Nguyễn Thái Ngọc Duy1-1/+5
The argument name "optional" may mislead the reader to think this option could be NULL. But it can't be. While at there, document a bit more about struct set_gitdir_args. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-05repository: delete ignore_env memberLibravatar Nguyễn Thái Ngọc Duy1-9/+0
This variable was added because the repo_set_gitdir() was created to cover both submodule and main repos, but these two are initialized a bit differently so ignore_env == 0 means main repo, while ignore_env != 0 is submodules. Since the difference part (env variables) has been moved out of repo_set_gitdir(), this function works the same way for both repo types and ignore_env is not needed anymore. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-05sha1_file.c: move delayed getenv(altdb) back to setup_git_env()Libravatar Nguyễn Thái Ngọc Duy1-0/+4
getenv() is supposed to work on the main repository only. This delayed getenv() code in sha1_file.c makes it more difficult to convert sha1_file.c to a generic object store that could be used by both submodule and main repositories. Move the getenv() back in setup_git_env() where other env vars are also fetched. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-05repository.c: move env-related setup code back to environment.cLibravatar Nguyễn Thái Ngọc Duy1-1/+10
It does not make sense that generic repository code contains handling of environment variables, which are specific for the main repository only. Refactor repo_set_gitdir() function to take $GIT_DIR and optionally _all_ other customizable paths. These optional paths can be NULL and will be calculated according to the default directory layout. Note that some dead functions are left behind to reduce diff noise. They will be deleted in the next patch. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-05repository: initialize the_repository in main()Libravatar Nguyễn Thái Ngọc Duy1-1/+1
This simplifies initialization of struct repository and anything inside. Easier to read. Easier to add/remove fields. Everything will go through main() common-main.c so this should cover all programs, including t/helper. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-13Integrate hash algorithm support with repo setupLibravatar brian m. carlson1-0/+5
In future versions of Git, we plan to support an additional hash algorithm. Integrate the enumeration of hash algorithms with repository setup, and store a pointer to the enumerated data in struct repository. Of course, we currently only support SHA-1, so hard-code this value in read_repository_format. In the future, we'll enumerate this value from the configuration. Add a constant, the_hash_algo, which points to the hash_algo structure pointer in the repository global. Note that this is the hash which is used to serialize data to disk, not the hash which is used to display items to the user. The transition plan anticipates that these may be different. We can add an additional element in the future (say, ui_hash_algo) to provide for this case. Include repository.h in cache.h since we now need to have access to these struct and variable definitions. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-18repo_read_index: don't discard the indexLibravatar Brandon Williams1-0/+8
Have 'repo_read_index()' behave more like the other read_index family of functions and don't discard the index if it has already been populated and instead rely on the quick return of read_index_from which has: /* istate->initialized covers both .git/index and .git/sharedindex.xxx */ if (istate->initialized) return istate->cache_nr; Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23repository: enable initialization of submodulesLibravatar Brandon Williams1-0/+10
Introduce 'repo_submodule_init()' which performs initialization of a 'struct repository' as a submodule of another 'struct repository'. The resulting submodule 'struct repository' can be in one of three states: 1. The submodule is initialized and has a worktree. 2. The submodule is initialized but does not have a worktree. This would occur when the submodule's gitdir is present in the superproject's 'gitdir/modules/' directory yet the submodule has not been checked out in superproject's worktree. 3. The submodule remains uninitialized due to an error in the initialization process or there is no matching submodule at the provided path in the superproject. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23submodule-config: store the_submodule_cache in the_repositoryLibravatar Brandon Williams1-0/+4
Refactor how 'the_submodule_cache' is handled so that it can be stored inside of a repository object. Also migrate 'the_submodule_cache' to be stored in 'the_repository'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23repository: add index_state to struct repoLibravatar Brandon Williams1-0/+9
Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23config: read config from a repository objectLibravatar Brandon Williams1-0/+10
Teach the config machinery to read config information from a repository object. This involves storing a 'struct config_set' inside the repository object and adding a number of functions (repo_config*) to be able to query a repository's config. The current config API enables lazy-loading of the config. This means that when 'git_config_get_int()' is called, if the_config_set hasn't been populated yet, then it will be populated and properly initialized by reading the necessary config files (system wide .gitconfig, user's home .gitconfig, and the repository's config). To maintain this paradigm, the new API to read from a repository object's config will also perform this lazy-initialization. Since both APIs (git_config_get* and repo_config_get*) have the same semantics we can migrate the default config to be stored within 'the_repository' and just have the 'git_config_get*' family of functions redirect to the 'repo_config_get*' functions. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23repository: introduce the repository objectLibravatar Brandon Williams1-0/+64
Introduce the repository object 'struct repository' which can be used to hold all state pertaining to a git repository. Some of the benefits of object-ifying a repository are: 1. Make the code base more readable and easier to reason about. 2. Allow for working on multiple repositories, specifically submodules, within the same process. Currently the process for working on a submodule involves setting up an argv_array of options for a particular command and then launching a child process to execute the command in the context of the submodule. This is clunky and can require lots of little hacks in order to ensure correctness. Ideally it would be nice to simply pass a repository and an options struct to a command. 3. Eliminating reliance on global state will make it easier to enable the use of threading to improve performance. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>