summaryrefslogtreecommitdiff
path: root/sha1_file.c
AgeCommit message (Collapse)AuthorFilesLines
2014-10-29Merge branch 'jk/prune-mtime'Libravatar Junio C Hamano1-11/+198
Tighten the logic to decide that an unreachable cruft is sufficiently old by covering corner cases such as an ancient object becoming reachable and then going unreachable again, in which case its retention period should be prolonged. * jk/prune-mtime: (28 commits) drop add_object_array_with_mode revision: remove definition of unused 'add_object' function pack-objects: double-check options before discarding objects repack: pack objects mentioned by the index pack-objects: use argv_array reachable: use revision machinery's --indexed-objects code rev-list: add --indexed-objects option rev-list: document --reflog option t5516: test pushing a tag of an otherwise unreferenced blob traverse_commit_list: support pending blobs/trees with paths make add_object_array_with_context interface more sane write_sha1_file: freshen existing objects pack-objects: match prune logic for discarding objects pack-objects: refactor unpack-unreachable expiration check prune: keep objects reachable from recent objects sha1_file: add for_each iterators for loose and packed objects count-objects: use for_each_loose_file_in_objdir count-objects: do not use xsize_t when counting object size prune-packed: use for_each_loose_file_in_objdir reachable: mark index blobs as SEEN ...
2014-10-16write_sha1_file: freshen existing objectsLibravatar Jeff King1-7/+44
When we try to write a loose object file, we first check whether that object already exists. If so, we skip the write as an optimization. However, this can interfere with prune's strategy of using mtimes to mark files in progress. For example, if a branch contains a particular tree object and is deleted, that tree object may become unreachable, and have an old mtime. If a new operation then tries to write the same tree, this ends up as a noop; we notice we already have the object and do nothing. A prune running simultaneously with this operation will see the object as old, and may delete it. We can solve this by "freshening" objects that we avoid writing by updating their mtime. The algorithm for doing so is essentially the same as that of has_sha1_file. Therefore we provide a new (static) interface "check_and_freshen", which finds and optionally freshens the object. It's trivial to implement freshening and simple checking by tweaking a single parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16sha1_file: add for_each iterators for loose and packed objectsLibravatar Jeff King1-0/+62
We typically iterate over the reachable objects in a repository by starting at the tips and walking the graph. There's no easy way to iterate over all of the objects, including unreachable ones. Let's provide a way of doing so. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16prune: factor out loose-object directory traversalLibravatar Jeff King1-0/+84
Prune has to walk $GIT_DIR/objects/?? in order to find the set of loose objects to prune. Other parts of the code (e.g., count-objects) want to do the same. Let's factor it out into a reusable for_each-style function. Note that this is not quite a straight code movement. The original code had strange behavior when it found a file of the form "[0-9a-f]{2}/.{38}" that did _not_ contain all hex digits. It executed a "break" from the loop, meaning that we stopped pruning in that directory (but still pruned other directories!). This was probably a bug; we do not want to process the file as an object, but we should keep going otherwise (and that is how the new code handles it). We are also a little more careful with loose object directories which fail to open. The original code silently ignored any failures, but the new code will complain about any problems besides ENOENT. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-16foreach_alt_odb: propagate return value from callbackLibravatar Jeff King1-4/+8
We check the return value of the callback and stop iterating if it is non-zero. However, we do not make the non-zero return value available to the caller, so they have no way of knowing whether the operation succeeded or not (technically they can keep their own error flag in the callback data, but that is unlike our other for_each functions). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-14Merge branch 'mh/lockfile'Libravatar Junio C Hamano1-0/+1
The lockfile API and its users have been cleaned up. * mh/lockfile: (38 commits) lockfile.h: extract new header file for the functions in lockfile.c hold_locked_index(): move from lockfile.c to read-cache.c hold_lock_file_for_append(): restore errno before returning get_locked_file_path(): new function lockfile.c: rename static functions lockfile: rename LOCK_NODEREF to LOCK_NO_DEREF commit_lock_file_to(): refactor a helper out of commit_lock_file() trim_last_path_component(): replace last_path_elm() resolve_symlink(): take a strbuf parameter resolve_symlink(): use a strbuf for internal scratch space lockfile: change lock_file::filename into a strbuf commit_lock_file(): use a strbuf to manage temporary space try_merge_strategy(): use a statically-allocated lock_file object try_merge_strategy(): remove redundant lock_file allocation struct lock_file: declare some fields volatile lockfile: avoid transitory invalid states git_config_set_multivar_in_file(): avoid call to rollback_lock_file() dump_marks(): remove a redundant call to rollback_lock_file() api-lockfile: document edge cases commit_lock_file(): rollback lock file on failure to rename ...
2014-10-08Merge branch 'sp/stream-clean-filter'Libravatar Junio C Hamano1-7/+53
When running a required clean filter, we do not have to mmap the original before feeding the filter. Instead, stream the file contents directly to the filter and process its output. * sp/stream-clean-filter: sha1_file: don't convert off_t to size_t too early to avoid potential die() convert: stream from fd to required clean filter to reduce used address space copy_fd(): do not close the input file descriptor mmap_limit: introduce GIT_MMAP_LIMIT to allow testing expected mmap size memory_limit: use git_env_ulong() to parse GIT_ALLOC_LIMIT config.c: add git_env_ulong() to parse environment variable convert: drop arguments other than 'path' from would_convert_to_git()
2014-10-01lockfile.h: extract new header file for the functions in lockfile.cLibravatar Michael Haggerty1-0/+1
Move the interface declaration for the functions in lockfile.c from cache.h to a new file, lockfile.h. Add #includes where necessary (and remove some redundant includes of cache.h by files that already include builtin.h). Move the documentation of the lock_file state diagram from lockfile.c to the new header file. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-22sha1_file: don't convert off_t to size_t too early to avoid potential die()Libravatar Steffen Prohaska1-4/+9
xsize_t() checks if an off_t argument can be safely converted to a size_t return value. If the check is executed too early, it could fail for large files on 32-bit architectures even if the size_t code path is not taken. Other paths might be able to handle the large file. Specifically, index_stream_convert_blob() is able to handle a large file if a filter is configured that returns a small result. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-11Merge branch 'nd/large-blobs'Libravatar Junio C Hamano1-1/+3
Teach a few codepaths to punt (instead of dying) when large blobs that would not fit in core are involved in the operation. * nd/large-blobs: diff: shortcut for diff'ing two binary SHA-1 objects diff --stat: mark any file larger than core.bigfilethreshold binary diff.c: allow to pass more flags to diff_populate_filespec sha1_file.c: do not die failing to malloc in unpack_compressed_entry wrapper.c: introduce gentle xmallocz that does not die()
2014-09-02Merge branch 'rs/strbuf-getcwd'Libravatar Junio C Hamano1-1/+1
Reduce the use of fixed sized buffer passed to getcwd() calls by introducing xgetcwd() helper. * rs/strbuf-getcwd: use strbuf_add_absolute_path() to add absolute paths abspath: convert absolute_path() to strbuf use xgetcwd() to set $GIT_DIR use xgetcwd() to get the current directory or die wrapper: add xgetcwd() abspath: convert real_path_internal() to strbuf abspath: use strbuf_getcwd() to remember original working directory setup: convert setup_git_directory_gently_1 et al. to strbuf unix-sockets: use strbuf_getcwd() strbuf: add strbuf_getcwd()
2014-08-28convert: stream from fd to required clean filter to reduce used address spaceLibravatar Steffen Prohaska1-1/+26
The data is streamed to the filter process anyway. Better avoid mapping the file if possible. This is especially useful if a clean filter reduces the size, for example if it computes a sha1 for binary data, like git media. The file size that the previous implementation could handle was limited by the available address space; large files for example could not be handled with (32-bit) msysgit. The new implementation can filter files of any size as long as the filter output is small enough. The new code path is only taken if the filter is required. The filter consumes data directly from the fd. If it fails, the original data is not immediately available. The condition can easily be handled as a fatal error, which is expected for a required filter anyway. If the filter was not required, the condition would need to be handled in a different way, like seeking to 0 and reading the data. But this would require more restructuring of the code and is probably not worth it. The obvious approach of falling back to reading all data would not help achieving the main purpose of this patch, which is to handle large files with limited address space. If reading all data is an option, we can simply take the old code path right away and mmap the entire file. The environment variable GIT_MMAP_LIMIT, which has been introduced in a previous commit is used to test that the expected code path is taken. A related test that exercises required filters is modified to verify that the data actually has been modified on its way from the file system to the object store. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-28mmap_limit: introduce GIT_MMAP_LIMIT to allow testing expected mmap sizeLibravatar Steffen Prohaska1-1/+17
In order to test expectations about mmap in a way similar to testing expectations about malloc with GIT_ALLOC_LIMIT introduced by d41489a6 (Add more large blob test cases, 2012-03-07), introduce a new environment variable GIT_MMAP_LIMIT to limit the largest allowed mmap length. xmmap() is modified to check the size of the requested region and fail it if it is beyond the limit. Together with GIT_ALLOC_LIMIT tests can now confirm expectations about memory consumption. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-21convert: drop arguments other than 'path' from would_convert_to_git()Libravatar Steffen Prohaska1-1/+1
It is only the path that matters in the decision whether to filter or not. Clarify this by making path the only argument of would_convert_to_git(). Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-18sha1_file.c: do not die failing to malloc in unpack_compressed_entryLibravatar Nguyễn Thái Ngọc Duy1-1/+3
Fewer die() gives better control to the caller, provided that the caller _can_ handle it. And in unpack_compressed_entry() case, it can, because unpack_compressed_entry() already returns NULL if it fails to inflate data. A side effect from this is fsck continues to run when very large blobs are present (and do not fit in memory). Noticed-by: Dale R. Worley <worley@alum.mit.edu> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-22Merge branch 'kb/perf-trace'Libravatar Junio C Hamano1-26/+4
* kb/perf-trace: api-trace.txt: add trace API documentation progress: simplify performance measurement by using getnanotime() wt-status: simplify performance measurement by using getnanotime() git: add performance tracing for git's main() function to debug scripts trace: add trace_performance facility to debug performance issues trace: add high resolution timer function to debug performance issues trace: add 'file:line' to all trace output trace: move code around, in preparation to file:line output trace: add current timestamp to all trace output trace: disable additional trace output for unit tests trace: add infrastructure to augment trace output with additional info sha1_file: change GIT_TRACE_PACK_ACCESS logging to use trace API Documentation/git.txt: improve documentation of 'GIT_TRACE*' variables trace: improve trace performance trace: remove redundant printf format attribute trace: consistently name the format parameter trace: move trace declarations from cache.h to new trace.h
2014-07-21Merge branch 'ek/alt-odb-entry-fix'Libravatar Junio C Hamano1-4/+9
* ek/alt-odb-entry-fix: sha1_file: do not add own object directory as alternate
2014-07-16Merge branch 'jk/strip-suffix'Libravatar Junio C Hamano1-31/+26
* jk/strip-suffix: prepare_packed_git_one: refactor duplicate-pack check verify-pack: use strbuf_strip_suffix strbuf: implement strbuf_strip_suffix index-pack: use strip_suffix to avoid magic numbers use strip_suffix instead of ends_with in simple cases replace has_extension with ends_with implement ends_with via strip_suffix add strip_suffix function sha1_file: replace PATH_MAX buffer with strbuf in prepare_packed_git_one()
2014-07-16Merge branch 'rs/fix-alt-odb-path-comparison' into maintLibravatar Junio C Hamano1-1/+2
Code to avoid adding the same alternate object store twice was subtly broken for a long time, but nobody seems to have noticed. * rs/fix-alt-odb-path-comparison: sha1_file: avoid overrunning alternate object base string
2014-07-15sha1_file: do not add own object directory as alternateLibravatar Ephrim Khong1-4/+9
When adding alternate object directories, we try not to add the directory of the current repository to avoid cycles. Unfortunately, that test was broken, since it compared an absolute with a relative path. Signed-off-by: Ephrim Khong <dr.khong@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-13sha1_file: change GIT_TRACE_PACK_ACCESS logging to use trace APILibravatar Karsten Blees1-26/+4
This changes GIT_TRACE_PACK_ACCESS functionality as follows: * supports the same options as GIT_TRACE (e.g. printing to stderr) * no longer supports relative paths * appends to the trace file rather than overwriting Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-10Merge branch 'rs/fix-alt-odb-path-comparison'Libravatar Junio C Hamano1-1/+2
* rs/fix-alt-odb-path-comparison: sha1_file: avoid overrunning alternate object base string
2014-07-01sha1_file: avoid overrunning alternate object base stringLibravatar René Scharfe1-1/+2
While checking if a new alternate object database is a duplicate make sure that old and new base paths have the same length before comparing them with memcmp. This avoids overrunning the buffer of the existing entry if the new one is longer and it stops rejecting foobar/ after foo/ was already added. Signed-off-by: Rene Scharfe <ls.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-30prepare_packed_git_one: refactor duplicate-pack checkLibravatar Jeff King1-2/+7
When we are reloading the list of packs, we check whether a particular pack has been loaded. This is slightly tricky, because we load packs based on the presence of their ".idx" files, but record the name of the matching ".pack" file. Therefore we want to compare their bases. The existing code stripped off ".idx" from a file we found, then compared that whole base length to strings containing the ".pack" version. This meant we could end up comparing bytes past what the ".pack" string contained, if the ".idx" file name was much longer. In practice, it worked OK because memcmp would end up seeing a difference in the two strings and would return early before hitting the full length. However, memcmp may sometimes read extra bytes past a difference (e.g., because it is comparing 64-bit words), or is even free to compare in reverse order. Furthermore, our memcmp made no guarantees that we matched the whole pack name, up to ".pack". So "foo.idx" would match "foo-bar.pack", which is wrong (but does not typically happen, because our pack names have a fixed size). We can fix both issues, avoid magic numbers, and document that we expect to compare against a string with ".pack" by using strip_suffix. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-30replace has_extension with ends_withLibravatar Jeff King1-5/+5
These two are almost the same function, with the exception that has_extension only matches if there is content before the suffix. So ends_with(".exe", ".exe") is true, but has_extension would not be. This distinction does not matter to any of the callers, though, and we can just replace uses of has_extension with ends_with. We prefer the "ends_with" name because it is more generic, and there is nothing about the function that requires it to be used for file extensions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-30sha1_file: replace PATH_MAX buffer with strbuf in prepare_packed_git_one()Libravatar René Scharfe1-26/+16
Instead of using strbuf to create a message string in case a path is too long for our fixed-size buffer, replace that buffer with a strbuf and thus get rid of the limitation. Helped-by: Duy Nguyen <pclouds@gmail.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-25Merge branch 'jk/report-fail-to-read-objects-better' into maintLibravatar Junio C Hamano1-1/+5
Reworded the error message given upon a failure to open an existing loose object file due to e.g. permission issues; it was reported as the object being corrupt, but that is not quite true. * jk/report-fail-to-read-objects-better: open_sha1_file: report "most interesting" errno
2014-06-16Merge branch 'jk/report-fail-to-read-objects-better'Libravatar Junio C Hamano1-1/+5
* jk/report-fail-to-read-objects-better: open_sha1_file: report "most interesting" errno
2014-05-15open_sha1_file: report "most interesting" errnoLibravatar Jeff King1-1/+5
When we try to open a loose object file, we first attempt to open in the local object database, and then try any alternates. This means that the errno value when we return will be from the last place we looked (and due to the way the code is structured, simply ENOENT if we do not have have any alternates). This can cause confusing error messages, as read_sha1_file checks for ENOENT when reporting a missing object. If errno is something else, we report that. If it is ENOENT, but has_loose_object reports that we have it, then we claim the object is corrupted. For example: $ chmod 0 .git/objects/??/* $ git rev-list --all fatal: loose object b2d6fab18b92d49eac46dc3c5a0bcafabda20131 (stored in .git/objects/b2/d6fab18b92d49eac46dc3c5a0bcafabda20131) is corrupt This patch instead keeps track of the "most interesting" errno we receive during our search. We consider ENOENT to be the least interesting of all, and otherwise report the first error found (so problems in the object database take precedence over ones in alternates). Here it is with this patch: $ git rev-list --all fatal: failed to read object b2d6fab18b92d49eac46dc3c5a0bcafabda20131: Permission denied Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-08Merge branch 'jl/nor-or-nand-and'Libravatar Junio C Hamano1-2/+2
Eradicate mistaken use of "nor" (that is, essentially "nor" used not in "neither A nor B" ;-)) from in-code comments, command output strings, and documentations. * jl/nor-or-nand-and: code and test: fix misuses of "nor" comments: fix misuses of "nor" contrib: fix misuses of "nor" Documentation: fix misuses of "nor"
2014-03-31code and test: fix misuses of "nor"Libravatar Justin Lebar1-1/+1
Signed-off-by: Justin Lebar <jlebar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-31comments: fix misuses of "nor"Libravatar Justin Lebar1-1/+1
Signed-off-by: Justin Lebar <jlebar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-18Merge branch 'dd/use-alloc-grow'Libravatar Junio C Hamano1-6/+1
Replace open-coded reallocation with ALLOC_GROW() macro. * dd/use-alloc-grow: sha1_file.c: use ALLOC_GROW() in pretend_sha1_file() read-cache.c: use ALLOC_GROW() in add_index_entry() builtin/mktree.c: use ALLOC_GROW() in append_to_tree() attr.c: use ALLOC_GROW() in handle_attr_line() dir.c: use ALLOC_GROW() in create_simplify() reflog-walk.c: use ALLOC_GROW() replace_object.c: use ALLOC_GROW() in register_replace_object() patch-ids.c: use ALLOC_GROW() in add_commit() diffcore-rename.c: use ALLOC_GROW() diff.c: use ALLOC_GROW() commit.c: use ALLOC_GROW() in register_commit_graft() cache-tree.c: use ALLOC_GROW() in find_subtree() bundle.c: use ALLOC_GROW() in add_to_ref_list() builtin/pack-objects.c: use ALLOC_GROW() in check_pbase_path()
2014-03-18Merge branch 'nd/sha1-file-delta-stack-leakage-fix'Libravatar Junio C Hamano1-0/+4
Fix a small leak in the delta stack used when resolving a long delta chain at runtime. * nd/sha1-file-delta-stack-leakage-fix: sha1_file: fix delta_stack memory leak in unpack_entry
2014-03-14Merge branch 'mh/object-code-cleanup'Libravatar Junio C Hamano1-30/+36
* mh/object-code-cleanup: sha1_file.c: document a bunch of functions defined in the file sha1_file_name(): declare to return a const string find_pack_entry(): document last_found_pack replace_object: use struct members instead of an array
2014-03-03sha1_file.c: use ALLOC_GROW() in pretend_sha1_file()Libravatar Dmitry S. Dolzhenko1-6/+1
Helped-by: He Sun <sunheehnus@gmail.com> Signed-off-by: Dmitry S. Dolzhenko <dmitrys.dolzhenko@yandex.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-27Merge branch 'jk/pack-bitmap'Libravatar Junio C Hamano1-4/+2
Borrow the bitmap index into packfiles from JGit to speed up enumeration of objects involved in a commit range without having to fully traverse the history. * jk/pack-bitmap: (26 commits) ewah: unconditionally ntohll ewah data ewah: support platforms that require aligned reads read-cache: use get_be32 instead of hand-rolled ntoh_l block-sha1: factor out get_be and put_be wrappers do not discard revindex when re-preparing packfiles pack-bitmap: implement optional name_hash cache t/perf: add tests for pack bitmaps t: add basic bitmap functionality tests count-objects: recognize .bitmap in garbage-checking repack: consider bitmaps when performing repacks repack: handle optional files created by pack-objects repack: turn exts array into array-of-struct repack: stop using magic number for ARRAY_SIZE(exts) pack-objects: implement bitmap writing rev-list: add bitmap mode to speed up object lists pack-objects: use bitmaps when packing objects pack-objects: split add_object_entry pack-bitmap: add support for bitmap indexes documentation: add documentation for the bitmap format ewah: compressed bitmap implementation ...
2014-02-24sha1_file.c: document a bunch of functions defined in the fileLibravatar Michael Haggerty1-11/+15
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24sha1_file_name(): declare to return a const stringLibravatar Michael Haggerty1-15/+9
Change the return value of sha1_file_name() to (const char *). (Callers have no business mucking about here.) Change callers accordingly, deleting a few superfluous temporary variables along the way. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24find_pack_entry(): document last_found_packLibravatar Michael Haggerty1-4/+12
Add a comment at the declaration of last_found_pack and where it is used in find_pack_entry(). In the latter, separate the cases (1) to make a place for the new comment and (2) to turn the success case into affirmative logic. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Reviewed-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24sha1_file: fix delta_stack memory leak in unpack_entryLibravatar Nguyễn Thái Ngọc Duy1-0/+4
This delta_stack array can grow to any length depending on the actual delta chain, but we forget to free it. Normally it does not matter because we use small_delta_stack[] from stack and small_delta_stack can hold 64-delta chains, more than standard --depth=50 in pack-objects. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-27Merge branch 'ss/safe-create-leading-dir-with-slash'Libravatar Junio C Hamano1-4/+8
"git clone $origin foo\bar\baz" on Windows failed to create the leading directories (i.e. a moral-equivalent of "mkdir -p"). * ss/safe-create-leading-dir-with-slash: safe_create_leading_directories(): on Windows, \ can separate path components
2014-01-27Merge branch 'mh/safe-create-leading-directories'Libravatar Junio C Hamano1-29/+38
Code clean-up and protection against concurrent write access to the ref namespace. * mh/safe-create-leading-directories: rename_tmp_log(): on SCLD_VANISHED, retry rename_tmp_log(): limit the number of remote_empty_directories() attempts rename_tmp_log(): handle a possible mkdir/rmdir race rename_ref(): extract function rename_tmp_log() remove_dir_recurse(): handle disappearing files and directories remove_dir_recurse(): tighten condition for removing unreadable dir lock_ref_sha1_basic(): if locking fails with ENOENT, retry lock_ref_sha1_basic(): on SCLD_VANISHED, retry safe_create_leading_directories(): add new error value SCLD_VANISHED cmd_init_db(): when creating directories, handle errors conservatively safe_create_leading_directories(): introduce enum for return values safe_create_leading_directories(): always restore slash at end of loop safe_create_leading_directories(): split on first of multiple slashes safe_create_leading_directories(): rename local variable safe_create_leading_directories(): add explicit "slash" pointer safe_create_leading_directories(): reduce scope of local variable safe_create_leading_directories(): fix format of "if" chaining
2014-01-22safe_create_leading_directories(): on Windows, \ can separate path componentsLibravatar Michael Haggerty1-4/+8
When cloning to a directory "C:\foo\bar" from Windows' cmd.exe where "foo" does not exist yet, Git would throw an error like fatal: could not create work tree dir 'c:\foo\bar'.: No such file or directory Fix this by not hard-coding a platform specific directory separator into safe_create_leading_directories(). This patch, including its entire commit message, is derived from a patch by Sebastian Schuberth. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-16do not discard revindex when re-preparing packfilesLibravatar Jeff King1-1/+0
When an object lookup fails, we re-read the objects/pack directory to pick up any new packfiles that may have been created since our last read. We also discard any pack revindex structs we've allocated. The discarding is a problem for the pack-bitmap code, which keeps a pointer to the revindex for the bitmapped pack. After the discard, the pointer is invalid, and we may read free()d memory. Other revindex users do not keep a bare pointer to the revindex; instead, they always access it through revindex_for_pack(), which lazily builds the revindex. So one solution is to teach the pack-bitmap code a similar trick. It would be slightly less efficient, but probably not all that noticeable. However, it turns out this discarding is not actually necessary. When we call reprepare_packed_git, we do not throw away our old pack list. We keep the existing entries, and only add in new ones. So there is no safety problem; we will still have the pack struct that matches each revindex. The packfile itself may go away, of course, but we are already prepared to handle that, and it may happen outside of reprepare_packed_git anyway. Throwing away the revindex may save some RAM if the pack never gets reused (about 12 bytes per object). But it also wastes some CPU time (to regenerate the index) if the pack does get reused. It's hard to say which is more valuable, but in either case, it happens very rarely (only when we race with a simultaneous repack). Just leaving the revindex in place is simple and safe both for current and future code. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-10Merge branch 'jk/oi-delta-base'Libravatar Junio C Hamano1-0/+53
Teach "cat-file --batch" to show delta-base object name for a packed object that is represented as a delta. * jk/oi-delta-base: cat-file: provide %(deltabase) batch format sha1_object_info_extended: provide delta base sha1s
2014-01-10Merge branch 'jh/rlimit-nofile-fallback'Libravatar Junio C Hamano1-7/+30
When we figure out how many file descriptors to allocate for keeping packfiles open, a system with non-working getrlimit() could cause us to die(), but because we make this call only to get a rough estimate of how many is available and we do not even attempt to use up all file descriptors available ourselves, it is nicer to fall back to a reasonable low value rather than dying. * jh/rlimit-nofile-fallback: get_max_fd_limit(): fall back to OPEN_MAX upon getrlimit/sysconf failure
2014-01-10Merge branch 'cc/replace-object-info'Libravatar Junio C Hamano1-10/+10
read_sha1_file() that is the workhorse to read the contents given an object name honoured object replacements, but there is no corresponding mechanism to sha1_object_info() that is used to obtain the metainfo (e.g. type & size) about the object, leading callers to weird inconsistencies. * cc/replace-object-info: replace info: rename 'full' to 'long' and clarify in-code symbols Documentation/git-replace: describe --format option builtin/replace: unset read_replace_refs t6050: add tests for listing with --format builtin/replace: teach listing using short, medium or full formats sha1_file: perform object replacement in sha1_object_info_extended() t6050: show that git cat-file --batch fails with replace objects sha1_object_info_extended(): add an "unsigned flags" parameter sha1_file.c: add lookup_replace_object_extended() to pass flags replace_object: don't check read_replace_refs twice rename READ_SHA1_FILE_REPLACE flag to LOOKUP_REPLACE_OBJECT
2014-01-06safe_create_leading_directories(): add new error value SCLD_VANISHEDLibravatar Michael Haggerty1-0/+11
Add a new possible error result that can be returned by safe_create_leading_directories() and safe_create_leading_directories_const(): SCLD_VANISHED. This value indicates that a file or directory on the path existed at one point (either it already existed or the function created it), but then it disappeared. This probably indicates that another process deleted the directory while we were working. If SCLD_VANISHED is returned, the caller might want to retry the function call, as there is a chance that a new attempt will succeed. Why doesn't safe_create_leading_directories() do the retrying internally? Because an empty directory isn't really ever safe until it holds a file. So even if safe_create_leading_directories() were absolutely sure that the directory existed before it returned, there would be no guarantee that the directory still existed when the caller tried to write something in it. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-06safe_create_leading_directories(): introduce enum for return valuesLibravatar Michael Haggerty1-8/+8
Instead of returning magic integer values (which a couple of callers go to the trouble of distinguishing), return values from an enum. Add a docstring. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>