summaryrefslogtreecommitdiff
path: root/dir.c
AgeCommit message (Collapse)AuthorFilesLines
2015-03-12untracked cache: guard and disable on system changesLibravatar Nguyễn Thái Ngọc Duy1-1/+54
If the user enables untracked cache, then - move worktree to an unsupported filesystem - or simply upgrade OS - or move the whole (portable) disk from one machine to another - or access a shared fs from another machine there's no guarantee that untracked cache can still function properly. Record the worktree location and OS footprint in the cache. If it changes, err on the safe side and disable the cache. The user can 'update-index --untracked-cache' again to make sure all conditions are met. This adds a new requirement that setup_git_directory* must be called before read_cache() because we need worktree location by then, or the cache is dropped. This change does not cover all bases, you can fool it if you try hard. The point is to stop accidents. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: brian m. carlson <sandals@crustytoothpaste.net> Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHELibravatar Nguyễn Thái Ngọc Duy1-1/+1
This can be used to double check if results with untracked cache are correctly, compared to vanilla version. Untracked cache remains in index, but not used. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: mark index dirty if untracked cache is updatedLibravatar Nguyễn Thái Ngọc Duy1-0/+9
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATSLibravatar Nguyễn Thái Ngọc Duy1-0/+12
This could be used to verify correct behavior in tests Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: avoid racy timestampsLibravatar Nguyễn Thái Ngọc Duy1-2/+2
When a directory is updated within the same second that its timestamp is last saved, we cannot realize the directory has been updated by checking timestamps. Assume the worst (something is update). See 29e4d36 (Racy GIT - 2005-12-20) for more information. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: invalidate at index addition or removalLibravatar Nguyễn Thái Ngọc Duy1-0/+31
Ideally we should implement untracked_cache_remove_from_index() and untracked_cache_add_to_index() so that they update untracked cache right away instead of invalidating it and wait for read_directory() next time to deal with it. But that may need some more work in unpack-trees.c. So stay simple as the first step. The new call in add_index_entry_with_check() may look strange because new calls usually stay close to cache_tree_invalidate_path(). We do it a bit later than c_t_i_p() in this function because if it's about replacing the entry with the same name, we don't care (but cache-tree does). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: load from UNTR index extensionLibravatar Nguyễn Thái Ngọc Duy1-0/+219
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: save to an index extensionLibravatar Nguyễn Thái Ngọc Duy1-0/+139
Helped-by: Stefan Beller <sbeller@google.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: don't open non-existent .gitignoreLibravatar Nguyễn Thái Ngọc Duy1-1/+25
This cuts down a signficant number of open(.gitignore) because most directories usually don't have .gitignore files. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: mark what dirs should be recursed/savedLibravatar Nguyễn Thái Ngọc Duy1-1/+13
If we redo this thing in a functional style, we would have one struct untracked_dir as input tree and another as output. The input is used for verification. The output is a brand new tree, reflecting current worktree. But that means recreate a lot of dir nodes even if a lot could be shared between input and output trees in good cases. So we go with the messy but efficient way, combining both input and output trees into one. We need a way to know which node in this combined tree belongs to the output. This is the purpose of this "recurse" flag. "valid" bit can't be used for this because it's about data of the node except the subdirs. When we invalidate a directory, we want to keep cached data of the subdirs intact even though we don't really know what subdir still exists (yet). Then we check worktree to see what actual subdir remains on disk. Those will have 'recurse' bit set again. If cached data for those are still valid, we may be able to avoid computing exclude files for them. Those subdirs that are deleted will have 'recurse' remained clear and their 'valid' bits do not matter. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: record/validate dir mtime and reuse cached outputLibravatar Nguyễn Thái Ngọc Duy1-2/+119
The main readdir loop in read_directory_recursive() is replaced with a new one that checks if cached results of a directory is still valid. If a file is added or removed from the index, the containing directory is invalidated (but not its subdirs). If directory's mtime is changed, the same happens. If a .gitignore is updated, the containing directory and all subdirs are invalidated recursively. If dir_struct#flags or other conditions change, the cache is ignored. If a directory is invalidated, we opendir/readdir/closedir and run the exclude machinery on that directory listing as usual. If untracked cache is also enabled, we'll update the cache along the way. If a directory is validated, we simply pull the untracked listing out from the cache. The cache also records the list of direct subdirs that we have to recurse in. Fully excluded directories are seen as "untracked files". In the best case when no dirs are invalidated, read_directory() becomes a series of stat(dir), open(.gitignore), fstat(), read(), close() and optionally hash_sha1_file() For comparison, standard read_directory() is a sequence of opendir(), readdir(), open(.gitignore), fstat(), read(), close(), the expensive last_exclude_matching() and closedir(). We already try not to open(.gitignore) if we know it does not exist, so open/fstat/read/close sequence does not apply to every directory. The sequence could be reduced further, as noted in prep_exclude() in another patch. So in theory, the entire best-case read_directory sequence could be reduced to a series of stat() and nothing else. This is not a silver bullet approach. When you compile a C file, for example, the old .o file is removed and a new one with the same name created, effectively invalidating the containing directory's cache (but not its subdirectories). If your build process touches every directory, this cache adds extra overhead for nothing, so it's a good idea to separate generated files from tracked files.. Editors may use the same strategy for saving files. And of course you're out of luck running your repo on an unsupported filesystem and/or operating system. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: make a wrapper around {open,read,close}dir()Libravatar Nguyễn Thái Ngọc Duy1-8/+47
This allows us to feed different info to read_directory_recursive() based on untracked cache in the next patch. Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: invalidate dirs recursively if .gitignore changesLibravatar Nguyễn Thái Ngọc Duy1-1/+17
It's easy to see that if an existing .gitignore changes, its SHA-1 would be different and invalidate_gitignore() is called. If .gitignore is removed, add_excludes() will treat it like an empty .gitignore, which again should invalidate the cached directory data. if .gitignore is added, lookup_untracked() already fills initial .gitignore SHA-1 as "empty file", so again invalidate_gitignore() is called. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: initial untracked cache validationLibravatar Nguyễn Thái Ngọc Duy1-3/+110
Make sure the starting conditions and all global exclude files are good to go. If not, either disable untracked cache completely, or wipe out the cache and start fresh. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12untracked cache: record .gitignore information and dir hierarchyLibravatar Nguyễn Thái Ngọc Duy1-19/+123
The idea is if we can capture all input and (non-rescursive) output of read_directory_recursive(), and can verify later that all the input is the same, then the second r_d_r() should produce the same output as in the first run. The requirement for this to work is stat info of a directory MUST change if an entry is added to or removed from that directory (and should not change often otherwise). If your OS and filesystem do not meet this requirement, untracked cache is not for you. Most file systems on *nix should be fine. On Windows, NTFS is fine while FAT may not be [1] even though FAT on Linux seems to be fine. The list of input of r_d_r() is in the big comment block in dir.h. In short, the output of a directory (not counting subdirs) mainly depends on stat info of the directory in question, all .gitignore leading to it and the check_only flag when r_d_r() is called recursively. This patch records all this info (and the output) as r_d_r() runs. Two hash_sha1_file() are required for $GIT_DIR/info/exclude and core.excludesfile unless their stat data matches. hash_sha1_file() is only needed when .gitignore files in the worktree are modified, otherwise their SHA-1 in index is used (see the previous patch). We could store stat data for .gitignore files so we don't have to rehash them if their content is different from index, but I think .gitignore files are rarely modified, so not worth extra cache data (and hashing penalty read-cache.c:verify_hdr(), as we will be storing this as an index extension). The implication is, if you change .gitignore, you better add it to the index soon or you lose all the benefit of untracked cache because a modified .gitignore invalidates all subdirs recursively. This is especially bad for .gitignore at root. This cached output is about untracked files only, not ignored files because the number of tracked files is usually small, so small cache overhead, while the number of ignored files could go really high (e.g. *.o files mixing with source code). [1] "Description of NTFS date and time stamps for files and folders" http://support.microsoft.com/kb/299648 Helped-by: Torsten Bögershausen <tboegi@web.de> Helped-by: David Turner <dturner@twopensource.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-12dir.c: optionally compute sha-1 of a .gitignore fileLibravatar Nguyễn Thái Ngọc Duy1-7/+47
This is not used anywhere yet. But the goal is to compare quickly if a .gitignore file has changed when we have the SHA-1 of both old (cached somewhere) and new (from index or a tree) versions. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-24Merge branch 'nd/dir-prep-exclude-cleanup'Libravatar Junio C Hamano1-1/+1
Code clean-up. * nd/dir-prep-exclude-cleanup: dir.c: remove the second declaration of "stk" in prep_exclude()
2014-10-21dir.c: remove the second declaration of "stk" in prep_exclude()Libravatar Nguyễn Thái Ngọc Duy1-1/+1
This "stk" shadows the first declaration at the top. There's currently no bad effect. But let's avoid it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-02Merge branch 'rs/strbuf-getcwd'Libravatar Junio C Hamano1-4/+8
Reduce the use of fixed sized buffer passed to getcwd() calls by introducing xgetcwd() helper. * rs/strbuf-getcwd: use strbuf_add_absolute_path() to add absolute paths abspath: convert absolute_path() to strbuf use xgetcwd() to set $GIT_DIR use xgetcwd() to get the current directory or die wrapper: add xgetcwd() abspath: convert real_path_internal() to strbuf abspath: use strbuf_getcwd() to remember original working directory setup: convert setup_git_directory_gently_1 et al. to strbuf unix-sockets: use strbuf_getcwd() strbuf: add strbuf_getcwd()
2014-08-26use xgetcwd() to get the current directory or dieLibravatar René Scharfe1-4/+8
Convert several calls of getcwd() and die() to use xgetcwd() instead. This way we get rid of fixed-size buffers (which can be too small depending on the used file system) and gain consistent error messages. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-14prep_exclude: remove the artificial PATH_MAX limitLibravatar Nguyễn Thái Ngọc Duy1-19/+28
This fixes a segfault in git-status with long paths on Windows, where PATH_MAX is only 260. This also fixes the problem of silently ignoring .gitignore if the full path exceeds PATH_MAX. Now add_excludes_from_file() will report if it gets ENAMETOOLONG. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-14dir.c: coding style fixLibravatar Nguyễn Thái Ngọc Duy1-6/+6
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Karsten Blees <blees@dcon.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20cleanup duplicate name_compare() functionsLibravatar Jeremiah Mahler1-2/+1
We often represent our strings as a counted string, i.e. a pair of the pointer to the beginning of the string and its length, and the string may not be NUL terminated to that length. To compare a pair of such counted strings, unpack-trees.c and read-cache.c implement their own name_compare() functions identically. In addition, the cache_name_compare() function in read-cache.c is nearly identical. The only difference is when one string is the prefix of the other string, in which case name_compare() returns -1/+1 to show which one is longer, and cache_name_compare() returns the difference of the lengths to show the same information. Unify these three functions by using the implementation from cache_name_compare(). This does not make any difference to the existing and future callers, as they must be paying attention only to the sign of the returned value (and not the magnitude) because the original implementations of these two functions return values returned by memcmp(3) when the one string is not a prefix of the other string, and the only thing memcmp(3) guarantees its callers is the sign of the returned value, not the magnitude. Signed-off-by: Jeremiah Mahler <jmmahler@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-02dir.c:trim_trailing_spaces(): fix for " \ " sequenceLibravatar Pasha Bolokhov1-15/+19
Discard the unnecessary 'nr_spaces' variable, remove 'strlen()' and improve the 'if' structure. Switch to pointers instead of integers to control the loop. Slightly more rare occurrences of 'text \ ' with a backslash in between spaces are handled correctly. Namely, the code in 7e2e4b37 (dir: ignore trailing spaces in exclude patterns, 2014-02-09) does not reset 'last_space' when a backslash is encountered and the above line stays intact as a result. Add a test at the end of t/t0008-ignores.sh to exhibit this behavior. Signed-off-by: Pasha Bolokhov <pasha.bolokhov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-03Merge branch 'cb/aix'Libravatar Junio C Hamano1-3/+3
* cb/aix: tests: don't rely on strerror text when testing rmdir failure dir.c: make git_fnmatch() not inline
2014-03-31dir.c: make git_fnmatch() not inlineLibravatar Charles Bailey1-3/+3
Now that it calls a static inline function, it cannot be an inline definition with external linkage. Remove inline and make it an external definition. Signed-off-by: Charles Bailey <cbailey32@bloomberg.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-18Merge branch 'dd/use-alloc-grow'Libravatar Junio C Hamano1-4/+1
Replace open-coded reallocation with ALLOC_GROW() macro. * dd/use-alloc-grow: sha1_file.c: use ALLOC_GROW() in pretend_sha1_file() read-cache.c: use ALLOC_GROW() in add_index_entry() builtin/mktree.c: use ALLOC_GROW() in append_to_tree() attr.c: use ALLOC_GROW() in handle_attr_line() dir.c: use ALLOC_GROW() in create_simplify() reflog-walk.c: use ALLOC_GROW() replace_object.c: use ALLOC_GROW() in register_replace_object() patch-ids.c: use ALLOC_GROW() in add_commit() diffcore-rename.c: use ALLOC_GROW() diff.c: use ALLOC_GROW() commit.c: use ALLOC_GROW() in register_commit_graft() cache-tree.c: use ALLOC_GROW() in find_subtree() bundle.c: use ALLOC_GROW() in add_to_ref_list() builtin/pack-objects.c: use ALLOC_GROW() in check_pbase_path()
2014-03-14Merge branch 'nd/no-more-fnmatch'Libravatar Junio C Hamano1-4/+7
We started using wildmatch() in place of fnmatch(3); complete the process and stop using fnmatch(3). * nd/no-more-fnmatch: actually remove compat fnmatch source code stop using fnmatch (either native or compat) Revert "test-wildmatch: add "perf" command to compare wildmatch and fnmatch" use wildmatch() directly without fnmatch() wrapper
2014-03-14Merge branch 'nd/gitignore-trailing-whitespace'Libravatar Junio C Hamano1-0/+20
Trailing whitespaces in .gitignore files, unless they are quoted for fnmatch(3), e.g. "path\ ", are warned and ignored. Strictly speaking, this is a backward incompatible change, but very unlikely to bite any sane user and adjusting should be obvious and easy. * nd/gitignore-trailing-whitespace: t0008: skip trailing space test on Windows dir: ignore trailing spaces in exclude patterns dir: warn about trailing spaces in exclude patterns
2014-03-03dir.c: use ALLOC_GROW() in create_simplify()Libravatar Dmitry S. Dolzhenko1-4/+1
Signed-off-by: Dmitry S. Dolzhenko <dmitrys.dolzhenko@yandex.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24pathspec: pass directory indicator to match_pathspec_item()Libravatar Nguyễn Thái Ngọc Duy1-2/+2
This patch activates the DO_MATCH_DIRECTORY code in m_p_i(), which makes "git diff HEAD submodule/" and "git diff HEAD submodule" produce the same output. Previously only the version without trailing slash returns the difference (if any). That's the effect of new ce_path_match(). dir_path_match() is not executed by the new tests. And it should not introduce regressions. Previously if path "dir/" is passed in with pathspec "dir/", they obviously match. With new dir_path_match(), the path becomes _directory_ "dir" vs pathspec "dir/", which is not executed by the old code path in m_p_i(). The new code path is executed and produces the same result. The other case is pathspec "dir" and path "dir/" is now turned to "dir" (with DO_MATCH_DIRECTORY). Still the same result before or after the patch. So why change? Because of the next patch about clean.c. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24match_pathspec: match pathspec "foo/" against directory "foo"Libravatar Nguyễn Thái Ngọc Duy1-1/+6
Currently we do support matching pathspec "foo/" against directory "foo". That is because match_pathspec() has no way to tell "foo" is a directory and matching "foo/" against _file_ "foo" is wrong. The callers can now tell match_pathspec if "foo" is a directory, we could make an exception for this case. Code is not executed though because no callers pass the flag yet. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24dir.c: prepare match_pathspec_item for taking more flagsLibravatar Nguyễn Thái Ngọc Duy1-6/+13
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24pathspec: rename match_pathspec_depth() to match_pathspec()Libravatar Nguyễn Thái Ngọc Duy1-10/+10
A long time ago, for some reason I was not happy with match_pathspec(). I created a better version, match_pathspec_depth() that was suppose to replace match_pathspec() eventually. match_pathspec() has finally been gone since 6 months ago. Use the shorter name for match_pathspec_depth(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-20use wildmatch() directly without fnmatch() wrapperLibravatar Nguyễn Thái Ngọc Duy1-4/+7
Make it clear that we don't use fnmatch() anymore. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-10dir: ignore trailing spaces in exclude patternsLibravatar Nguyễn Thái Ngọc Duy1-9/+12
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-10dir: warn about trailing spaces in exclude patternsLibravatar Nguyễn Thái Ngọc Duy1-0/+17
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-27Merge branch 'mh/safe-create-leading-directories'Libravatar Junio C Hamano1-7/+20
Code clean-up and protection against concurrent write access to the ref namespace. * mh/safe-create-leading-directories: rename_tmp_log(): on SCLD_VANISHED, retry rename_tmp_log(): limit the number of remote_empty_directories() attempts rename_tmp_log(): handle a possible mkdir/rmdir race rename_ref(): extract function rename_tmp_log() remove_dir_recurse(): handle disappearing files and directories remove_dir_recurse(): tighten condition for removing unreadable dir lock_ref_sha1_basic(): if locking fails with ENOENT, retry lock_ref_sha1_basic(): on SCLD_VANISHED, retry safe_create_leading_directories(): add new error value SCLD_VANISHED cmd_init_db(): when creating directories, handle errors conservatively safe_create_leading_directories(): introduce enum for return values safe_create_leading_directories(): always restore slash at end of loop safe_create_leading_directories(): split on first of multiple slashes safe_create_leading_directories(): rename local variable safe_create_leading_directories(): add explicit "slash" pointer safe_create_leading_directories(): reduce scope of local variable safe_create_leading_directories(): fix format of "if" chaining
2014-01-21remove_dir_recurse(): handle disappearing files and directoriesLibravatar Michael Haggerty1-6/+16
If a file or directory that we are trying to remove disappears (e.g., because another process has pruned it), do not consider it an error. However, if REMOVE_DIR_KEEP_TOPLEVEL is set, and the toplevel directory is missing, then consider it an error (like before). Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-21remove_dir_recurse(): tighten condition for removing unreadable dirLibravatar Michael Haggerty1-2/+5
If opendir() fails on the top-level directory, it makes sense to try to delete it anyway--but only if the failure was due to EACCES. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-06Support pathspec magic :(exclude) and its short form :!Libravatar Nguyễn Thái Ngọc Duy1-6/+41
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-17dir: revert work-around for retired dangerous behaviorLibravatar Eric Sunshine1-15/+3
directory_exists_in_index_icase() dangerously assumed that it could access one character beyond the end of its directory argument, and that that character would unconditionally be '/'. 2eac2a4c (ls-files -k: a directory only can be killed if the index has a non-directory, 2013-08-15) added a caller which did not respect this undocumented assumption, and 680be044 (dir.c::test_one_path(): work around directory_exists_in_index_icase() breakage, 2013-08-23) added a work-around which temporarily appends a '/' before invoking directory_exists_in_index_icase(). Since the dangerous behavior of directory_exists_in_index_icase() has been eliminated, the work-around is now redundant, so retire it (but not the tests added by the same commit). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-17name-hash: stop storing trailing '/' on paths in index_state.dir_hashLibravatar Eric Sunshine1-1/+1
When 5102c617 (Add case insensitivity support for directories when using git status, 2010-10-03) added directories to the name-hash there was only a single hash table in which both real cache entries and leading directory prefixes were registered. To distinguish between the two types of entries, directories were stored with a trailing '/'. 2092678c (name-hash.c: fix endless loop with core.ignorecase=true, 2013-02-28), however, moved directories to a separate hash table (index_state.dir_hash) but retained the (now) redundant trailing '/', thus callers continue to bear the burden of ensuring the slash's presence before searching the index for a directory. Eliminate this redundancy by storing paths in the dir-hash without the trailing '/'. An important benefit of this change is that it eliminates undocumented and dangerous behavior of dir.c:directory_exists_in_index_icase() in which it assumes not only that it can validly access one character beyond the end of its incoming directory argument, but also that that character will unconditionally be a '/'. This perilous behavior was "tolerated" because the string passed in by its lone caller always had a '/' in that position, however, things broke [1] when 2eac2a4c (ls-files -k: a directory only can be killed if the index has a non-directory, 2013-08-15) added a new caller which failed to respect the undocumented assumption. [1]: http://thread.gmane.org/gmane.comp.version-control.git/232727 Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-17employ new explicit "exists in index?" APILibravatar Eric Sunshine1-5/+5
Each caller of index_name_exists() knows whether it is looking for a directory or a file, and can avoid the unnecessary indirection of index_name_exists() by instead calling index_dir_exists() or index_file_exists() directly. Invoking the appropriate search function explicitly will allow a subsequent patch to relieve callers of the artificial burden of having to add a trailing '/' to the pathname given to index_dir_exists(). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-11Merge branch 'jc/ls-files-killed-optim'Libravatar Junio C Hamano1-8/+44
"git ls-files -k" needs to crawl only the part of the working tree that may overlap the paths in the index to find killed files, but shared code with the logic to find all the untracked files, which made it unnecessarily inefficient. * jc/ls-files-killed-optim: dir.c::test_one_path(): work around directory_exists_in_index_icase() breakage t3010: update to demonstrate "ls-files -k" optimization pitfalls ls-files -k: a directory only can be killed if the index has a non-directory dir.c: use the cache_* macro to access the current index
2013-09-09Merge branch 'jl/submodule-mv'Libravatar Junio C Hamano1-208/+111
"git mv A B" when moving a submodule A does "the right thing", inclusing relocating its working tree and adjusting the paths in the .gitmodules file. * jl/submodule-mv: (53 commits) rm: delete .gitmodules entry of submodules removed from the work tree mv: update the path entry in .gitmodules for moved submodules submodule.c: add .gitmodules staging helper functions mv: move submodules using a gitfile mv: move submodules together with their work trees rm: do not set a variable twice without intermediate reading. t6131 - skip tests if on case-insensitive file system parse_pathspec: accept :(icase)path syntax pathspec: support :(glob) syntax pathspec: make --literal-pathspecs disable pathspec magic pathspec: support :(literal) syntax for noglob pathspec kill limit_pathspec_to_literal() as it's only used by parse_pathspec() parse_pathspec: preserve prefix length via PATHSPEC_PREFIX_ORIGIN parse_pathspec: make sure the prefix part is wildcard-free rename field "raw" to "_raw" in struct pathspec tree-diff: remove the use of pathspec's raw[] in follow-rename codepath remove match_pathspec() in favor of match_pathspec_depth() remove init_pathspec() in favor of parse_pathspec() remove diff_tree_{setup,release}_paths convert common_prefix() to use struct pathspec ...
2013-08-23dir.c::test_one_path(): work around directory_exists_in_index_icase() breakageLibravatar Eric Sunshine1-3/+15
directory_exists_in_index() takes pathname and its length, but its helper function directory_exists_in_index_icase() reads one byte beyond the end of the pathname and expects there to be a '/'. This needs to be fixed, as that one-byte-beyond-the-end location may not even be readable, possibly by not registering directories to name hashes with trailing slashes. In the meantime, update the new caller added recently to treat_one_path() to make sure that the path buffer it gives the function is one byte longer than the path it is asking the function about by appending a slash to it. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-15ls-files -k: a directory only can be killed if the index has a non-directoryLibravatar Junio C Hamano1-2/+27
"ls-files -o" and "ls-files -k" both traverse the working tree down to find either all untracked paths or those that will be "killed" (removed from the working tree to make room) when the paths recorded in the index are checked out. It is necessary to traverse the working tree fully when enumerating all the "other" paths, but when we are only interested in "killed" paths, we can take advantage of the fact that paths that do not overlap with entries in the index can never be killed. The treat_one_path() helper function, which is called during the recursive traversal, is the ideal place to implement an optimization. When we are looking at a directory P in the working tree, there are three cases: (1) P exists in the index. Everything inside the directory P in the working tree needs to go when P is checked out from the index. (2) P does not exist in the index, but there is P/Q in the index. We know P will stay a directory when we check out the contents of the index, but we do not know yet if there is a directory P/Q in the working tree to be killed, so we need to recurse. (3) P does not exist in the index, and there is no P/Q in the index to require P to be a directory, either. Only in this case, we know that everything inside P will not be killed without recursing. Note that this helper is called by treat_leading_path() that decides if we need to traverse only subdirectories of a single common leading directory, which is essential for this optimization to be correct. This caller checks each level of the leading path component from shallower directory to deeper ones, and that is what allows us to only check if the path appears in the index. If the call to treat_one_path() weren't there, given a path P/Q/R, the real traversal may start from directory P/Q/R, even when the index records P as a regular file, and we would end up having to check if any leading subpath in P/Q/R, e.g. P, appears in the index. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-15dir.c: use the cache_* macro to access the current indexLibravatar Junio C Hamano1-6/+5
These codepaths always start from the_index and use index_* functions, but there is no reason to do so. Use the compatibility cache_* macro to access the current in-core index like everybody else. While at it, fix typo in the comment for a function to check if a path within a directory appears in the index. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-22Merge branch 'nd/const-struct-cache-entry'Libravatar Junio C Hamano1-3/+3
* nd/const-struct-cache-entry: Convert "struct cache_entry *" to "const ..." wherever possible