summaryrefslogtreecommitdiff
path: root/read-cache.c
AgeCommit message (Collapse)AuthorFilesLines
2011-12-09Merge branch 'rs/allocate-cache-entry-individually'Libravatar Junio C Hamano1-51/+32
* rs/allocate-cache-entry-individually: cache.h: put single NUL at end of struct cache_entry read-cache.c: allocate index entries individually Conflicts: read-cache.c
2011-11-18refresh_index: make porcelain output more specificLibravatar Jeff King1-2/+21
If you have a deleted file and a porcelain refreshes the cache, we print: Unstaged changes after reset: M file This is technically correct, in that the file is modified, but it's friendlier to the user if we further differentiate the case of a deleted file (especially because this output looks a lot like "diff --name-status", which would also make the distinction). Similarly, we can distinguish typechanges ("T") and intent-to-add files ("A"), both of which appear as just "M" in the current output. The plumbing output for all cases remains "needs update" for historical compatibility. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-11-18refresh_index: rename format variablesLibravatar Jeff King1-6/+6
When refreshing the index, for modified (or unmerged) files we will print "needs update" (or "needs merge") for plumbing, or line similar to the output from "diff --name-status" for porcelain. The variables holding which type of message to show are named after the plumbing messages. However, as we begin to differentiate more cases at the porcelain level (with the plumbing message staying the same), that naming scheme will become awkward. Instead, name the variables after which case we found (modified or unmerged), not what we will output. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-11-18read-cache: let refresh_cache_ent pass up changed flagsLibravatar Jeff King1-3/+6
This will enable refresh_cache to differentiate more cases of modification (such as typechange) when telling the user what isn't fresh. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-26read-cache.c: allocate index entries individuallyLibravatar René Scharfe1-50/+31
The code to estimate the in-memory size of the index based on its on-disk representation is subtly wrong for certain architecture-dependent struct layouts. Instead of fixing it, replace the code to keep the index entries in a single large block of memory and allocate each entry separately instead. This is both simpler and more flexible, as individual entries can now be freed. Actually using that added flexibility is left for a later patch. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-26read-cache.c: fix index memory allocationLibravatar René Scharfe1-3/+3
estimate_cache_size() tries to guess how much memory is needed for the in-memory representation of an index file. It does that by using the file size, the number of entries and the difference of the sizes of the on-disk and in-memory structs -- without having to check the length of the name of each entry, which varies for each entry, but their sums are the same no matter the representation. Except there can be a difference. First of all, the size is really calculated by ce_size and ondisk_ce_size based on offsetof(..., name), not sizeof, which can be different. And entries are padded with 1 to 8 NULs at the end (after the variable name) to make their total length a multiple of eight. So in order to allocate enough memory to hold the index, change the delta calculation to be based on offsetof(..., name) and round up to the next multiple of eight. On a 32-bit Linux, this delta was used before: sizeof(struct cache_entry) == 72 sizeof(struct ondisk_cache_entry) == 64 --- 8 The actual difference for an entry with a filename length of one was, however (find the definitions are in cache.h): offsetof(struct cache_entry, name) == 72 offsetof(struct ondisk_cache_entry, name) == 62 ce_size == (72 + 1 + 8) & ~7 == 80 ondisk_ce_size == (62 + 1 + 8) & ~7 == 64 --- 16 So eight bytes less had been allocated for such entries. The new formula yields the correct delta: (72 - 62 + 7) & ~7 == 16 Reported-by: John Hsing <tsyj2007@gmail.com> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-25Merge branch 'maint'Libravatar Junio C Hamano1-1/+1
* maint: whitespace: have SP on both sides of an assignment "=" update-ref: whitespace fix
2011-08-25whitespace: have SP on both sides of an assignment "="Libravatar Junio C Hamano1-1/+1
I've deliberately excluded the borrowed code in compat/nedmalloc directory. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-29Merge branch 'ef/maint-win-verify-path'Libravatar Junio C Hamano1-14/+11
* ef/maint-win-verify-path: verify_dotfile(): do not assume '/' is the path seperator verify_path(): simplify check at the directory boundary verify_path: consider dos drive prefix real_path: do not assume '/' is the path seperator A Windows path starting with a backslash is absolute
2011-06-08verify_dotfile(): do not assume '/' is the path seperatorLibravatar Theo Niessink1-3/+4
verify_dotfile() currently assumes that the path seperator is '/', but on Windows it can also be '\\', so use is_dir_sep() instead. Signed-off-by: Theo Niessink <theo@taletn.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-07verify_path(): simplify check at the directory boundaryLibravatar Junio C Hamano1-10/+3
We simply want to say "At a directory boundary, be careful with a name that begins with a dot, forbid a name that ends with the boundary character or has duplicated bounadry characters". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27verify_path: consider dos drive prefixLibravatar Erik Faye-Lund1-1/+4
If someone manage to create a repo with a 'C:' entry in the root-tree, files can be written outside of the working-dir. This opens up a can-of-worms of exploits. Fix it by explicitly checking for a dos drive prefix when verifying a paht. While we're at it, make sure that paths beginning with '\' is considered absolute as well. Noticed-by: Theo Niessink <theo@taletn.com> Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-09index_fd(): turn write_object and format_check arguments into one flagLibravatar Junio C Hamano1-2/+2
The "format_check" parameter tucked after the existing parameters is too ugly an afterthought to live in any reasonable API. Combine it with the other boolean parameter "write_object" into a single "flags" parameter. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-03Merge branch 'jc/index-update-if-able' into maintLibravatar Junio C Hamano1-0/+25
* jc/index-update-if-able: update $GIT_INDEX_FILE when there are racily clean entries diff/status: refactor opportunistic index update
2011-03-26Merge branch 'jc/index-update-if-able'Libravatar Junio C Hamano1-0/+25
* jc/index-update-if-able: update $GIT_INDEX_FILE when there are racily clean entries diff/status: refactor opportunistic index update
2011-03-21update $GIT_INDEX_FILE when there are racily clean entriesLibravatar Junio C Hamano1-1/+14
Traditional "opportunistic index update" done by read-only "diff" and "status" was about updating cached lstat(2) information in the index for the next round. We missed another obvious optimization opportunity: when there are racily clean entries that will cease to be racily clean by updating $GIT_INDEX_FILE. Detect that case and write $GIT_INDEX_FILE out to give it a newer timestamp. Noticed by Lasse Makholm by stracing "git status" in a fresh checkout and counting the number of open(2) calls. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-21diff/status: refactor opportunistic index updateLibravatar Junio C Hamano1-0/+12
When we had to refresh the index internally before running diff or status, we opportunistically updated the $GIT_INDEX_FILE so that later invocation of git can use the lstat(2) we already did in this invocation. Make them share a helper function to do so. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-27Merge branch 'nd/hash-object-sanity'Libravatar Junio C Hamano1-1/+1
* nd/hash-object-sanity: Make hash-object more robust against malformed objects Conflicts: cache.h
2011-02-27Merge branch 'nd/struct-pathspec'Libravatar Junio C Hamano1-23/+2
* nd/struct-pathspec: (22 commits) t6004: add pathspec globbing test for log family t7810: overlapping pathspecs and depth limit grep: drop pathspec_matches() in favor of tree_entry_interesting() grep: use writable strbuf from caller for grep_tree() grep: use match_pathspec_depth() for cache/worktree grepping grep: convert to use struct pathspec Convert ce_path_match() to use match_pathspec_depth() Convert ce_path_match() to use struct pathspec struct rev_info: convert prune_data to struct pathspec pathspec: add match_pathspec_depth() tree_entry_interesting(): optimize wildcard matching when base is matched tree_entry_interesting(): support wildcard matching tree_entry_interesting(): fix depth limit with overlapping pathspecs tree_entry_interesting(): support depth limit tree_entry_interesting(): refactor into separate smaller functions diff-tree: convert base+baselen to writable strbuf glossary: define pathspec Move tree_entry_interesting() to tree-walk.c and export it tree_entry_interesting(): remove dependency on struct diff_options Convert struct diff_options to use struct pathspec ...
2011-02-22update-index --refresh --porcelain: add missing constLibravatar Jonathan Nieder1-2/+2
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-07Make hash-object more robust against malformed objectsLibravatar Nguyễn Thái Ngọc Duy1-1/+1
Commits, trees and tags have structure. Don't let users feed git with malformed ones. Sooner or later git will die() when encountering them. Note that this patch does not check semantics. A tree that points to non-existent objects is perfectly OK (and should be so, users may choose to add commit first, then its associated tree for example). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03Convert ce_path_match() to use match_pathspec_depth()Libravatar Nguyễn Thái Ngọc Duy1-23/+1
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-03Convert ce_path_match() to use struct pathspecLibravatar Nguyễn Thái Ngọc Duy1-3/+4
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-03Merge branch 'jj/icase-directory'Libravatar Junio C Hamano1-0/+23
* jj/icase-directory: Support case folding in git fast-import when core.ignorecase=true Support case folding for git add when core.ignorecase=true Add case insensitivity support when using git ls-files Add case insensitivity support for directories when using git status Case insensitivity support for .gitignore via core.ignorecase Add string comparison functions that respect the ignore_case variable. Makefile & configure: add a NO_FNMATCH_CASEFOLD flag Makefile & configure: add a NO_FNMATCH flag Conflicts: Makefile config.mak.in configure.ac fast-import.c
2010-10-06Support case folding for git add when core.ignorecase=trueLibravatar Joshua Jensen1-0/+23
When MyDir/ABC/filea.txt is added to Git, the disk directory MyDir/ABC/ is renamed to mydir/aBc/, and then mydir/aBc/fileb.txt is added, the index will contain MyDir/ABC/filea.txt and mydir/aBc/fileb.txt. Although the earlier portions of this patch series account for those differences in case, this patch makes the pathing consistent by folding the case of newly added files against the first file added with that path. In read-cache.c's add_to_index(), the index_name_exists() support used for git status's case insensitive directory lookups is used to find the proper directory case according to what the user already checked in. That is, MyDir/ABC/'s case is used to alter the stored path for fileb.txt to MyDir/ABC/fileb.txt (instead of mydir/aBc/fileb.txt). This is especially important when cloning a repository to a case sensitive file system. MyDir/ABC/ and mydir/aBc/ exist in the same directory on a Windows machine, but on Linux, the files exist in two separate directories. The update to add_to_index(), in effect, treats a Windows file system as case sensitive by making path case consistent. Signed-off-by: Joshua Jensen <jjensen@workspacewhiz.com> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-11core: Stop leaking ondisk_cache_entrysLibravatar Jonathan Nieder1-1/+4
Noticed with valgrind. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-02Correct spelling of 'REUC' extensionLibravatar Shawn O. Pearce1-1/+1
The new dircache extension CACHE_EXT_RESOLVE_UNDO, whose value is 0x52455543, is actually the ASCII sequence 'REUC', not the ASCII sequence 'REUN'. Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-24Make ce_uptodate() trustworthy againLibravatar Junio C Hamano1-2/+4
The rule has always been that a cache entry that is ce_uptodate(ce) means that we already have checked the work tree entity and we know there is no change in the work tree compared to the index, and nobody should have to double check. Note that false ce_uptodate(ce) does not mean it is known to be dirty---it only means we don't know if it is clean. There are a few codepaths (refresh-index and preload-index are among them) that mark a cache entry as up-to-date based solely on the return value from ie_match_stat(); this function uses lstat() to see if the work tree entity has been touched, and for a submodule entry, if its HEAD points at the same commit as the commit recorded in the index of the superproject (a submodule that is not even cloned is considered clean). A submodule is no longer considered unmodified merely because its HEAD matches the index of the superproject these days, in order to prevent people from forgetting to commit in the submodule and updating the superproject index with the new submodule commit, before commiting the state in the superproject. However, the patch to do so didn't update the codepath that marks cache entries up-to-date based on the updated definition and instead worked it around by saying "we don't trust the return value of ce_uptodate() for submodules." This makes ce_uptodate() trustworthy again by not marking submodule entries up-to-date. The next step _could_ be to introduce a few "in-core" flag bits to cache_entry structure to record "this entry is _known_ to be dirty", call is_submodule_modified() from ie_match_stat(), and use these new bits to avoid running this rather expensive check more than once, but that can be a separate patch. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-21Remove diff machinery dependency from read-cacheLibravatar Linus Torvalds1-78/+0
Exal Sibeaz pointed out that some git files are way too big, and that add_files_to_cache() brings in all the diff machinery to any git binary that needs the basic git SHA1 object operations from read-cache.c. Which is pretty much all of them. It's doubly silly, since add_files_to_cache() is only used by builtin programs (add, checkout and commit), so it's fairly easily fixed by just moving the thing to builtin-add.c, and avoiding the dependency entirely. I initially argued to Exal that it would probably be best to try to depend on smart compilers and linkers, but after spending some time trying to make -ffunction-sections work and giving up, I think Exal was right, and the fix is to just do some trivial cleanups like this. This trivial cleanup results in pretty stunning file size differences. The diff machinery really is mostly used by just the builtin programs, and you have things like these trivial before-and-after numbers: -rwxr-xr-x 1 torvalds torvalds 1727420 2010-01-21 10:53 git-hash-object -rwxrwxr-x 1 torvalds torvalds 940265 2010-01-21 11:16 git-hash-object Now, I'm not saying that 940kB is good either, but that's mostly all the debug information - you can see the real code with 'size': text data bss dec hex filename 418675 3920 127408 550003 86473 git-hash-object (before) 230650 2288 111728 344666 5425a git-hash-object (after) ie we have a nice 24% size reduction from this trivial cleanup. It's not just that one file either. I get: [torvalds@nehalem git]$ du -s /home/torvalds/libexec/git-core 45640 /home/torvalds/libexec/git-core (before) 33508 /home/torvalds/libexec/git-core (after) so we're talking 12MB of diskspace here. (Of course, stripping all the binaries brings the 33MB down to 9MB, so the whole debug information thing is still the bulk of it all, but that's a separate issue entirely) Now, I'm sure there are other things we should do, and changing our compiler flags from -O2 to -Os would bring the text size down by an additional almost 20%, but this thing Exal pointed out seems to be some good low-hanging fruit. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-20Merge branch 'jc/cache-unmerge'Libravatar Junio C Hamano1-0/+18
* jc/cache-unmerge: rerere forget path: forget recorded resolution rerere: refactor rerere logic to make it independent from I/O rerere: remove silly 1024-byte line limit resolve-undo: teach "update-index --unresolve" to use resolve-undo info resolve-undo: "checkout -m path" uses resolve-undo information resolve-undo: allow plumbing to clear the information resolve-undo: basic tests resolve-undo: record resolved conflicts in a new index extension section builtin-merge.c: use standard active_cache macros Conflicts: builtin-ls-files.c builtin-merge.c builtin-rerere.c
2010-01-20Merge branch 'jc/symbol-static'Libravatar Junio C Hamano1-2/+4
* jc/symbol-static: date.c: mark file-local function static Replace parse_blob() with an explanatory comment symlinks.c: remove unused functions object.c: remove unused functions strbuf.c: remove unused function sha1_file.c: remove unused function mailmap.c: remove unused function utf8.c: mark file-local function static submodule.c: mark file-local function static quote.c: mark file-local function static remote-curl.c: mark file-local function static read-cache.c: mark file-local functions static parse-options.c: mark file-local function static entry.c: mark file-local function static http.c: mark file-local functions static pretty.c: mark file-local function static builtin-rev-list.c: mark file-local function static bisect.c: mark file-local function static
2010-01-13Merge branch 'cc/reset-more'Libravatar Junio C Hamano1-2/+1
* cc/reset-more: t7111: check that reset options work as described in the tables Documentation: reset: add some missing tables Fix bit assignment for CE_CONFLICTED "reset --merge": fix unmerged case reset: use "unpack_trees()" directly instead of "git read-tree" reset: add a few tests for "git reset --merge" Documentation: reset: add some tables to describe the different options reset: improve mixed reset error message when in a bare repo
2010-01-13Merge branch 'nd/sparse'Libravatar Junio C Hamano1-3/+14
* nd/sparse: (25 commits) t7002: test for not using external grep on skip-worktree paths t7002: set test prerequisite "external-grep" if supported grep: do not do external grep on skip-worktree entries commit: correctly respect skip-worktree bit ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID tests: rename duplicate t1009 sparse checkout: inhibit empty worktree Add tests for sparse checkout read-tree: add --no-sparse-checkout to disable sparse checkout support unpack-trees(): ignore worktree check outside checkout area unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout unpack-trees.c: generalize verify_* functions unpack-trees(): add CE_WT_REMOVE to remove on worktree alone Introduce "sparse checkout" dir.c: export excluded_1() and add_excludes_from_file_1() excluded_1(): support exclude files in index unpack-trees(): carry skip-worktree bit over in merged_entry() Read .gitignore from index if it is skip-worktree Avoid writing to buffer in add_excludes_from_file_1() ... Conflicts: .gitignore Documentation/config.txt Documentation/git-update-index.txt Makefile entry.c t/t7002-grep.sh
2010-01-12read-cache.c: mark file-local functions staticLibravatar Junio C Hamano1-2/+4
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-03"reset --merge": fix unmerged caseLibravatar Junio C Hamano1-2/+1
Commit 9e8ecea (Add 'merge' mode to 'git reset', 2008-12-01) disallowed "git reset --merge" when there was unmerged entries. But it wished if unmerged entries were reset as if --hard (instead of --merge) has been used. This makes sense because all "mergy" operations makes sure that any path involved in the merge does not have local modifications before starting, so resetting such a path away won't lose any information. The previous commit changed the behavior of --merge to accept resetting unmerged entries if they are reset to a different state than HEAD, but it did not reset the changes in the work tree, leaving the conflict markers in the resulting file in the work tree. Fix it by doing three things: - Update the documentation to match the wish of original "reset --merge" better, namely, "An unmerged entry is a sign that the path didn't have any local modification and can be safely resetted to whatever the new HEAD records"; - Update read_index_unmerged(), which reads the index file into the cache while dropping any higher-stage entries down to stage #0, not to copy the object name from the higher stage entry. The code used to take the object name from the a stage entry ("base" if you happened to have stage #1, or "ours" if both sides added, etc.), which essentially meant that you are getting random results depending on what the merge did. The _only_ reason we want to keep a previously unmerged entry in the index at stage #0 is so that we don't forget the fact that we have corresponding file in the work tree in order to be able to remove it when the tree we are resetting to does not have the path. In order to differentiate such an entry from ordinary cache entry, the cache entry added by read_index_unmerged() is marked as CE_CONFLICTED. - Update merged_entry() and deleted_entry() so that they pay attention to cache entries marked as CE_CONFLICTED. They are previously unmerged entries, and the files in the work tree that correspond to them are resetted away by oneway_merge() to the version from the tree we are resetting to. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-27Merge branch 'nf/maint-fix-index-ext-len-on-be64' into maintLibravatar Junio C Hamano1-1/+1
* nf/maint-fix-index-ext-len-on-be64: read_index(): fix reading extension size on BE 64-bit archs
2009-12-27read_index(): fix reading extension size on BE 64-bit archsLibravatar Nathaniel W Filardo1-1/+1
On big endian platforms with 8-byte unsigned long, the code reads the size of the index extension section (which is a 4-byte network byte order integer) incorrectly. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-25resolve-undo: record resolved conflicts in a new index extension sectionLibravatar Junio C Hamano1-0/+18
When resolving a conflict using "git add" to create a stage #0 entry, or "git rm" to remove entries at higher stages, remove_index_entry_at() function is eventually called to remove unmerged (i.e. higher stage) entries from the index. Introduce a "resolve_undo_info" structure and keep track of the removed cache entries, and save it in a new index extension section in the index_state. Operations like "read-tree -m", "merge", "checkout [-m] <branch>" and "reset" are signs that recorded information in the index is no longer necessary. The data is removed from the index extension when operations start; they may leave conflicted entries in the index, and later user actions like "git add" will record their conflicted states afresh. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-14ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALIDLibravatar Nguyễn Thái Ngọc Duy1-3/+18
Previously CE_MATCH_IGNORE_VALID flag is used by both valid and skip-worktree bits. While the two bits have similar behaviour, sharing this flag means "git update-index --really-refresh" will ignore skip-worktree while it should not. Instead another flag is introduced to ignore skip-worktree bit, CE_MATCH_IGNORE_VALID only applies to valid bit. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23Teach Git to respect skip-worktree bit (reading part)Libravatar Nguyễn Thái Ngọc Duy1-6/+2
grep: turn on --cached for files that is marked skip-worktree ls-files: do not check for deleted file that is marked skip-worktree update-index: ignore update request if it's skip-worktree, while still allows removing diff*: skip worktree version Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-21reset: make the reminder output consistent with "checkout"Libravatar Matthieu Moy1-6/+20
git reset without argument displays a summary of the local modification, like this: $ git reset Makefile: locally modified Some people have problems with this; they look like an error message. This patch makes its output mimic how "git checkout $another_branch" reports the paths with local modifications. "git add --refresh --verbose" is changed in the same way. It also adds a header to make it clear that the output is informative, and not an error. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
2009-08-21Rename REFRESH_SAY_CHANGED to REFRESH_IN_PORCELAIN.Libravatar Matthieu Moy1-1/+1
The change in the output is going to become more general than just saying "changed", so let's make the variable name more general too. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-27Use die_errno() instead of die() when checking syscallsLibravatar Thomas Rast1-1/+1
Lots of die() calls did not actually report the kind of error, which can leave the user confused as to the real problem. Use die_errno() where we check a system/library call that sets errno on failure, or one of the following that wrap such calls: Function Passes on error from -------- -------------------- odb_pack_keep open read_ancestry fopen read_in_full xread strbuf_read xread strbuf_read_file open or strbuf_read_file strbuf_readlink readlink write_in_full xwrite Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-27Convert existing die(..., strerror(errno)) to die_errno()Libravatar Thomas Rast1-3/+3
Change calls to die(..., strerror(errno)) to use the new die_errno(). In the process, also make slight style adjustments: at least state _something_ about the function that failed (instead of just printing the pathname), and put paths in single quotes. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-15checkout bugfix: use stat.mtime instead of stat.ctime in two placesLibravatar Kjetil Barvik1-2/+2
Commit e1afca4fd "write_index(): update index_state->timestamp after flushing to disk" on 2009-02-23 used stat.ctime to record the timestamp of the index-file. This is wrong, so fix this and use the correct stat.mtime timestamp instead. Commit 110c46a909 "Not all systems use st_[cm]tim field for ns resolution file timestamp" on 2009-03-08, has a similar bug for the builtin-fetch-pack.c file. Signed-off-by: Kjetil Barvik <barvik@broadpark.no> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-08Not all systems use st_[cm]tim field for ns resolution file timestampLibravatar Junio C Hamano1-2/+2
Some codepaths do not still use the ST_[CM]TIME_NSEC() pair of macros introduced by the previous commit but assumes all systems use st_mtim and st_ctim fields in "struct stat" to record nanosecond resolution part of the file timestamps. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07Record ns-timestamps if possible, but do not use it without USE_NSECLibravatar Kjetil Barvik1-25/+4
Traditionally, the lack of USE_NSEC meant "do not record nor use the nanosecond resolution part of the file timestamps". To avoid problems on filesystems that lose the ns part when the metadata is flushed to the disk and then later read back in, disabling USE_NSEC has been a good idea in general. If you are on a filesystem without such an issue, it does not hurt to read and store them in the cached stat data in the index entries even if your git is compiled without USE_NSEC. The index left with such a version of git can be read by git compiled with USE_NSEC and it can make use of the nanosecond part to optimize the check to see if the path on the filesystem hsa been modified since we last looked at. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-23write_index(): update index_state->timestamp after flushing to diskLibravatar Kjetil Barvik1-2/+10
Since this timestamp is used to check for racy-clean files, it is important to keep it uptodate. For the 'git checkout' command without the '-q' option, this make a huge difference. Before, each and every file which was updated, was racy-clean after the call to unpack_trees() and write_index() but before the GIT process ended. And because of the call to show_local_changes() in builtin-checkout.c, we ended up reading those files back into memory, doing a SHA1 to check if the files was really different from the index. And, of course, no file was different. With this fix, 'git checkout' without the '-q' option should now be almost as fast as with the '-q' option, but not quite, as we still do some few lstat(2) calls more without the '-q' option. Below is some average numbers for 10 checkout's to v2.6.27 and 10 to v2.6.25 of the Linux kernel, to show the difference: before (git version 1.6.2.rc1.256.g58a87): 7.860 user 2.427 sys 19.465 real 52.8% CPU faults: 0 major 95331 minor after: 6.184 user 2.160 sys 17.619 real 47.4% CPU faults: 0 major 38994 minor Signed-off-by: Kjetil Barvik <barvik@broadpark.no> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-19make USE_NSEC work as expectedLibravatar Kjetil Barvik1-14/+56
Since the filesystem ext4 is now defined as stable in Linux v2.6.28, and ext4 supports nanonsecond resolution timestamps natively, it is time to make USE_NSEC work as expected. This will make racy git situations less likely to happen. For 'git checkout' this means it will be less likely that we have to open, read the contents of the file into RAM, and check if file is really modified or not. The result sould be a litle less used CPU time, less pagefaults and a litle faster program, at least for 'git checkout'. Since the number of possible racy git situations would increase when disks gets faster, this patch would be more and more helpfull as times go by. For a fast Solid State Disk, this patch should be helpfull. Note that, when file operations starts to take less than 1 nanosecond, one would again start to get more racy git situations. For more info on racy git, see Documentation/technical/racy-git.txt For more info on ext4, see http://kernelnewbies.org/Ext4 Signed-off-by: Kjetil Barvik <barvik@broadpark.no> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-18check_updates(): effective removal of cache entries marked CE_REMOVELibravatar Kjetil Barvik1-0/+20
Below is oprofile output from GIT command 'git chekcout -q my-v2.6.25' (move from tag v2.6.27 to tag v2.6.25 of the Linux kernel): CPU: Core 2, speed 1999.95 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 20000 Counted INST_RETIRED_ANY_P events (number of instructions retired) with a unit mask of 0x00 (No unit mask) count 20000 CPU_CLK_UNHALT...|INST_RETIRED:2...| samples| %| samples| %| ------------------------------------ 409247 100.000 342878 100.000 git CPU_CLK_UNHALT...|INST_RETIRED:2...| samples| %| samples| %| ------------------------------------ 260476 63.6476 257843 75.1996 libz.so.1.2.3 100876 24.6492 64378 18.7758 kernel-2.6.28.4_2.vmlinux 30850 7.5382 7874 2.2964 libc-2.9.so 14775 3.6103 8390 2.4469 git 2020 0.4936 4325 1.2614 libcrypto.so.0.9.8 191 0.0467 32 0.0093 libpthread-2.9.so 58 0.0142 36 0.0105 ld-2.9.so 1 2.4e-04 0 0 libldap-2.3.so.0.2.31 Detail list of the top 20 function entries (libz counted in one blob): CPU_CLK_UNHALTED INST_RETIRED_ANY_P samples % samples % image name symbol name 260476 63.6862 257843 75.2725 libz.so.1.2.3 /lib/libz.so.1.2.3 16587 4.0555 3636 1.0615 libc-2.9.so memcpy 7710 1.8851 277 0.0809 libc-2.9.so memmove 3679 0.8995 1108 0.3235 kernel-2.6.28.4_2.vmlinux d_validate 3546 0.8670 2607 0.7611 kernel-2.6.28.4_2.vmlinux __getblk 3174 0.7760 1813 0.5293 libc-2.9.so _int_malloc 2396 0.5858 3681 1.0746 kernel-2.6.28.4_2.vmlinux copy_to_user 2270 0.5550 2528 0.7380 kernel-2.6.28.4_2.vmlinux __link_path_walk 2205 0.5391 1797 0.5246 kernel-2.6.28.4_2.vmlinux ext4_mark_iloc_dirty 2103 0.5142 1203 0.3512 kernel-2.6.28.4_2.vmlinux find_first_zero_bit 2077 0.5078 997 0.2911 kernel-2.6.28.4_2.vmlinux do_get_write_access 2070 0.5061 514 0.1501 git cache_name_compare 2043 0.4995 1501 0.4382 kernel-2.6.28.4_2.vmlinux rcu_irq_exit 2022 0.4944 1732 0.5056 kernel-2.6.28.4_2.vmlinux __ext4_get_inode_loc 2020 0.4939 4325 1.2626 libcrypto.so.0.9.8 /usr/lib/libcrypto.so.0.9.8 1965 0.4804 1384 0.4040 git patch_delta 1708 0.4176 984 0.2873 kernel-2.6.28.4_2.vmlinux rcu_sched_grace_period 1682 0.4112 727 0.2122 kernel-2.6.28.4_2.vmlinux sysfs_slab_alias 1659 0.4056 290 0.0847 git find_pack_entry_one 1480 0.3619 1307 0.3816 kernel-2.6.28.4_2.vmlinux ext4_writepage_trans_blocks Notice the memmove line, where the CPU did 7710 / 277 = 27.8 cycles per instruction, and compared to the total cycles spent inside the source code of GIT for this command, all the memmove() calls translates to (7710 * 100) / 14775 = 52.2% of this. Retesting with a GIT program compiled for gcov usage, I found out that the memmove() calls came from remove_index_entry_at() in read-cache.c, where we have: memmove(istate->cache + pos, istate->cache + pos + 1, (istate->cache_nr - pos) * sizeof(struct cache_entry *)); remove_index_entry_at() is called 4902 times from check_updates() in unpack-trees.c, and each time called we move each cache_entry pointers (from the removed one) one step to the left. Since we have 28828 entries in the cache this time, and if we on average move half of them each time, we in total move approximately 4902 * 0.5 * 28828 * 4 = 282 629 712 bytes, or twice this amount if each pointer is 8 bytes (64 bit). OK, is seems that the function check_updates() is called 28 times, so the estimated guess above had been more correct if check_updates() had been called only once, but the point is: we get lots of bytes moved. To fix this, and use an O(N) algorithm instead, where N is the number of cache_entries, we delete/remove all entries in one loop through all entries. From a retest, the new remove_marked_cache_entries() from the patch below, ended up with the following output line from oprofile: 46 0.0105 15 0.0041 git remove_marked_cache_entries If we can trust the numbers from oprofile in this case, we saved approximately ((7710 - 46) * 20000) / (2 * 1000 * 1000 * 1000) = 0.077 seconds CPU time with this fix for this particular test. And notice that now the CPU did only 46 / 15 = 3.1 cycles/instruction. Signed-off-by: Kjetil Barvik <barvik@broadpark.no> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>