Age | Commit message (Collapse) | Author | Files | Lines |
|
* lt/tree-2:
tree_entry(): new tree-walking helper function
|
|
This adds a "tree_entry()" function that combines the common operation of
doing a "tree_entry_extract()" + "update_tree_entry()".
It also has a simplified calling convention, designed for simple loops
that traverse over a whole tree: the arguments are pointers to the tree
descriptor and a name_entry structure to fill in, and it returns a boolean
"true" if there was an entry left to be gotten in the tree.
This allows tree traversal with
struct tree_desc desc;
struct name_entry entry;
desc.buf = tree->buffer;
desc.size = tree->size;
while (tree_entry(&desc, &entry) {
... use "entry.{path, sha1, mode, pathlen}" ...
}
which is not only shorter than writing it out in full, it's hopefully less
error prone too.
[ It's actually a tad faster too - we don't need to recalculate the entry
pathlength in both extract and update, but need to do it only once.
Also, some callers can avoid doing a "strlen()" on the result, since
it's returned as part of the name_entry structure.
However, by now we're talking just 1% speedup on "git-rev-list --objects
--all", and we're definitely at the point where tree walking is no
longer the issue any more. ]
NOTE! Not everybody wants to use this new helper function, since some of
the tree walkers very much on purpose do the descriptor update separately
from the entry extraction. So the "extract + update" sequence still
remains as the core sequence, this is just a simplified interface.
We should probably add a silly two-line inline helper function for
initializing the descriptor from the "struct tree" too, just to cut down
on the noise from that common "desc" initializer.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
* master:
send-email: do not pass bogus address to local sendmail binary
Add a basic test case for git send-email, and fix some real bugs discovered.
Fix a bug in email extraction used in git-send-email.
Add support for --bcc to git-send-email.
git-send-email: Add References: headers to emails, in addition to In-Reply-To:
git-clean fails on files beginning with a dash
git-svn: remove assertion that broke with older versions of svn
git-svn: t0001: workaround a heredoc bug in old versions of dash
Documentation: fix a tutorial-2 typo
Documentation: retitle the git-core tutorial
documentation: add brief mention of cat-file to tutorial part I
documentation: mention gitk font adjustment in tutorial
Fix some documentation typoes
|
|
This makes t9001 test happy. Also fixes the warning on
uninitialized $references variable again.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Signed-off-by: Ryan Anderson <rda@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
(Also, kill off an accidentally created warning.)
Signed-off-by: Ryan Anderson <rda@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Signed-off-by: Ryan Anderson <rda@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Signed-off-by: Ryan Anderson <rda@google.com>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Reproducible with:
$ git init-db
$ echo "some text" >-file
$ git clean
Removing -file
rm: invalid option -- l
Try `rm --help' for more information.
Signed-off-by: Dennis Stosberg <dennis@stosberg.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
svn < 1.3.x would display changes to keywords lines as modified
if they aren't expanded in the working copy. We already check
for changes against the git tree here, so checking against the
svn one is probably excessive.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
The dash installed on my Debian Sarge boxes don't seem to like
<<'' as a heredoc starter. Recent versions of dash do not need
this fix.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Fix a typo.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Give the git-core tutorial a name that better reflects its intended
audience.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
I'd rather avoid git cat-file so early on, but the
git-cat-file -p old-commit:/path/to/file
trick is too useful....
Also fix a nearby typo while we're at it.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Kind of silly, but the font I get by default in gitk makes it mostly
unusable for me, so this is the first thing I'd want to know about.
(But maybe there's a better suggestion than just Ctrl-='ing until
satisfied.)
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Fix some typoes in Documentation/everyday.txt
Signed-off-by: Horst H. von Brand <vonbrand@inf.utfsm.cl>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
* jc/lt-tree-n-cache-tree:
adjust to the rebased series by Linus.
Remove last vestiges of generic tree_entry_list
Convert fetch.c: process_tree() to raw tree walker
Convert "mark_tree_uninteresting()" to raw tree walker
Remove unused "zeropad" entry from tree_list_entry
fsck-objects: avoid unnecessary tree_entry_list usage
Remove "tree->entries" tree-entry list from tree parser
builtin-read-tree.c: avoid tree_entry_list in prime_cache_tree_rec()
Switch "read_tree_recursive()" over to tree-walk functionality
Make "tree_entry" have a SHA1 instead of a union of object pointers
Make "struct tree" contain the pointer to the tree buffer
Make git-diff-tree indicate when it flushes
Remove unnecessary output from t3600-rm.
|
|
* jc/lt-tree-n-cache-tree:
adjust to the rebased series by Linus.
Remove "tree->entries" tree-entry list from tree parser
Switch "read_tree_recursive()" over to tree-walk functionality
Make "tree_entry" have a SHA1 instead of a union of object pointers
Add raw tree buffer info to "struct tree"
This results as if an "ours" merge absorbed the previous "next"
branch change into the 10-patch series, but it really is a result
of an honest merge.
nothing to commit
|
|
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
The old tree_entry_list is dead, long live the unified single tree
parser.
Yes, we now still have a compatibility function to create a bogus
tree_entry_list in builtin-read-tree.c, but that is now entirely local
to that very messy piece of code.
I'd love to clean read-tree.c up too, but I'm too scared right now, so
the best I can do is to just contain the damage, and try to make sure
that no new users of the tree_entry_list sprout up by not having it as
an exported interface any more.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This leaves only the horrid code in builtin-read-tree.c using the old
interface. Some day I will gather the strength to tackle that one too.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Not very many users to go..
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
That was a hack, only needed because 'git fsck-objects' didn't look at
the raw tree format. Now that fsck traverses the tree itself, we can
drop it.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Prime example of where the raw tree parser is easier for everybody.
[jc: "Aieee" one-liner fix from the list applied. ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Instead, just use the tree buffer directly, and use the tree-walk
infrastructure to walk the buffers instead of the tree-entry list.
The tree-entry list is inefficient, and generates tons of small
allocations for no good reason. The tree-walk infrastructure is
generally no harder to use than following a linked list, and allows
us to do most tree parsing in-place.
Some programs still use the old tree-entry lists, and are a bit
painful to convert without major surgery. For them we have a helper
function that creates a temporary tree-entry list on demand.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Use the raw tree walker instead.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Don't use the tree_entry list any more.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This is preparatory work for further cleanups, where we try to make
tree_entry look more like the more efficient tree-walk descriptor.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This allows us to avoid allocating information for names etc, because
we can just use the information from the tree buffer directly.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
There are times when gitk needs to know that the commits it has sent
to git-diff-tree --stdin did not match, and it needs to know in a
timely fashion even if none of them match. At the moment,
git-diff-tree outputs nothing for non-matching commits, so it is
impossible for gitk to distinguish between git-diff-tree being slow
and git-diff-tree saying no.
This makes git-diff-tree flush its output and echo back the
input line when it is not a valid-looking object name. Gitk, or
other users of git-diff-tree --stdin, can use a blank line or
any other "marker line" to indicate that git-diff-tree has
processed all the commits on its input up to the echoed back
marker line, and any commits that have not been output do not
match.
[jc: re-done after a couple of back-and-forth discussion on the list.]
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Moved the setup commands into test_expect_success blocks so their
output is hidden unless -v is used. This makes the test suite look
a little cleaner when the rm test-file setup step fails (and was
expected to fail for most cases).
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
* jc/lt-tree-n-cache-tree:
Remove "tree->entries" tree-entry list from tree parser
Switch "read_tree_recursive()" over to tree-walk functionality
Make "tree_entry" have a SHA1 instead of a union of object pointers
Add raw tree buffer info to "struct tree"
Don't use "sscanf()" for tree mode scanning
git-fetch: avoid using "case ... in (arm)"
|
|
* lt/apply:
apply: force matching at the beginning.
Add a test-case for git-apply trying to add an ending line
apply: treat EOF as proper context.
|
|
* jc/cache-tree: (26 commits)
builtin-rm: squelch compiler warnings.
git-write-tree writes garbage on sparc64
Fix crash when reading the empty tree
fsck-objects: do not segfault on missing tree in cache-tree
cache-tree: a bit more debugging support.
read-tree: invalidate cache-tree entry when a new index entry is added.
Fix test-dump-cache-tree in one-tree disappeared case.
fsck-objects: mark objects reachable from cache-tree
cache-tree: replace a sscanf() by two strtol() calls
cache-tree.c: typefix
test-dump-cache-tree: validate the cached data as well.
cache_tree_update: give an option to update cache-tree only.
read-tree: teach 1-way merege and plain read to prime cache-tree.
read-tree: teach 1 and 2 way merges about cache-tree.
update-index: when --unresolve, smudge the relevant cache-tree entries.
test-dump-cache-tree: report number of subtrees.
cache-tree: sort the subtree entries.
Teach fsck-objects about cache-tree.
index: make the index file format extensible.
cache-tree: protect against "git prune".
...
Conflicts:
Makefile, builtin.h, git.c: resolved the same way as in next.
|
|
* lt/tree: (98 commits)
Remove "tree->entries" tree-entry list from tree parser
Switch "read_tree_recursive()" over to tree-walk functionality
Make "tree_entry" have a SHA1 instead of a union of object pointers
Add raw tree buffer info to "struct tree"
Don't use "sscanf()" for tree mode scanning
git-fetch: avoid using "case ... in (arm)"
mailinfo: skip bogus UNIX From line inside body
mailinfo: More carefully parse header lines in read_one_header_line()
Allow in body headers beyond the in body header prefix.
More accurately detect header lines in read_one_header_line
In handle_body only read a line if we don't already have one.
Refactor commit messge handling.
Move B and Q decoding into check header.
Make read_one_header_line return a flag not a length.
Fix memory leak in "git rev-list --objects"
gitview: Move the console error messages to message dialog
gitview: Add key binding for F5.
Let git-clone to pass --template=dir option to git-init-db.
Make cvsexportcommit create parent directories as needed.
Document current cvsexportcommit limitations.
...
Conflicts:
Makefile, builtin.h, git.c are trivially resolved.
builtin-read-tree.c needed adjustment for the tree
parser change.
|
|
* jc/dirwalk-n-cache-tree: (212 commits)
builtin-rm: squelch compiler warnings.
Add builtin "git rm" command
Move pathspec matching from builtin-add.c into dir.c
Prevent bogus paths from being added to the index.
builtin-add: fix unmatched pathspec warnings.
Remove old "git-add.sh" remnants
builtin-add: warn on unmatched pathspecs
Do "git add" as a builtin
Clean up git-ls-file directory walking library interface
libify git-ls-files directory traversal
Add a conversion tool to migrate remote information into the config
fetch, pull: ask config for remote information
Fix build procedure for builtin-init-db
read-tree -m -u: do not overwrite or remove untracked working tree files.
apply --cached: do not check newly added file in the working tree
Implement a --dry-run option to git-quiltimport
Implement git-quiltimport
Revert "builtin-grep: workaround for non GNU grep."
builtin-grep: workaround for non GNU grep.
builtin-grep: workaround for non GNU grep.
...
|
|
This finally removes the tree-entry list from "struct tree", since most of
the users can just use the tree-walk infrastructure to walk the raw tree
buffers instead of the tree-entry list.
The tree-entry list is inefficient, and generates tons of small
allocations for no good reason. The tree-walk infrastructure is generally
no harder to use than following a linked list, and allows us to do most
tree parsing in-place.
Some programs still use the old tree-entry lists, and are a bit painful to
convert without major surgery. For them we have a helper function that
creates a temporary tree-entry list on demand. We can convert those too
eventually, but with this they no longer affect any users who don't need
the explicit lists.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Don't use the tree_entry list, it really had no major reason not to just
walk the raw tree instead.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This is preparatory work for further cleanups, where we try to make
tree_entry look more like the more efficient tree-walk descriptor.
Instead of having a union of pointers to blob/tree/objects, this just
makes "struct tree_entry" have the raw SHA1, and makes all the users use
that instead (often that implies adding a "lookup_tree(..)" on the sha1,
but sometimes the user just wanted the SHA1 in the first place, and it
just avoids an unnecessary indirection).
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
This allows us to avoid allocating information for names etc, because
we can just use the information from the tree buffer directly.
We still keep the old "tree_entry_list" in struct tree as well, so old
users aren't affected, apart from the fact that the allocations are
different (if you free a tree entry, you should no longer free the name
allocation for it, since it's allocated as part of "tree->buffer")
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Doing an oprofile run on the result of my git rev-list memory leak fixes
and tree parsing cleanups, I was surprised by the third-highest entry
being
samples % image name app name symbol name
179751 2.7163 libc-2.4.so libc-2.4.so _IO_vfscanf@@GLIBC_2.4
where that 2.7% is actually more than 5% of one CPU, because this was run
on a dual CPU setup with the other CPU just being idle.
That seems to all be from the use of 'sscanf(tree, "%o", &mode)' for the
tree buffer parsing.
So do the trivial octal parsing by hand, which also gives us where the
first space in the string is (and thus where the pathname starts) so we
can get rid of the "strchr(tree, ' ')" call too.
This brings the "git rev-list --all --objects" time down from 63 seconds
to 55 seconds on the historical kernel archive for me, so it's quite
noticeable - tree parsing is a lot of what we end up doing when following
all the objects.
[ I also see a 5% speedup on a full "git fsck-objects" on the current
kernel archive, so that sscanf() really does seem to have hurt our
performance by a surprising amount ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
NetBSD ash chokes on the optional open parenthesis for case arms. Inside
$(command substitution), however, bash barfs without. So adjust things
accordingly.
Originally pointed out by Dennis Stosberg.
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
* jc/cache-tree:
git-write-tree writes garbage on sparc64
|
|
* master:
Fix memory leak in "git rev-list --objects"
gitview: Move the console error messages to message dialog
gitview: Add key binding for F5.
Let git-clone to pass --template=dir option to git-init-db.
Make cvsexportcommit create parent directories as needed.
Document current cvsexportcommit limitations.
Do not call 'cmp' with non-existant -q flag.
Fix "--abbrev=xyz" for revision listing
t1002: use -U0 instead of --unified=0
format-patch: -n and -k are mutually exclusive.
|
|
* jc/mailinfo:
mailinfo: skip bogus UNIX From line inside body
|
|
* eb/mailinfo:
mailinfo: More carefully parse header lines in read_one_header_line()
Allow in body headers beyond the in body header prefix.
More accurately detect header lines in read_one_header_line
In handle_body only read a line if we don't already have one.
Refactor commit messge handling.
Move B and Q decoding into check header.
Make read_one_header_line return a flag not a length.
|
|
In the "next" branch, write_index_ext_header() writes garbage on a
64-bit big-endian machine; the written index file will be unreadable.
I noticed this on NetBSD/sparc64. Reproducible with:
$ git init-db
$ :>file
$ git-update-index --add file
$ git-write-tree
$ git-update-index
error: index uses extension, which we do not understand
fatal: index file corrupt
Signed-off-by: Dennis Stosberg <dennis@stosberg.net>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Martin Langhoff points out that "git repack -a" ends up using up a lot of
memory for big archives, and that git cvsimport probably should do only
incremental repacks in order to avoid having repacking flush all the
caches.
The big majority of the memory usage of repacking is from git rev-list
tracking all objects, and this patch should go a long way in avoiding the
excessive memory usage: the bulk of it was due to the object names being
leaked from the tree parser.
For the historic Linux kernel archive, this simple patch does:
Before:
/usr/bin/time git-rev-list --all --objects > /dev/null
72.45user 0.82system 1:13.55elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+125376minor)pagefaults 0swaps
After:
/usr/bin/time git-rev-list --all --objects > /dev/null
75.22user 0.48system 1:16.34elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (0major+43921minor)pagefaults 0swaps
where we do end up wasting a bit of time on some extra strdup()s (which
could be avoided, but that would require tracking where the pathnames came
from), but we avoid a lot of memory usage.
Minor page faults track maximum RSS very closely (each page fault maps in
one page into memory), so the reduction from 125376 page faults to 43921
means a rough reduction of VM footprint from almost half a gigabyte to
about a third of that. Those numbers were also double-checked by looking
at "top" while the process was running.
(Side note: at least part of the remaining VM footprint is the mapping of
the 177MB pack-file, so the remaining memory use is at least partly "well
behaved" from a project caching perspective).
For the current git archive itself, the memory usage for a "--all
--objects" rev-list invocation dropped from 7128 pages to 2318 (27MB to
9MB), so the reduction seems to hold for much smaller projects too.
For regular "git-rev-list" usage (ie without the "--objects" flag) this
patch has no impact.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Signed-off-by: Junio C Hamano <junkio@cox.net>
|
|
Signed-off-by: Junio C Hamano <junkio@cox.net>
|