summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2012-02-02grep: load file data after checking binary-nessLibravatar Jeff King1-3/+3
Usually we load each file to grep into memory, check whether it's binary, and then either grep it (the default) or not (if "-I" was given). In the "-I" case, we can skip loading the file entirely if it is marked as binary via gitattributes. On my giant 3-gigabyte media repository, doing "git grep -I foo" went from: real 0m0.712s user 0m0.044s sys 0m4.780s to: real 0m0.026s user 0m0.016s sys 0m0.020s Obviously this is an extreme example. The repo is almost entirely binary files, and you can see that we spent all of our time asking the kernel to read() the data. However, with a cold disk cache, even avoiding a few binary files can have an impact. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02grep: respect diff attributes for binary-nessLibravatar Jeff King3-2/+39
There is currently no way for users to tell git-grep that a particular path is or is not a binary file; instead, grep always relies on its auto-detection (or the user specifying "-a" to treat all binary-looking files like text). This patch teaches git-grep to use the same attribute lookup that is used by git-diff. We could add a new "grep" flag, but that is unnecessarily complex and unlikely to be useful. Despite the name, the "-diff" attribute (or "diff=foo" and the associated diff.foo.binary config option) are really about describing the contents of the path. It's simply historical that diff was the only thing that cared about these attributes in the past. And if this simple approach turns out to be insufficient, we still have a backwards-compatible path forward: we can add a separate "grep" attribute, and fall back to respecting "diff" if it is unset. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02grep: cache userdiff_driver in grep_sourceLibravatar Jeff King2-6/+20
Right now, grep only uses the userdiff_driver for one thing: looking up funcname patterns for "-p" and "-W". As new uses for userdiff drivers are added to the grep code, we want to minimize attribute lookups, which can be expensive. It might seem at first that this would also optimize multiple lookups when the funcname pattern for a file is needed multiple times. However, the compiled funcname pattern is already cached in struct grep_opt's "priv" member, so multiple lookups are already suppressed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02grep: drop grep_buffer's "name" parameterLibravatar Jeff King3-4/+3
Before the grep_source interface existed, grep_buffer was used by two types of callers: 1. Ones which pulled a file into a buffer, and then wanted to supply the file's name for the output (i.e., git grep). 2. Ones which really just wanted to grep a buffer (i.e., git log --grep). Callers in set (1) should now be using grep_source. Callers in set (2) always pass NULL for the "name" parameter of grep_buffer. We can therefore get rid of this now-useless parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02convert git-grep to use grep_source interfaceLibravatar Jeff King1-119/+23
The grep_source interface (as opposed to grep_buffer) will eventually gives us a richer interface for telling the low-level grep code about our buffers. Eventually this will lead to things like better binary-file handling. For now, it lets us drop a lot of now-redundant code. The conversion is mostly straight-forward. One thing to note is that the memory ownership rules for "struct grep_source" are different than the "struct work_item" found here (the former will copy things like the filename, rather than taking ownership). Therefore you will also see some slight tweaking of when filename buffers are released. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02grep: refactor the concept of "grep source" into an objectLibravatar Jeff King2-34/+186
The main interface to the low-level grep code is grep_buffer, which takes a pointer to a buffer and a size. This is convenient and flexible (we use it to grep commit bodies, files on disk, and blobs by sha1), but it makes it hard to pass extra information about what we are grepping (either for correctness, like overriding binary auto-detection, or for optimizations, like lazily loading blob contents). Instead, let's encapsulate the idea of a "grep source", including the buffer, its size, and where the data is coming from. This is similar to the diff_filespec structure used by the diff code (unsurprising, since future patches will implement some of the same optimizations found there). The diffstat is slightly scarier than the actual patch content. Most of the modified lines are simply replacing access to raw variables with their counterparts that are now in a "struct grep_source". Most of the added lines were taken from builtin/grep.c, which partially abstracted the idea of grep sources (for file vs sha1 sources). Instead of dropping the now-redundant code, this patch leaves builtin/grep.c using the traditional grep_buffer interface (which now wraps the grep_source interface). That makes it easy to test that there is no change of behavior (yet). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02grep: move sha1-reading mutex into low-level codeLibravatar Jeff King3-23/+29
The multi-threaded git-grep code needs to serialize access to the thread-unsafe read_sha1_file call. It does this with a mutex that is local to builtin/grep.c. Let's instead push this down into grep.c, where it can be used by both builtin/grep.c and grep.c. This will let us safely teach the low-level grep.c code tricks that involve reading from the object db. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-02-02grep: make locking flag globalLibravatar Jeff King3-11/+13
The low-level grep code traditionally didn't care about threading, as it doesn't do any threading itself and didn't call out to other non-thread-safe code. That changed with 0579f91 (grep: enable threading with -p and -W using lazy attribute lookup, 2011-12-12), which pushed the lookup of funcname attributes (which is not thread-safe) into the low-level grep code. As a result, the low-level code learned about a new global "grep_attr_mutex" to serialize access to the attribute code. A multi-threaded caller (e.g., builtin/grep.c) is expected to initialize the mutex and set "use_threads" in the grep_opt structure. The low-level code only uses the lock if use_threads is set. However, putting the use_threads flag into the grep_opt struct is not the most logical place. Whether threading is in use is not something that matters for each call to grep_buffer, but is instead global to the whole program (i.e., if any thread is doing multi-threaded grep, every other thread, even if it thinks it is doing its own single-threaded grep, would need to use the locking). In practice, this distinction isn't a problem for us, because the only user of multi-threaded grep is "git-grep", which does nothing except call grep. This patch turns the opt->use_threads flag into a global flag. More important than the nit-picking semantic argument above is that this means that the locking functions don't need to actually have access to a grep_opt to know whether to lock. Which in turn can make adding new locks simpler, as we don't need to pass around a grep_opt. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-27Git 1.7.9Libravatar Junio C Hamano2-1/+6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-26INSTALL: warn about recent Fedora breakageLibravatar Junio C Hamano1-1/+5
Recent releases of Redhat/Fedora are reported to ship Perl binary package with some core modules stripped away (see http://lwn.net/Articles/477234/) against the upstream Perl5 people's wishes. The Time::HiRes module used by gitweb one of them. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-26git-completion: workaround zsh COMPREPLY bugLibravatar Felipe Contreras1-0/+8
zsh adds a backslash (foo\ ) for each item in the COMPREPLY array if IFS doesn't contain spaces. This issue has been reported[1], but there is no solution yet. This wasn't a problem due to another bug[2], which was fixed in zsh version 4.3.12. After this change, 'git checkout ma<tab>' would resolve to 'git checkout master\ '. Aditionally, the introduction of __gitcomp_nl in commit a31e626 (completion: optimize refs completion) in git also made the problem apparent, as Matthieu Moy reported. The simplest and most generic solution is to hide all the changes we do to IFS, so that "foo \nbar " is recognized by zsh as "foo bar". This works on versions of git before and after the introduction of __gitcomp_nl (a31e626), and versions of zsh before and after 4.3.12. Once zsh is fixed, we should conditionally disable this workaround to have the same benefits as bash users. [1] http://www.zsh.org/mla/workers/2012/msg00053.html [2] http://zsh.git.sourceforge.net/git/gitweb.cgi?p=zsh/zsh;a=commitdiff;h=2e25dfb8fd38dbef0a306282ffab1d343ce3ad8d Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-26docs: minor grammar fixes for v1.7.9 release notesLibravatar Jeff King1-6/+7
Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-23Fix typo in 1.7.9 release notesLibravatar Michael Haggerty1-1/+1
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-18Git 1.7.9-rc2Libravatar Junio C Hamano2-10/+2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-18Merge branch 'maint'Libravatar Junio C Hamano3-2/+12
* maint: Git 1.7.8.4 Git 1.7.7.6 diff-index: enable recursive pathspec matching in unpack_trees Conflicts: GIT-VERSION-GEN
2012-01-18Git 1.7.8.4Libravatar Junio C Hamano3-3/+9
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-18Merge branch 'maint-1.7.7' into maintLibravatar Junio C Hamano3-0/+14
* maint-1.7.7: Git 1.7.7.6 diff-index: enable recursive pathspec matching in unpack_trees Conflicts: GIT-VERSION-GEN
2012-01-18Git 1.7.7.6Libravatar Junio C Hamano2-1/+5
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-18diff-index: enable recursive pathspec matching in unpack_treesLibravatar Nguyen Thai Ngoc Duy2-0/+10
The pathspec structure has a few bits of data to drive various operation modes after we unified the pathspec matching logic in various codepaths. For example, max_depth field is there so that "git grep" can limit the output for files found in limited depth of tree traversal. Also in order to show just the surface level differences in "git diff-tree", recursive field stops us from descending into deeper level of the tree structure when it is set to false, and this also affects pathspec matching when we have wildcards in the pathspec. The diff-index has always wanted the recursive behaviour, and wanted to match pathspecs without any depth limit. But we forgot to do so when we updated tree_entry_interesting() logic to unify the pathspec matching logic. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-18Merge branch 'jc/pull-signed-tag-doc'Libravatar Junio C Hamano2-1/+221
* jc/pull-signed-tag-doc: pulling signed tag: add howto document
2012-01-18pulling signed tag: add howto documentLibravatar Junio C Hamano2-1/+221
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-18Merge branch 'jk/credentials'Libravatar Junio C Hamano2-22/+92
* jk/credentials: credential-cache: ignore "connection refused" errors unix-socket: do not let close() or chdir() clobber errno during cleanup credential-cache: report more daemon connection errors unix-socket: handle long socket pathnames
2012-01-18Merge branch 'nd/pathspec-recursion-cleanup'Libravatar Junio C Hamano4-0/+16
* nd/pathspec-recursion-cleanup: diff-index: enable recursive pathspec matching in unpack_trees Document limited recursion pathspec matching with wildcards
2012-01-18Merge branch 'mh/maint-show-ref-doc'Libravatar Junio C Hamano1-3/+3
* mh/maint-show-ref-doc: git-show-ref doc: typeset regexp in fixed width font git-show-ref: fix escaping in asciidoc source
2012-01-18Merge branch 'tr/maint-word-diff-incomplete-line'Libravatar Junio C Hamano2-0/+23
* tr/maint-word-diff-incomplete-line: word-diff: ignore '\ No newline at eof' marker
2012-01-16credential-cache: ignore "connection refused" errorsLibravatar Jeff King1-1/+1
The credential-cache helper will try to connect to its daemon over a unix socket. Originally, a failure to do so was silently ignored, and we would either give up (if performing a "get" or "erase" operation), or spawn a new daemon (for a "store" operation). But since 8ec6c8d, we try to report more errors. We detect a missing daemon by checking for ENOENT on our connection attempt. If the daemon is missing, we continue as before (giving up or spawning a new daemon). For any other error, we die and report the problem. However, checking for ENOENT is not sufficient for a missing daemon. We might also get ECONNREFUSED if a dead daemon process left a stale socket. This generally shouldn't happen, as the daemon cleans up after itself, but the daemon may not always be given a chance to do so (e.g., power loss, "kill -9"). The resulting state is annoying not just because the helper outputs an extra useless message, but because it actually blocks the helper from spawning a new daemon to replace the stale socket. Fix it by checking for ECONNREFUSED. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-16Merge branch 'jn/maint-gitweb-grep-fix'Libravatar Junio C Hamano1-10/+10
* jn/maint-gitweb-grep-fix: gitweb: Harden "grep" search against filenames with ':' gitweb: Fix file links in "grep" search
2012-01-16diff-index: enable recursive pathspec matching in unpack_treesLibravatar Nguyen Thai Ngoc Duy2-0/+10
The pathspec structure has a few bits of data to drive various operation modes after we unified the pathspec matching logic in various codepaths. For example, max_depth field is there so that "git grep" can limit the output for files found in limited depth of tree traversal. Also in order to show just the surface level differences in "git diff-tree", recursive field stops us from descending into deeper level of the tree structure when it is set to false, and this also affects pathspec matching when we have wildcards in the pathspec. The diff-index has always wanted the recursive behaviour, and wanted to match pathspecs without any depth limit. But we forgot to do so when we updated tree_entry_interesting() logic to unify the pathspec matching logic. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-14Document limited recursion pathspec matching with wildcardsLibravatar Nguyễn Thái Ngọc Duy2-0/+6
It's actually unlimited recursion if wildcards are active regardless --max-depth Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-13git-show-ref doc: typeset regexp in fixed width fontLibravatar Michael Haggerty1-1/+1
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-13git-show-ref: fix escaping in asciidoc sourceLibravatar Michael Haggerty1-3/+3
Two "^" characters were incorrectly being interpreted as markup for superscripting. Fix them by writing them as attribute references "{caret}". Although a single "^" character in a paragraph cannot be misinterpreted in this way, also write other "^" characters as "{caret}" in the interest of good hygiene (unless they are in literal paragraphs, of course, in which context attribute references are not recognized). Spell "{}" consistently, namely *not* quoted as "\{\}". Since the braces are empty, they cannot be interpreted as an attribute reference, and either spelling is OK. So arbitrarily choose one variation and use it consistently. Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-12Git 1.7.9-rc1Libravatar Junio C Hamano2-2/+6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-12Merge branch 'jc/request-pull-show-head-4'Libravatar Junio C Hamano1-1/+1
* jc/request-pull-show-head-4: request-pull: use the real fork point when preparing the message
2012-01-12Merge branch 'tr/maint-mailinfo'Libravatar Junio C Hamano1-7/+18
* tr/maint-mailinfo: mailinfo documentation: accurately describe non -k case
2012-01-12Merge branch 'ss/maint-msys-cvsexportcommit'Libravatar Junio C Hamano2-3/+10
* ss/maint-msys-cvsexportcommit: git-cvsexportcommit: Fix calling Perl's rel2abs() on MSYS t9200: On MSYS, do not pass Windows-style paths to CVS
2012-01-12Merge branch 'jk/maint-upload-archive'Libravatar Junio C Hamano1-6/+15
* jk/maint-upload-archive: archive: re-allow HEAD:Documentation on a remote invocation
2012-01-12Merge branch 'maint'Libravatar Junio C Hamano4-2/+22
* maint: Update draft release notes to 1.7.8.4 Update draft release notes to 1.7.7.6 Update draft release notes to 1.7.6.6 thin-pack: try harder to use preferred base objects as base
2012-01-12Update draft release notes to 1.7.8.4Libravatar Junio C Hamano1-0/+5
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-12Merge branch 'maint-1.7.7' into maintLibravatar Junio C Hamano3-2/+17
* maint-1.7.7: Update draft release notes to 1.7.7.6 Update draft release notes to 1.7.6.6 thin-pack: try harder to use preferred base objects as base
2012-01-12Update draft release notes to 1.7.7.6Libravatar Junio C Hamano1-0/+5
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-12Merge branch 'maint-1.7.6' into maint-1.7.7Libravatar Junio C Hamano2-2/+12
* maint-1.7.6: Update draft release notes to 1.7.6.6 thin-pack: try harder to use preferred base objects as base
2012-01-12Update draft release notes to 1.7.6.6Libravatar Junio C Hamano1-0/+5
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-12thin-pack: try harder to use preferred base objects as baseLibravatar Jeff King1-2/+7
When creating a pack using objects that reside in existing packs, we try to avoid recomputing futile delta between an object (trg) and a candidate for its base object (src) if they are stored in the same packfile, and trg is not recorded as a delta already. This heuristics makes sense because it is likely that we tried to express trg as a delta based on src but it did not produce a good delta when we created the existing pack. As the pack heuristics prefer producing delta to remove data, and Linus's law dictates that the size of a file grows over time, we tend to record the newest version of the file as inflated, and older ones as delta against it. When creating a thin-pack to transfer recent history, it is likely that we will try to send an object that is recorded in full, as it is newer. But the heuristics to avoid recomputing futile delta effectively forbids us from attempting to express such an object as a delta based on another object. Sending an object in full is often more expensive than sending a suboptimal delta based on other objects, and it is even more so if we could use an object we know the receiving end already has (i.e. preferred base object) as the delta base. Tweak the recomputation avoidance logic, so that we do not punt on computing delta against a preferred base object. The effect of this change can be seen on two simulated upload-pack workloads. The first is based on 44 reflog entries from my git.git origin/master reflog, and represents the packs that kernel.org sent me git updates for the past month or two. The second workload represents much larger fetches, going from git's v1.0.0 tag to v1.1.0, then v1.1.0 to v1.2.0, and so on. The table below shows the average generated pack size and the average CPU time consumed for each dataset, both before and after the patch: dataset | reflog | tags --------------------------------- before | 53358 | 2750977 size after | 32398 | 2668479 change | -39% | -3% --------------------------------- before | 0.18 | 1.12 CPU after | 0.18 | 1.15 change | +0% | +3% This patch makes a much bigger difference for packs with a shorter slice of history (since its effect is seen at the boundaries of the pack) though it has some benefit even for larger packs. Signed-off-by: Jeff King <peff@peff.net> Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-12word-diff: ignore '\ No newline at eof' markerLibravatar Thomas Rast2-0/+23
The word-diff logic accumulates + and - lines until another line type appears (normally [ @\]), at which point it generates the word diff. This is usually correct, but it breaks when the preimage does not have a newline at EOF: $ printf "%s" "a a a" >a $ printf "%s\n" "a ab a" >b $ git diff --no-index --word-diff a b diff --git 1/a 2/b index 9f68e94..6a7c02f 100644 --- 1/a +++ 2/b @@ -1 +1 @@ [-a a a-] No newline at end of file {+a ab a+} Because of the order of the lines in a unified diff @@ -1 +1 @@ -a a a \ No newline at end of file +a ab a the '\' line flushed the buffers, and the - and + lines were never matched with each other. A proper fix would defer such markers until the end of the hunk. However, word-diff is inherently whitespace-ignoring, so as a cheap fix simply ignore the marker (and hide it from the output). We use a prefix match for '\ ' to parallel the logic in apply.c:parse_fragment(). We currently do not localize this string (just accept other variants of it in git-apply), but this should be future-proof. Noticed-by: Ivan Shirokoff <shirokoff@yandex-team.ru> Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-11archive: re-allow HEAD:Documentation on a remote invocationLibravatar Carlos Martín Nieto1-6/+15
The tightening done in (ee27ca4a: archive: don't let remote clients get unreachable commits, 2011-11-17) went too far and disallowed HEAD:Documentation as it would try to find "HEAD:Documentation" as a ref. Only DWIM the "HEAD" part to see if it exists as a ref. Once we're sure that we've been given a valid ref, we follow the normal code path. This still disallows attempts to access commits which are not branch tips. Signed-off-by: Carlos Martín Nieto <cmn@elego.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-11Merge branch 'maint'Libravatar Junio C Hamano2-1/+2
* maint: attr: fix leak in free_attr_elem t2203: fix wrong commit command
2012-01-11Merge branch 'maint-1.7.7' into maintLibravatar Junio C Hamano2-1/+2
* maint-1.7.7: attr: fix leak in free_attr_elem t2203: fix wrong commit command
2012-01-11Merge branch 'maint-1.7.6' into maint-1.7.7Libravatar Junio C Hamano2-1/+2
* maint-1.7.6: attr: fix leak in free_attr_elem t2203: fix wrong commit command
2012-01-11attr: fix leak in free_attr_elemLibravatar Jeff King1-0/+1
This function frees the individual "struct match_attr"s we have allocated, but forgot to free the array holding their pointers, leading to a minor memory leak (but it can add up after checking attributes for paths in many directories). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-11git-cvsexportcommit: Fix calling Perl's rel2abs() on MSYSLibravatar Sebastian Schuberth1-0/+7
Due to MSYS path mangling GIT_DIR contains a Windows-style path when checked inside a Perl script even if GIT_DIR was previously set to an MSYS-style path in a shell script. So explicitly convert to an MSYS-style path before calling Perl's rel2abs() to make it work. This fix was inspired by a very similar patch in WebKit: http://trac.webkit.org/changeset/76255/trunk/Tools/Scripts/commit-log-editor Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com> Tested-by: Pat Thoyts <patthoyts@users.sourceforge.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>