summaryrefslogtreecommitdiff
path: root/cache.h
AgeCommit message (Collapse)AuthorFilesLines
2011-08-17Merge branch 'js/ref-namespaces'Libravatar Junio C Hamano1-0/+3
* js/ref-namespaces: ref namespaces: tests ref namespaces: documentation ref namespaces: Support remote repositories via upload-pack and receive-pack ref namespaces: infrastructure Fix prefix handling in ref iteration functions
2011-08-11Merge branch 'cb/partial-commit-relative-pathspec'Libravatar Junio C Hamano1-0/+1
* cb/partial-commit-relative-pathspec: commit: allow partial commits with relative paths
2011-08-05Merge branch 'jc/pack-order-tweak'Libravatar Junio C Hamano1-0/+3
* jc/pack-order-tweak: pack-objects: optimize "recency order" core: log offset pack data accesses happened
2011-08-02commit: allow partial commits with relative pathsLibravatar Clemens Buchacher1-0/+1
In order to do partial commits, git-commit overlays a tree on the cache and checks pathspecs against the result. Currently, the overlaying is done using "prefix" which prevents relative pathspecs with ".." and absolute pathspec from matching when they refer to files not under "prefix" and absent from the index, but still in the tree (i.e. files staged for removal). The point of providing a prefix at all is performance optimization. If we say there is no common prefix for the files of interest, then we have to read the entire tree into the index. But even if we cannot use the working directory as a prefix, we can still figure out if there is a common prefix for all given paths, and use that instead. The pathspec_prefix() routine from ls-files.c does exactly that. Any use of global variables is removed from pathspec_prefix() so that it can be called from commit.c. Reported-by: Reuben Thomas <rrt@sc3d.org> Analyzed-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Clemens Buchacher <drizzd@aon.at> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-19Merge branch 'jc/index-pack'Libravatar Junio C Hamano1-1/+0
* jc/index-pack: verify-pack: use index-pack --verify index-pack: show histogram when emulating "verify-pack -v" index-pack: start learning to emulate "verify-pack -v" index-pack: a miniscule refactor index-pack --verify: read anomalous offsets from v2 idx file write_idx_file: need_large_offset() helper function index-pack: --verify write_idx_file: introduce a struct to hold idx customization options index-pack: group the delta-base array entries also by type Conflicts: builtin/verify-pack.c cache.h sha1_file.c
2011-07-19Merge branch 'jk/clone-cmdline-config'Libravatar Junio C Hamano1-0/+2
* jk/clone-cmdline-config: clone: accept config options on the command line config: make git_config_parse_parameter a public function remote: use new OPT_STRING_LIST parse-options: add OPT_STRING_LIST helper
2011-07-19Merge branch 'jc/zlib-wrap'Libravatar Junio C Hamano1-9/+23
* jc/zlib-wrap: zlib: allow feeding more than 4GB in one go zlib: zlib can only process 4GB at a time zlib: wrap deflateBound() too zlib: wrap deflate side of the API zlib: wrap inflateInit2 used to accept only for gzip format zlib: wrap remaining calls to direct inflate/inflateEnd zlib wrapper: refactor error message formatter Conflicts: sha1_file.c
2011-07-06core: log offset pack data accesses happenedLibravatar Junio C Hamano1-0/+3
In a workload other than "git log" (without pathspec nor any option that causes us to inspect trees and blobs), the recency pack order is said to cause the access jump around quite a bit. Add a hook to allow us observe how bad it is. "git config core.logpackaccess /var/tmp/pal.txt" will give you the log in the specified file. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-06ref namespaces: infrastructureLibravatar Josh Triplett1-0/+3
Add support for dividing the refs of a single repository into multiple namespaces, each of which can have its own branches, tags, and HEAD. Git can expose each namespace as an independent repository to pull from and push to, while sharing the object store, and exposing all the refs to operations such as git-gc. Storing multiple repositories as namespaces of a single repository avoids storing duplicate copies of the same objects, such as when storing multiple branches of the same source. The alternates mechanism provides similar support for avoiding duplicates, but alternates do not prevent duplication between new objects added to the repositories without ongoing maintenance, while namespaces do. To specify a namespace, set the GIT_NAMESPACE environment variable to the namespace. For each ref namespace, git stores the corresponding refs in a directory under refs/namespaces/. For example, GIT_NAMESPACE=foo will store refs under refs/namespaces/foo/. You can also specify namespaces via the --namespace option to git. Note that namespaces which include a / will expand to a hierarchy of namespaces; for example, GIT_NAMESPACE=foo/bar will store refs under refs/namespaces/foo/refs/namespaces/bar/. This makes paths in GIT_NAMESPACE behave hierarchically, so that cloning with GIT_NAMESPACE=foo/bar produces the same result as cloning with GIT_NAMESPACE=foo and cloning from that repo with GIT_NAMESPACE=bar. It also avoids ambiguity with strange namespace paths such as foo/refs/heads/, which could otherwise generate directory/file conflicts within the refs directory. Add the infrastructure for ref namespaces: handle the GIT_NAMESPACE environment variable and --namespace option, and support iterating over refs in a namespace. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Signed-off-by: Jamey Sharp <jamey@minilop.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-29Merge branch 'jc/streaming-filter' into nextLibravatar Junio C Hamano1-37/+1
* jc/streaming-filter: t0021: test application of both crlf and ident t0021-conversion.sh: fix NoTerminatingSymbolAtEOF test streaming: filter cascading streaming filter: ident filter Add LF-to-CRLF streaming conversion stream filter: add "no more input" to the filters Add streaming filter API convert.h: move declarations for conversion from cache.h
2011-06-29Merge branch 'jc/streaming' into nextLibravatar Junio C Hamano1-1/+35
* jc/streaming: sha1_file: use the correct type (ssize_t, not size_t) for read-style function streaming: read loose objects incrementally sha1_file.c: expose helpers to read loose objects streaming: read non-delta incrementally from a pack streaming_write_entry(): support files with holes convert: CRLF_INPUT is a no-op in the output codepath streaming_write_entry(): use streaming API in write_entry() streaming: a new API to read from the object store write_entry(): separate two helper functions out unpack_object_header(): make it public sha1_object_info_extended(): hint about objects in delta-base cache sha1_object_info_extended(): expose a bit more info packed_object_info_detail(): do not return a string
2011-06-29Merge branch 'ef/maint-win-verify-path'Libravatar Junio C Hamano1-1/+1
* ef/maint-win-verify-path: verify_dotfile(): do not assume '/' is the path seperator verify_path(): simplify check at the directory boundary verify_path: consider dos drive prefix real_path: do not assume '/' is the path seperator A Windows path starting with a backslash is absolute
2011-06-22config: make git_config_parse_parameter a public functionLibravatar Jeff King1-0/+2
We use this internally to parse "git -c core.foo=bar", but the general format of "key=value" is useful for other places. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10zlib: zlib can only process 4GB at a timeLibravatar Junio C Hamano1-13/+22
The size of objects we read from the repository and data we try to put into the repository are represented in "unsigned long", so that on larger architectures we can handle objects that weigh more than 4GB. But the interface defined in zlib.h to communicate with inflate/deflate limits avail_in (how many bytes of input are we calling zlib with) and avail_out (how many bytes of output from zlib are we ready to accept) fields effectively to 4GB by defining their type to be uInt. In many places in our code, we allocate a large buffer (e.g. mmap'ing a large loose object file) and tell zlib its size by assigning the size to avail_in field of the stream, but that will truncate the high octets of the real size. The worst part of this story is that we often pass around z_stream (the state object used by zlib) to keep track of the number of used bytes in input/output buffer by inspecting these two fields, which practically limits our callchain to the same 4GB limit. Wrap z_stream in another structure git_zstream that can express avail_in and avail_out in unsigned long. For now, just die() when the caller gives a size that cannot be given to a single zlib call. In later patches in the series, we would make git_inflate() and git_deflate() internally loop to give callers an illusion that our "improved" version of zlib interface can operate on a buffer larger than 4GB in one go. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10zlib: wrap deflateBound() tooLibravatar Junio C Hamano1-3/+1
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10zlib: wrap deflate side of the APILibravatar Junio C Hamano1-0/+6
Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use of deflateInit2 in remote-curl.c to tell the library to use gzip header and trailer in git_deflate_init_gzip(). There is only one caller that cares about the status from deflateEnd(). Introduce git_deflate_end_gently() to let that sole caller retrieve the status and act on it (i.e. die) for now, but we would probably want to make inflate_end/deflate_end die when they ran out of memory and get rid of the _gently() kind. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10zlib: wrap inflateInit2 used to accept only for gzip formatLibravatar Junio C Hamano1-0/+1
http-backend.c uses inflateInit2() to tell the library that it wants to accept only gzip format. Wrap it in a helper function so that readers do not have to wonder what the magic numbers 15 and 16 are for. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-05verify-pack: use index-pack --verifyLibravatar Junio C Hamano1-1/+0
This finally gets rid of the inefficient verify-pack implementation that walks objects in the packfile in their object name order and replaces it with a call to index-pack --verify. As a side effect, it also removes packed_object_info_detail() API which is rather expensive. As this changes the way errors are reported (verify-pack used to rely on the usual runtime error detection routine unpack_entry() to diagnose the CRC errors in an entry in the *.idx file; index-pack --verify checks the whole *.idx file in one go), update a test that expected the string "CRC" to appear in the error message. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-01Merge branch 'jk/maint-config-alias-fix' into maintLibravatar Junio C Hamano1-2/+0
* jk/maint-config-alias-fix: handle_options(): do not miscount how many arguments were used config: always parse GIT_CONFIG_PARAMETERS during git_config git_config: don't peek at global config_parameters config: make environment parsing routines static
2011-05-30Merge branch 'jk/maint-config-alias-fix'Libravatar Junio C Hamano1-2/+0
* jk/maint-config-alias-fix: handle_options(): do not miscount how many arguments were used config: always parse GIT_CONFIG_PARAMETERS during git_config git_config: don't peek at global config_parameters config: make environment parsing routines static Conflicts: config.c
2011-05-27A Windows path starting with a backslash is absoluteLibravatar Theo Niessink1-1/+1
This fixes prefix_path() not recognizing e.g. \foo\bar as an absolute path on Windows. Signed-off-by: Theo Niessink <theo@taletn.com> Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-26convert.h: move declarations for conversion from cache.hLibravatar Junio C Hamano1-37/+1
Before adding the streaming filter API to the conversion layer, move the existing declarations related to the conversion to its own header file. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-26Merge branch 'jk/git-connection-deadlock-fix' into maint-1.7.4Libravatar Junio C Hamano1-0/+1
* jk/git-connection-deadlock-fix: test core.gitproxy configuration send-pack: avoid deadlock on git:// push with failed pack-objects connect: let callers know if connection is a socket connect: treat generic proxy processes like ssh processes Conflicts: connect.c
2011-05-26Merge branch 'jk/git-connection-deadlock-fix' into maintLibravatar Junio C Hamano1-0/+1
* jk/git-connection-deadlock-fix: test core.gitproxy configuration send-pack: avoid deadlock on git:// push with failed pack-objects connect: let callers know if connection is a socket connect: treat generic proxy processes like ssh processes Conflicts: connect.c
2011-05-25Merge branch 'jc/bigfile'Libravatar Junio C Hamano1-2/+5
* jc/bigfile: Bigfile: teach "git add" to send a large file straight to a pack index_fd(): split into two helper functions index_fd(): turn write_object and format_check arguments into one flag
2011-05-24config: make environment parsing routines staticLibravatar Jeff King1-2/+0
Nobody outside of git_config_from_parameters should need to use the GIT_CONFIG_PARAMETERS parsing functions, so let's make them private. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-23Merge branch 'mg/config-symbolic-constants'Libravatar Junio C Hamano1-0/+10
* mg/config-symbolic-constants: config: Give error message when not changing a multivar config: define and document exit codes
2011-05-23Merge branch 'jc/magic-pathspec'Libravatar Junio C Hamano1-4/+4
* jc/magic-pathspec: setup.c: Fix some "symbol not declared" sparse warnings t3703: Skip tests using directory name ":" on Windows revision.c: leave a note for "a lone :" enhancement t3703, t4208: add test cases for magic pathspec rev/path disambiguation: further restrict "misspelled index entry" diag fix overslow :/no-such-string-ever-existed diagnostics fix overstrict :<path> diagnosis grep: use get_pathspec() correctly pathspec: drop "lone : means no pathspec" from get_pathspec() Revert "magic pathspec: add ":(icase)path" to match case insensitively" magic pathspec: add ":(icase)path" to match case insensitively magic pathspec: futureproof shorthand form magic pathspec: add tentative ":/path/from/top/level" pathspec support
2011-05-20sha1_file.c: expose helpers to read loose objectsLibravatar Junio C Hamano1-0/+3
Make map_sha1_file(), parse_sha1_header() and unpack_sha1_header() available to the streaming read API by exporting them via cache.h header file. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-20streaming_write_entry(): use streaming API in write_entry()Libravatar Junio C Hamano1-0/+1
When the output to a path does not have to be converted, we can read from the object database from the streaming API and write to the file in the working tree, without having to hold everything in the memory. The ident, auto- and safe- crlf conversions inherently require you to read the whole thing before deciding what to do, so while it is technically possible to support them by using a buffer of an unbound size or rewinding and reading the stream twice, it is less practical than the traditional "read the whole thing in core and convert" approach. Adding streaming filters for the other conversions on top of this should be doable by tweaking the can_bypass_conversion() function (it should be renamed to can_filter_stream() when it happens). Then the streaming API can be extended to wrap the git_istream streaming_write_entry() opens on the underlying object in another git_istream that reads from it, filters what is read, and let the streaming_write_entry() read the filtered result. But that is outside the scope of this series. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-20unpack_object_header(): make it publicLibravatar Junio C Hamano1-0/+1
This function is used to read and skip over the per-object header in a packfile. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-20sha1_object_info_extended(): hint about objects in delta-base cacheLibravatar Junio C Hamano1-1/+2
An object found in the delta-base cache is not guaranteed to stay there, but we know it came from a pack and it is likely to give us a quick access if we read_sha1_file() it right now, which is a piece of useful information. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-19Merge branch 'jc/replacing'Libravatar Junio C Hamano1-4/+14
* jc/replacing: read_sha1_file(): allow selective bypassing of replacement mechanism inline lookup_replace_object() calls read_sha1_file(): get rid of read_sha1_file_repl() madness t6050: make sure we test not just commit replacement Declare lookup_replace_object() in cache.h, not in commit.h Conflicts: environment.c
2011-05-19Merge branch 'jk/git-connection-deadlock-fix'Libravatar Junio C Hamano1-0/+1
* jk/git-connection-deadlock-fix: test core.gitproxy configuration send-pack: avoid deadlock on git:// push with failed pack-objects connect: let callers know if connection is a socket connect: treat generic proxy processes like ssh processes Conflicts: connect.c
2011-05-19sha1_object_info_extended(): expose a bit more infoLibravatar Junio C Hamano1-0/+28
The original interface for sha1_object_info() takes an object name and gives back a type and its size (the latter is given only when it was asked). The new interface wraps its implementation and exposes a bit more pieces of information that the interface used to discard, namely: - where the object is stored (loose? cached? packed?) - if packed, where in which packfile? Signed-off-by: Junio C Hamano <gitster@pobox.com> --- * In the earlier round, this used u.pack.delta to record the length of the delta chain, but the caller is not necessarily interested in the length of the delta chain per-se, but may only want to know if it is a delta against another object or is stored as a deflated data. Calling packed_object_info_detail() involves walking the reverse index chain to compute the store size of the object and is unnecessarily expensive. We could resurrect the code if a new caller wants to know, but I doubt it.
2011-05-17config: define and document exit codesLibravatar Michael J Gruber1-0/+10
The return codes of git_config_set() and friends are magic numbers right in the source. #define them in cache.h where the functions are declared, and use the constants in the source. Also, mention the resulting exit codes of "git config" in its man page (and complete the list). Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-16packed_object_info_detail(): do not return a stringLibravatar Junio C Hamano1-1/+1
Instead return an integer that can be given to typename() if the caller wants a string, just like everybody else does. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-16connect: let callers know if connection is a socketLibravatar Jeff King1-0/+1
They might care because they want to do a half-duplex close. With pipes, that means simply closing the output descriptor; with a socket, you must actually call shutdown. Instead of exposing the magic no_fork child_process struct, let's encapsulate the test in a function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-15Merge branches 'jc/convert', 'jc/bigfile' and 'jc/replacing' into jc/streamingLibravatar Junio C Hamano1-7/+20
* jc/convert: convert: make it harder to screw up adding a conversion attribute convert: make it safer to add conversion attributes convert: give saner names to crlf/eol variables, types and functions convert: rename the "eol" global variable to "core_eol" * jc/bigfile: Bigfile: teach "git add" to send a large file straight to a pack index_fd(): split into two helper functions index_fd(): turn write_object and format_check arguments into one flag * jc/replacing: read_sha1_file(): allow selective bypassing of replacement mechanism inline lookup_replace_object() calls read_sha1_file(): get rid of read_sha1_file_repl() madness t6050: make sure we test not just commit replacement Declare lookup_replace_object() in cache.h, not in commit.h
2011-05-15read_sha1_file(): allow selective bypassing of replacement mechanismLibravatar Junio C Hamano1-1/+6
The way "object replacement" mechanism was tucked to the read_sha1_file() interface was suboptimal in a couple of ways: - Callers that want it to die with useful diagnosis upon seeing a corrupt object does not have a way to say that they do not want any object replacement. - Callers who do not want it to die but want to handle the errors themselves are told to arrange to call read_object(), but the function does not use the replacement mechanism, and also it is a file scope static function that not many callers can call to begin with. This adds a read_sha1_file_extended() that takes a set of flags; the callers of read_sha1_file() passes a flag READ_SHA1_FILE_REPLACE to ask for object replacement mechanism to kick in. Later, we could add another flag bit to tell the function to return an error instead of dying and then remove the misguided "call read_object() yourself". Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-15inline lookup_replace_object() callsLibravatar Junio C Hamano1-2/+10
In a repository without object replacement, lookup_replace_object() should be a no-op. Check the flag "read_replace_refs" on the side of the caller, and bypess a function call when we know we are not dealing with replacement. Also, even when we are set up to replace objects, if we do not find any replacement defined, flip that flag off to avoid function call overhead for all the later object accesses. As this change the semantics of the flag from "do we need read the replacement definition?" to "do we need to check with the lookup table?" the flag needs to be renamed later to something saner, e.g. "use_replace", when the codebase is calmer, but not now. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-15read_sha1_file(): get rid of read_sha1_file_repl() madnessLibravatar Junio C Hamano1-5/+1
Most callers want to silently get a replacement object, and they do not care what the real name of the replacement object is. Worse yet, no sane interface to return the underlying object without replacement is provided. Remove the function and make only the few callers that want the name of the replacement object find it themselves. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-15Declare lookup_replace_object() in cache.h, not in commit.hLibravatar Junio C Hamano1-0/+1
The declaration is misplaced as the replace API is supposed to affect not just commits, but all types of objects. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-10fix overslow :/no-such-string-ever-existed diagnosticsLibravatar Junio C Hamano1-4/+4
"git cmd :/no-such-string-ever-existed" runs an extra round of get_sha1() since 009fee4 (Detailed diagnosis when parsing an object name fails., 2009-12-07). Once without error diagnosis to see there is no commit with such a string in the log message (hence "it cannot be a ref"), and after seeing that :/no-such-string-ever-existed is not a filename (hence "it cannot be a path, either"), another time to give "better diagnosis". The thing is, the second time it runs, we already know that traversing the history all the way down to the root will _not_ find any matching commit. Rename misguided "gently" parameter, which is turned off _only_ when the "detailed diagnosis" codepath knows that it cannot be a ref and making the call only for the caller to die with a message. Flip its meaning (and adjust the callers) and call it "only_to_die", which is not a great name, but it describes far more clearly what the codepaths that switches their behaviour based on this variable do. On my box, the command spends ~1.8 seconds without the patch to make the report; with the patch it spends ~1.12 seconds. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-09convert: rename the "eol" global variable to "core_eol"Libravatar Junio C Hamano1-1/+1
Yes, it is clear that "eol" wants to mean some sort of end-of-line thing, but as the name of a global variable, it is way too short to describe what kind of end-of-line thing it wants to represent. Besides, there are many codepaths that want to use their own local "char *eol" variable to point at the end of the current line they are processing. This global variable holds what we read from core.eol configuration variable. Name it as such. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-09index_fd(): turn write_object and format_check arguments into one flagLibravatar Junio C Hamano1-2/+5
The "format_check" parameter tucked after the existing parameters is too ugly an afterthought to live in any reasonable API. Combine it with the other boolean parameter "write_object" into a single "flags" parameter. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-06Merge branch 'im/hashcmp-optim'Libravatar Junio C Hamano1-4/+14
* im/hashcmp-optim: hashcmp(): inline memcmp() by hand to optimize
2011-05-06Merge branch 'nd/struct-pathspec'Libravatar Junio C Hamano1-1/+1
* nd/struct-pathspec: pathspec: rename per-item field has_wildcard to use_wildcard Improve tree_entry_interesting() handling code Convert read_tree{,_recursive} to support struct pathspec Reimplement read_tree_recursive() using tree_entry_interesting()
2011-05-04Merge branch 'jc/pack-objects-bigfile' into maintLibravatar Junio C Hamano1-0/+1
* jc/pack-objects-bigfile: Teach core.bigfilethreashold to pack-objects
2011-04-28hashcmp(): inline memcmp() by hand to optimizeLibravatar Ingo Molnar1-4/+14
This is reported to speed "git gc" by 18%. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Junio C Hamano <gitster@pobox.com>