Age | Commit message (Collapse) | Author | Files | Lines |
|
"git cat-file --batch-check=<format>" is added, primarily to allow
on-disk footprint of objects in packfiles (often they are a lot
smaller than their true size, when expressed as deltas) to be
reported.
* jk/in-pack-size-measurement:
pack-revindex: radix-sort the revindex
pack-revindex: use unsigned to store number of objects
cat-file: split --batch input lines on whitespace
cat-file: add %(objectsize:disk) format atom
cat-file: add --batch-check=<format>
cat-file: refactor --batch option parsing
cat-file: teach --batch to stream blob objects
t1006: modernize output comparisons
teach sha1_object_info_extended a "disk_size" query
zero-initialize object_info structs
|
|
Add a command to allow previewing the contents locally before
pushing it out, when working with a MediaWiki remote.
I personally do not think this belongs to Git. If you are working
on a set of AsciiDoc source files, you sure do want to locally
format to preview what you will be pushing out, and if you are
working on a set of C or Java source files, you do want to test it
before pushing it out, too. That kind of thing belongs to your
build script, not to your SCM.
But I'll let it pass, as this is only a contrib/ thing.
* bp/mediawiki-preview:
git-remote-mediawiki: add preview subcommand into git mw
git-remote-mediawiki: add git-mw command
git-remote-mediawiki: factoring code between git-remote-mediawiki and Git::Mediawiki
git-remote-mediawiki: update tests to run with the new bin-wrapper
git-remote-mediawiki: add a git bin-wrapper for developement
wrap-for-bin: make bin-wrappers chainable
git-remote-mediawiki: introduction of Git::Mediawiki.pm
|
|
Logic to auto-detect character encodings in the commit log message
did not reject overlong and invalid UTF-8 characters.
* bc/commit-invalid-utf8:
commit: reject non-characters
commit: reject overlong UTF-8 sequences
commit: reject invalid UTF-8 codepoints
|
|
* es/overlapping-range-set:
range_set: fix coalescing bug when range is a subset of another
t4211: fix broken test when one -L range is subset of another
|
|
"git clone -s/-l" is a filesystem level copy and does not offer any
protection against source repository being corrupt. While the
connectivity validation checks commits and trees being readable, it
made the otherwise instantaneous local modes of clone much more
expensive, without protecting blob data from bitflips.
* jk/maint-clone-shared-no-connectivity-validation:
clone: drop connectivity check for local clones
|
|
Pushing to repositories with many refs employed O(m*n) algorithm
where n is the number of refs on the receiving end.
* bc/push-match-many-refs:
remote.c: avoid O(m*n) behavior in match_push_refs
|
|
"git rebase [-i]" used to leave just "rebase" as its reflog message
for some operations. This rewords them to be more informative.
* rr/rebase-reflog-message-reword:
rebase -i: use a better reflog message
rebase: use a better reflog message
|
|
All surrounding examples are typeset as monospaced text. Follow suit.
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Acked-by: Thomas Rast <trast@inf.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git apply" parsed patches that add new files, generated by programs
other than Git, incorrectly. This is an old breakage in v1.7.11.
* tr/maint-apply-non-git-patch-parsefix:
apply: carefully strdup a possibly-NULL name
|
|
Older cURL wanted piece of memory we call it with to be stable, but
we updated the auth material after handing it to a call.
* bc/http-keep-memory-given-to-curl:
http.c: don't rewrite the user:passwd string multiple times
|
|
"git pull" into nothing trashed "local changes" that were in the
index.
* jk/pull-into-dirty-unborn:
pull: merge into unborn by fast-forwarding from empty tree
pull: update unborn branch tip after index
|
|
Many "git submodule" operations did not work on a submodule at a
path whose name is not in ASCII.
* fg/submodule-non-ascii-path:
t7400: test of UTF-8 submodule names pass under Mac OS
handle multibyte characters in name
|
|
"cherry-pick" had a small leak in its error codepath.
* fc/sequencer-plug-leak:
sequencer: avoid leaking message buffer when refusing to create an empty commit
sequencer: remove useless indentation
|
|
Logic used by git-send-email to suppress cc mishandled names like "A
U. Thor" <author@example.xz>, where the human readable part needs to
be quoted (the user input may not have the double quotes around the
name, and comparison was done between quoted and unquoted strings).
It also mishandled names that need RFC2047 quoting.
* mt/send-email-cc-match-fix:
send-email: sanitize author when writing From line
send-email: add test for duplicate utf8 name
test-send-email: test for pre-sanitized self name
t/send-email: test suppress-cc=self with non-ascii
t/send-email: add test with quoted sender
send-email: make --suppress-cc=self sanitize input
t/send-email: test suppress-cc=self on cccmd
send-email: fix suppress-cc=self on cccmd
t/send-email.sh: add test for suppress-cc=self
|
|
Pass port number as a separate argument when send-email initializes
Net::SMTP, instead of as a part of the hostname, i.e. host:port.
This allows GSSAPI codepath to match with the hostname given.
* bc/send-email-use-port-as-separate-param:
send-email: provide port separately from hostname
|
|
Allow shallow-cloning of submodules with "git submodule update".
* fg/submodule-clone-depth:
Add --depth to submodule update/add
|
|
In addition to the choice from "rebase, merge, or checkout-detach",
allow a custom command to be used in "submodule update" to update
the working tree of submodules.
* cp/submodule-custom-update:
submodule update: allow custom command to update submodule working tree
|
|
"git format-patch" learned "--from[=whom]" option, which sets the
"From: " header to the specified person (or the person who runs the
command, if "=whom" part is missing) and move the original author
information to an in-body From: header as necessary.
* jk/format-patch-from:
teach format-patch to place other authors into in-body "From"
pretty.c: drop const-ness from pretty_print_context
|
|
The configuration variable "merge.ff" was cleary a tri-state to
choose one from "favor fast-forward when possible", "always create
a merge even when the history could fast-forward" and "do not
create any merge, only update when the history fast-forwards", but
the command line parser did not implement the usual convention of
"last one wins, and command line overrides the configuration"
correctly.
* mv/merge-ff-tristate:
merge: handle --ff/--no-ff/--ff-only as a tri-state option
|
|
Fetching between repositories with many refs employed O(n^2)
algorithm to match up the common objects, which has been corrected.
* jk/fetch-pack-many-refs:
fetch-pack: avoid quadratic behavior in rev_list_push
commit.c: make compare_commits_by_commit_date global
fetch-pack: avoid quadratic list insertion in mark_complete
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
* jc/remote-http-argv-array:
remote-http: use argv-array
|
|
* rs/pickaxe-simplify:
diffcore-pickaxe: simplify has_changes and contains
|
|
* tr/test-lint-no-export-assignment-in-shell:
test-lint: detect 'export FOO=bar'
t9902: fix 'test A == B' to use = operator
|
|
* rr/name-rev-stdin-doc:
name-rev doc: rewrite --stdin paragraph
|
|
* ft/diff-rename-default-score-is-half:
diff-options: document default similarity index
|
|
* ml/cygwin-does-not-have-fifo:
test-lib.sh - cygwin does not have usable FIFOs
|
|
An Gitweb installation that is a part of larger site can optionally
show extra links that point at the levels higher than the Gitweb
pages itself in the link hierarchy of pages.
* tf/gitweb-extra-breadcrumbs:
gitweb: allow extra breadcrumbs to prefix the trail
|
|
* ms/remote-tracking-branches-in-doc:
Change "remote tracking" to "remote-tracking"
|
|
* jk/pull-to-integrate:
pull: change the description to "integrate" changes
push: avoid suggesting "merging" remote changes
|
|
* jk/maint-config-multi-order:
git-config(1): clarify precedence of multiple values
|
|
"log --format=" did not honor i18n.logoutputencoding configuration
and this attempts to fix it.
* as/log-output-encoding-in-user-format:
t4205 (log-pretty-formats): avoid using `sed`
t6006 (rev-list-format): add tests for "%b" and "%s" for the case i18n.commitEncoding is not set
t4205, t6006, t7102: make functions better readable
t4205 (log-pretty-formats): revert back single quotes
t4041, t4205, t6006, t7102: use iso8859-1 rather than iso-8859-1
t4205: replace .\+ with ..* in sed commands
pretty: --format output should honor logOutputEncoding
pretty: Add failing tests: --format output should honor logOutputEncoding
t4205 (log-pretty-formats): don't hardcode SHA-1 in expected outputs
t7102 (reset): don't hardcode SHA-1 in expected outputs
t6006 (rev-list-format): don't hardcode SHA-1 in expected outputs
|
|
The document says one cannot push from a shallow clone. But that is
not true (maybe it was at some point in the past). The client does not
stop such a push nor does it give any indication to the receiver that
this is a shallow push. If the receiver accepts it, it's in.
Since 52fed6e (receive-pack: check connectivity before concluding "git
push" - 2011-09-02), receive-pack is prepared to deal with broken
push, a shallow push can't cause any corruption. Update the document
to reflect that.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The pack revindex stores the offsets of the objects in the
pack in sorted order, allowing us to easily find the on-disk
size of each object. To compute it, we populate an array
with the offsets from the sha1-sorted idx file, and then use
qsort to order it by offsets.
That does O(n log n) offset comparisons, and profiling shows
that we spend most of our time in cmp_offset. However, since
we are sorting on a simple off_t, we can use numeric sorts
that perform better. A radix sort can run in O(k*n), where k
is the number of "digits" in our number. For a 64-bit off_t,
using 16-bit "digits" gives us k=4.
On the linux.git repo, with about 3M objects to sort, this
yields a 400% speedup. Here are the best-of-five numbers for
running
echo HEAD | git cat-file --batch-check="%(objectsize:disk)
on a fully packed repository, which is dominated by time
spent building the pack revindex:
before after
real 0m0.834s 0m0.204s
user 0m0.788s 0m0.164s
sys 0m0.040s 0m0.036s
This matches our algorithmic expectations. log(3M) is ~21.5,
so a traditional sort is ~21.5n. Our radix sort runs in k*n,
where k is the number of radix digits. In the worst case,
this is k=4 for a 64-bit off_t, but we can quit early when
the largest value to be sorted is smaller. For any
repository under 4G, k=2. Our algorithm makes two passes
over the list per radix digit, so we end up with 4n. That
should yield ~5.3x speedup. We see 4x here; the difference
is probably due to the extra bucket book-keeping the radix
sort has to do.
On a smaller repo, the difference is less impressive, as
log(n) is smaller. For git.git, with 173K objects (but still
k=2), we see a 2.7x improvement:
before after
real 0m0.046s 0m0.017s
user 0m0.036s 0m0.012s
sys 0m0.008s 0m0.000s
On even tinier repos (e.g., a few hundred objects), the
speedup goes away entirely, as the small advantage of the
radix sort gets erased by the book-keeping costs (and at
those sizes, the cost to generate the the rev-index gets
lost in the noise anyway).
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A packfile may have up to 2^32-1 objects in it, so the
"right" data type to use is uint32_t. We currently use a
signed int, which means that we may behave incorrectly for
packfiles with more than 2^31-1 objects on 32-bit systems.
Nobody has noticed because having 2^31 objects is pretty
insane. The linux.git repo has on the order of 2^22 objects,
which is hundreds of times smaller than necessary to trigger
the bug.
Let's bump this up to an "unsigned". On 32-bit systems, this
gives us the correct data-type, and on 64-bit systems, it is
probably more efficient to use the native "unsigned" than a
true uint32_t.
While we're at it, we can fix the binary search not to
overflow in such a case if our unsigned is 32 bits.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
If we get an input line to --batch or --batch-check that
looks like "HEAD foo bar", we will currently feed the whole
thing to get_sha1(). This means that to use --batch-check
with `rev-list --objects`, one must pre-process the input,
like:
git rev-list --objects HEAD |
cut -d' ' -f1 |
git cat-file --batch-check
Besides being more typing and slightly less efficient to
invoke `cut`, the result loses information: we no longer
know which path each object was found at.
This patch teaches cat-file to split input lines at the
first whitespace. Everything to the left of the whitespace
is considered an object name, and everything to the right is
made available as the %(reset) atom. So you can now do:
git rev-list --objects HEAD |
git cat-file --batch-check='%(objectsize) %(rest)'
to collect object sizes at particular paths.
Even if %(rest) is not used, we always do the whitespace
split (which means you can simply eliminate the `cut`
command from the first example above).
This whitespace split is backwards compatible for any
reasonable input. Object names cannot contain spaces, so any
input with spaces would have resulted in a "missing" line.
The only input hurt is if somebody really expected input of
the form "HEAD is a fine-looking ref!" to fail; it will now
parse HEAD, and make "is a fine-looking ref!" available as
%(rest).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This atom is just like %(objectsize), except that it shows
the on-disk size of the object rather than the object's true
size. In other words, it makes the "disk_size" query of
sha1_object_info_extended available via the command-line.
This can be used for rough attribution of disk usage to
particular refs, though see the caveats in the
documentation.
This patch does not include any tests, as the exact numbers
returned are volatile and subject to zlib and packing
decisions. We cannot even reliably guarantee that the
on-disk size is smaller than the object content (though in
general this should be the case for non-trivial objects).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The `cat-file --batch-check` command can be used to quickly
get information about a large number of objects. However, it
provides a fixed set of information.
This patch adds an optional <format> option to --batch-check
to allow a caller to specify which items they are interested
in, and in which order to output them. This is not very
exciting for now, since we provide the same limited set that
you could already get. However, it opens the door to adding
new format items in the future without breaking backwards
compatibility (or forcing callers to pay the cost to
calculate uninteresting items).
Since the --batch option shares code with --batch-check, it
receives the same feature, though it is less likely to be of
interest there.
The format atom names are chosen to match their counterparts
in for-each-ref. Though we do not (yet) share any code with
for-each-ref's formatter, this keeps the interface as
consistent as possible, and may help later on if the
implementations are unified.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A test that should have failed but didn't revealed a bug that needs
to be corrected.
* jc/t1512-fix:
get_short_sha1(): correctly disambiguate type-limited abbreviation
t1512: correct leftover constants from earlier edition
|
|
Finishing touches to a topic that is already in master for the
upcoming release.
* tr/test-v-and-v-subtest-only:
t0000: do not use export X=Y
|
|
"git rebase -i" now honors --strategy and -X options.
* af/rebase-i-merge-options:
Do not ignore merge options in interactive rebase
|
|
"git stash save" is not just about "saving" the local changes, but
also is to restore the working tree state to that of HEAD. If you
changed a non-directory into a directory in the local change, you
may have untracked files in that directory, which have to be killed
while doing so, unless you run it with --include-untracked. Teach
the command to detect and error out before spreading the damage.
This needed a small fix to "ls-files --killed".
* pb/stash-refuse-to-kill:
git stash: avoid data loss when "git stash save" kills a directory
treat_directory(): do not declare submodules to be untracked
|
|
"git diff" refused to even show difference when core.safecrlf is
set to true (i.e. error out) and there are offending lines in the
working tree files.
* jc/maint-diff-core-safecrlf:
diff: demote core.safecrlf=true to core.safecrlf=warn
|
|
"git status" learned status.branch and status.short configuration
variables to use --branch and --short options by default (override
with --no-branch and --no-short options from the command line).
* jg/status-config:
status/commit: make sure --porcelain is not affected by user-facing config
commit: make it work with status.short
status: introduce status.branch to enable --branch by default
status: introduce status.short to enable --short by default
|
|
* jk/bash-completion:
completion: learn about --man-path
completion: handle unstuck form of base git options
|
|
Invocations of "git checkout" used internally by "git rebase" were
counted as "checkout", and affected later "git checkout -" to the
the user to an unexpected place.
* rr/rebase-checkout-reflog:
checkout: respect GIT_REFLOG_ACTION
status: do not depend on rebase reflog messages
t/t2021-checkout-last: "checkout -" should work after a rebase finishes
wt-status: remove unused field in grab_1st_switch_cbdata
t7512: test "detached from" as well
|