Age | Commit message (Collapse) | Author | Files | Lines |
|
The manual size computations here are correct, but using
strip_suffix makes that obvious, and hopefully communicates
the intent of the code more clearly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When working with paths in strbufs, we frequently want to
ensure that a directory contains a trailing slash before
appending to it. We can shorten this code (and make the
intent more obvious) by calling strbuf_complete.
Most of these cases are trivially identical conversions, but
there are two things to note:
- in a few cases we did not check that the strbuf is
non-empty (which would lead to an out-of-bounds memory
access). These were generally not triggerable in
practice, either from earlier assertions, or typically
because we would have just fed the strbuf to opendir(),
which would choke on an empty path.
- in a few cases we indexed the buffer with "original_len"
or similar, rather than the current sb->len, and it is
not immediately obvious from the diff that they are the
same. In all of these cases, I manually verified that
the strbuf does not change between the assignment and
the strbuf_complete call.
This does not convert cases which look like:
if (sb->len && !is_dir_sep(sb->buf[sb->len - 1]))
strbuf_addch(sb, '/');
as those are obviously semantically different. Some of these
cases arguably should be doing that, but that is out of
scope for this change, which aims purely for cleanup with no
behavior change (and at least it will make such sites easier
to find and examine in the future, as we can grep for
strbuf_complete).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Since 27e1e22 (prune: factor out loose-object directory
traversal, 2014-10-15), we now have a generic callback
system for iterating over the loose object directories. This
is used by prune, count-objects, etc.
We did not convert git-fsck at the time because it
implemented an inode-sorting scheme that was not part of the
generic code. Now that the inode-sorting code is gone, we
can reuse the generic code. The result is shorter,
hopefully more readable, and drops some unchecked sprintf
calls.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fsck tries to access loose objects in order of inode number,
with the hope that this would make cold cache access faster
on a spinning disk. This dates back to 7e8c174 (fsck-cache:
sort entries by inode number, 2005-05-02), which predates
the invention of packfiles.
These days, there's not much point in trying to optimize
cold cache for a large number of loose objects. You are much
better off to simply pack the objects, which will reduce the
disk footprint _and_ provide better locality of data access.
So while you can certainly construct pathological cases
where this code might help, it is not worth the trouble
anymore.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
strncpy is known to be a confusing function because of its
termination semantics. These calls are all correct, but it
takes some examination to see why. In particular, every one
of them expects to copy up to the length limit, and then
makes some arrangement for terminating the result.
We can just use memcpy, along with noting explicitly how the
result is terminated (if it is not already obvious). That
should make it more clear to a reader that we are doing the
right thing.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we are going to launch "/path/to/konqueror", we instead
rewrite this into "/path/to/kfmclient" by duplicating the
original string and writing over the ending bits. This can
be done more obviously with strip_suffix and xstrfmt.
Note that we also fix a subtle bug with the "filename"
parameter, which is passed as argv[0] to the child. If the
user has configured a program name with no directory
component, we always pass the string "kfmclient", even if
your program is called something else. But if you give a
full path, we give the basename of that path. But more
bizarrely, if we rewrite "konqueror" to "kfmclient", we
still pass "konqueror".
The history of this function doesn't reveal anything
interesting, so it looks like just an oversight from
combining the suffix-munging with the basename-finding.
Let's just call basename on the munged path, which produces
consistent results (if you gave a program, whether a full
path or not, we pass its basename).
Probably this doesn't matter at all in practice, but it
makes the code slightly less confusing to read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
To generate "--keep=receive-pack $pid on $host", we write
progressively into a single buffer, which requires keeping
track of how much we've written so far. But since the result
is destined to go into our argv array, we can simply use
argv_array_pushf.
Unfortunately we still have to have a fixed-size buffer for
the gethostname() call, but at least it now doesn't involve
any extra size computation. And as a bonus, we drop an
sprintf and a strcpy call.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we are allocating a struct with a FLEX_ARRAY member, we
generally compute the size of the array and then sprintf or
strcpy into it. Normally we could improve a dynamic allocation
like this by using xstrfmt, but it doesn't work here; we
have to account for the size of the rest of the struct.
But we can improve things a bit by storing the length that
we use for the allocation, and then feeding it to xsnprintf
or memcpy, which makes it more obvious that we are not
writing more than the allocated number of bytes.
It would be nice if we had some kind of helper for
allocating generic flex arrays, but it doesn't work that
well:
- the call signature is a little bit unwieldy:
d = flex_struct(sizeof(*d), offsetof(d, path), fmt, ...);
You need offsetof here instead of just writing to the
end of the base size, because we don't know how the
struct is packed (partially this is because FLEX_ARRAY
might not be zero, though we can account for that; but
the size of the struct may actually be rounded up for
alignment, and we can't know that).
- some sites do clever things, like over-allocating because
they know they will write larger things into the buffer
later (e.g., struct packed_git here).
So we're better off to just write out each allocation (or
add type-specific helpers, though many of these are one-off
allocations anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This saves us some manual computation, and eliminates a call
to strcpy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Before sha1_to_hex_r() existed, a simple way to get hex
sha1 into a buffer was with:
strcpy(buf, sha1_to_hex(sha1));
This isn't wrong (assuming the buf is 41 characters), but it
makes auditing the code base for bad strcpy() calls harder,
as these become false positives.
Let's convert them to sha1_to_hex_r(), and likewise for
some calls to find_unique_abbrev(). While we're here, we'll
double-check that all of the buffers are correctly sized,
and use the more obvious GIT_SHA1_HEXSZ constant.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We generate range strings like "1234abcd...5678efab" for use
in the the fetch and push status tables. We use fixed-size
buffers along with strcat to do so. These aren't buggy, as
our manual size computation is correct, but there's nothing
checking that this is so. Let's switch them to strbufs
instead, which are obviously correct, and make it easier to
audit the code base for problematic calls to strcat().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We use manual computation and strcpy to allocate the "root"
variable. This would be much simpler using xstrfmt. But
since we store the length, too, we can just use a strbuf,
which handles that for us.
Note that we stop distinguishing between "no root" and
"empty root" in some cases, but that's OK; the results are
the same (e.g., inserting an empty string is a noop).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The init code predates strbufs, and uses PATH_MAX-sized
buffers along with many manual checks on intermediate sizes
(some of which make magic assumptions, such as that init
will not create a path inside .git longer than 50
characters).
We can simplify this greatly by using strbufs, which drops
some hard-to-verify strcpy calls in favor of git_path_buf.
While we're in the area, let's also convert existing calls
to git_path to the safer git_path_buf (our existing calls
were passed to pretty tame functions, and so were not a
problem, but it's easy to be consistent and safe here).
Note that we had an explicit test that "git init" rejects
long template directories. This comes from 32d1776 (init: Do
not segfault on big GIT_TEMPLATE_DIR environment variable,
2009-04-18). We can drop the test_must_fail here, as we now
accept this and need only confirm that we don't segfault,
which was the original point of the test.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we are initializing a .git directory, we may call
probe_utf8_pathname_composition to detect utf8 mangling. We
pass in a path buffer for it to use, and it blindly
strcpy()s into it, not knowing whether the buffer is large
enough to hold the result or not.
In practice this isn't a big deal, because the buffer we
pass in already contains "$GIT_DIR/config", and we append
only a few extra bytes to it. But we can easily do the right
thing just by calling git_path_buf ourselves. Technically
this results in a different pathname (before we appended our
utf8 characters to the "config" path, and now they get their
own files in $GIT_DIR), but that should not matter for our
purposes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We format a pkt-line into a heap buffer, which requires
manual computation of the required size, and uses some bare
sprintf calls. We could use a strbuf instead, which would
take care of the computation for us. But it's even easier
still to use packet_write(). Besides handling the formatting
and writing for us, it fixes two things:
1. Our manual max-size check used 0xFFFF, while technically
LARGE_PACKET_MAX is slightly smaller than this.
2. Our packet will now be output as part of
GIT_TRACE_PACKET debugging.
Unfortunately packet_write() does not let us build up the
buffer progressively, so we do have to repeat ourselves a
little depending on the "vhost" setting, but the end result
is still far more readable than the original.
Since there were no tests covering this feature at all,
we'll add a few into t5802.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we report an error to the client, we format it into a
fixed-size buffer using vsprintf(). This can't actually
overflow in practice, since we only format a very tame
subset of strings (mostly strerror() output). However, it's
hard to tell immediately, so let's just use a strbuf so
readers do not have to wonder.
We do add an allocation here, but the performance is not
important; the next step is to call die() anyway.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This function predates xstrfmt, and its functionality is a
subset. Let's just use xstrfmt.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We parse the INFINITE_DEPTH constant into a static,
fixed-size buffer using sprintf. This buffer is sufficiently
large for the current constant, but it's a suspicious
pattern, as the constant is defined far away, and it's not
immediately obvious that 12 bytes are large enough to hold
it.
We can just use xstrfmt here, which gets rid of any question
of the buffer size. It also removes any concerns with object
lifetime, which means we do not have to wonder why this
buffer deep within a conditional is marked "static" (we
never free our newly allocated result, of course, but that's
OK; it's global that lasts the lifetime of the whole program
anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We xmalloc a fixed-size buffer and sprintf into it; this is
OK because the size of our formatting types is finite, but
that's not immediately clear to a reader auditing sprintf
calls. Let's switch to xstrfmt, which is shorter and
obviously correct.
Note that just dropping the common xmalloc here causes gcc
to complain with -Wmaybe-uninitialized. That's because if
"types" does not match any of our known types, we never
write anything into the "normalized" pointer. With the
current code, gcc doesn't notice because we always return a
valid pointer (just one which might point to uninitialized
data, but the compiler doesn't know that). In other words,
the current code is potentially buggy if new types are added
without updating this spot.
So let's take this opportunity to clean up the function a
bit more. We can drop the "normalized" pointer entirely, and
just return directly from each code path. And then add an
assertion at the end in case we haven't covered any cases.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
It's a common pattern to do:
foo = xmalloc(strlen(one) + strlen(two) + 1 + 1);
sprintf(foo, "%s %s", one, two);
(or possibly some variant with strcpy()s or a more
complicated length computation). We can switch these to use
xstrfmt, which is shorter, involves less error-prone manual
computation, and removes many sprintf and strcpy calls which
make it harder to audit the code for real buffer overflows.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This strncpy is pointless; we pass the strlen() of the src
string, meaning that it works just like a memcpy. Worse,
though, is that the size has no relation to the destination
buffer, meaning it is a potential overflow. In practice,
it's not. We pass only short constant strings like
"warning: " and "error: ", which are much smaller than the
destination buffer.
We can make this much simpler by just using xsnprintf, which
will check for overflow and return the size for our next
vsnprintf, without us having to run a separate strlen().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We generally use 32-byte buffers to format git's "type size"
header fields. These should not generally overflow unless
you can produce some truly gigantic objects (and our types
come from our internal array of constant strings). But it is
a good idea to use xsnprintf to make sure this is the case.
Note that we slightly modify the interface to
write_sha1_file_prepare, which nows uses "hdrlen" as an "in"
parameter as well as an "out" (on the way in it stores the
allocated size of the header, and on the way out it returns
the ultimate size of the header).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We sometimes sprintf into fixed-size buffers when we know
that the buffer is large enough to fit the input (either
because it's a constant, or because it's numeric input that
is bounded in size). Likewise with strcpy of constant
strings.
However, these sites make it hard to audit sprintf and
strcpy calls for buffer overflows, as a reader has to
cross-reference the size of the array with the input. Let's
use xsnprintf instead, which communicates to a reader that
we don't expect this to overflow (and catches the mistake in
case we do).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are several PATH_MAX-sized buffers in mailsplit, along
with some questionable uses of sprintf. These are not
really of security interest, as local mailsplit pathnames
are not typically under control of an attacker, and you
could generally only overflow a few numbers at the end of a
path that approaches PATH_MAX (a longer path would choke
mailsplit long before). But it does not hurt to be careful,
and as a bonus we lift some limits for systems with
too-small PATH_MAX varibles.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When fsck-ing alternates, we make a copy of the alternate
directory in a fixed PATH_MAX buffer. We memcpy directly,
without any check whether we are overflowing the buffer.
This is OK if PATH_MAX is a true representation of the
maximum path on the system, because any path here will have
already been vetted by the alternates subsystem. But that is
not true on every system, so we should be more careful.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Commit 02976bf (fsck: introduce `git fsck --connectivity-only`,
2015-06-22) recently gave fsck an option to perform only a
subset of the checks, by skipping the fsck_object_dir()
call. However, it does so only for the local object
directory, and we still do expensive checks on any alternate
repos. We should skip them in this case, too.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
If we encounter an error while splitting a maildir, we exit
the function early, leaking the open filehandle. This isn't
a big deal, since we exit the program soon after, but it's
easy enough to be careful.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When no branch is given to the "--reflog" option, we resolve
HEAD to get the default branch. However, if HEAD points to
an unborn branch, resolve_ref returns NULL, and we later
segfault trying to access it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Both "git show-ref -h" and "git show-ref --help" illustrated that the
"--exclude-existing" option makes the command read list of refs
from its standard input. Change only the "show-ref -h" output to
have a pair of "<>" around the placeholder that designate an input
file, i.e. "git show-ref --exclude-existing < <ref-list>".
* ah/show-ref-usage-string:
show-ref: place angle brackets around variables in usage string
|
|
* rt/help-strings-fix:
tag, update-ref: improve description of option "create-reflog"
pull: don't mark values for option "rebase" for translation
|
|
* gb/apply-comment-typofix:
apply: comment grammar fix
|
|
The description of option "create-reflog" is "create_reflog", which
is neither a good description, nor a sensible string to translate.
Change it to a more meaningful message.
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"false|true|preserve" are actual values for option "rebase"
of the "git-pull" command and should therefore not be marked
for translation.
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
|
|
Recent "git am" had regression when adding a Signed-off-by line
with its "-s" option by an unintended tightening of how an existing
trailer block is detected.
* jc/builtin-am-signoff-regression-fix:
am: match --signoff to the original scripted version
|
|
Linus noticed that the recently reimplemented "git am -s" defines
the trailer block too rigidly, resulting in an unnecessary blank
line between the existing sign-offs and his new sign-off. An e-mail
submission sent to Linus in real life ends with mixture of sign-offs
and commentaries, e.g.
title here
message here
Signed-off-by: Original Author <original@auth.or>
[rv: tweaked frotz and nitfol]
Signed-off-by: Re Viewer <rv@ew.er>
Signed-off-by: Other Reviewer <other@rev.ewer>
---
patch here
Because the reimplementation reused append_signoff() helper that is
used by other codepaths, which is unaware that people intermix such
comments with their sign-offs in the trailer block, such a message
was judged to end with a non-trailer, resulting in an extra blank
line before adding a new sign-off.
The original scripted version of "git am" used a lot looser
definition, i.e. "if and only if there is no line that begins with
Signed-off-by:, add a blank line before adding a new sign-off". For
the upcoming release, stop using the append_signoff() in "git am"
and reimplement the looser definition used by the scripted version
to use only in "git am" to fix this regression in "am" while
avoiding new regressions to other users of append_signoff().
In the longer term, we should look into loosening append_signoff()
so that other codepaths that add a new sign-off behave the same way
as "git am -s", but that is a task for post-release.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
|
|
|
|
|
|
When we show "branch@{0}", we format into a fixed-size
buffer using sprintf. This can overflow if you have long
branch names. We can fix it by using a temporary strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
'dt/commit-preserve-base-index-upon-opportunistic-cache-tree-update' into maint
When re-priming the cache-tree opportunistically while committing
the in-core index as-is, we mistakenly invalidated the in-core
index too aggressively, causing the experimental split-index code
to unnecessarily rewrite the on-disk index file(s).
* dt/commit-preserve-base-index-upon-opportunistic-cache-tree-update:
commit: don't rewrite shared index unnecessarily
|
|
Minor code cleanup.
* jc/calloc-pathspec:
ps_matched: xcalloc() takes nmemb and then element size
|
|
"git rev-list" does not take "--notes" option, but did not complain
when one is given.
* jk/rev-list-has-no-notes:
rev-list: make it obvious that we do not support notes
|
|
An off-by-one error made "git remote" to mishandle a remote with a
single letter nickname.
* mh/get-remote-group-fix:
get_remote_group(): use skip_prefix()
get_remote_group(): eliminate superfluous call to strcspn()
get_remote_group(): rename local variable "space" to "wordlen"
get_remote_group(): handle remotes with single-character names
|
|
Recent "git am" introduced a double-locking failure when used with
the "--3way" option that invokes rerere machinery.
* jk/am-rerere-lock-fix:
rerere: release lockfile in non-writing functions
|
|
'dt/commit-preserve-base-index-upon-opportunistic-cache-tree-update'
When re-priming the cache-tree opportunistically while committing
the in-core index as-is, we mistakenly invalidated the in-core
index too aggressively, causing the experimental split-index code
to unnecessarily rewrite the on-disk index file(s).
* dt/commit-preserve-base-index-upon-opportunistic-cache-tree-update:
commit: don't rewrite shared index unnecessarily
|
|
Error string fix.
* ah/reflog-typofix-in-error:
reflog: add missing single quote to error message
|
|
Usage string fix.
* ah/read-tree-usage-string:
read-tree: replace bracket set with parentheses to clarify usage
|
|
Usage string fix.
* ah/pack-objects-usage-strings:
pack-objects: place angle brackets around placeholders in usage strings
|
|
There's a bug in builtin/am.c in which we take a lock on
MERGE_RR recursively. But rather than fix am.c, this patch
fixes the confusing interface from rerere.c that caused the
bug. Read on for the gory details.
The setup_rerere() function both reads the existing MERGE_RR
file, and takes MERGE_RR.lock. In the rerere() and
rerere_forget() functions, we end up in write_rr(), which
will then commit the lock file.
But for functions like rerere_clear() that do not write to
MERGE_RR, we expect the caller to have handled
setup_rerere(). That caller would then need to release the
lockfile, but it can't; the lock struct is local to
rerere.c.
For builtin/rerere.c, this is OK. We run a single rerere
operation and then exit immediately, which has the side
effect of rolling back the lockfile.
But in builtin/am.c, this is actively wrong. If we run "git
am -3 --skip", we call setup-rerere twice without releasing
the lock:
1. The "--skip" causes us to call am_rerere_clear(), which
calls setup_rerere(), but never drops the lock.
2. We then proceed to the next patch.
3. The "--3way" may cause us to call rerere() to handle
conflicts in that patch, but we are already holding the
lock. The lockfile code dies with:
BUG: prepare_tempfile_object called for active object
We could fix this by having rerere_clear() call
rollback_lock_file(). But it feels a bit odd for it to roll
back a lockfile that it did not itself take. So let's
simplify the interface further, and handle setup_rerere in
the function itself, taking away the question from the
caller over whether they need to do so.
We can give rerere_gc() the same treatment, as well (even
though it doesn't have any callers besides builtin/rerere.c
at this point). Note that these functions don't take flags
from their callers to pass along to setup_rerere; that's OK,
because the flags would not be meaningful for what they are
doing.
Both of those functions need to hold the lock because even
though they do not write to MERGE_RR, they are still writing
and should be protected from a simultaneous "rerere" run.
But rerere_remaining(), "rerere diff", and "rerere status"
are all read-only operations. They want to setup_rerere(),
but do not care about taking the lock in the first place.
Since our update of MERGE_RR is the usual atomic rename done
by commit_lock_file, they can just do a lockless read. For
that, we teach setup_rerere a READONLY flag to avoid the
lock.
As a bonus, this pushes builtin/rerere.c's setup_rerere call
closer to the functions that use it. Which means that "git
rerere totally-bogus-command" will no longer silently
exit(0) in a repository without rerere enabled.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|