Age | Commit message (Collapse) | Author | Files | Lines |
|
The internal implementation of "git grep" has seen some clean-up.
* ab/grep-preparatory-cleanup: (31 commits)
grep: assert that threading is enabled when calling grep_{lock,unlock}
grep: given --threads with NO_PTHREADS=YesPlease, warn
pack-objects: fix buggy warning about threads
pack-objects & index-pack: add test for --threads warning
test-lib: add a PTHREADS prerequisite
grep: move is_fixed() earlier to avoid forward declaration
grep: change internal *pcre* variable & function names to be *pcre1*
grep: change the internal PCRE macro names to be PCRE1
grep: factor test for \0 in grep patterns into a function
grep: remove redundant regflags assignments
grep: catch a missing enum in switch statement
perf: add a comparison test of log --grep regex engines with -F
perf: add a comparison test of log --grep regex engines
perf: add a comparison test of grep regex engines with -F
perf: add a comparison test of grep regex engines
perf: emit progress output when unpacking & building
perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do
grep: add tests to fix blind spots with \0 patterns
grep: prepare for testing binary regexes containing rx metacharacters
grep: add a test helper function for less verbose -f \0 tests
...
|
|
The result from "git diff" that compares two blobs, e.g. "git diff
$commit1:$path $commit2:$path", used to be shown with the full
object name as given on the command line, but it is more natural to
use the $path in the output and use it to look up .gitattributes.
* jk/diff-blob:
diff: use blob path for blob/file diffs
diff: use pending "path" if it is available
diff: use the word "path" instead of "name" for blobs
diff: pass whole pending entry in blobinfo
handle_revision_arg: record paths for pending objects
handle_revision_arg: record modes for "a..b" endpoints
t4063: add tests of direct blob diffs
get_sha1_with_context: dynamically allocate oc->path
get_sha1_with_context: always initialize oc->symlink_path
sha1_name: consistently refer to object_context as "oc"
handle_revision_arg: add handle_dotdot() helper
handle_revision_arg: hoist ".." check out of range parsing
handle_revision_arg: stop using "dotdot" as a generic pointer
handle_revision_arg: simplify commit reference lookups
handle_revision_arg: reset "dotdot" consistently
|
|
Tag objects, which are not reachable from any ref, that point at
missing objects were mishandled by "git gc" and friends (they
should silently be ignored instead)
* jk/ignore-broken-tags-when-ignoring-missing-links:
revision.c: ignore broken tags with ignore_missing_links
|
|
Conversion from uchar[20] to struct object_id continues.
* bc/object-id: (53 commits)
object: convert parse_object* to take struct object_id
tree: convert parse_tree_indirect to struct object_id
sequencer: convert do_recursive_merge to struct object_id
diff-lib: convert do_diff_cache to struct object_id
builtin/ls-tree: convert to struct object_id
merge: convert checkout_fast_forward to struct object_id
sequencer: convert fast_forward_to to struct object_id
builtin/ls-files: convert overlay_tree_on_cache to object_id
builtin/read-tree: convert to struct object_id
sha1_name: convert internals of peel_onion to object_id
upload-pack: convert remaining parse_object callers to object_id
revision: convert remaining parse_object callers to object_id
revision: rename add_pending_sha1 to add_pending_oid
http-push: convert process_ls_object and descendants to object_id
refs/files-backend: convert many internals to struct object_id
refs: convert struct ref_update to use struct object_id
ref-filter: convert some static functions to struct object_id
Convert struct ref_array_item to struct object_id
Convert the verify_pack callback to struct object_id
Convert lookup_tag to struct object_id
...
|
|
If the revision parser sees an argument like tree:path, we
parse it down to the correct blob (or tree), but throw away
the "path" portion. Let's ask get_sha1_with_context() to
record it, and pass it along in the pending array.
This will let programs like git-diff which rely on the
revision-parser show more accurate paths.
Note that the implementation is a little tricky; we have to
make sure we free oc.path in all code paths. For handle_dotdot(),
we can piggy-back on the existing cleanup-wrapper pattern.
The real work happens in handle_dotdot_1(), but the
handle_dotdot() wrapper makes sure that the path is freed no
matter how we exit the function (and for that reason we make
sure that the object_context struct is zero'd, so if we fail
to even get to the get_sha1_with_context() call, we just end
up calling free(NULL)).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "a..b" revision syntax was designed to handle commits,
so it doesn't bother to record any mode we find while
traversing a "tree:path" endpoint. These days "git diff" can
diff blobs using either "a:path..b:path" (with dots) or
"a:path b:path" (without), but the two behave
inconsistently, as the with-dots version fails to notice the
mode.
Let's teach the dot-dot range parser to record modes; it
doesn't cost us anything, and it makes this case work.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The handle_revision_arg function is rather long, and a big
chunk of it is handling the range operators. Let's pull that
out to a separate helper. While we're doing so, we can clean
up a few of the rough edges that made the flow hard to
follow:
- instead of manually restoring *dotdot (that we overwrote
with a NUL), do the real work in a sub-helper, which
makes it clear that the munge/restore lines are a
matched pair
- eliminate a goto which wasn't actually used for control
flow, but only to avoid duplicating a few lines
(instead, those lines are pushed into another helper
function)
- use early returns instead of deep nesting
- consistently name all variables for the left-hand side
of the range as "a" (rather than "this" or "from") and
the right-hand side as "b" (rather than "next", or using
the unadorned "sha1" or "flags" from the main function).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Since 003c84f6d (specifying ranges: we did not mean to make
".." an empty set, 2011-05-02), we treat the argument ".."
specially. We detect it by noticing that both sides of the
range are empty, and that this is a non-symmetric two-dot
range.
While correct, this makes the code overly complicated. We
can just detect ".." up front before we try to do further
parsing. This avoids having to de-munge the NUL from dotdot,
and lets us eliminate an extra const array (which we needed
only to do direct pointer comparisons).
It also removes the one code path from the range-parsing
conditional that requires us to return -1. That will make it
simpler to pull the dotdot parsing out into its own
function.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The handle_revision_arg() function has a "dotdot" variable
that it uses to find a ".." or "..." in the argument. If we
don't find one, we look for other marks, like "^!". But we
just keep re-using the "dotdot" variable, which is
confusing.
Let's introduce a separate "mark" variable that can be used
for these other marks. They still reuse the same variable,
but at least the name is no longer actively misleading.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "dotdot" range parser avoids calling
lookup_commit_reference() if we are directly fed two
commits. But its casts are unnecessarily complex; that
function will just return a commit we pass into it.
Just calling the function all the time is much simpler, and
doesn't do any significant extra work (the object is already
parsed, and deref_tag() on a non-tag is a noop; we do incur
one extra lookup_object() call, but that's fairly trivial).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we are parsing a range like "a..b", we write a
temporary NUL over the first ".", so that we can access the
names "a" and "b" as C strings. But our restoration of the
original "." is done at inconsistent times, which can lead
to confusing results.
For most calls, we restore the "." after we resolve the
names, but before we call verify_non_filename(). This means
that when we later call add_pending_object(), the name for
the left-hand "a" has been re-expanded to "a..b". You can
see this with:
git log --source a...b
where "b" will be correctly marked with "b", but "a" will be
marked with "a...b". Likewise with "a..b" (though you need
to use --boundary to even see "a" at all in that case).
To top off the confusion, when the REVARG_CANNOT_BE_FILENAME
flag is set, we skip the non-filename check, and leave the
NUL in place.
That means we do report the correct name for "a" in the
pending array. But some code paths try to show the whole
"a...b" name in error messages, and these erroneously show
only "a" instead of "a...b". E.g.:
$ git cherry-pick HEAD:foo...HEAD:foo
error: object d95f3ad14dee633a758d2e331151e950dd13e4ed is a blob, not a commit
error: object d95f3ad14dee633a758d2e331151e950dd13e4ed is a blob, not a commit
fatal: Invalid symmetric difference expression HEAD:foo
(That last message should be "HEAD:foo...HEAD:foo"; I used
cherry-pick because it passes the CANNOT_BE_FILENAME flag).
As an interesting side note, cherry-pick actually looks at
and re-resolves the arguments from the pending->name fields.
So it would have been visibly broken by the first bug, but
the effect was canceled out by the second one.
This patch makes the whole function consistent by re-writing
the NUL immediately after calling verify_non_filename(), and
then restoring the "." as appropriate in some error-printing
and early-return code paths.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Make the --regexp-ignore-case option work with --perl-regexp. This
never worked, and there was no test for this. Fix the bug and add a
test.
When PCRE support was added in commit 63e7e9d8b6 ("git-grep: Learn
PCRE", 2011-05-09) compile_pcre_regexp() would only check
opt->ignore_case, but when the --perl-regexp option was added in
commit 727b6fc3ed ("log --grep: accept --basic-regexp and
--perl-regexp", 2012-10-03) the code didn't set the opt->ignore_case.
Change the test suite to test for -i and --invert-regexp with
basic/extended/perl patterns in addition to fixed, which was the only
patternType that was tested for before in combination with those
options.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When peeling a tag for prepare_revision_walk(), we do not
respect the ignore_missing_links flag. This can lead to a
bogus error when pack-objects walks the possibly-broken
unreachable-but-recent part of the object graph.
The other link-following all happens via traverse_commit_list(),
which explains why this case was missed. And our tests
covered only broken links from commits. Let's be more
comprehensive and cover broken tree entries (which do work)
and tags (which shows off this bug).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Make parse_object, parse_object_or_die, and parse_object_buffer take a
pointer to struct object_id. Remove the temporary variables inserted
earlier, since they are no longer necessary. Transform all of the
callers using the following semantic patch:
@@
expression E1;
@@
- parse_object(E1.hash)
+ parse_object(&E1)
@@
expression E1;
@@
- parse_object(E1->hash)
+ parse_object(E1)
@@
expression E1, E2;
@@
- parse_object_or_die(E1.hash, E2)
+ parse_object_or_die(&E1, E2)
@@
expression E1, E2;
@@
- parse_object_or_die(E1->hash, E2)
+ parse_object_or_die(E1, E2)
@@
expression E1, E2, E3, E4, E5;
@@
- parse_object_buffer(E1.hash, E2, E3, E4, E5)
+ parse_object_buffer(&E1, E2, E3, E4, E5)
@@
expression E1, E2, E3, E4, E5;
@@
- parse_object_buffer(E1->hash, E2, E3, E4, E5)
+ parse_object_buffer(E1, E2, E3, E4, E5)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Rename this function and convert it to take a pointer to struct
object_id.
This is a prerequisite for converting get_reference, which is needed to
convert parse_object.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert the lookup_tree function to take a pointer to struct object_id.
The commit was created with manual changes to tree.c, tree.h, and
object.c, plus the following semantic patch:
@@
@@
- lookup_tree(EMPTY_TREE_SHA1_BIN)
+ lookup_tree(&empty_tree_oid)
@@
expression E1;
@@
- lookup_tree(E1.hash)
+ lookup_tree(&E1)
@@
expression E1;
@@
- lookup_tree(E1->hash)
+ lookup_tree(E1)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert lookup_blob to take a pointer to struct object_id.
The commit was created with manual changes to blob.c and blob.h, plus
the following semantic patch:
@@
expression E1;
@@
- lookup_blob(E1.hash)
+ lookup_blob(&E1)
@@
expression E1;
@@
- lookup_blob(E1->hash)
+ lookup_blob(E1)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert lookup_commit, lookup_commit_or_die,
lookup_commit_reference, and lookup_commit_reference_gently to take
struct object_id arguments.
Introduce a temporary in parse_object buffer in order to convert this
function. This is required since in order to convert parse_object and
parse_object_buffer, lookup_commit_reference_gently and
lookup_commit_or_die would need to be converted. Not introducing a
temporary would therefore require that lookup_commit_or_die take a
struct object_id *, but lookup_commit would take unsigned char *,
leaving a confusing and hard-to-use interface.
parse_object_buffer will lose this temporary in a later patch.
This commit was created with manual changes to commit.c, commit.h, and
object.c, plus the following semantic patch:
@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1.hash, E2)
+ lookup_commit_reference_gently(&E1, E2)
@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1->hash, E2)
+ lookup_commit_reference_gently(E1, E2)
@@
expression E1;
@@
- lookup_commit_reference(E1.hash)
+ lookup_commit_reference(&E1)
@@
expression E1;
@@
- lookup_commit_reference(E1->hash)
+ lookup_commit_reference(E1)
@@
expression E1;
@@
- lookup_commit(E1.hash)
+ lookup_commit(&E1)
@@
expression E1;
@@
- lookup_commit(E1->hash)
+ lookup_commit(E1)
@@
expression E1, E2;
@@
- lookup_commit_or_die(E1.hash, E2)
+ lookup_commit_or_die(&E1, E2)
@@
expression E1, E2;
@@
- lookup_commit_or_die(E1->hash, E2)
+ lookup_commit_or_die(E1, E2)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
This is a caller of lookup_commit_or_die, which we will convert later
on.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert the sha1 member of struct cache_tree to struct object_id by
changing the definition and applying the following semantic patch, plus
the standard object_id transforms:
@@
struct cache_tree E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct cache_tree *E1;
@@
- E1->sha1
+ E1->oid.hash
Fix up one reference to active_cache_tree which was not automatically
caught by Coccinelle. These changes are prerequisites for converting
parse_object.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).
So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.
By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.
As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"uchar [40]" to "struct object_id" conversion continues.
* bc/object-id:
wt-status: convert to struct object_id
builtin/merge-base: convert to struct object_id
Convert object iteration callbacks to struct object_id
sha1_file: introduce an nth_packed_object_oid function
refs: simplify parsing of reflog entries
refs: convert each_reflog_ent_fn to struct object_id
reflog-walk: convert struct reflog_info to struct object_id
builtin/replace: convert to struct object_id
Convert remaining callers of resolve_refdup to object_id
builtin/merge: convert to struct object_id
builtin/clone: convert to struct object_id
builtin/branch: convert to struct object_id
builtin/grep: convert to struct object_id
builtin/fmt-merge-message: convert to struct object_id
builtin/fast-export: convert to struct object_id
builtin/describe: convert to struct object_id
builtin/diff-tree: convert to struct object_id
builtin/commit: convert to struct object_id
hex: introduce parse_oid_hex
|
|
The interpret_branch_name() function converts names like
@{-1} and @{upstream} into branch names. The expanded ref
names are not fully qualified, and may be outside of the
refs/heads/ namespace (e.g., "@" expands to "HEAD", and
"@{upstream}" is likely to be in "refs/remotes/").
This is OK for callers like dwim_ref() which are primarily
interested in resolving the resulting name, no matter where
it is. But callers like "git branch" treat the result as a
branch name in refs/heads/. When we expand to a ref outside
that namespace, the results are very confusing (e.g., "git
branch @" tries to create refs/heads/HEAD, which is
nonsense).
Callers can't know from the returned string how the
expansion happened (e.g., did the user really ask for a
branch named "HEAD", or did we do a bogus expansion?). One
fix would be to return some out-parameters describing the
types of expansion that occurred. This has the benefit that
the caller can generate precise error messages ("I
understood @{upstream} to mean origin/master, but that is a
remote tracking branch, so you cannot create it as a local
name").
However, out-parameters make the function interface somewhat
cumbersome. Instead, let's do the opposite: let the caller
tell us which elements to expand. That's easier to pass in,
and none of the callers give more precise error messages
than "@{upstream} isn't a valid branch name" anyway (which
should be sufficient).
The strbuf_branchname() function needs a similar parameter,
as most of the callers access interpret_branch_name()
through it.
We can break the callers down into two groups:
1. Callers that are happy with any kind of ref in the
result. We pass "0" here, so they continue to work
without restrictions. This includes merge_name(),
the reflog handling in add_pending_object_with_path(),
and substitute_branch_name(). This last is what powers
dwim_ref().
2. Callers that have funny corner cases (mostly in
git-branch and git-checkout). These need to make use of
the new parameter, but I've left them as "0" in this
patch, and will address them individually in follow-on
patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Make each_reflog_ent_fn take two struct object_id pointers instead of
two pointers to unsigned char. Convert the various callbacks to use
struct object_id as well. Also, rename fsck_handle_reflog_sha1 to
fsck_handle_reflog_oid.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git log rev^..rev" is an often-used revision range specification
to show what was done on a side branch merged at rev. This has
gained a short-hand "rev^-1". In general "rev^-$n" is the same as
"^rev^$n rev", i.e. what has happened on other branches while the
history leading to nth parent was looking the other way.
* vn/revision-shorthand-for-side-branch-log:
revision: new rev^-n shorthand for rev^n..rev
|
|
"git log rev^..rev" is commonly used to show all work done on and merged
from a side branch. This patch introduces a shorthand "rev^-" for this
and additionally allows "rev^-$n" to mean "reachable from rev, excluding
what is reachable from the nth parent of rev". For example, for a
two-parent merge, you can use rev^-2 to get the set of commits which were
made to the main branch while the topic branch was prepared.
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Convert struct cache_entry to use struct object_id by applying the
following semantic patch and the object_id transforms from contrib, plus
the actual change to the struct:
@@
struct cache_entry E1;
@@
- E1.sha1
+ E1.oid.hash
@@
struct cache_entry *E1;
@@
- E1->sha1
+ E1->oid.hash
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When "git rebase" tries to compare set of changes on the updated
upstream and our own branch, it computes patch-id for all of these
changes and attempts to find matches. This has been optimized by
lazily computing the full patch-id (which is expensive) to be
compared only for changes that touch the same set of paths.
* kw/patch-ids-optim:
rebase: avoid computing unnecessary patch IDs
patch-ids: add flag to create the diff patch id using header only data
patch-ids: replace the seen indicator with a commit pointer
patch-ids: stop using a hand-rolled hashmap implementation
|
|
"git -c grep.patternType=extended log --basic-regexp" misbehaved
because the internal API to access the grep machinery was not
designed well.
* jc/grep-commandline-vs-configuration:
grep: further simplify setting the pattern type
|
|
The cherry_pick_list was looping through the original side checking the
seen indicator and setting the cherry_flag on the commit. If we save
off the commit in the patch_id we can set the cherry_flag on the correct
commit when running through the other side when a patch_id match is found.
Signed-off-by: Kevin Willford <kcwillford@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When c5c31d33 (grep: move pattern-type bits support to top-level
grep.[ch], 2012-10-03) introduced grep_commit_pattern_type() helper
function, the intention was to allow the users of grep API to having
to fiddle only with .pattern_type_option (which can be set to "fixed",
"basic", "extended", and "pcre"), and then immediately before compiling
the pattern strings for use, call grep_commit_pattern_type() to have
it prepare various bits in the grep_opt structure (like .fixed,
.regflags, etc.).
However, grep_set_pattern_type_option() helper function the grep API
internally uses were left as an external function by mistake. This
function shouldn't have been made callable by the users of the API.
Later when the grep API was used in revision traversal machinery,
the caller then mistakenly started calling the function around
34a4ae55 (log --grep: use the same helper to set -E/-F options as
"git grep", 2012-10-03), instead of setting the .pattern_type_option
field and letting the grep_commit_pattern_type() to take care of the
details.
This caused an unnecessary bug that made a configured
grep.patternType take precedence over the command line options
(e.g. --basic-regexp, --fixed-strings) in "git log" family of
commands.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git log" learns log.showSignature configuration variable, and a
command line option "--no-show-signature" to countermand it.
* mj/log-show-signature-conf:
log: add log.showSignature configuration variable
log: add "--no-show-signature" command line option
t4202: refactor test
|
|
If an user creates an alias with "--show-signature" early in command
line, e.g.
[alias] logss = log --show-signature
then there is no way to countermand it through command line.
Teach git-log and related commands about "--no-show-signature" command
line option. This will make "git logss --no-show-signature" run
without showing GPG signature.
Signed-off-by: Mehul Jain <mehul.jain2029@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The function takes a pointer to a pathspec structure, and releases
the resources held by it, but does not free() the structure itself.
Such a function should be called "clear", not "free".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Move from unsigned char[20] to struct object_id continues.
* bc/object-id:
match-trees: convert several leaf functions to use struct object_id
tree-walk: convert tree_entry_extract() to use struct object_id
struct name_entry: use struct object_id instead of unsigned char sha1[20]
match-trees: convert shift_tree() and shift_tree_by() to use object_id
test-match-trees: convert to use struct object_id
sha1-name: introduce a get_oid() function
|
|
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When "git log" shows the log message indented by 4-spaces, the
remainder of a line after a HT does not align in the way the author
originally intended. The command now expands tabs by default in
such a case, and allows the users to override it with a new option,
'--no-expand-tabs'.
* lt/pretty-expand-tabs:
pretty: test --expand-tabs
pretty: allow tweaking tabwidth in --expand-tabs
pretty: enable --expand-tabs by default for selected pretty formats
pretty: expand tabs in indented logs to make things line up properly
|
|
When the local convention of the project is to use tab width that is
not 8, it may make sense to allow "git log --expand-tabs=<n>" to
tweak the output to match it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
"git log --pretty={medium,full,fuller}" and "git log" by default
prepend 4 spaces to the log message, so it makes sense to enable
the new "expand-tabs" facility by default for these formats.
Add --no-expand-tabs option to override the new default.
The change alone breaks a test in t4201 that runs "git shortlog"
on the output from "git log", and expects that the output from
"git log" does not do such a tab expansion. Adjust the test to
explicitly disable expand-tabs with --no-expand-tabs.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
A commit log message sometimes tries to line things up using tabs,
assuming fixed-width font with the standard 8-place tab settings.
Viewing such a commit however does not work well in "git log", as
we indent the lines by prefixing 4 spaces in front of them.
This should all line up:
Column 1 Column 2
-------- --------
A B
ABCD EFGH
SPACES Instead of Tabs
Even with multi-byte UTF8 characters:
Column 1 Column 2
-------- --------
Ä B
åäö 100
A Møøse once bit my sister..
Tab-expand the lines in "git log --expand-tabs" output before
prefixing 4 spaces.
This is based on the patch by Linus Torvalds, but at this step, we
require an explicit command line option to enable the behaviour.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we find a blob at "a/b/c", we currently pass this to
our show_object_fn callbacks as two components: "a/b/" and
"c". Callbacks which want the full value then call
path_name(), which concatenates the two. But this is an
inefficient interface; the path is a strbuf, and we could
simply append "c" to it temporarily, then roll back the
length, without creating a new copy.
So we could improve this by teaching the callsites of
path_name() this trick (and there are only 3). But we can
also notice that no callback actually cares about the
broken-down representation, and simply pass each callback
the full path "a/b/c" as a string. The callback code becomes
even simpler, then, as we do not have to worry about freeing
an allocated buffer, nor rolling back our modification to
the strbuf.
This is theoretically less efficient, as some callbacks
would not bother to format the final path component. But in
practice this is not measurable. Since we use the same
strbuf over and over, our work to grow it is amortized, and
we really only pay to memcpy a few bytes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the previous commit, we left name_path as a thin wrapper
around a strbuf. This patch drops it entirely. As a result,
every show_object_fn callback needs to be adjusted. However,
none of their code needs to be changed at all, because the
only use was to pass it to path_name(), which now handles
the bare strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "struct name_path" data is examined in only two places:
we generate it in process_tree(), and we convert it to a
single string in path_name(). Everyone else just passes it
through to those functions.
We can further note that process_tree() already keeps a
single strbuf with the leading tree path, for use with
tree_entry_interesting().
Instead of building a separate name_path linked list, let's
just use the one we already build in "base". This reduces
the amount of code (especially tricky code in path_name()
which did not check for integer overflows caused by deep
or large pathnames).
It is also more efficient in some instances. Any time we
were using tree_entry_interesting, we were building up the
strbuf anyway, so this is an immediate and obvious win
there. In cases where we were not, we trade off storing
"pathname/" in a strbuf on the heap for each level of the
path, instead of two pointers and an int on the stack (with
one pointer into the tree object). On a 64-bit system, the
latter is 20 bytes; so if path components are less than that
on average, this has lower peak memory usage. In practice
it probably doesn't matter either way; we are already
holding in memory all of the tree objects leading up to each
pathname, and for normal-depth pathnames, we are only
talking about hundreds of bytes.
This patch leaves "struct name_path" as a thin wrapper
around the strbuf, to avoid disrupting callbacks. We should
fix them, but leaving it out makes this diff easier to view.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When "git rev-list" shows an object with its associated path
name, it does so by walking the name_path linked list and
printing each component (stopping at any embedded NULs or
newlines).
We'd like to eventually get rid of name_path entirely in
favor of a single buffer, and dropping this custom printing
code is part of that. As a first step, let's use path_name()
to format the list into a single buffer, and print that.
This is strictly less efficient than the original, but it's
a temporary step in the refactoring; our end game will be to
get the fully formatted name in the first place.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Update various codepaths to avoid manually-counted malloc().
* jk/tighten-alloc: (22 commits)
ewah: convert to REALLOC_ARRAY, etc
convert ewah/bitmap code to use xmalloc
diff_populate_gitlink: use a strbuf
transport_anonymize_url: use xstrfmt
git-compat-util: drop mempcpy compat code
sequencer: simplify memory allocation of get_message
test-path-utils: fix normalize_path_copy output buffer size
fetch-pack: simplify add_sought_entry
fast-import: simplify allocation in start_packfile
write_untracked_extension: use FLEX_ALLOC helper
prepare_{git,shell}_cmd: use argv_array
use st_add and st_mult for allocation size computation
convert trivial cases to FLEX_ARRAY macros
use xmallocz to avoid size arithmetic
convert trivial cases to ALLOC_ARRAY
convert manual allocations to argv_array
argv-array: add detach function
add helpers for allocating flex-array structs
harden REALLOC_ARRAY and xcalloc against size_t overflow
tree-diff: catch integer overflow in combine_diff_path allocation
...
|
|
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we find a blob at "a/b/c", we currently pass this to
our show_object_fn callbacks as two components: "a/b/" and
"c". Callbacks which want the full value then call
path_name(), which concatenates the two. But this is an
inefficient interface; the path is a strbuf, and we could
simply append "c" to it temporarily, then roll back the
length, without creating a new copy.
So we could improve this by teaching the callsites of
path_name() this trick (and there are only 3). But we can
also notice that no callback actually cares about the
broken-down representation, and simply pass each callback
the full path "a/b/c" as a string. The callback code becomes
even simpler, then, as we do not have to worry about freeing
an allocated buffer, nor rolling back our modification to
the strbuf.
This is theoretically less efficient, as some callbacks
would not bother to format the final path component. But in
practice this is not measurable. Since we use the same
strbuf over and over, our work to grow it is amortized, and
we really only pay to memcpy a few bytes.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In the previous commit, we left name_path as a thin wrapper
around a strbuf. This patch drops it entirely. As a result,
every show_object_fn callback needs to be adjusted. However,
none of their code needs to be changed at all, because the
only use was to pass it to path_name(), which now handles
the bare strbuf.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The "struct name_path" data is examined in only two places:
we generate it in process_tree(), and we convert it to a
single string in path_name(). Everyone else just passes it
through to those functions.
We can further note that process_tree() already keeps a
single strbuf with the leading tree path, for use with
tree_entry_interesting().
Instead of building a separate name_path linked list, let's
just use the one we already build in "base". This reduces
the amount of code (especially tricky code in path_name()
which did not check for integer overflows caused by deep
or large pathnames).
It is also more efficient in some instances. Any time we
were using tree_entry_interesting, we were building up the
strbuf anyway, so this is an immediate and obvious win
there. In cases where we were not, we trade off storing
"pathname/" in a strbuf on the heap for each level of the
path, instead of two pointers and an int on the stack (with
one pointer into the tree object). On a 64-bit system, the
latter is 20 bytes; so if path components are less than that
on average, this has lower peak memory usage. In practice
it probably doesn't matter either way; we are already
holding in memory all of the tree objects leading up to each
pathname, and for normal-depth pathnames, we are only
talking about hundreds of bytes.
This patch leaves "struct name_path" as a thin wrapper
around the strbuf, to avoid disrupting callbacks. We should
fix them, but leaving it out makes this diff easier to view.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|