summaryrefslogtreecommitdiff
path: root/t/t4212-log-corrupt.sh
AgeCommit message (Collapse)AuthorFilesLines
2014-04-09Merge branch 'jk/commit-dates-parsing-fix' into maintLibravatar Junio C Hamano1-4/+2
* jk/commit-dates-parsing-fix: t4212: loosen far-in-future test for AIX date: recognize bogus FreeBSD gmtime output
2014-04-01t4212: loosen far-in-future test for AIXLibravatar Jeff King1-4/+2
One of the tests in t4212 checks our behavior when we feed gmtime a date so far in the future that it gives up and returns NULL. Some implementations, like AIX, may actually just provide us a bogus result instead. It's not worth it for us to come up with heuristics that guess whether the return value is sensible or not. On good platforms where gmtime reports the problem to us with NULL, we will print the epoch value. On bad platforms, we will print garbage. But our test should be written for the lowest common denominator so that it passes everywhere. Reported-by: Charles Bailey <cbailey32@bloomberg.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-18Merge branch 'jk/commit-dates-parsing-fix' into maintLibravatar Junio C Hamano1-0/+45
Codepaths that parse timestamps in commit objects have been tightened. * jk/commit-dates-parsing-fix: show_ident_date: fix tz range check log: do not segfault on gmtime errors log: handle integer overflow in timestamps date: check date overflow against time_t fsck: report integer overflow in author timestamps t4212: test bogus timestamps with git-log
2014-02-24log: do not segfault on gmtime errorsLibravatar Jeff King1-0/+8
Many code paths assume that show_date and show_ident_date cannot return NULL. For the most part, we handle missing or corrupt timestamps by showing the epoch time t=0. However, we might still return NULL if gmtime rejects the time_t we feed it, resulting in a segfault. Let's catch this case and just format t=0. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24log: handle integer overflow in timestampsLibravatar Jeff King1-0/+16
If an ident line has a ridiculous date value like (2^64)+1, we currently just pass ULONG_MAX along to the date code, which can produce nonsensical dates. On systems with a signed long time_t (e.g., 64-bit glibc systems), this actually doesn't end up too bad. The ULONG_MAX is converted to -1, we apply the timezone field to that, and the result ends up somewhere between Dec 31, 1969 and Jan 1, 1970. However, there is still a few good reasons to detect the overflow explicitly: 1. On systems where "unsigned long" is smaller than time_t, we get a nonsensical date in the future. 2. Even where it would produce "Dec 31, 1969", it's easier to recognize "midnight Jan 1" as a consistent sentinel value for "we could not parse this". 3. Values which do not overflow strtoul but do overflow a signed time_t produce nonsensical values in the past. For example, on a 64-bit system with a signed long time_t, a timestamp of 18446744073000000000 produces a date in 1947. We also recognize overflow in the timezone field, which could produce nonsensical results. In this case we show the parsed date, but in UTC. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24t4212: test bogus timestamps with git-logLibravatar Jeff King1-0/+21
When t4212 was originally added by 9dbe7c3d (pretty: handle broken commit headers gracefully, 2013-04-17), it tested our handling of commits with broken ident lines in which the timestamps could not be parsed. It does so using a bogus line like "Name <email>-<> 1234 -0000", because that simulates an error that was seen in the wild. Later, 03818a4 (split_ident: parse timestamp from end of line, 2013-10-14) made our parser smart enough to actually find the timestamp on such a line, and t4212 was adjusted to match. While it's nice that we handle this real-world case, this meant that we were not actually testing the bogus-timestamp case anymore. This patch adds a test with a totally incomprehensible timestamp to make sure we are testing the code path. Note that the behavior is slightly different between regular log output and "--format=%ad". In the former case, we produce a sentinel value and in the latter, we produce an empty string. While at first this seems unnecessarily inconsistent, it matches the original behavior given by 9dbe7c3d. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-15split_ident: parse timestamp from end of lineLibravatar Jeff King1-2/+7
Split_ident currently parses left to right. Given this input: Your Name <email@example.com> 123456789 -0500\n We assume the name starts the line and runs until the first "<". That starts the email address, which runs until the first ">". Everything after that is assumed to be the timestamp. This works fine in the normal case, but is easily broken by corrupted ident lines that contain an extra ">". Some examples seen in the wild are: 1. Name <email>-<> 123456789 -0500\n 2. Name <email> <Name<email>> 123456789 -0500\n 3. Name1 <email1>, Name2 <email2> 123456789 -0500\n Currently each of these produces some email address (which is not necessarily the one the user intended) and end up with a NULL date (which is generally interpreted as the epoch by "git log" and friends). But in each case we could get the correct timestamp simply by parsing from the right-hand side, looking backwards for the final ">", and then reading the timestamp from there. In general, it's a losing battle to try to automatically guess what the user meant with their broken crud. But this particular workaround is probably worth doing. One, it's dirt simple, and can't impact non-broken cases. Two, it doesn't catch a single breakage we've seen, but rather a large class of errors (i.e., any breakage inside the email angle brackets may affect the email, but won't spill over into the timestamp parsing). And three, the timestamp is arguably more valuable to get right, because it can affect correctness (e.g., in --until cutoffs). This patch implements the right-to-left scheme described above. We adjust the tests in t4212, which generate a commit with such a broken ident, and now gets the timestamp right. We also add a test that fsck continues to detect the breakage. For reference, here are pointers to the breakages seen (as numbered above): [1] http://article.gmane.org/gmane.comp.version-control.git/221441 [2] http://article.gmane.org/gmane.comp.version-control.git/222362 [3] http://perl5.git.perl.org/perl.git/commit/13b79730adea97e660de84bbe67f9d7cbe344302 Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-17pretty: handle broken commit headers gracefullyLibravatar René Scharfe1-0/+42
Centralize the parsing of the date and time zone strings in the new helper function show_ident_date() and make sure it checks the pointers provided by split_ident_line() for NULL before use. Reported-by: Ivan Lyapunov <dront78@gmail.com> Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx> Signed-off-by: Junio C Hamano <gitster@pobox.com>