summaryrefslogtreecommitdiff
path: root/gitweb
AgeCommit message (Collapse)AuthorFilesLines
2009-06-27Merge branch 'maint'Libravatar Junio C Hamano1-1/+9
* maint: gitweb/README: fix AliasMatch in example Test grep --and/--or/--not Test git archive --remote fread does not return negative on error
2009-06-27gitweb/README: fix AliasMatch in exampleLibravatar Giuseppe Bilotta1-1/+9
When combining "dumb client" and human-friendly access by using the '.git' extension to switch between the two, make sure the AliasMatch covers the entire request. Without a full match, a request for http://git.example.com/project/shortlog/branch..gitsomething would result in a 404 because the server would try to access the the project 'project/shortlog/branch.' The solution is still not bulletproof, so document the possible failing case. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-23Merge branch 'jn/gitweb-cleanup'Libravatar Junio C Hamano1-42/+42
* jn/gitweb-cleanup: gitweb: Remove unused $hash_base parameter from normalize_link_target gitweb: Simplify snapshot format detection logic in evaluate_path_info gitweb: Use capturing parentheses only when you intend to capture gitweb: Replace wrongly added tabs with spaces gitweb: Use block form of map/grep in a few cases more gitweb: Always use three argument form of open gitweb: Always use three argument form of open gitweb: Do not use bareword filehandles
2009-05-22gitweb: Sanitize title attribute in format_subject_htmlLibravatar Jakub Narebski1-1/+2
Replace control characters with question mark '?' (like in chop_and_esc_str). A little background: some web browsers turn on strict (and unforgiving) XML validating mode for XHTML documents served using application/xhtml+xml content type. This means among others that control characters are forbidden to appear in gitweb output. CGI.pm does by default slight escaping (using simple_escape subroutine from CGI::Util) of all _attribute_ values (depending on the value of autoEscape, by default on). This escaping, at least in CGI.pm version 3.10 (most current version at CPAN is 3.43), is minimal: only '"', '&', '<' and '>' are escaped using named HTML entity references (&quot;, &amp;, &lt; and &gt; respectively). But simple_escape does not do escaping of control characters such as ^X which are invalid in XHTML (in strict mode). If by some accident commit message do contain some control character in first 50 characters (more or less) of first line of commit message, and this line is longer than 50 characters (so gitweb shortens it for display), then gitweb would put this control character in title attribute (and CGI.pm would not remove them). The tag _contents_ is safe because it is escaped using esc_html() explicitly, and it replaces control characters by their printable representation. While at it: chop_and_escape_str doesn't need capturing group. Noticed-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-13gitweb: Remove unused $hash_base parameter from normalize_link_targetLibravatar Jakub Narebski1-5/+2
...since it was decided for normalize_link_target to only mangle pathname, and do not try to check if target is present in $hash_base tree, for performance reasons. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-13gitweb: Simplify snapshot format detection logic in evaluate_path_infoLibravatar Jakub Narebski1-3/+4
This issue was caught by perlcritic in harsh severity level noticing that catch variable was used outside conditional thanks to the Perl::Critic::Policy::RegularExpressions::ProhibitCaptureWithoutTest policy. See "Perl Best Practices", chapter 12. Regular Expressions, section 12.15. Captured Values: Pattern matches that fail never assign anything to $1, $2, etc., nor do they leave those variables undefined. After an unsuccessful pattern match, the numeric capture variables remain exactly as they were before the match was attempted. New version is in my opinion much easier to understand; previous version worked correctly due to the fact that we returned from loop on first found match. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Acked-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-13gitweb: Use capturing parentheses only when you intend to captureLibravatar Jakub Narebski1-1/+1
Non-capturing groups are useful because they have better runtime performance and do not copy strings to the magic global capture variables. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-13gitweb: Replace wrongly added tabs with spacesLibravatar Jakub Narebski1-2/+2
In two places there was hard tab character instead of space. Fix this. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-10gitweb: Use block form of map/grep in a few cases moreLibravatar Jakub Narebski1-3/+3
Use block form of 'grep' i.e. 'grep {BLOCK} LIST' rather than 'grep(EXPR, LIST)' in filter_snapshot_fmts subroutine. This makes code more readable, as expression is rather long, and statement above there is 'map' with very similar expression also in the block form. Remove unnecessary and misleading parentheses around block form 'map' arguments in quote_command subroutine. The inner "map" in format_snapshot_links was left alone, as it is not clear whether adding parentheses or changing it into block form would improve readibility and clarity of this code. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-10gitweb: Always use three argument form of openLibravatar Jakub Narebski1-11/+14
From 94638fb6edf3ea693228c680a6a30271ccd77522 Mon Sep 17 00:00:00 2001 From: Jakub Narebski <jnareb@gmail.com> Date: Mon, 11 May 2009 03:25:55 +0200 Subject: [PATCH] gitweb: Localize magic variable $/ Instead of undefining and then restoring magic variable $/ (input record separator) for 'slurp mode', localize it. While at it, state explicitely that "local $/;" makes it undefined, by using explicit "local $/ = undef;". Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-10gitweb: Always use three argument form of openLibravatar Jakub Narebski1-6/+6
In most cases (except insert_file() subroutine) we used old two argument form of 'open' to open files for reading. This can cause subtle bugs when $projectroot or $projects_list file starts with mode characters ('>', '<', '+<', '|', etc.) or with leading whitespace; and also when $projects_list file or $mimetypes_file or ctags files end with trailing whitespace or '|'. Additionally it is also more clear to explicitly state that we open those files for reading. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-10gitweb: Do not use bareword filehandlesLibravatar Jakub Narebski1-13/+12
gitweb: Do not use bareword filehandles The script was using bareword filehandles. This is considered a bad practice so they have been changed to indirect filehandles. Changes touch git_get_project_ctags and mimetype_guess_file; while at it rename local variable from $mime to $mimetype (in mimetype_guess_file) to better reflect its value (its contents). Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-09gitweb: Remove function prototypes (cleanup)Libravatar Jakub Narebski1-7/+5
Use of function prototypes is considered bad practice in Perl. The ones used here didn't accomplish anything anyhow, so they've been removed. >From perlsub(1): [...] the intent of this feature [prototypes] is primarily to let you define subroutines that work like built-in functions [...] you can generate new syntax with it [...] We don't want to have subroutines behaving exactly like built-in functions, we don't want to define new syntax / syntactic sugar, so prototypes in gitweb are not needed... and they can have unintended consequences. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-20Documentation: fix typos / spelling mistakesLibravatar Mike Ralphson1-1/+1
Signed-off-by: Mike Ralphson <mike@abacus.co.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-20gitweb: Fix snapshots requested via PATH_INFOLibravatar Holger Weiß1-2/+2
Fix the detection of the requested snapshot format, which failed for PATH_INFO URLs since the references to the hashes which describe the supported snapshot formats weren't dereferenced appropriately. Signed-off-by: Holger Weiß <holger@zedat.fu-berlin.de> Acked-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-19gitweb: Hyperlink multiple git hashes on the same commit message lineLibravatar Marcel M. Cary1-7/+5
The current implementation only hyperlinks the first hash on a given line of the commit message. It seems sensible to highlight all of them if there are multiple, and it seems plausible that there would be multiple even with a tidy line length limit, because they can be abbreviated as short as 8 characters. Benchmark: I wanted to make sure that using the 'e' switch to the Perl regex wasn't going to kill performance, since this is called once per commit message line displayed. In all three A/B scenarios I tried, the A and B yielded the same results within 2%, where A is the version of code before this patch and B is the version after. 1: View a commit message containing the last 1000 commit hashes 2: View a commit message containing 1000 lines of 40 dots to avoid hyperlinking at the same message length 3: View a short merge commit message with a few lines of text and no hashes All were run in CGI mode on my sub-production hardware on a recent clone of git.git. Numbers are the average of 10 reqests per second with the first request discarded, since I expect this change to affect primarily CPU usage. Measured with ApacheBench. Note that the web page rendered was the same; while the new code supports multiple hashes per line, there was at most one per line. The primary purpose of scenarios 2 and 3 were to verify that the addition of 1000 commit messages had an impact on how much of the time was spent rendering commit messages. They were all within 2% of 0.80 requests per second (much faster). So I think the patch has no noticeable effect on performance. Signed-off-by: Marcel M. Cary <marcel@oak.homeunix.org> Acked-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-18gitweb: Fix warnings with override permitted but no repo overrideLibravatar Marcel M. Cary1-6/+10
When a feature like "blame" is permitted to be overridden in the repository configuration but it is not actually set in the repository, a warning is emitted due to the undefined value of the repository configuration, even though it's a perfectly normal condition. Emitting warning is grounds for test failure in the gitweb test script. This error was caused by rewrite of git_get_project_config from using "git config [<type>] <name>" for each individual configuration variable checked to parsing "git config --list --null" output in commit b201927 (gitweb: Read repo config using 'git config -z -l'). Earlier version of git_get_project_config was returning empty string if variable do not exist in config; newer version is meant to return undef in this case, therefore change in feature_bool was needed. Additionally config_to_* subroutines were meant to be invoked only if configuration variable exists; therefore we added early return to git_get_project_config: it now returns no value if variable does not exists in config. Otherwise config_to_* subroutines (config_to_bool in paryicular) wouldn't be able to distinguish between the case where variable does not exist and the case where variable doesn't have value (the "[section] noval" case, which evaluates to true for boolean). While at it fix bug in config_to_bool, where checking if $val is defined (if config variable has value) was done _after_ stripping leading and trailing whitespace, which lead to 'Use of uninitialized value' warning. Add test case for features overridable but not overriden in repo config, and case for no value boolean configuration variable. Signed-off-by: Marcel M. Cary <marcel@oak.homeunix.org> Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-16gitweb: fix wrong base URL when non-root DirectoryIndexLibravatar Giuseppe Bilotta1-6/+22
CGI::url() has some issues when rebuilding the script URL if the script is a DirectoryIndex. One of these issue is the inability to strip PATH_INFO, which is why we had to do it ourselves. Another issue is that the resulting URL cannot be used for the <base> tag: it works if we're the DirectoryIndex at the root level, but not otherwise. We fix this by building the proper base URL ourselves, and improve the comment about the need to strip PATH_INFO manually while we're at it. Additionally t/t9500-gitweb-standalone-no-errors.sh had to be modified to set SCRIPT_NAME variable (CGI standard states that it MUST be set, and now gitweb uses it if PATH_INFO is not empty, as is the case for some of tests in t9500). Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-15Merge branch 'jn/gitweb-committag'Libravatar Junio C Hamano1-1/+1
* jn/gitweb-committag: gitweb: Better regexp for SHA-1 committag match
2009-02-08Merge branch 'maint'Libravatar Junio C Hamano2-3/+27
* maint: gitweb: add $prevent_xss option to prevent XSS by repository content rev-list: fix showing distance when using --bisect-all
2009-02-08gitweb: add $prevent_xss option to prevent XSS by repository contentLibravatar Matt McCutchen2-3/+27
Add a gitweb configuration variable $prevent_xss that disables features to prevent content in repositories from launching cross-site scripting (XSS) attacks in the gitweb domain. Currently, this option makes gitweb ignore README.html (a better solution may be worked out in the future) and serve a blob_plain file of an untrusted type with "Content-Disposition: attachment", which tells the browser not to show the file at its original URL. The XSS prevention is currently off by default. Signed-off-by: Matt McCutchen <matt@mattmccutchen.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-07gitweb: Better regexp for SHA-1 committag matchLibravatar Jakub Narebski1-1/+1
Make SHA-1 regexp to be turned into hyperlink (the SHA-1 committag) to match word boundary at the beginning and the end. This way we reduce number of false matches, for example we now don't match 0x74a5cd01 which is hex decimal (for example memory address), but is not SHA-1. Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-01gitweb: Update README that gitweb works better with PATH_INFOLibravatar Jakub Narebski1-6/+4
One had to configure gitweb for it to find static files (stylesheets, images) when using path_info URLs. Now that it is not necessary thanks to adding BASE element to HTML head if needed, update README to reflect this fact. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-30gitweb: align comments to codeLibravatar Giuseppe Bilotta1-4/+4
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Acked-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-30gitweb: webserver config for PATH_INFOLibravatar Giuseppe Bilotta1-0/+76
Document some possible Apache configurations when the path_info feature is enabled in gitweb. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Acked-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-30gitweb: make static files accessible with PATH_INFOLibravatar Giuseppe Bilotta1-0/+5
Gitweb links to a number of static files such as CSS stylesheets, favicon or the git logo. When, such as with the default Makefile, the paths to these files are relative (i.e. doesn't start with a "/"), the files become inaccessible in any view other tha project list and summary page if gitweb is invoked with a non-empty PATH_INFO. Fix this by adding a <base> element pointing to the script's own URL, which ensure that all relative paths will be resolved correctly. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Acked-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-28gitweb: check if-modified-since for feedsLibravatar Giuseppe Bilotta1-1/+19
Offering Last-modified header for feeds is only half the work, even if we bail out early on HEAD requests. We should also check that same date against If-modified-since, and bail out early with 304 Not Modified if that's the case. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-28gitweb: last-modified time should be commiter, not authorLibravatar Giuseppe Bilotta1-1/+1
The last-modified time header added by RSS to increase cache hits from readers should be set to the date the repository was last modified. The author time in this respect is not a good guess because the last commit might come from a oldish patch. Use the committer time for the last-modified header to ensure a more correct guess of the last time the repository was modified. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-28gitweb: rss channel dateLibravatar Giuseppe Bilotta1-0/+4
The RSS 2.0 specifications defines not one but _two_ dates for its channel element! Woohoo! Luckily, it seems that consensus seems to be that if both are present they should be equal, except for some very obscure and discouraged cases. Since lastBuildDate would make more sense for us and pubDate seems to be the most commonly used, we defined both and make them equal. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-28gitweb: rss feed managingEditorLibravatar Giuseppe Bilotta1-1/+3
The RSS 2.0 specification allows an optional managingEditor tag for the channel, containing the "email address for person responsible for editorial content", which is basically the project owner. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-28gitweb: feed generator metadataLibravatar Giuseppe Bilotta1-0/+2
Add <generator> tag to RSS and Atom feed. Versioning info (gitweb/git core versions, separated by a literal slash) is stored in the appropriate attribute for the Atom feed, and in the tag content for the RSS feed. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-28gitweb: channel image in rss feedLibravatar Giuseppe Bilotta1-0/+10
Define the channel image for the rss feed when the logo or favicon are defined, preferring the former to the latter. As suggested in the RSS 2.0 specifications, the image's title and link as set to the same as the channel's. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17Merge branch 'gb/gitweb-opml'Libravatar Junio C Hamano1-3/+7
* gb/gitweb-opml: gitweb: suggest name for OPML view gitweb: don't use pathinfo for global actions
2009-01-17Merge branch 'gb/gitweb-patch'Libravatar Junio C Hamano1-3/+110
* gb/gitweb-patch: gitweb: link to patch(es) view in commit(diff) and (short)log view gitweb: add patches view gitweb: change call pattern for git_commitdiff gitweb: add patch view Conflicts: gitweb/gitweb.perl
2009-01-10gitweb: suggest name for OPML viewLibravatar Giuseppe Bilotta1-1/+5
Suggest opml.xml as name for OPML view by providing the appropriate header, consistently with similar usage in project_index view. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Acked-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-07Merge branch 'mk/gitweb-feature'Libravatar Junio C Hamano1-32/+9
* mk/gitweb-feature: gitweb: unify boolean feature subroutines
2009-01-07Merge branch 'jn/gitweb-blame'Libravatar Junio C Hamano1-35/+51
* jn/gitweb-blame: gitweb: cache $parent_commit info in git_blame() gitweb: A bit of code cleanup in git_blame() gitweb: Move 'lineno' id from link to row element in git_blame
2009-01-06gitweb: don't use pathinfo for global actionsLibravatar Giuseppe Bilotta1-2/+2
With PATH_INFO urls, actions for the projects list (e.g. opml, project_index) were being put in the URL right after the base. The resulting URL is not properly parsed by gitweb itself, since it expects a project name as first component of the URL. Accepting global actions in use_pathinfo is not a very robust solution due to possible present and future conflicts between project names and global actions, therefore we just refuse to create PATH_INFO URLs when the project is not defined. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Acked-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-05gitweb: use href() when generating URLs in OPMLLibravatar Giuseppe Bilotta1-2/+2
Since the OPML project list view was hand-coding the RSS and HTML URLs, it didn't respect global options such as use_pathinfo. Make it use href() to ensure consistency with the rest of the gitweb setup. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Acked-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-27gitweb: Fix export check in git_get_projects_listLibravatar Devin Doucette1-2/+3
When $filter was empty, the path passed to check_export_ok would contain an extra '/', which some implementations of export_auth_hook are sensitive to. It makes more sense to fix this here than to handle the special case in each implementation of export_auth_hook. Signed-off-by: Devin Doucette <devin@doucette.cc> Acked-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-21gitweb: link to patch(es) view in commit(diff) and (short)log viewLibravatar Giuseppe Bilotta1-2/+28
We link to patch view in commit and commitdiff view, and to patches view in log and shortlog view. In (short)log view, the link is only offered when the number of commits shown is no more than the allowed maximum number of patches. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-21gitweb: add patches viewLibravatar Giuseppe Bilotta1-1/+14
The only difference between patch and patches view is in the treatement of single commits: the former only displays a single patch, whereas the latter displays a patchset leading to the specified commit. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-21gitweb: change call pattern for git_commitdiffLibravatar Giuseppe Bilotta1-3/+4
Since we are going to introduce an additional parameter for git_commitdiff to tune patch view, we switch to named/hash-based parameter passing for clarity and robustness. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-21gitweb: add patch viewLibravatar Giuseppe Bilotta1-1/+68
The output of commitdiff_plain is not intended for git-am: * when given a range of commits, commitdiff_plain publishes a single patch with the message from the first commit, instead of a patchset * the hand-built email format replicates the commit summary both as email subject and as first line of the email itself, resulting in a duplication if the output is used with git-am. We thus create a new view that can be fed to git-am directly, allowing patch exchange via gitweb. The new view exposes the output of git format-patch directly, limiting it to a single patch in the case of a single commit. A configurable upper limit defaulting to 16 is imposed on the number of commits which will be included in a patchset, to prevent DoS attacks on the server. Setting the limit to 0 will disable the patch view, setting it to a negative number will remove the limit. Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-16gitweb: do not run "git diff" that is PorcelainLibravatar Junio C Hamano1-37/+3
Jakub says that legacy-style URI to view two blob differences are never generated since 1.4.3. This codepath runs "git diff" Porcelain from the gitweb, which is a no-no. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-15gitweb: unify boolean feature subroutinesLibravatar Matt Kraai1-32/+9
The boolean feature subroutines behaved identically except for the name of the configuration option, so make that a parameter and unify them. Signed-off-by: Matt Kraai <kraai@ftbfs.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-10gitweb: cache $parent_commit info in git_blame()Libravatar Jakub Narebski1-5/+11
Luben Tuikov changed 'lineno' link from leading to commit which gave current version of given block of lines, to leading to parent of this commit in 244a70e (Blame "linenr" link jumps to previous state at "orig_lineno"). This made possible data mining using 'blame' view. The current implementation calls rev-parse once per each blamed line to find parent revision of blamed commit, even when the same commit appears more than once, which is inefficient. This patch mitigates this issue by caching $parent_commit info in %metainfo, which makes gitweb call rev-parse only once per each unique commit in the output from "git blame". In the tables below you can see simple benchmark comparing gitweb performance before and after this patch File | L[1] | C[2] || Time0[3] | Before[4] | After[4] ==================================================================== blob.h | 18 | 4 || 0m1.727s | 0m2.545s | 0m2.474s GIT-VERSION-GEN | 42 | 13 || 0m2.165s | 0m2.448s | 0m2.071s README | 46 | 6 || 0m1.593s | 0m2.727s | 0m2.242s revision.c | 1923 | 121 || 0m2.357s | 0m30.365s | 0m7.028s gitweb/gitweb.perl | 6291 | 428 || 0m8.080s | 1m37.244s | 0m20.627s File | L/C | Before/After ========================================= blob.h | 4.5 | 1.03 GIT-VERSION-GEN | 3.2 | 1.18 README | 7.7 | 1.22 revision.c | 15.9 | 4.32 gitweb/gitweb.perl | 14.7 | 4.71 As you can see the greater ratio of lines in file to unique commits in blame output, the greater gain from the new implementation. Legend: [1] Number of lines: $ wc -l <file> [2] Number of unique commits in the blame output: $ git blame -p <file> | grep author-time | wc -l [3] Time for running "git blame -p" (user time, single run): $ time git blame -p <file> >/dev/null [4] Time to run gitweb as Perl script from command line: $ gitweb-run.sh "p=.git;a=blame;f=<file>" > /dev/null 2>&1 The gitweb-run.sh script includes slightly modified (with adjusted pathnames) code from gitweb_run() function from the test script t/t9500-gitweb-standalone-no-errors.sh; gitweb config file gitweb_config.perl contents (again up to adjusting pathnames; in particular $projectroot variable should point to top directory of git repository) can be found in the same place. Discussion ~~~~~~~~~~ A possible future improvement would be to open a bidi pipe to "git cat-file --batch-check", (like in Git::Repo in gitweb caching by Lea Wiemann), feed $long_rev^ to it, and parse its output, which is in the following form: 926b07e694599d86cec668475071b32147c95034 commit 637 This would mean one call to git-cat-file for the whole 'blame' view, instead of one call to git-rev-parse per each unique commit in blame output. Yet another solution would be to change use of validate_refname() to validate_revision() when checking script parameters (CGI query or path_info), with validate_revision being something like the following: sub validate_revision { my $rev = shift; return validate_refname(strip_rev_suffixes($rev)); } so we don't need to calculate $long_rev^, but can pass "$long_rev^" as 'hb' parameter. This solution has the advantage that it can be easily adapted to future incremental blame output. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Acked-by: Luben Tuikov <ltuikov@yahoo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-10gitweb: A bit of code cleanup in git_blame()Libravatar Jakub Narebski1-28/+39
Among others, here are the highlights: * move variable declaration closer to the place it is set and used, if possible, * uniquify and simplify coding style a bit, which includes removing unnecessary '()'. * check type only if $hash was defined, as otherwise from the way git_get_hash_by_path() is called (and works), we know that it is a blob, * use modern calling convention for git-blame, * remove unused variable, * don't use implicit variables ($_), * add some comments Signed-off-by: Jakub Narebski <jnareb@gmail.com> Acked-by: Luben Tuikov <ltuikov@yahoo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-09gitweb: Move 'lineno' id from link to row element in git_blameLibravatar Jakub Narebski1-2/+1
Move l<line number> ID from <a> link element inside table row (inside cell element for column with line numbers), to encompassing <tr> table row element. It was done to make it easier to manipulate result HTML with DOM, and to be able write 'blame_incremental' view with the same, or nearly the same result. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Acked-by: Luben Tuikov <ltuikov@yahoo.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-08gitweb: Fix bug in insert_file() subroutineLibravatar Jakub Narebski1-1/+1
In insert_file() subroutine (which is used to insert HTML fragments as custom header, footer, hometext (for projects list view), and per project README.html (for summary view)) we used: map(to_utf8, <$fd>); This doesn't work, and other form has to be used: map { to_utf8($_) } <$fd>; Now with test for t9600 added, for $GIT_DIR/README.html. Signed-off-by: Jakub Narebski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>