summaryrefslogtreecommitdiff
path: root/contrib/mw-to-git/git-remote-mediawiki.perl
AgeCommit message (Collapse)AuthorFilesLines
2017-11-08remote-mediawiki: show progress while fetching namespacesLibravatar Antoine Beaupré1-0/+1
Without this, the fetch process seems hanged while we fetch page listings across the namespaces. Obviously, it should be possible to silence this with -q, but that's an issue already present everywhere in the code and should be fixed separately: https://github.com/Git-Mediawiki/Git-Mediawiki/issues/30 Signed-off-by: Antoine Beaupré <anarcat@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-08remote-mediawiki: process namespaces in orderLibravatar Antoine Beaupré1-1/+1
Ideally, we'd process them in numeric order since that is more logical, but we can't do that yet since this is where we find the numeric identifiers in the first place. Lexicographic order is a good compromise. Signed-off-by: Antoine Beaupré <anarcat@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-08remote-mediawiki: support fetching from (Main) namespaceLibravatar Antoine Beaupré1-1/+6
When we specify a list of namespaces to fetch from, by default the MW API will not fetch from the default namespace, refered to as "(Main)" in the documentation: https://www.mediawiki.org/wiki/Manual:Namespace#Built-in_namespaces I haven't found a way to address that "(Main)" namespace when getting the namespace ids: indeed, when listing namespaces, there is no "canonical" field for the main namespace, although there is a "*" field that is set to "" (empty). So in theory, we could specify the empty namespace to get the main namespace, but that would make specifying namespaces harder for the user: we would need to teach users about the "empty" default namespace. It would also make the code more complicated: we'd need to parse quotes in the configuration. So we simply override the query here and allow the user to specify "(Main)" since that is the publicly documented name. Signed-off-by: Antoine Beaupré <anarcat@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-08remote-mediawiki: skip virtual namespacesLibravatar Antoine Beaupré1-1/+4
Virtual namespaces do not correspond to pages in the database and are automatically generated by MediaWiki. It makes little sense, therefore, to fetch pages from those namespaces and the MW API doesn't support listing those pages. According to the documentation, those virtual namespaces are currently "Special" (-1) and "Media" (-2) but we treat all negative namespaces as "virtual" as a future-proofing mechanism. Signed-off-by: Antoine Beaupré <anarcat@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-08remote-mediawiki: show known namespace choices on failureLibravatar Antoine Beaupré1-1/+2
If we fail to find a requested namespace, we should tell the user which ones we know about, since those were already fetched. This allows users to fetch all namespaces by specifying a dummy namespace, failing, then copying the list of namespaces in the config. Eventually, we should have a flag that allows fetching all namespaces automatically. Reviewed-by: Antoine Beaupré <anarcat@debian.org> Signed-off-by: Antoine Beaupré <anarcat@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-07remote-mediawiki: allow fetching namespaces with spacesLibravatar Ingo Ruhnke1-0/+1
we still want to use spaces as separators in the config, but we should allow the user to specify namespaces with spaces, so we use underscore for this. Reviewed-by: Antoine Beaupré <anarcat@debian.org> Signed-off-by: Antoine Beaupré <anarcat@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-07remote-mediawiki: add namespace supportLibravatar Kevin1-0/+25
This introduces a new remote.origin.namespaces argument that is a space-separated list of namespaces. The list of pages extract is then taken from all the specified namespaces. Reviewed-by: Antoine Beaupré <anarcat@debian.org> Signed-off-by: Antoine Beaupré <anarcat@debian.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-27Spelling fixesLibravatar Ville Skyttä1-1/+1
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-11Spelling fixesLibravatar Ville Skyttä1-1/+1
<BAD> <CORRECTED> accidently accidentally commited committed dependancy dependency emtpy empty existance existence explicitely explicitly git-upload-achive git-upload-archive hierachy hierarchy indegee indegree intial initial mulitple multiple non-existant non-existent precendence. precedence. priviledged privileged programatically programmatically psuedo-binary pseudo-binary soemwhere somewhere successfull successful transfering transferring uncommited uncommitted unkown unknown usefull useful writting writing Signed-off-by: Ville Skyttä <ville.skytta@iki.fi> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-23git-remote-mediawiki: fix encoding issue for UTF-8 media filesLibravatar Matthieu Moy1-1/+6
When a media file contains valid UTF-8, git-remote-mediawiki tried to be too clever about the encoding, and the call to utf8::downgrade() on the downloaded content was failing with Wide character in subroutine entry at git-remote-mediawiki line 583. Instead, use $response->decode() to apply decoding linked to the Content-Encoding: header, and return the content without attempting any charset decoding. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-11-12contrib: typofixesLibravatar Masanari Iida1-1/+1
Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-24Merge branch 'maint'Libravatar Jonathan Nieder1-2/+12
* maint: git-remote-mediawiki: bugfix for pages w/ >500 revisions
2013-09-24git-remote-mediawiki: bugfix for pages w/ >500 revisionsLibravatar Benoit Person1-2/+12
Mediawiki introduces a new API for queries w/ more than 500 results in version 1.21. That change triggered an infinite loop while cloning a mediawiki with such a page. The latest API renamed and moved the "continuing" information in the response, necessary to build the next query. The code failed to retrieve that information but still detected that it was in a "continuing query". As a result, it launched the same query over and over again. If a "continuing" information is detected in the response (old or new), the next query is updated accordingly. If not, we quit assuming it's not a continuing query. Reported-by: Benjamin Cathey Signed-off-by: Benoit Person <benoit.person@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-03git-remote-mediawiki: no need to update private ref in non-dumb pushLibravatar Matthieu Moy1-1/+0
We used to update the private ref ourselves, but this update is now done by default since 664059fb (transport-helper: update remote helper namespace, 2013-04-17). Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-03git-remote-mediawiki: use no-private-update capability on dumb pushLibravatar Matthieu Moy1-0/+3
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-18Merge branch 'bp/mediawiki-preview'Libravatar Junio C Hamano1-72/+13
Add a command to allow previewing the contents locally before pushing it out, when working with a MediaWiki remote. I personally do not think this belongs to Git. If you are working on a set of AsciiDoc source files, you sure do want to locally format to preview what you will be pushing out, and if you are working on a set of C or Java source files, you do want to test it before pushing it out, too. That kind of thing belongs to your build script, not to your SCM. But I'll let it pass, as this is only a contrib/ thing. * bp/mediawiki-preview: git-remote-mediawiki: add preview subcommand into git mw git-remote-mediawiki: add git-mw command git-remote-mediawiki: factoring code between git-remote-mediawiki and Git::Mediawiki git-remote-mediawiki: update tests to run with the new bin-wrapper git-remote-mediawiki: add a git bin-wrapper for developement wrap-for-bin: make bin-wrappers chainable git-remote-mediawiki: introduction of Git::Mediawiki.pm
2013-07-08git-remote-mediawiki: factoring code between git-remote-mediawiki and ↵Libravatar Benoit Person1-72/+13
Git::Mediawiki For now, Git::Mediawiki contains nothing. This first patch moves some of git-remote-mediawiki.perl's factorisable code into Git::Mediawiki. In the same time, it removes the side effects of that code and renames the fucntions and constants moved to expose a better API. Signed-off-by: Benoit Person <benoit.person@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-03git-remote-mediawiki: un-brace file handles in binmode callsLibravatar Matthieu Moy1-2/+2
Commit e83d36b66fc turned "print STDOUT" into "print {*STDOUT}", as suggested by perlcritic. Unfortunately, it also changed two "binmode STDOUT" calls the same way, which does not work and yield a "Not a GLOB reference" error. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-23Merge branch 'cm/remote-mediawiki-perlcritique'Libravatar Junio C Hamano1-247/+290
* cm/remote-mediawiki-perlcritique: (31 commits) git-remote-mediawiki: make error message more precise git-remote-mediawiki: add a perlcritic rule in Makefile git-remote-mediawiki: add a .perlcriticrc file git-remote-mediawiki: clearly rewrite double dereference git-remote-mediawiki: fix a typo ("mediwiki" instead of "mediawiki") git-remote-mediawiki: put non-trivial numeric values in constants. git-remote-mediawiki: don't use quotes for empty strings git-remote-mediawiki: replace "unless" statements with negated "if" statements git-remote-mediawiki: brace file handles for print for more clarity git-remote-mediawiki: modify strings for a better coding-style git-remote-mediawiki: put long code into a subroutine git-remote-mediawiki: remove import of unused open2 git-remote-mediawiki: check return value of open git-remote-mediawiki: assign a variable as undef and make proper indentation git-remote-mediawiki: rename a variable ($last) which has the name of a keyword git-remote-mediawiki: remove unused variable $entry git-remote-mediawiki: turn double-negated expressions into simple expressions git-remote-mediawiki: change the name of a variable git-remote-mediawiki: add newline in the end of die() error messages git-remote-mediawiki: change style in a regexp ...
2013-06-20Merge branch 'cm/remote-mediawiki'Libravatar Junio C Hamano1-0/+15
* cm/remote-mediawiki: git-remote-mediawiki: display message when launched directly
2013-06-14git-remote-mediawiki: make error message more preciseLibravatar Célestin Matte1-3/+7
In subroutine parse_command, error messages were not correct. For the "import" function, having too much or incorrect arguments displayed both "invalid arguments", while it displayed "too many arguments" for the "option" functions under the same conditions. Separate the two error messages in both cases. Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: clearly rewrite double dereferenceLibravatar Célestin Matte1-4/+4
@$var structures are re-written in the following way: @{$var} It makes them more readable. Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: fix a typo ("mediwiki" instead of "mediawiki")Libravatar Célestin Matte1-2/+2
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: put non-trivial numeric values in constants.Libravatar Célestin Matte1-6/+14
Non-trivial numeric values (e.g., different from 0, 1 and 2) are placed in constants at the top of the code to be easily modifiable and to make more sense Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: don't use quotes for empty stringsLibravatar Célestin Matte1-8/+10
Empty strings are replaced by an $EMPTY constant. Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: replace "unless" statements with negated "if" statementsLibravatar Célestin Matte1-6/+6
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: brace file handles for print for more clarityLibravatar Célestin Matte1-95/+95
This follows the following rule: InputOutput::RequireBracedFileHandleWithPrint (Severity: 1) The `print' and `printf' functions have a unique syntax that supports an optional file handle argument. Conway suggests wrapping this argument in braces to make it visually stand out from the other arguments. When you put braces around any of the special package-level file handles like `STDOUT', `STDERR', and `DATA', you must the `'*'' sigil or else it won't compile under `use strict 'subs''. print $FH "Mary had a little lamb\n"; #not ok print {$FH} "Mary had a little lamb\n"; #ok print STDERR $foo, $bar, $baz; #not ok print {STDERR} $foo, $bar, $baz; #won't compile under 'strict' print {*STDERR} $foo, $bar, $baz; #perfect! Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: modify strings for a better coding-styleLibravatar Célestin Matte1-120/+119
- strings which don't need interpolation are single-quoted for more clarity and slight gain of performance - interpolation is preferred over concatenation in many cases, for more clarity - variables are always used with the ${} operator inside strings - strings including double-quotes are written with qq() so that the quotes do not have to be escaped Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: put long code into a subroutineLibravatar Célestin Matte1-24/+32
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: remove import of unused open2Libravatar Célestin Matte1-1/+0
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: check return value of openLibravatar Célestin Matte1-1/+2
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: assign a variable as undef and make proper indentationLibravatar Célestin Matte1-1/+4
Explicitly assign local variable $/ as undef and make a proper one-instruction-by-line indentation Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: rename a variable ($last) which has the name of a keywordLibravatar Célestin Matte1-4/+4
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: remove unused variable $entryLibravatar Célestin Matte1-1/+0
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: turn double-negated expressions into simple expressionsLibravatar Célestin Matte1-4/+4
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: change the name of a variableLibravatar Célestin Matte1-3/+3
Local variable $url has the same name as a global variable. Changing the name of the local variable prevents future possible misunderstanding. Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: add newline in the end of die() error messagesLibravatar Célestin Matte1-13/+13
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: change style in a regexpLibravatar Célestin Matte1-1/+1
Change '[\n]' to '\n': brackets are useless here. Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: change style in a regexpLibravatar Célestin Matte1-1/+1
In this regexp, ' |\n' is used, whereas its equivalent '[ \n]', which is clearer, is used elsewhere. Make the style coherent. Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: change separator of some regexpsLibravatar Célestin Matte1-3/+3
Use {}{} instead of /// when slashes are used inside the regexp so as not to escape it. Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: change the behaviour of a splitLibravatar Célestin Matte1-1/+1
A "split ' '" is turned into a "split / /", which changes its behaviour: the old method matched a run of whitespaces (/\s*/), while the new one will match a single space, which is what we want here. Indeed, in other contexts, changing split(' ') to split(/ /) could potentially be a regression, however, here, when parsing the output of "rev-list --parents", whose output SHA-1's are each separated by a single space, splitting on a single space is perfectly correct. Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: remove useless regexp modifier (m)Libravatar Célestin Matte1-3/+3
m// and // is used randomly. It is better to use the m modifier only when needed, e.g., when the regexp uses another separator than //. Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: rewrite unclear line of instructionsLibravatar Célestin Matte1-1/+2
Subroutines' parameters should be assigned to variable before doing anything else Besides, existing instruction affected a variable inside a "if", which break Git's coding style Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: change syntax of map callsLibravatar Célestin Matte1-6/+8
Put first parameter of map inside a block, for better readability. Follow BuiltinFunctions::RequireBlockMap Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: move a variable declaration at the top of the codeLibravatar Célestin Matte1-3/+3
%basetimestamps declaration was lost in the middle of subroutines Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: always end a subroutine with a returnLibravatar Célestin Matte1-0/+18
Follow Subroutines::RequireFinalReturn Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: replace :utf8 by :encoding(UTF-8)Libravatar Célestin Matte1-3/+3
Follow perlcritic's InputOutput::RequireEncodingWithUTF8Layer policy Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: move "use warnings;" before any instructionLibravatar Célestin Matte1-2/+1
Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14git-remote-mediawiki: make a regexp clearerLibravatar Célestin Matte1-1/+1
Perl's split function takes a regex pattern argument. You can also feed it an expression, which is then compiled into a regex at runtime. It therefore works to pass your pattern via single quotes, but it is much less obvious to a reader that the argument is meant to be a regex, not a static string. Using the traditional slash-delimiters makes this easier to read. Signed-off-by: Célestin Matte <celestin.matte@ensimag.fr> Signed-off-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-14Merge branch 'bp/mediawiki-credential'Libravatar Junio C Hamano1-57/+9
The bridge to MediaWiki has been updated to use the credential helper interface in Git.pm, losing its own and the original implementation the former was based on. * bp/mediawiki-credential: git-remote-mediawiki: use Git.pm functions for credentials