Commit Graph

523 Commits (d395433513029a5015130b7728da80045c23959b)
 

Author SHA1 Message Date
nemobis 31ea06ff86 Remove also wiki/[A-Z].+$
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@889 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 664fa18ea3 Remove sourceforge wikis and URLs with parameters to index.php
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@888 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 3624c02852 Remove Wikimedia Foundation wikis, other 'wikimedia' URLs cleanup
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@887 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis ccfc95e9f4 Issue 59: Add first dirty list of possible MediaWiki sitesFirst passes of the script, now going on with all TLDs.Just sorted and cleaned of biggest noises like mailing lists, github and stackoverflow.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@886 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 002a8d6702 Add first list of sourceforge wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@885 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz d5cf4fe592 Deleting things that have no usecase
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@884 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 7c4cb64859 Revert r874: breaks backwards compatibility, no usecase provided
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@883 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 6efe406ea5 Followup r877, first check most common conditions for shortcut performance
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@882 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis b940293136 Add more wikkii wikis from mutante's wikistats
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@881 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 0dd9c5f299 Follow up r874, forgot to uncomment accesskey and secretkey after testing
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@879 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 611d13f8c5 Follow up r877, check the number of revision tags
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@878 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 64bd837cab (Issue 34) XML integry check inside the code
This *really* fixes the issue and asks the user whether or not to regenerate a dump.


git-svn-id: https://wikiteam.googlecode.com/svn/trunk@877 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 5bed9cb28e Adding MediaWiki Export XML Schema Descriptions to track changes and add support accordingly
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@876 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 8680a55e24 Fix identifier name so that it can be more human-readable and backward compatible
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@874 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 79047a3ded (Issue 71) Use a better check for private wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@873 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz a0726eed04 Rename revnum.txt to revnum.json and fix downloadurl from GitHub.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@871 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz e14fb0fe1b Adding revnum.txt, a file used by dumpgenerator.py version 2.0.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@870 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 26873ad495 Fix typo, make domain2prefix quiet again
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@869 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 626118cfab Let's call it 0.2 then, a bump to 1 would require announcements etc. We're not there yet (API support etc.).
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@867 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz df1e7efafd Change version of dumpgenerator.py to 1.1. Using 0.1 is rather confusing.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@866 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis e1d4de3179 Uncomment appended index.php for guess in most configurations
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@864 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 5d34d9512a Needs to be non-matching group
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@863 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 82ba173739 Issue 22: allimages now uses aicontinue, not aifrom
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@862 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis c6546ff935 Issue 71: Don't try to download private wikis, first workaround
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@861 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 776038666f Issue 72: revert r857, just define everything in launcher.py
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@860 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 6113fa3340 Add delay to getPageTitlesScraper
We must be nice here too or naughty hosts fail badly, for instance wikkii.com gave

urllib2.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Temporarily



git-svn-id: https://wikiteam.googlecode.com/svn/trunk@859 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 9e1b13e173 Correct --help: format is --delay=5, not --delay:5
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@858 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 6430ac5f47 Check for the existence of the array in domain2prefix instead; uploader.py failed on python 2.6
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@857 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 69b00964c8 Add a comment on delay
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@856 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis ef7d527e86 Add some advice about editthis.info for usage via launcher.py
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@855 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 4820339d10 Fix r842, patch by balrog; Schbirid reported python error in CleanHTML
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@854 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 577e8034e6 ahem syntax
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@853 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 6966ba5918 subprocess.check_output is not supported in python 2.6, let's not dieYes, some are still on python 2.6; even 3.2.46-grbfs-kapsi it seems.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@852 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 7c94815e2c Issue 68: Use GET, not POST, to download images; some harm and no? good
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@851 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 0eb4913f89 Requires python 2.7 or more (for subprocess.check_output)
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@850 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 8ef08bbe28 Issue 67: launcher.py fails to 7z completed dumps (also get rid of some os.system)
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@849 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis d289c563b1 Link to https://code.google.com/p/wikiteam/wiki/NewTutorial#Download_a_list_of_wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@848 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis ad5d468824 Add some documentation; would need a proper --help
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@846 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis e1b34b7f6b Fix whitespace
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@844 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 57e226c049 Use urllib2 and set user agent in some more places; some webhosts block urllib.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@842 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 414fb1988f Add a little more singular/plural support.
Along with this commit:
* Changed original %s to %d for number of edits, since it is recognised as an integer.
* Directly defined the number of edits to be 1 when the if condition is true, to optimise performance.


git-svn-id: https://wikiteam.googlecode.com/svn/trunk@841 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 13ebee4c28 Issue 60: Add authentication cookie support, patch by Fredrik Roubert
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@840 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis b34f01646a Use getUserAgent in one more place, urllib is blocked by some
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@839 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 1a8e393c11 Google, not Gmail account
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@838 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 0150b93dcd Issue #64: Improve speed by some orders of magnitude
Patch by Betacommand, many thanks.

git-svn-id: https://wikiteam.googlecode.com/svn/trunk@837 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 8b4480be64 Adding plural support for number of edits saved for a page
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@831 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 5e826c272d Use recursive option for zip
The * does not scale, I get errors from sh that the arguments list is too long.

git-svn-id: https://wikiteam.googlecode.com/svn/trunk@830 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis ed8d174d5a Issue #61: some skins hide that stuff, use meta tag generator
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@829 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp 5648586e31 git-svn-id: https://wikiteam.googlecode.com/svn/trunk@826 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95 11 years ago
nemobis 9a5f904261 Issue #45, workaround: ignore OSError in the size check
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@825 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago