Emilio J. Rodríguez-Posada
3854a344fe
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
2014-06-29 10:01:25 +02:00
Emilio J. Rodríguez-Posada
1c1f0dbb86
replacing XML with JSON in image downloading
2014-06-29 10:01:09 +02:00
balr0g
481323c7f7
Don't try to download sites with disabled API
2014-06-27 15:19:54 -04:00
nemobis
1933db8a94
Merge pull request #124 from balr0g/scraper-unicode-title-fix
...
Fix scraper for sites with Unicode titles
2014-06-27 18:06:51 +02:00
balr0g
62be069026
Fix scraper for sites with Unicode titles
2014-06-27 11:50:59 -04:00
nemobis
62d961fa97
Fix typo, unused variable spotted by balrog
2014-06-27 17:33:29 +02:00
nemobis
95bc2dec38
Link GitHub issue tracker
2014-06-27 15:25:28 +02:00
balr0g
d60e560571
Add Content-Encoding: gzip support
2014-06-27 09:02:35 -04:00
Emilio J. Rodríguez-Posada
5261811fa4
only if api exists
2014-06-26 10:39:45 +02:00
Emilio J. Rodríguez-Posada
610764619a
add saveSiteInfo() to download meta=siteinfo data from API to a file
2014-06-26 10:38:59 +02:00
Emilio J. Rodríguez-Posada
d395433513
comments and newlines
2014-06-26 10:26:57 +02:00
Emilio J. Rodríguez-Posada
5eff4bd072
comments and tabs
2014-06-26 10:17:00 +02:00
Emilio J. Rodríguez-Posada
0b0c40f5da
adding more user-agents, but keeps the first as default by now
2014-06-26 10:13:59 +02:00
Emilio J. Rodríguez-Posada
81468c4a7c
using JSON to retrieve namespaces via API
2014-06-25 23:54:05 +02:00
Emilio J. Rodríguez-Posada
703eb9011b
improving checkAPI() using JSON properly loaded
2014-06-25 23:17:39 +02:00
Emilio J. Rodríguez-Posada
44d3fe1e36
Merge pull request #117 from nemobis/bug/48
...
Issue 48: Check that API actually works
2014-06-25 23:12:29 +02:00
Emilio J. Rodríguez-Posada
fc80556d8a
merging...
2014-06-25 22:57:56 +02:00
Emilio J. Rodríguez-Posada
f474deb71f
now we use JSON properly in getPageTitlesAPI(), instead of XML; fixing some wrong prints, now support utf-8
2014-06-25 22:53:46 +02:00
Federico Leva
997276110c
Issue 46: dumpgenerator should follow redirects
...
Patch by @balr0g from libsonic (GPLv3+).
2014-06-25 21:33:02 +02:00
Federico Leva
a8e1575879
Issue 48: Check that API actually works
2014-06-25 20:35:36 +02:00
Emilio J. Rodríguez-Posada
c9aa165504
fixing header with the new year, info and documentation link
2014-06-25 18:08:33 +02:00
nemobis
ac4c93c12a
Issue 85: more cross-platform shebang on all scripts
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@962 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2014-02-26 23:22:53 +00:00
nemobis
403dc213ef
Issue 71: English-only match for an older case
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@942 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2014-02-02 23:06:10 +00:00
nemobis
0ede45b7cf
Special:BadTitle works only in English wikis
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@917 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2014-01-26 17:26:20 +00:00
nemobis
034866a32e
Handle permissions-errors for wikis requiring login or whatever
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@916 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2014-01-26 17:21:46 +00:00
nemobis
6c69d9800f
Followup, delay needs config; should be BC
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@914 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2014-01-25 22:29:42 +00:00
nemobis
55185467e1
Add delay to all checking and listing functions, crappy hosts die on them
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@902 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2014-01-23 16:05:19 +00:00
nemobis
6efe406ea5
Followup r877, first check most common conditions for shortcut performance
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@882 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-17 11:18:42 +00:00
Hydriz
611d13f8c5
Follow up r877, check the number of revision tags
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@878 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-16 14:01:24 +00:00
Hydriz
64bd837cab
(Issue 34) XML integry check inside the code
...
This *really* fixes the issue and asks the user whether or not to regenerate a dump.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@877 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-16 13:39:49 +00:00
Hydriz
79047a3ded
(Issue 71) Use a better check for private wikis
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@873 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-15 07:43:37 +00:00
nemobis
26873ad495
Fix typo, make domain2prefix quiet again
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@869 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-11 07:33:25 +00:00
nemobis
626118cfab
Let's call it 0.2 then, a bump to 1 would require announcements etc. We're not there yet (API support etc.).
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@867 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-10 09:17:06 +00:00
Hydriz
df1e7efafd
Change version of dumpgenerator.py to 1.1. Using 0.1 is rather confusing.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@866 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-10 02:47:18 +00:00
nemobis
e1d4de3179
Uncomment appended index.php for guess in most configurations
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@864 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-08 22:20:21 +00:00
nemobis
5d34d9512a
Needs to be non-matching group
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@863 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-08 22:01:36 +00:00
nemobis
82ba173739
Issue 22: allimages now uses aicontinue, not aifrom
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@862 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-08 21:57:04 +00:00
nemobis
c6546ff935
Issue 71: Don't try to download private wikis, first workaround
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@861 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-07 22:59:28 +00:00
nemobis
776038666f
Issue 72: revert r857, just define everything in launcher.py
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@860 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-07 12:24:50 +00:00
nemobis
6113fa3340
Add delay to getPageTitlesScraper
...
We must be nice here too or naughty hosts fail badly, for instance wikkii.com gave
urllib2.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Temporarily
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@859 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-06 08:21:57 +00:00
nemobis
9e1b13e173
Correct --help: format is --delay=5, not --delay:5
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@858 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-06 08:13:07 +00:00
nemobis
6430ac5f47
Check for the existence of the array in domain2prefix instead; uploader.py failed on python 2.6
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@857 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-06 07:49:12 +00:00
nemobis
ef7d527e86
Add some advice about editthis.info for usage via launcher.py
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@855 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-05 18:11:23 +00:00
nemobis
4820339d10
Fix r842, patch by balrog; Schbirid reported python error in CleanHTML
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@854 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-05 17:10:44 +00:00
nemobis
7c94815e2c
Issue 68: Use GET, not POST, to download images; some harm and no? good
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@851 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-01 23:07:27 +00:00
nemobis
e1b34b7f6b
Fix whitespace
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@844 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-10-14 19:10:10 +00:00
nemobis
57e226c049
Use urllib2 and set user agent in some more places; some webhosts block urllib.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@842 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-10-14 09:27:30 +00:00
Hydriz
414fb1988f
Add a little more singular/plural support.
...
Along with this commit:
* Changed original %s to %d for number of edits, since it is recognised as an integer.
* Directly defined the number of edits to be 1 when the if condition is true, to optimise performance.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@841 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-10-13 09:35:48 +00:00
nemobis
13ebee4c28
Issue 60: Add authentication cookie support, patch by Fredrik Roubert
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@840 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-10-09 20:11:01 +00:00
nemobis
b34f01646a
Use getUserAgent in one more place, urllib is blocked by some
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@839 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-10-09 19:48:02 +00:00