Emilio J. Rodríguez-Posada
3854a344fe
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
10 years ago
Emilio J. Rodríguez-Posada
1c1f0dbb86
replacing XML with JSON in image downloading
10 years ago
balr0g
481323c7f7
Don't try to download sites with disabled API
10 years ago
nemobis
1933db8a94
Merge pull request #124 from balr0g/scraper-unicode-title-fix
...
Fix scraper for sites with Unicode titles
10 years ago
balr0g
62be069026
Fix scraper for sites with Unicode titles
10 years ago
nemobis
62d961fa97
Fix typo, unused variable spotted by balrog
10 years ago
nemobis
95bc2dec38
Link GitHub issue tracker
10 years ago
balr0g
d60e560571
Add Content-Encoding: gzip support
10 years ago
Emilio J. Rodríguez-Posada
5261811fa4
only if api exists
10 years ago
Emilio J. Rodríguez-Posada
610764619a
add saveSiteInfo() to download meta=siteinfo data from API to a file
10 years ago
Emilio J. Rodríguez-Posada
d395433513
comments and newlines
10 years ago
Emilio J. Rodríguez-Posada
5eff4bd072
comments and tabs
10 years ago
Emilio J. Rodríguez-Posada
0b0c40f5da
adding more user-agents, but keeps the first as default by now
10 years ago
Emilio J. Rodríguez-Posada
81468c4a7c
using JSON to retrieve namespaces via API
10 years ago
Emilio J. Rodríguez-Posada
703eb9011b
improving checkAPI() using JSON properly loaded
10 years ago
Emilio J. Rodríguez-Posada
44d3fe1e36
Merge pull request #117 from nemobis/bug/48
...
Issue 48: Check that API actually works
10 years ago
Emilio J. Rodríguez-Posada
fc80556d8a
merging...
10 years ago
Emilio J. Rodríguez-Posada
f474deb71f
now we use JSON properly in getPageTitlesAPI(), instead of XML; fixing some wrong prints, now support utf-8
10 years ago
Federico Leva
997276110c
Issue 46: dumpgenerator should follow redirects
...
Patch by @balr0g from libsonic (GPLv3+).
10 years ago
Federico Leva
a8e1575879
Issue 48: Check that API actually works
10 years ago
Emilio J. Rodríguez-Posada
c9aa165504
fixing header with the new year, info and documentation link
10 years ago
nemobis
ac4c93c12a
Issue 85: more cross-platform shebang on all scripts
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@962 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
403dc213ef
Issue 71: English-only match for an older case
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@942 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
0ede45b7cf
Special:BadTitle works only in English wikis
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@917 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
034866a32e
Handle permissions-errors for wikis requiring login or whatever
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@916 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
6c69d9800f
Followup, delay needs config; should be BC
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@914 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
55185467e1
Add delay to all checking and listing functions, crappy hosts die on them
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@902 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
6efe406ea5
Followup r877, first check most common conditions for shortcut performance
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@882 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
611d13f8c5
Follow up r877, check the number of revision tags
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@878 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
64bd837cab
(Issue 34) XML integry check inside the code
...
This *really* fixes the issue and asks the user whether or not to regenerate a dump.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@877 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
79047a3ded
(Issue 71) Use a better check for private wikis
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@873 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
26873ad495
Fix typo, make domain2prefix quiet again
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@869 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
626118cfab
Let's call it 0.2 then, a bump to 1 would require announcements etc. We're not there yet (API support etc.).
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@867 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
df1e7efafd
Change version of dumpgenerator.py to 1.1. Using 0.1 is rather confusing.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@866 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
e1d4de3179
Uncomment appended index.php for guess in most configurations
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@864 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
5d34d9512a
Needs to be non-matching group
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@863 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
82ba173739
Issue 22: allimages now uses aicontinue, not aifrom
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@862 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
c6546ff935
Issue 71: Don't try to download private wikis, first workaround
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@861 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
776038666f
Issue 72: revert r857, just define everything in launcher.py
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@860 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
6113fa3340
Add delay to getPageTitlesScraper
...
We must be nice here too or naughty hosts fail badly, for instance wikkii.com gave
urllib2.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Temporarily
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@859 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
9e1b13e173
Correct --help: format is --delay=5, not --delay:5
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@858 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
6430ac5f47
Check for the existence of the array in domain2prefix instead; uploader.py failed on python 2.6
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@857 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
ef7d527e86
Add some advice about editthis.info for usage via launcher.py
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@855 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
4820339d10
Fix r842, patch by balrog; Schbirid reported python error in CleanHTML
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@854 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
7c94815e2c
Issue 68: Use GET, not POST, to download images; some harm and no? good
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@851 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
e1b34b7f6b
Fix whitespace
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@844 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
57e226c049
Use urllib2 and set user agent in some more places; some webhosts block urllib.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@842 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
414fb1988f
Add a little more singular/plural support.
...
Along with this commit:
* Changed original %s to %d for number of edits, since it is recognised as an integer.
* Directly defined the number of edits to be 1 when the if condition is true, to optimise performance.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@841 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
13ebee4c28
Issue 60: Add authentication cookie support, patch by Fredrik Roubert
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@840 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
b34f01646a
Use getUserAgent in one more place, urllib is blocked by some
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@839 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago