balr0g
fd6ea19b4b
config['api'] is set but empty; properly handle this
10 years ago
nemobis
1ff96238eb
Denote as alpha until revamp is tested
...
Per emijrp who asked not to run dumps with this, at https://github.com/WikiTeam/wikiteam/issues/104#issuecomment-48039143
Currently proposed things to fix or check: https://github.com/WikiTeam/wikiteam/issues?milestone=1&state=open
10 years ago
Emilio J. Rodríguez-Posada
89e3c3e462
standarize getImage* functions names
10 years ago
Emilio J. Rodríguez-Posada
aaa1822759
improving image list downloader
10 years ago
Emilio J. Rodríguez-Posada
88c9468c0e
improving image list downloader
10 years ago
balr0g
3929e4eb9c
Cleanups and error fixes suggested by flake8 (pep8 + pyflakes)
10 years ago
Emilio J. Rodríguez-Posada
c07b527e5d
adding session to getWikiEngine()
10 years ago
Emilio J. Rodríguez-Posada
30c153ce1f
chg: using 'with open' for files
10 years ago
balr0g
9aa3c4a0e1
Removed all traces of urllib except for encode/decode; more bugs fixed.
10 years ago
balr0g
c8e11a949b
Initial port to Requests
10 years ago
Emilio J. Rodríguez-Posada
9553e3550c
adding wiki engine detector
10 years ago
Emilio J. Rodríguez-Posada
eb97cf1adf
version 0.2.2 and tiny bits in --help
10 years ago
balr0g
50b011f90d
Initial port to argparse
10 years ago
Emilio J. Rodríguez-Posada
568deef081
adding comments for clarification
10 years ago
Emilio J. Rodríguez-Posada
d4eed1f738
fixing #127 and #134 , now works with APIs that returns 'name' field for images and those that don't do it (in this case we unquote over ascii); also fixing bug that re-download image list when it was completed previously
10 years ago
Emilio J. Rodríguez-Posada
005de23c1d
adding gzip to siteinfo downloader
10 years ago
Emilio J. Rodríguez-Posada
d79ea64d41
fixing issue #97 pretty siteinfo json saving, indenting 4 chars
10 years ago
Emilio J. Rodríguez-Posada
3854a344fe
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
10 years ago
Emilio J. Rodríguez-Posada
1c1f0dbb86
replacing XML with JSON in image downloading
10 years ago
balr0g
481323c7f7
Don't try to download sites with disabled API
10 years ago
nemobis
1933db8a94
Merge pull request #124 from balr0g/scraper-unicode-title-fix
...
Fix scraper for sites with Unicode titles
10 years ago
balr0g
62be069026
Fix scraper for sites with Unicode titles
10 years ago
nemobis
62d961fa97
Fix typo, unused variable spotted by balrog
10 years ago
nemobis
95bc2dec38
Link GitHub issue tracker
10 years ago
balr0g
d60e560571
Add Content-Encoding: gzip support
10 years ago
Emilio J. Rodríguez-Posada
5261811fa4
only if api exists
10 years ago
Emilio J. Rodríguez-Posada
610764619a
add saveSiteInfo() to download meta=siteinfo data from API to a file
10 years ago
Emilio J. Rodríguez-Posada
d395433513
comments and newlines
10 years ago
Emilio J. Rodríguez-Posada
5eff4bd072
comments and tabs
10 years ago
Emilio J. Rodríguez-Posada
0b0c40f5da
adding more user-agents, but keeps the first as default by now
10 years ago
Emilio J. Rodríguez-Posada
81468c4a7c
using JSON to retrieve namespaces via API
10 years ago
Emilio J. Rodríguez-Posada
703eb9011b
improving checkAPI() using JSON properly loaded
10 years ago
Emilio J. Rodríguez-Posada
44d3fe1e36
Merge pull request #117 from nemobis/bug/48
...
Issue 48: Check that API actually works
10 years ago
Emilio J. Rodríguez-Posada
fc80556d8a
merging...
10 years ago
Emilio J. Rodríguez-Posada
f474deb71f
now we use JSON properly in getPageTitlesAPI(), instead of XML; fixing some wrong prints, now support utf-8
10 years ago
Federico Leva
997276110c
Issue 46: dumpgenerator should follow redirects
...
Patch by @balr0g from libsonic (GPLv3+).
10 years ago
Federico Leva
a8e1575879
Issue 48: Check that API actually works
10 years ago
Emilio J. Rodríguez-Posada
c9aa165504
fixing header with the new year, info and documentation link
10 years ago
nemobis
ac4c93c12a
Issue 85: more cross-platform shebang on all scripts
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@962 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
403dc213ef
Issue 71: English-only match for an older case
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@942 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
0ede45b7cf
Special:BadTitle works only in English wikis
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@917 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
034866a32e
Handle permissions-errors for wikis requiring login or whatever
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@916 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
6c69d9800f
Followup, delay needs config; should be BC
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@914 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
55185467e1
Add delay to all checking and listing functions, crappy hosts die on them
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@902 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
6efe406ea5
Followup r877, first check most common conditions for shortcut performance
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@882 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
611d13f8c5
Follow up r877, check the number of revision tags
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@878 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
64bd837cab
(Issue 34) XML integry check inside the code
...
This *really* fixes the issue and asks the user whether or not to regenerate a dump.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@877 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
79047a3ded
(Issue 71) Use a better check for private wikis
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@873 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
26873ad495
Fix typo, make domain2prefix quiet again
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@869 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
626118cfab
Let's call it 0.2 then, a bump to 1 would require announcements etc. We're not there yet (API support etc.).
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@867 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
df1e7efafd
Change version of dumpgenerator.py to 1.1. Using 0.1 is rather confusing.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@866 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
e1d4de3179
Uncomment appended index.php for guess in most configurations
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@864 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
5d34d9512a
Needs to be non-matching group
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@863 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
82ba173739
Issue 22: allimages now uses aicontinue, not aifrom
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@862 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
c6546ff935
Issue 71: Don't try to download private wikis, first workaround
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@861 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
776038666f
Issue 72: revert r857, just define everything in launcher.py
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@860 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
6113fa3340
Add delay to getPageTitlesScraper
...
We must be nice here too or naughty hosts fail badly, for instance wikkii.com gave
urllib2.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Temporarily
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@859 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
9e1b13e173
Correct --help: format is --delay=5, not --delay:5
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@858 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
6430ac5f47
Check for the existence of the array in domain2prefix instead; uploader.py failed on python 2.6
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@857 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
ef7d527e86
Add some advice about editthis.info for usage via launcher.py
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@855 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
4820339d10
Fix r842, patch by balrog; Schbirid reported python error in CleanHTML
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@854 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
7c94815e2c
Issue 68: Use GET, not POST, to download images; some harm and no? good
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@851 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
e1b34b7f6b
Fix whitespace
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@844 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
57e226c049
Use urllib2 and set user agent in some more places; some webhosts block urllib.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@842 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
414fb1988f
Add a little more singular/plural support.
...
Along with this commit:
* Changed original %s to %d for number of edits, since it is recognised as an integer.
* Directly defined the number of edits to be 1 when the if condition is true, to optimise performance.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@841 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
13ebee4c28
Issue 60: Add authentication cookie support, patch by Fredrik Roubert
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@840 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
b34f01646a
Use getUserAgent in one more place, urllib is blocked by some
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@839 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
8b4480be64
Adding plural support for number of edits saved for a page
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@831 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
ed8d174d5a
Issue #61 : some skins hide that stuff, use meta tag generator
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@829 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp
f1874656ed
comments
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@818 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
af81adebeb
comments
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@817 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
424588a55c
comments
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@816 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
360d1c1fa1
fixing createnewdump() and resumepreviousdump()
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@815 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
8295990df0
moving code to functions; tiny changes in comments
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@814 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
79a310c470
tiny changes in comments, some clarifications
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@813 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
0d7d73d420
ignore case
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@812 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
5dddc9c502
tiny changes in comments, some clarifications
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@811 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
73bdc5a5dc
caps
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@810 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
5f4d10dd5a
2013
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@809 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
a78f390557
fixing issue 56
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@806 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
nemobis
0080197b8f
WARNING: API support is experimental (faster but broken).
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@805 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
Hydriz
240b9f9dac
Add support for Wikidata and Wikivoyage.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@804 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
a850194829
rewriting the inline help a bit
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@798 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
a57a490513
fixing file description bug
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@796 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
36ea489313
fixing file description bug
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@795 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
e40eb9b826
disable check integrity function by now
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@758 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
c8ca525ff7
just adding documentation and comments
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@743 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp
d616bcc2e6
just adding documentation and comments
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@742 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
Hydriz
bbffb24a0c
(Issue 34) XML integry check inside the code
...
An attempt to integrate a XML integrity checker into the script. If
the dump integrity check fails, the script will be redumped from the
start.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@710 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
nemobis
7e083de24b
Partial revert of r675: the script was always downloading only the last revision, as reported by Hydriz... *facepalms*
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@709 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
nemobis
9f58ca6a9b
Fix Issue 26: Error loop "XML for ... is wrong"
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@675 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
nemobis
a8479f9936
Get namespaces and images info from API. Code badly copied around, we should probably use modules such as http://packages.python.org/simplemediawiki/ to use API. Tested, seems to work.
...
Fixed some typos.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@668 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
c73a535ea7
adding &templates param for Special:Export, useful when other software use getXML() function to export wiki pages, e.g.: WikiEvidens
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@610 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
nemobis
721eb8c596
Added link to api.php or index.php being checked, so that if you called the script via another script you can understand something.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@588 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
nemobis
fac6c33846
google, not gmail; upload docs
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@572 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
Hydriz
b680b07709
Fixing sentence to avoid having 2 commas too close to each other
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@506 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
7e0ebdbff4
link to doc
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@461 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
4a5e1bc880
converting api check and indexphp check to fuctions
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@395 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
a5655e7885
notebook, simple menu
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@364 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
8e03c6ccf3
years
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@344 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
0c40b02aac
new version caring about long filenames and weird chars in url
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@343 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
15189ce31a
header
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@263 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
3c43b3cde3
fixing issue #39 wikia images error
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@261 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
d408c06b77
randomtitle = 'Main_Page' #previously AMF5LKE43MNFGHKSDMRTJ
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@225 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
76897b09b2
retries for image list retrieve;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@224 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
31ed2aff33
moving some more GET requests to POST;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@223 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
60f7fa88a1
\n
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@222 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
e933c780b5
doc
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@221 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
fb4aa9e9dd
doc
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@220 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
d1b2c2cd96
doc
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@219 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
2caf754ef4
moving getImageFilenamesURL() from GET to POST; some comments;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@218 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
0746f49db0
moving some GET requests to POST: index.php check and namespaces get
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@217 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
a865564713
.lower()
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@216 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
e61f1ef974
doc
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@215 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
6c42144f89
adding --force parameter to skip wiki[mp]edia projects alert
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@214 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
13dfc9be88
adding empty docstrings;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@200 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
c489de6533
issue #14
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@197 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
df3bd03336
prints
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@194 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
b25bf51a59
prints
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@193 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
42640f3781
image limit errors improved
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@192 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
4a0fb86751
index.php detector improved
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@191 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
778a8ad7ae
adding support to download images on old mediawikis; regexp4;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@165 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
5b7674edb7
params in main() to call it from external scripts (using import dumpgenerator; dumpgenerator.main(params=params))
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@161 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
e4d0de09d6
params in main() to call it from external scripts (using import dumpgenerator; dumpgenerator.main(params=params))
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@160 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp
da6c4d1468
print error output when splitters error occurs; 5000 -> 500 in image list parser
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@158 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
c5b6b8a866
md5.new->md5; new uploaded wikis;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@157 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
835158b79b
more wikis; md5 or hashlib
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@154 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
c7b511ee38
more wikis; new content spliter;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@152 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
9a20327939
undoHTMLEntities for titles when scrapped from Allpages; protocol http https;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@150 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
af64c6b6c5
http or https
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@149 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
e9f379b888
fixing print;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@147 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
b26a86cdd5
verbose false for imagedump;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@144 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
17235cffd9
better comments;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@140 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
6e45398878
when full history fails, retrieve only the last version; various server errors handled
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@139 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
e4b233cc37
print verbose; seconds numbers to variables;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@138 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
7b85e243b0
:
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@137 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
3c49a30764
fixing issue #12 and issue #13 ;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@136 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
3388c7e83b
print
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@135 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
101cd62e3d
print
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@134 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
46303d780d
fixing issue #11 ;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@133 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
1f91b4c63e
fixing issue #11 ;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@132 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
e13b4a6428
removing http/https for file prefixes;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@126 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
79f961aef5
more wikis;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@122 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
d6c1a773a0
removing some old #fix comments; moving TODO to Issues section at Google Code;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@119 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
88cbc0e871
removing some old #fix comments;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@118 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
23c9a06a31
removing some old #fix comments;removing thread option not implemented and better not to be done;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@117 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
05d1fb97c4
sorting titles; fixing issue #9 ;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@116 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
200baf406c
fixing ampersand & and ' issues while retrieving XML (issue #2 )
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@91 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
727e4dbd7c
pause and reload xml when no </siteinfo> is found
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@90 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp
6cb2575b1b
empty lines
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@85 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago