Commit Graph

315 Commits (80288cf49e42f90d6017cb939d5c190be8dc3811)

Author SHA1 Message Date
balr0g fd6ea19b4b config['api'] is set but empty; properly handle this 10 years ago
nemobis 1ff96238eb Denote as alpha until revamp is tested
Per emijrp who asked not to run dumps with this, at https://github.com/WikiTeam/wikiteam/issues/104#issuecomment-48039143
Currently proposed things to fix or check: https://github.com/WikiTeam/wikiteam/issues?milestone=1&state=open
10 years ago
Emilio J. Rodríguez-Posada 89e3c3e462 standarize getImage* functions names 10 years ago
Emilio J. Rodríguez-Posada aaa1822759 improving image list downloader 10 years ago
Emilio J. Rodríguez-Posada 88c9468c0e improving image list downloader 10 years ago
balr0g 3929e4eb9c Cleanups and error fixes suggested by flake8 (pep8 + pyflakes) 10 years ago
Emilio J. Rodríguez-Posada c07b527e5d adding session to getWikiEngine() 10 years ago
Emilio J. Rodríguez-Posada 30c153ce1f chg: using 'with open' for files 10 years ago
balr0g 9aa3c4a0e1 Removed all traces of urllib except for encode/decode; more bugs fixed. 10 years ago
balr0g c8e11a949b Initial port to Requests 10 years ago
Emilio J. Rodríguez-Posada 9553e3550c adding wiki engine detector 10 years ago
Emilio J. Rodríguez-Posada eb97cf1adf version 0.2.2 and tiny bits in --help 10 years ago
balr0g 50b011f90d Initial port to argparse 10 years ago
Emilio J. Rodríguez-Posada 568deef081 adding comments for clarification 10 years ago
Emilio J. Rodríguez-Posada d4eed1f738 fixing #127 and #134 , now works with APIs that returns 'name' field for images and those that don't do it (in this case we unquote over ascii); also fixing bug that re-download image list when it was completed previously 10 years ago
Emilio J. Rodríguez-Posada 005de23c1d adding gzip to siteinfo downloader 10 years ago
Emilio J. Rodríguez-Posada d79ea64d41 fixing issue #97 pretty siteinfo json saving, indenting 4 chars 10 years ago
Emilio J. Rodríguez-Posada 3854a344fe Merge branch 'master' of https://github.com/WikiTeam/wikiteam 10 years ago
Emilio J. Rodríguez-Posada 1c1f0dbb86 replacing XML with JSON in image downloading 10 years ago
balr0g 481323c7f7 Don't try to download sites with disabled API 10 years ago
nemobis 1933db8a94 Merge pull request #124 from balr0g/scraper-unicode-title-fix
Fix scraper for sites with Unicode titles
10 years ago
balr0g 62be069026 Fix scraper for sites with Unicode titles 10 years ago
nemobis 62d961fa97 Fix typo, unused variable spotted by balrog 10 years ago
nemobis 95bc2dec38 Link GitHub issue tracker 10 years ago
balr0g d60e560571 Add Content-Encoding: gzip support 10 years ago
Emilio J. Rodríguez-Posada 5261811fa4 only if api exists 10 years ago
Emilio J. Rodríguez-Posada 610764619a add saveSiteInfo() to download meta=siteinfo data from API to a file 10 years ago
Emilio J. Rodríguez-Posada d395433513 comments and newlines 10 years ago
Emilio J. Rodríguez-Posada 5eff4bd072 comments and tabs 10 years ago
Emilio J. Rodríguez-Posada 0b0c40f5da adding more user-agents, but keeps the first as default by now 10 years ago
Emilio J. Rodríguez-Posada 81468c4a7c using JSON to retrieve namespaces via API 10 years ago
Emilio J. Rodríguez-Posada 703eb9011b improving checkAPI() using JSON properly loaded 10 years ago
Emilio J. Rodríguez-Posada 44d3fe1e36 Merge pull request #117 from nemobis/bug/48
Issue 48: Check that API actually works
10 years ago
Emilio J. Rodríguez-Posada fc80556d8a merging... 10 years ago
Emilio J. Rodríguez-Posada f474deb71f now we use JSON properly in getPageTitlesAPI(), instead of XML; fixing some wrong prints, now support utf-8 10 years ago
Federico Leva 997276110c Issue 46: dumpgenerator should follow redirects
Patch by @balr0g from libsonic (GPLv3+).
10 years ago
Federico Leva a8e1575879 Issue 48: Check that API actually works 10 years ago
Emilio J. Rodríguez-Posada c9aa165504 fixing header with the new year, info and documentation link 10 years ago
nemobis ac4c93c12a Issue 85: more cross-platform shebang on all scripts
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@962 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 403dc213ef Issue 71: English-only match for an older case
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@942 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 0ede45b7cf Special:BadTitle works only in English wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@917 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 034866a32e Handle permissions-errors for wikis requiring login or whatever
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@916 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 6c69d9800f Followup, delay needs config; should be BC
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@914 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 55185467e1 Add delay to all checking and listing functions, crappy hosts die on them
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@902 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 6efe406ea5 Followup r877, first check most common conditions for shortcut performance
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@882 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 611d13f8c5 Follow up r877, check the number of revision tags
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@878 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 64bd837cab (Issue 34) XML integry check inside the code
This *really* fixes the issue and asks the user whether or not to regenerate a dump.


git-svn-id: https://wikiteam.googlecode.com/svn/trunk@877 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 79047a3ded (Issue 71) Use a better check for private wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@873 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 26873ad495 Fix typo, make domain2prefix quiet again
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@869 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 626118cfab Let's call it 0.2 then, a bump to 1 would require announcements etc. We're not there yet (API support etc.).
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@867 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz df1e7efafd Change version of dumpgenerator.py to 1.1. Using 0.1 is rather confusing.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@866 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis e1d4de3179 Uncomment appended index.php for guess in most configurations
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@864 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 5d34d9512a Needs to be non-matching group
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@863 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 82ba173739 Issue 22: allimages now uses aicontinue, not aifrom
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@862 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis c6546ff935 Issue 71: Don't try to download private wikis, first workaround
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@861 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 776038666f Issue 72: revert r857, just define everything in launcher.py
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@860 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 6113fa3340 Add delay to getPageTitlesScraper
We must be nice here too or naughty hosts fail badly, for instance wikkii.com gave

urllib2.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Temporarily



git-svn-id: https://wikiteam.googlecode.com/svn/trunk@859 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 9e1b13e173 Correct --help: format is --delay=5, not --delay:5
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@858 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 6430ac5f47 Check for the existence of the array in domain2prefix instead; uploader.py failed on python 2.6
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@857 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis ef7d527e86 Add some advice about editthis.info for usage via launcher.py
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@855 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 4820339d10 Fix r842, patch by balrog; Schbirid reported python error in CleanHTML
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@854 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 7c94815e2c Issue 68: Use GET, not POST, to download images; some harm and no? good
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@851 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis e1b34b7f6b Fix whitespace
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@844 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 57e226c049 Use urllib2 and set user agent in some more places; some webhosts block urllib.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@842 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 414fb1988f Add a little more singular/plural support.
Along with this commit:
* Changed original %s to %d for number of edits, since it is recognised as an integer.
* Directly defined the number of edits to be 1 when the if condition is true, to optimise performance.


git-svn-id: https://wikiteam.googlecode.com/svn/trunk@841 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 13ebee4c28 Issue 60: Add authentication cookie support, patch by Fredrik Roubert
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@840 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis b34f01646a Use getUserAgent in one more place, urllib is blocked by some
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@839 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 8b4480be64 Adding plural support for number of edits saved for a page
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@831 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis ed8d174d5a Issue #61: some skins hide that stuff, use meta tag generator
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@829 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp f1874656ed comments
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@818 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp af81adebeb comments
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@817 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 424588a55c comments
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@816 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 360d1c1fa1 fixing createnewdump() and resumepreviousdump()
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@815 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 8295990df0 moving code to functions; tiny changes in comments
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@814 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 79a310c470 tiny changes in comments, some clarifications
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@813 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 0d7d73d420 ignore case
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@812 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 5dddc9c502 tiny changes in comments, some clarifications
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@811 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 73bdc5a5dc caps
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@810 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 5f4d10dd5a 2013
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@809 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp a78f390557 fixing issue 56
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@806 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
nemobis 0080197b8f WARNING: API support is experimental (faster but broken).
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@805 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
Hydriz 240b9f9dac Add support for Wikidata and Wikivoyage.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@804 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp a850194829 rewriting the inline help a bit
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@798 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp a57a490513 fixing file description bug
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@796 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 36ea489313 fixing file description bug
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@795 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp e40eb9b826 disable check integrity function by now
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@758 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp c8ca525ff7 just adding documentation and comments
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@743 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp d616bcc2e6 just adding documentation and comments
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@742 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
Hydriz bbffb24a0c (Issue 34) XML integry check inside the code
An attempt to integrate a XML integrity checker into the script. If
the dump integrity check fails, the script will be redumped from the
start.


git-svn-id: https://wikiteam.googlecode.com/svn/trunk@710 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
nemobis 7e083de24b Partial revert of r675: the script was always downloading only the last revision, as reported by Hydriz... *facepalms*
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@709 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
nemobis 9f58ca6a9b Fix Issue 26: Error loop "XML for ... is wrong"
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@675 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
nemobis a8479f9936 Get namespaces and images info from API. Code badly copied around, we should probably use modules such as http://packages.python.org/simplemediawiki/ to use API. Tested, seems to work.
Fixed some typos.

git-svn-id: https://wikiteam.googlecode.com/svn/trunk@668 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp c73a535ea7 adding &templates param for Special:Export, useful when other software use getXML() function to export wiki pages, e.g.: WikiEvidens
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@610 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
nemobis 721eb8c596 Added link to api.php or index.php being checked, so that if you called the script via another script you can understand something.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@588 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
nemobis fac6c33846 google, not gmail; upload docs
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@572 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
Hydriz b680b07709 Fixing sentence to avoid having 2 commas too close to each other
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@506 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 7e0ebdbff4 link to doc
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@461 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 4a5e1bc880 converting api check and indexphp check to fuctions
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@395 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp a5655e7885 notebook, simple menu
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@364 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 8e03c6ccf3 years
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@344 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 0c40b02aac new version caring about long filenames and weird chars in url
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@343 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 15189ce31a header
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@263 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 3c43b3cde3 fixing issue #39 wikia images error
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@261 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp d408c06b77 randomtitle = 'Main_Page' #previously AMF5LKE43MNFGHKSDMRTJ
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@225 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 76897b09b2 retries for image list retrieve;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@224 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 31ed2aff33 moving some more GET requests to POST;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@223 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 60f7fa88a1 \n
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@222 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp e933c780b5 doc
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@221 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp fb4aa9e9dd doc
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@220 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp d1b2c2cd96 doc
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@219 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 2caf754ef4 moving getImageFilenamesURL() from GET to POST; some comments;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@218 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 0746f49db0 moving some GET requests to POST: index.php check and namespaces get
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@217 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp a865564713 .lower()
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@216 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp e61f1ef974 doc
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@215 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 6c42144f89 adding --force parameter to skip wiki[mp]edia projects alert
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@214 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 13dfc9be88 adding empty docstrings;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@200 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp c489de6533 issue #14
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@197 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp df3bd03336 prints
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@194 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp b25bf51a59 prints
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@193 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 42640f3781 image limit errors improved
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@192 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 4a0fb86751 index.php detector improved
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@191 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 778a8ad7ae adding support to download images on old mediawikis; regexp4;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@165 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 5b7674edb7 params in main() to call it from external scripts (using import dumpgenerator; dumpgenerator.main(params=params))
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@161 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp e4d0de09d6 params in main() to call it from external scripts (using import dumpgenerator; dumpgenerator.main(params=params))
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@160 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp da6c4d1468 print error output when splitters error occurs; 5000 -> 500 in image list parser
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@158 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp c5b6b8a866 md5.new->md5; new uploaded wikis;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@157 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 835158b79b more wikis; md5 or hashlib
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@154 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp c7b511ee38 more wikis; new content spliter;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@152 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 9a20327939 undoHTMLEntities for titles when scrapped from Allpages; protocol http https;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@150 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp af64c6b6c5 http or https
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@149 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp e9f379b888 fixing print;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@147 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp b26a86cdd5 verbose false for imagedump;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@144 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 17235cffd9 better comments;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@140 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 6e45398878 when full history fails, retrieve only the last version; various server errors handled
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@139 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp e4b233cc37 print verbose; seconds numbers to variables;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@138 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 7b85e243b0 :
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@137 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 3c49a30764 fixing issue #12 and issue #13;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@136 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 3388c7e83b print
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@135 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 101cd62e3d print
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@134 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 46303d780d fixing issue #11;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@133 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 1f91b4c63e fixing issue #11;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@132 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp e13b4a6428 removing http/https for file prefixes;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@126 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 79f961aef5 more wikis;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@122 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp d6c1a773a0 removing some old #fix comments; moving TODO to Issues section at Google Code;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@119 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 88cbc0e871 removing some old #fix comments;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@118 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 23c9a06a31 removing some old #fix comments;removing thread option not implemented and better not to be done;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@117 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 05d1fb97c4 sorting titles; fixing issue #9;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@116 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 200baf406c fixing ampersand & and ' issues while retrieving XML (issue #2)
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@91 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 727e4dbd7c pause and reload xml when no </siteinfo> is found
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@90 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago
emijrp 6cb2575b1b empty lines
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@85 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
14 years ago