2
0
mirror of https://github.com/WikiTeam/wikiteam synced 2024-11-10 13:10:27 +00:00
Commit Graph

204 Commits

Author SHA1 Message Date
Emilio J. Rodríguez-Posada
eb97cf1adf version 0.2.2 and tiny bits in --help 2014-07-03 16:02:11 +02:00
balr0g
50b011f90d Initial port to argparse 2014-06-30 19:41:03 -04:00
Emilio J. Rodríguez-Posada
568deef081 adding comments for clarification 2014-06-30 20:09:36 +02:00
Emilio J. Rodríguez-Posada
d4eed1f738 fixing #127 and #134 , now works with APIs that returns 'name' field for images and those that don't do it (in this case we unquote over ascii); also fixing bug that re-download image list when it was completed previously 2014-06-30 20:03:32 +02:00
Emilio J. Rodríguez-Posada
005de23c1d adding gzip to siteinfo downloader 2014-06-29 13:26:06 +02:00
Emilio J. Rodríguez-Posada
d79ea64d41 fixing issue #97 pretty siteinfo json saving, indenting 4 chars 2014-06-29 10:46:26 +02:00
Emilio J. Rodríguez-Posada
3854a344fe Merge branch 'master' of https://github.com/WikiTeam/wikiteam 2014-06-29 10:01:25 +02:00
Emilio J. Rodríguez-Posada
1c1f0dbb86 replacing XML with JSON in image downloading 2014-06-29 10:01:09 +02:00
balr0g
481323c7f7 Don't try to download sites with disabled API 2014-06-27 15:19:54 -04:00
nemobis
1933db8a94 Merge pull request #124 from balr0g/scraper-unicode-title-fix
Fix scraper for sites with Unicode titles
2014-06-27 18:06:51 +02:00
balr0g
62be069026 Fix scraper for sites with Unicode titles 2014-06-27 11:50:59 -04:00
nemobis
62d961fa97 Fix typo, unused variable spotted by balrog 2014-06-27 17:33:29 +02:00
nemobis
95bc2dec38 Link GitHub issue tracker 2014-06-27 15:25:28 +02:00
balr0g
d60e560571 Add Content-Encoding: gzip support 2014-06-27 09:02:35 -04:00
Emilio J. Rodríguez-Posada
5261811fa4 only if api exists 2014-06-26 10:39:45 +02:00
Emilio J. Rodríguez-Posada
610764619a add saveSiteInfo() to download meta=siteinfo data from API to a file 2014-06-26 10:38:59 +02:00
Emilio J. Rodríguez-Posada
d395433513 comments and newlines 2014-06-26 10:26:57 +02:00
Emilio J. Rodríguez-Posada
5eff4bd072 comments and tabs 2014-06-26 10:17:00 +02:00
Emilio J. Rodríguez-Posada
0b0c40f5da adding more user-agents, but keeps the first as default by now 2014-06-26 10:13:59 +02:00
Emilio J. Rodríguez-Posada
81468c4a7c using JSON to retrieve namespaces via API 2014-06-25 23:54:05 +02:00
Emilio J. Rodríguez-Posada
703eb9011b improving checkAPI() using JSON properly loaded 2014-06-25 23:17:39 +02:00
Emilio J. Rodríguez-Posada
44d3fe1e36 Merge pull request #117 from nemobis/bug/48
Issue 48: Check that API actually works
2014-06-25 23:12:29 +02:00
Emilio J. Rodríguez-Posada
fc80556d8a merging... 2014-06-25 22:57:56 +02:00
Emilio J. Rodríguez-Posada
f474deb71f now we use JSON properly in getPageTitlesAPI(), instead of XML; fixing some wrong prints, now support utf-8 2014-06-25 22:53:46 +02:00
Federico Leva
997276110c Issue 46: dumpgenerator should follow redirects
Patch by @balr0g from libsonic (GPLv3+).
2014-06-25 21:33:02 +02:00
Federico Leva
a8e1575879 Issue 48: Check that API actually works 2014-06-25 20:35:36 +02:00
Emilio J. Rodríguez-Posada
c9aa165504 fixing header with the new year, info and documentation link 2014-06-25 18:08:33 +02:00
nemobis
ac4c93c12a Issue 85: more cross-platform shebang on all scripts
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@962 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2014-02-26 23:22:53 +00:00
nemobis
403dc213ef Issue 71: English-only match for an older case
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@942 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2014-02-02 23:06:10 +00:00
nemobis
0ede45b7cf Special:BadTitle works only in English wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@917 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2014-01-26 17:26:20 +00:00
nemobis
034866a32e Handle permissions-errors for wikis requiring login or whatever
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@916 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2014-01-26 17:21:46 +00:00
nemobis
6c69d9800f Followup, delay needs config; should be BC
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@914 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2014-01-25 22:29:42 +00:00
nemobis
55185467e1 Add delay to all checking and listing functions, crappy hosts die on them
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@902 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2014-01-23 16:05:19 +00:00
nemobis
6efe406ea5 Followup r877, first check most common conditions for shortcut performance
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@882 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-17 11:18:42 +00:00
Hydriz
611d13f8c5 Follow up r877, check the number of revision tags
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@878 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-16 14:01:24 +00:00
Hydriz
64bd837cab (Issue 34) XML integry check inside the code
This *really* fixes the issue and asks the user whether or not to regenerate a dump.


git-svn-id: https://wikiteam.googlecode.com/svn/trunk@877 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-16 13:39:49 +00:00
Hydriz
79047a3ded (Issue 71) Use a better check for private wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@873 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-15 07:43:37 +00:00
nemobis
26873ad495 Fix typo, make domain2prefix quiet again
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@869 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-11 07:33:25 +00:00
nemobis
626118cfab Let's call it 0.2 then, a bump to 1 would require announcements etc. We're not there yet (API support etc.).
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@867 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-10 09:17:06 +00:00
Hydriz
df1e7efafd Change version of dumpgenerator.py to 1.1. Using 0.1 is rather confusing.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@866 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-10 02:47:18 +00:00
nemobis
e1d4de3179 Uncomment appended index.php for guess in most configurations
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@864 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-08 22:20:21 +00:00
nemobis
5d34d9512a Needs to be non-matching group
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@863 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-08 22:01:36 +00:00
nemobis
82ba173739 Issue 22: allimages now uses aicontinue, not aifrom
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@862 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-08 21:57:04 +00:00
nemobis
c6546ff935 Issue 71: Don't try to download private wikis, first workaround
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@861 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-07 22:59:28 +00:00
nemobis
776038666f Issue 72: revert r857, just define everything in launcher.py
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@860 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-07 12:24:50 +00:00
nemobis
6113fa3340 Add delay to getPageTitlesScraper
We must be nice here too or naughty hosts fail badly, for instance wikkii.com gave

urllib2.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Temporarily



git-svn-id: https://wikiteam.googlecode.com/svn/trunk@859 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-06 08:21:57 +00:00
nemobis
9e1b13e173 Correct --help: format is --delay=5, not --delay:5
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@858 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-06 08:13:07 +00:00
nemobis
6430ac5f47 Check for the existence of the array in domain2prefix instead; uploader.py failed on python 2.6
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@857 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-06 07:49:12 +00:00
nemobis
ef7d527e86 Add some advice about editthis.info for usage via launcher.py
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@855 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-05 18:11:23 +00:00
nemobis
4820339d10 Fix r842, patch by balrog; Schbirid reported python error in CleanHTML
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@854 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
2013-11-05 17:10:44 +00:00