balr0g
|
3929e4eb9c
|
Cleanups and error fixes suggested by flake8 (pep8 + pyflakes)
|
2014-07-04 11:40:59 -04:00 |
|
Emilio J. Rodríguez-Posada
|
c07b527e5d
|
adding session to getWikiEngine()
|
2014-07-03 19:33:09 +02:00 |
|
Emilio J. Rodríguez-Posada
|
30c153ce1f
|
chg: using 'with open' for files
|
2014-07-03 19:24:28 +02:00 |
|
balr0g
|
9aa3c4a0e1
|
Removed all traces of urllib except for encode/decode; more bugs fixed.
|
2014-07-03 12:35:40 -04:00 |
|
balr0g
|
c8e11a949b
|
Initial port to Requests
|
2014-07-03 12:32:27 -04:00 |
|
Emilio J. Rodríguez-Posada
|
9553e3550c
|
adding wiki engine detector
|
2014-07-03 18:17:02 +02:00 |
|
Emilio J. Rodríguez-Posada
|
eb97cf1adf
|
version 0.2.2 and tiny bits in --help
|
2014-07-03 16:02:11 +02:00 |
|
balr0g
|
50b011f90d
|
Initial port to argparse
|
2014-06-30 19:41:03 -04:00 |
|
Emilio J. Rodríguez-Posada
|
568deef081
|
adding comments for clarification
|
2014-06-30 20:09:36 +02:00 |
|
Emilio J. Rodríguez-Posada
|
d4eed1f738
|
fixing #127 and #134 , now works with APIs that returns 'name' field for images and those that don't do it (in this case we unquote over ascii); also fixing bug that re-download image list when it was completed previously
|
2014-06-30 20:03:32 +02:00 |
|
Emilio J. Rodríguez-Posada
|
005de23c1d
|
adding gzip to siteinfo downloader
|
2014-06-29 13:26:06 +02:00 |
|
Emilio J. Rodríguez-Posada
|
d79ea64d41
|
fixing issue #97 pretty siteinfo json saving, indenting 4 chars
|
2014-06-29 10:46:26 +02:00 |
|
Emilio J. Rodríguez-Posada
|
3854a344fe
|
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
|
2014-06-29 10:01:25 +02:00 |
|
Emilio J. Rodríguez-Posada
|
1c1f0dbb86
|
replacing XML with JSON in image downloading
|
2014-06-29 10:01:09 +02:00 |
|
balr0g
|
481323c7f7
|
Don't try to download sites with disabled API
|
2014-06-27 15:19:54 -04:00 |
|
nemobis
|
1933db8a94
|
Merge pull request #124 from balr0g/scraper-unicode-title-fix
Fix scraper for sites with Unicode titles
|
2014-06-27 18:06:51 +02:00 |
|
balr0g
|
62be069026
|
Fix scraper for sites with Unicode titles
|
2014-06-27 11:50:59 -04:00 |
|
nemobis
|
62d961fa97
|
Fix typo, unused variable spotted by balrog
|
2014-06-27 17:33:29 +02:00 |
|
nemobis
|
95bc2dec38
|
Link GitHub issue tracker
|
2014-06-27 15:25:28 +02:00 |
|
balr0g
|
d60e560571
|
Add Content-Encoding: gzip support
|
2014-06-27 09:02:35 -04:00 |
|
Emilio J. Rodríguez-Posada
|
5261811fa4
|
only if api exists
|
2014-06-26 10:39:45 +02:00 |
|
Emilio J. Rodríguez-Posada
|
610764619a
|
add saveSiteInfo() to download meta=siteinfo data from API to a file
|
2014-06-26 10:38:59 +02:00 |
|
Emilio J. Rodríguez-Posada
|
d395433513
|
comments and newlines
|
2014-06-26 10:26:57 +02:00 |
|
Emilio J. Rodríguez-Posada
|
5eff4bd072
|
comments and tabs
|
2014-06-26 10:17:00 +02:00 |
|
Emilio J. Rodríguez-Posada
|
0b0c40f5da
|
adding more user-agents, but keeps the first as default by now
|
2014-06-26 10:13:59 +02:00 |
|
Emilio J. Rodríguez-Posada
|
81468c4a7c
|
using JSON to retrieve namespaces via API
|
2014-06-25 23:54:05 +02:00 |
|
Emilio J. Rodríguez-Posada
|
703eb9011b
|
improving checkAPI() using JSON properly loaded
|
2014-06-25 23:17:39 +02:00 |
|
Emilio J. Rodríguez-Posada
|
44d3fe1e36
|
Merge pull request #117 from nemobis/bug/48
Issue 48: Check that API actually works
|
2014-06-25 23:12:29 +02:00 |
|
Emilio J. Rodríguez-Posada
|
fc80556d8a
|
merging...
|
2014-06-25 22:57:56 +02:00 |
|
Emilio J. Rodríguez-Posada
|
f474deb71f
|
now we use JSON properly in getPageTitlesAPI(), instead of XML; fixing some wrong prints, now support utf-8
|
2014-06-25 22:53:46 +02:00 |
|
Federico Leva
|
997276110c
|
Issue 46: dumpgenerator should follow redirects
Patch by @balr0g from libsonic (GPLv3+).
|
2014-06-25 21:33:02 +02:00 |
|
Federico Leva
|
a8e1575879
|
Issue 48: Check that API actually works
|
2014-06-25 20:35:36 +02:00 |
|
Emilio J. Rodríguez-Posada
|
c9aa165504
|
fixing header with the new year, info and documentation link
|
2014-06-25 18:08:33 +02:00 |
|
nemobis
|
ac4c93c12a
|
Issue 85: more cross-platform shebang on all scripts
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@962 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2014-02-26 23:22:53 +00:00 |
|
nemobis
|
403dc213ef
|
Issue 71: English-only match for an older case
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@942 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2014-02-02 23:06:10 +00:00 |
|
nemobis
|
0ede45b7cf
|
Special:BadTitle works only in English wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@917 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2014-01-26 17:26:20 +00:00 |
|
nemobis
|
034866a32e
|
Handle permissions-errors for wikis requiring login or whatever
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@916 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2014-01-26 17:21:46 +00:00 |
|
nemobis
|
6c69d9800f
|
Followup, delay needs config; should be BC
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@914 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2014-01-25 22:29:42 +00:00 |
|
nemobis
|
55185467e1
|
Add delay to all checking and listing functions, crappy hosts die on them
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@902 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2014-01-23 16:05:19 +00:00 |
|
nemobis
|
6efe406ea5
|
Followup r877, first check most common conditions for shortcut performance
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@882 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2013-11-17 11:18:42 +00:00 |
|
Hydriz
|
611d13f8c5
|
Follow up r877, check the number of revision tags
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@878 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2013-11-16 14:01:24 +00:00 |
|
Hydriz
|
64bd837cab
|
(Issue 34) XML integry check inside the code
This *really* fixes the issue and asks the user whether or not to regenerate a dump.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@877 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2013-11-16 13:39:49 +00:00 |
|
Hydriz
|
79047a3ded
|
(Issue 71) Use a better check for private wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@873 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2013-11-15 07:43:37 +00:00 |
|
nemobis
|
26873ad495
|
Fix typo, make domain2prefix quiet again
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@869 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2013-11-11 07:33:25 +00:00 |
|
nemobis
|
626118cfab
|
Let's call it 0.2 then, a bump to 1 would require announcements etc. We're not there yet (API support etc.).
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@867 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2013-11-10 09:17:06 +00:00 |
|
Hydriz
|
df1e7efafd
|
Change version of dumpgenerator.py to 1.1. Using 0.1 is rather confusing.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@866 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2013-11-10 02:47:18 +00:00 |
|
nemobis
|
e1d4de3179
|
Uncomment appended index.php for guess in most configurations
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@864 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2013-11-08 22:20:21 +00:00 |
|
nemobis
|
5d34d9512a
|
Needs to be non-matching group
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@863 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2013-11-08 22:01:36 +00:00 |
|
nemobis
|
82ba173739
|
Issue 22: allimages now uses aicontinue, not aifrom
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@862 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2013-11-08 21:57:04 +00:00 |
|
nemobis
|
c6546ff935
|
Issue 71: Don't try to download private wikis, first workaround
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@861 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
|
2013-11-07 22:59:28 +00:00 |
|