Emilio J. Rodríguez-Posada
3a8fa1f0cc
adding test for getImageFilenamesURL(), the image scraper
2014-07-01 17:29:07 +02:00
Emilio J. Rodríguez-Posada
5f0ca74d87
adding simple test for delay(), 0 and 3 seconds
2014-07-01 17:22:27 +02:00
balr0g
50b011f90d
Initial port to argparse
2014-06-30 19:41:03 -04:00
Antoine Musso
8a8b87225b
Easily run tests in a virtualenv with tox and nose
...
tox is a nice wrapper around virtualenv. It setup a new env that makes
sure you have a clean install without any module beside the standard
one. If you need more modules, simply add them to 'deps =' and tox will
install them from pypi.
nose makes testing easier, it find any files that looks like a test and
run them for you.
The envs are created in /.tox so ignore that directory entirely.
2014-07-01 00:52:04 +02:00
Emilio J. Rodríguez-Posada
09be8b5f22
prints
2014-06-30 23:09:13 +02:00
Emilio J. Rodríguez-Posada
3da6bc222f
ignoring .pyc files
2014-06-30 22:57:09 +02:00
Emilio J. Rodríguez-Posada
d04c0e5bce
first test
2014-06-30 22:55:29 +02:00
Emilio J. Rodríguez-Posada
568deef081
adding comments for clarification
2014-06-30 20:09:36 +02:00
Emilio J. Rodríguez-Posada
d4eed1f738
fixing #127 and #134 , now works with APIs that returns 'name' field for images and those that don't do it (in this case we unquote over ascii); also fixing bug that re-download image list when it was completed previously
2014-06-30 20:03:32 +02:00
Emilio J. Rodríguez-Posada
005de23c1d
adding gzip to siteinfo downloader
2014-06-29 13:26:06 +02:00
Emilio J. Rodríguez-Posada
d79ea64d41
fixing issue #97 pretty siteinfo json saving, indenting 4 chars
2014-06-29 10:46:26 +02:00
Emilio J. Rodríguez-Posada
3854a344fe
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
2014-06-29 10:01:25 +02:00
Emilio J. Rodríguez-Posada
1c1f0dbb86
replacing XML with JSON in image downloading
2014-06-29 10:01:09 +02:00
Emilio J. Rodríguez-Posada
6bbdf42ed0
Merge pull request #125 from balr0g/check-for-disabled-api
...
Don't try to download sites with disabled API
2014-06-28 10:35:31 +02:00
balr0g
481323c7f7
Don't try to download sites with disabled API
2014-06-27 15:19:54 -04:00
Emilio J. Rodríguez-Posada
084ccc6456
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
2014-06-27 18:52:46 +02:00
nemobis
1933db8a94
Merge pull request #124 from balr0g/scraper-unicode-title-fix
...
Fix scraper for sites with Unicode titles
2014-06-27 18:06:51 +02:00
balr0g
62be069026
Fix scraper for sites with Unicode titles
2014-06-27 11:50:59 -04:00
nemobis
dbfc1b598a
Ask Wikimedia Commons reseed
2014-06-27 17:41:20 +02:00
nemobis
c41cc5235c
Fix typo, add balrog
...
Because he always saves wikiteam from my silly mistakes, helping out on #wikiteam.
2014-06-27 17:39:50 +02:00
nemobis
62d961fa97
Fix typo, unused variable spotted by balrog
2014-06-27 17:33:29 +02:00
Emilio J. Rodríguez-Posada
7d00cfa0de
adding list info file for tropicalwikis
2014-06-27 17:07:37 +02:00
Emilio J. Rodríguez-Posada
ecd539f1ae
adding scribblewiki list info file
2014-06-27 17:06:17 +02:00
Emilio J. Rodríguez-Posada
e95d8ba6e1
sort list
2014-06-27 17:04:59 +02:00
Emilio J. Rodríguez-Posada
c420d4d843
adding spider for wikkii, updating the list (10 diff wikis, 2 new, 8 dead), adding info for list
2014-06-27 16:59:12 +02:00
Emilio J. Rodríguez-Posada
c7fc194f0d
add info file for wikkii.com list
2014-06-27 16:48:02 +02:00
Emilio J. Rodríguez-Posada
29a64507c2
last update date
2014-06-27 16:42:50 +02:00
Emilio J. Rodríguez-Posada
d90127e9cc
adding details to shoutwiki info
2014-06-27 16:16:14 +02:00
Emilio J. Rodríguez-Posada
90c442a5b7
updating shoutwiki list and uploading basic spider
2014-06-27 16:15:29 +02:00
Emilio J. Rodríguez-Posada
75e2234c2c
adding details to referata list
2014-06-27 16:15:00 +02:00
Emilio J. Rodríguez-Posada
db9bcb68ca
adding license to referata-spider.py
2014-06-27 16:14:44 +02:00
Emilio J. Rodríguez-Posada
fa6f603fe8
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
2014-06-27 15:33:10 +02:00
Emilio J. Rodríguez-Posada
eaec1afa83
adding info about shoutwiki list
2014-06-27 15:32:57 +02:00
nemobis
95bc2dec38
Link GitHub issue tracker
2014-06-27 15:25:28 +02:00
Emilio J. Rodríguez-Posada
a917a68020
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
2014-06-27 15:24:48 +02:00
Emilio J. Rodríguez-Posada
1befbabb02
updating info for referata list
2014-06-27 15:23:48 +02:00
nemobis
fb02da7f4d
Merge pull request #122 from balr0g/gzip
...
Add Content-Encoding: gzip support
2014-06-27 15:09:29 +02:00
balr0g
d60e560571
Add Content-Encoding: gzip support
2014-06-27 09:02:35 -04:00
Emilio J. Rodríguez-Posada
7a6ef18339
add more wikis to referata list; uploading basic referata-spider.py
2014-06-27 14:51:05 +02:00
Emilio J. Rodríguez-Posada
d97c46afd1
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
2014-06-27 13:39:39 +02:00
Emilio J. Rodríguez-Posada
a988968e7e
adding info for referata list
2014-06-27 13:39:23 +02:00
Emilio J. Rodríguez-Posada
a1a76ac198
issue #111
2014-06-27 12:58:03 +02:00
nemobis
4ef1d45e35
Merge pull request #119 from nemobis/2014/siteinfo
...
Issue 97: Add new siteinfo.json to the archived 7z
2014-06-26 11:14:11 +02:00
Federico Leva
f46b60a4b6
Issue 97: Add new siteinfo.json to the archived 7z
2014-06-26 11:11:19 +02:00
Emilio J. Rodríguez-Posada
5261811fa4
only if api exists
2014-06-26 10:39:45 +02:00
Emilio J. Rodríguez-Posada
610764619a
add saveSiteInfo() to download meta=siteinfo data from API to a file
2014-06-26 10:38:59 +02:00
Emilio J. Rodríguez-Posada
d395433513
comments and newlines
2014-06-26 10:26:57 +02:00
Emilio J. Rodríguez-Posada
5eff4bd072
comments and tabs
2014-06-26 10:17:00 +02:00
Emilio J. Rodríguez-Posada
0b0c40f5da
adding more user-agents, but keeps the first as default by now
2014-06-26 10:13:59 +02:00
Emilio J. Rodríguez-Posada
81468c4a7c
using JSON to retrieve namespaces via API
2014-06-25 23:54:05 +02:00