Commit Graph

669 Commits (f022b02e47f462fa0142683ebef8dca5eea18adb)
 

Author SHA1 Message Date
Emilio J. Rodríguez-Posada 3a8fa1f0cc adding test for getImageFilenamesURL(), the image scraper 10 years ago
Emilio J. Rodríguez-Posada 5f0ca74d87 adding simple test for delay(), 0 and 3 seconds 10 years ago
balr0g 50b011f90d Initial port to argparse 10 years ago
Antoine Musso 8a8b87225b Easily run tests in a virtualenv with tox and nose
tox is a nice wrapper around virtualenv. It setup a new env that makes
sure you have a clean install without any module beside the standard
one. If you need more modules, simply add them to 'deps =' and tox will
install them from pypi.

nose makes testing easier, it find any files that looks like a test and
run them for you.

The envs are created in /.tox so ignore that directory entirely.
10 years ago
Emilio J. Rodríguez-Posada 09be8b5f22 prints 10 years ago
Emilio J. Rodríguez-Posada 3da6bc222f ignoring .pyc files 10 years ago
Emilio J. Rodríguez-Posada d04c0e5bce first test 10 years ago
Emilio J. Rodríguez-Posada 568deef081 adding comments for clarification 10 years ago
Emilio J. Rodríguez-Posada d4eed1f738 fixing #127 and #134 , now works with APIs that returns 'name' field for images and those that don't do it (in this case we unquote over ascii); also fixing bug that re-download image list when it was completed previously 10 years ago
Emilio J. Rodríguez-Posada 005de23c1d adding gzip to siteinfo downloader 10 years ago
Emilio J. Rodríguez-Posada d79ea64d41 fixing issue #97 pretty siteinfo json saving, indenting 4 chars 10 years ago
Emilio J. Rodríguez-Posada 3854a344fe Merge branch 'master' of https://github.com/WikiTeam/wikiteam 10 years ago
Emilio J. Rodríguez-Posada 1c1f0dbb86 replacing XML with JSON in image downloading 10 years ago
Emilio J. Rodríguez-Posada 6bbdf42ed0 Merge pull request #125 from balr0g/check-for-disabled-api
Don't try to download sites with disabled API
10 years ago
balr0g 481323c7f7 Don't try to download sites with disabled API 10 years ago
Emilio J. Rodríguez-Posada 084ccc6456 Merge branch 'master' of https://github.com/WikiTeam/wikiteam 10 years ago
nemobis 1933db8a94 Merge pull request #124 from balr0g/scraper-unicode-title-fix
Fix scraper for sites with Unicode titles
10 years ago
balr0g 62be069026 Fix scraper for sites with Unicode titles 10 years ago
nemobis dbfc1b598a Ask Wikimedia Commons reseed 10 years ago
nemobis c41cc5235c Fix typo, add balrog
Because he always saves wikiteam from my silly mistakes, helping out on #wikiteam.
10 years ago
nemobis 62d961fa97 Fix typo, unused variable spotted by balrog 10 years ago
Emilio J. Rodríguez-Posada 7d00cfa0de adding list info file for tropicalwikis 10 years ago
Emilio J. Rodríguez-Posada ecd539f1ae adding scribblewiki list info file 10 years ago
Emilio J. Rodríguez-Posada e95d8ba6e1 sort list 10 years ago
Emilio J. Rodríguez-Posada c420d4d843 adding spider for wikkii, updating the list (10 diff wikis, 2 new, 8 dead), adding info for list 10 years ago
Emilio J. Rodríguez-Posada c7fc194f0d add info file for wikkii.com list 10 years ago
Emilio J. Rodríguez-Posada 29a64507c2 last update date 10 years ago
Emilio J. Rodríguez-Posada d90127e9cc adding details to shoutwiki info 10 years ago
Emilio J. Rodríguez-Posada 90c442a5b7 updating shoutwiki list and uploading basic spider 10 years ago
Emilio J. Rodríguez-Posada 75e2234c2c adding details to referata list 10 years ago
Emilio J. Rodríguez-Posada db9bcb68ca adding license to referata-spider.py 10 years ago
Emilio J. Rodríguez-Posada fa6f603fe8 Merge branch 'master' of https://github.com/WikiTeam/wikiteam 10 years ago
Emilio J. Rodríguez-Posada eaec1afa83 adding info about shoutwiki list 10 years ago
nemobis 95bc2dec38 Link GitHub issue tracker 10 years ago
Emilio J. Rodríguez-Posada a917a68020 Merge branch 'master' of https://github.com/WikiTeam/wikiteam 10 years ago
Emilio J. Rodríguez-Posada 1befbabb02 updating info for referata list 10 years ago
nemobis fb02da7f4d Merge pull request #122 from balr0g/gzip
Add Content-Encoding: gzip support
10 years ago
balr0g d60e560571 Add Content-Encoding: gzip support 10 years ago
Emilio J. Rodríguez-Posada 7a6ef18339 add more wikis to referata list; uploading basic referata-spider.py 10 years ago
Emilio J. Rodríguez-Posada d97c46afd1 Merge branch 'master' of https://github.com/WikiTeam/wikiteam 10 years ago
Emilio J. Rodríguez-Posada a988968e7e adding info for referata list 10 years ago
Emilio J. Rodríguez-Posada a1a76ac198 issue #111 10 years ago
nemobis 4ef1d45e35 Merge pull request #119 from nemobis/2014/siteinfo
Issue 97: Add new siteinfo.json to the archived 7z
10 years ago
Federico Leva f46b60a4b6 Issue 97: Add new siteinfo.json to the archived 7z 10 years ago
Emilio J. Rodríguez-Posada 5261811fa4 only if api exists 10 years ago
Emilio J. Rodríguez-Posada 610764619a add saveSiteInfo() to download meta=siteinfo data from API to a file 10 years ago
Emilio J. Rodríguez-Posada d395433513 comments and newlines 10 years ago
Emilio J. Rodríguez-Posada 5eff4bd072 comments and tabs 10 years ago
Emilio J. Rodríguez-Posada 0b0c40f5da adding more user-agents, but keeps the first as default by now 10 years ago
Emilio J. Rodríguez-Posada 81468c4a7c using JSON to retrieve namespaces via API 10 years ago