Commit Graph

  • 6113fa3340 Add delay to getPageTitlesScraper nemobis 2013-11-06 08:21:57 +0000
  • 9e1b13e173 Correct --help: format is --delay=5, not --delay:5 nemobis 2013-11-06 08:13:07 +0000
  • 6430ac5f47 Check for the existence of the array in domain2prefix instead; uploader.py failed on python 2.6 nemobis 2013-11-06 07:49:12 +0000
  • 69b00964c8 Add a comment on delay nemobis 2013-11-05 18:24:08 +0000
  • ef7d527e86 Add some advice about editthis.info for usage via launcher.py nemobis 2013-11-05 18:11:23 +0000
  • 4820339d10 Fix r842, patch by balrog; Schbirid reported python error in CleanHTML nemobis 2013-11-05 17:10:44 +0000
  • 577e8034e6 ahem syntax nemobis 2013-11-02 10:09:36 +0000
  • 6966ba5918 subprocess.check_output is not supported in python 2.6, let's not dieYes, some are still on python 2.6; even 3.2.46-grbfs-kapsi it seems. nemobis 2013-11-02 09:50:49 +0000
  • 7c94815e2c Issue 68: Use GET, not POST, to download images; some harm and no? good nemobis 2013-11-01 23:07:27 +0000
  • 0eb4913f89 Requires python 2.7 or more (for subprocess.check_output) nemobis 2013-10-28 18:39:50 +0000
  • 8ef08bbe28 Issue 67: launcher.py fails to 7z completed dumps (also get rid of some os.system) nemobis 2013-10-28 10:20:38 +0000
  • d289c563b1 Link to https://code.google.com/p/wikiteam/wiki/NewTutorial#Download_a_list_of_wikis nemobis 2013-10-16 18:08:45 +0000
  • ad5d468824 Add some documentation; would need a proper --help nemobis 2013-10-16 17:57:02 +0000
  • e1b34b7f6b Fix whitespace nemobis 2013-10-14 19:10:10 +0000
  • 57e226c049 Use urllib2 and set user agent in some more places; some webhosts block urllib. nemobis 2013-10-14 09:27:30 +0000
  • 414fb1988f Add a little more singular/plural support. Hydriz 2013-10-13 09:35:48 +0000
  • 13ebee4c28 Issue 60: Add authentication cookie support, patch by Fredrik Roubert nemobis 2013-10-09 20:11:01 +0000
  • b34f01646a Use getUserAgent in one more place, urllib is blocked by some nemobis 2013-10-09 19:48:02 +0000
  • 1a8e393c11 Google, not Gmail account nemobis 2013-09-19 10:11:43 +0000
  • 0150b93dcd Issue #64: Improve speed by some orders of magnitude nemobis 2013-09-09 17:27:50 +0000
  • 8b4480be64 Adding plural support for number of edits saved for a page Hydriz 2013-08-24 06:10:04 +0000
  • 5e826c272d Use recursive option for zip nemobis 2013-08-22 20:53:46 +0000
  • ed8d174d5a Issue #61: some skins hide that stuff, use meta tag generator nemobis 2013-08-17 07:34:11 +0000
  • 5648586e31 git-svn-id: https://wikiteam.googlecode.com/svn/trunk@826 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95 emijrp 2013-08-06 15:15:48 +0000
  • 9a5f904261 Issue #45, workaround: ignore OSError in the size check nemobis 2013-08-04 16:20:59 +0000
  • 75746af1c9 Update to current list of about 340k wikis nemobis 2013-05-04 10:05:29 +0000
  • f1874656ed comments emijrp 2013-03-27 22:11:51 +0000
  • af81adebeb comments emijrp 2013-03-27 21:50:23 +0000
  • 424588a55c comments emijrp 2013-03-27 21:30:52 +0000
  • 360d1c1fa1 fixing createnewdump() and resumepreviousdump() emijrp 2013-03-27 21:26:20 +0000
  • 8295990df0 moving code to functions; tiny changes in comments emijrp 2013-03-27 21:19:46 +0000
  • 79a310c470 tiny changes in comments, some clarifications emijrp 2013-03-27 20:58:55 +0000
  • 0d7d73d420 ignore case emijrp 2013-03-27 20:58:15 +0000
  • 5dddc9c502 tiny changes in comments, some clarifications emijrp 2013-03-27 20:57:30 +0000
  • 73bdc5a5dc caps emijrp 2013-03-27 20:36:03 +0000
  • 5f4d10dd5a 2013 emijrp 2013-03-27 20:33:06 +0000
  • 810c94723e update to current list of 300k wikis, got from API, without http:// protocol nemobis 2012-11-18 19:12:37 +0000
  • a78f390557 fixing issue 56 emijrp 2012-11-09 18:55:39 +0000
  • 0080197b8f WARNING: API support is experimental (faster but broken). nemobis 2012-11-09 11:55:06 +0000
  • 240b9f9dac Add support for Wikidata and Wikivoyage. Hydriz 2012-11-07 13:36:33 +0000
  • a850194829 rewriting the inline help a bit emijrp 2012-09-08 17:18:25 +0000
  • a57a490513 fixing file description bug emijrp 2012-08-14 19:09:34 +0000
  • 36ea489313 fixing file description bug emijrp 2012-08-14 19:08:50 +0000
  • 51d775c214 light improvements of checkalive.py script emijrp 2012-08-14 18:11:37 +0000
  • 866334aab7 renaming lists directory to taskforce emijrp 2012-08-14 17:27:00 +0000
  • 06fb988438 Upload my logs. nemobis 2012-08-12 16:21:23 +0000
  • 29ce623af7 api url typo abuie 2012-08-10 02:22:13 +0000
  • ccf6ecf5ed Extra large indentation (oops) abuie 2012-08-08 22:38:54 +0000
  • 1990ef0d6a Fix bug introduced in r676. nemobis 2012-08-08 22:31:09 +0000
  • cb77d8738d TODO: bug - translate relative copyright URLs as in http://archive.org/details/wiki-wikipovrayorg now linking http://archive.org/content/POV-Wiki:Copyrights nemobis 2012-08-08 17:27:40 +0000
  • f80bd3e0b6 git-svn-id: https://wikiteam.googlecode.com/svn/trunk@772 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95 emijrp 2012-08-08 08:46:42 +0000
  • 4325d60b4b fixing wikipediadownloader.py, dumps url emijrp 2012-08-07 08:27:45 +0000
  • e40eb9b826 disable check integrity function by now emijrp 2012-08-07 08:03:14 +0000
  • 7bbc0aa387 maybe it's something else nemobis 2012-08-07 06:41:53 +0000
  • 22a163c183 more todos nemobis 2012-08-07 05:47:16 +0000
  • 953cfbfc8a Added some todos. nemobis 2012-08-06 22:24:50 +0000
  • 10a9ed4df7 uploader: lang parameter emijrp 2012-08-06 18:39:27 +0000
  • d5ae56bd24 uploader emijrp 2012-08-06 15:57:11 +0000
  • f62e2686de uploader emijrp 2012-08-06 15:41:50 +0000
  • c8ca525ff7 just adding documentation and comments emijrp 2012-08-06 14:54:31 +0000
  • d616bcc2e6 just adding documentation and comments emijrp 2012-08-06 14:48:30 +0000
  • 277d9a9ec9 uploader emijrp 2012-08-06 14:07:46 +0000
  • 8c2cfa80e9 uploader emijrp 2012-08-06 14:05:45 +0000
  • 6bdd1a492c uploader emijrp 2012-08-06 13:33:04 +0000
  • d61ea1b77d uploader emijrp 2012-08-06 12:57:41 +0000
  • bc24877a72 uploader emijrp 2012-08-06 12:12:37 +0000
  • 1d972d10d8 git-svn-id: https://wikiteam.googlecode.com/svn/trunk@731 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95 emijrp 2012-08-03 11:07:29 +0000
  • bbffb24a0c (Issue 34) XML integry check inside the code Hydriz 2012-06-22 11:34:27 +0000
  • 7e083de24b Partial revert of r675: the script was always downloading only the last revision, as reported by Hydriz... *facepalms* nemobis 2012-06-22 07:07:59 +0000
  • 821bdaf974 duplicate nemobis 2012-06-18 14:28:54 +0000
  • 080670ca48 duuplicate nemobis 2012-06-18 14:27:10 +0000
  • a62117ed28 duplicate nemobis 2012-06-18 14:26:15 +0000
  • a1eb913d9d duplicate nemobis 2012-06-18 14:25:21 +0000
  • 11c977582a duplicate nemobis 2012-06-18 14:24:17 +0000
  • 79c9223e77 duplicate nemobis 2012-06-18 14:23:31 +0000
  • c46a8c7b90 done nemobis 2012-06-18 14:21:14 +0000
  • 979fa84ac3 duplicate nemobis 2012-06-18 14:19:33 +0000
  • 79a4d81e06 duplicate nemobis 2012-05-31 21:53:10 +0000
  • 61897f76c6 duplicate nemobis 2012-05-31 21:51:38 +0000
  • da005e6c88 duplicate nemobis 2012-05-31 21:50:57 +0000
  • 8d8a9e3dcc duplicate nemobis 2012-05-31 21:47:34 +0000
  • e132dcf1c7 duplicate nemobis 2012-05-31 21:46:30 +0000
  • f28242a57b duplicate nemobis 2012-05-31 21:40:29 +0000
  • 26cd289ba1 duplicate nemobis 2012-05-31 21:39:44 +0000
  • cac6130ea1 duplicate nemobis 2012-05-31 21:36:17 +0000
  • d6d13e1047 duplicate nemobis 2012-05-31 21:35:20 +0000
  • 8cad43dfb6 Does not compress, but still, moving %s-images.txt to the fulldump compressing command as it rightfully belongs there Hydriz 2012-05-31 14:32:21 +0000
  • 9f58ca6a9b Fix Issue 26: Error loop "XML for ... is wrong" nemobis 2012-05-28 09:35:56 +0000
  • 260f4e6be2 duplicates nemobis 2012-05-27 15:54:50 +0000
  • 373208b92f duplicate (without www. and without www.en.) nemobis 2012-05-27 15:53:17 +0000
  • f6992312bc duplicates nemobis 2012-05-27 15:51:11 +0000
  • 9a067bbe35 duplicate nemobis 2012-05-27 15:46:20 +0000
  • 4a5597865f duplicates nemobis 2012-05-27 15:44:06 +0000
  • a8479f9936 Get namespaces and images info from API. Code badly copied around, we should probably use modules such as http://packages.python.org/simplemediawiki/ to use API. Tested, seems to work. Fixed some typos. nemobis 2012-05-20 08:08:28 +0000
  • b8b2dbcf4a Moving the collection parameter to the top of the script so as to allow the user to easily configure that setting. Hydriz 2012-05-11 03:56:27 +0000
  • 78a07544ed Per emijrp, removing config.txt from archive. nemobis 2012-05-06 08:06:03 +0000
  • a28d9d1929 duplicates or done nemobis 2012-05-05 21:58:52 +0000
  • 143d7fba7c duplicates nemobis 2012-05-05 21:57:28 +0000
  • ccfd566a36 duplicates nemobis 2012-05-05 21:55:51 +0000
  • 114ec7db8c duplicate and one huge downloaded nemobis 2012-05-05 21:54:00 +0000