This website works better with JavaScript.
6113fa3340
Add delay to getPageTitlesScraper
nemobis
2013-11-06 08:21:57 +0000
9e1b13e173
Correct --help: format is --delay=5, not --delay:5
nemobis
2013-11-06 08:13:07 +0000
6430ac5f47
Check for the existence of the array in domain2prefix instead; uploader.py failed on python 2.6
nemobis
2013-11-06 07:49:12 +0000
69b00964c8
Add a comment on delay
nemobis
2013-11-05 18:24:08 +0000
ef7d527e86
Add some advice about editthis.info for usage via launcher.py
nemobis
2013-11-05 18:11:23 +0000
4820339d10
Fix r842, patch by balrog; Schbirid reported python error in CleanHTML
nemobis
2013-11-05 17:10:44 +0000
577e8034e6
ahem syntax
nemobis
2013-11-02 10:09:36 +0000
6966ba5918
subprocess.check_output is not supported in python 2.6, let's not dieYes, some are still on python 2.6; even 3.2.46-grbfs-kapsi it seems.
nemobis
2013-11-02 09:50:49 +0000
7c94815e2c
Issue 68: Use GET, not POST, to download images; some harm and no? good
nemobis
2013-11-01 23:07:27 +0000
0eb4913f89
Requires python 2.7 or more (for subprocess.check_output)
nemobis
2013-10-28 18:39:50 +0000
8ef08bbe28
Issue 67: launcher.py fails to 7z completed dumps (also get rid of some os.system)
nemobis
2013-10-28 10:20:38 +0000
d289c563b1
Link to https://code.google.com/p/wikiteam/wiki/NewTutorial#Download_a_list_of_wikis
nemobis
2013-10-16 18:08:45 +0000
ad5d468824
Add some documentation; would need a proper --help
nemobis
2013-10-16 17:57:02 +0000
e1b34b7f6b
Fix whitespace
nemobis
2013-10-14 19:10:10 +0000
57e226c049
Use urllib2 and set user agent in some more places; some webhosts block urllib.
nemobis
2013-10-14 09:27:30 +0000
414fb1988f
Add a little more singular/plural support.
Hydriz
2013-10-13 09:35:48 +0000
13ebee4c28
Issue 60: Add authentication cookie support, patch by Fredrik Roubert
nemobis
2013-10-09 20:11:01 +0000
b34f01646a
Use getUserAgent in one more place, urllib is blocked by some
nemobis
2013-10-09 19:48:02 +0000
1a8e393c11
Google, not Gmail account
nemobis
2013-09-19 10:11:43 +0000
0150b93dcd
Issue #64 : Improve speed by some orders of magnitude
nemobis
2013-09-09 17:27:50 +0000
8b4480be64
Adding plural support for number of edits saved for a page
Hydriz
2013-08-24 06:10:04 +0000
5e826c272d
Use recursive option for zip
nemobis
2013-08-22 20:53:46 +0000
ed8d174d5a
Issue #61 : some skins hide that stuff, use meta tag generator
nemobis
2013-08-17 07:34:11 +0000
5648586e31
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@826 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
emijrp
2013-08-06 15:15:48 +0000
9a5f904261
Issue #45 , workaround: ignore OSError in the size check
nemobis
2013-08-04 16:20:59 +0000
75746af1c9
Update to current list of about 340k wikis
nemobis
2013-05-04 10:05:29 +0000
f1874656ed
comments
emijrp
2013-03-27 22:11:51 +0000
af81adebeb
comments
emijrp
2013-03-27 21:50:23 +0000
424588a55c
comments
emijrp
2013-03-27 21:30:52 +0000
360d1c1fa1
fixing createnewdump() and resumepreviousdump()
emijrp
2013-03-27 21:26:20 +0000
8295990df0
moving code to functions; tiny changes in comments
emijrp
2013-03-27 21:19:46 +0000
79a310c470
tiny changes in comments, some clarifications
emijrp
2013-03-27 20:58:55 +0000
0d7d73d420
ignore case
emijrp
2013-03-27 20:58:15 +0000
5dddc9c502
tiny changes in comments, some clarifications
emijrp
2013-03-27 20:57:30 +0000
73bdc5a5dc
caps
emijrp
2013-03-27 20:36:03 +0000
5f4d10dd5a
2013
emijrp
2013-03-27 20:33:06 +0000
810c94723e
update to current list of 300k wikis, got from API, without http:// protocol
nemobis
2012-11-18 19:12:37 +0000
a78f390557
fixing issue 56
emijrp
2012-11-09 18:55:39 +0000
0080197b8f
WARNING: API support is experimental (faster but broken).
nemobis
2012-11-09 11:55:06 +0000
240b9f9dac
Add support for Wikidata and Wikivoyage.
Hydriz
2012-11-07 13:36:33 +0000
a850194829
rewriting the inline help a bit
emijrp
2012-09-08 17:18:25 +0000
a57a490513
fixing file description bug
emijrp
2012-08-14 19:09:34 +0000
36ea489313
fixing file description bug
emijrp
2012-08-14 19:08:50 +0000
51d775c214
light improvements of checkalive.py script
emijrp
2012-08-14 18:11:37 +0000
866334aab7
renaming lists directory to taskforce
emijrp
2012-08-14 17:27:00 +0000
06fb988438
Upload my logs.
nemobis
2012-08-12 16:21:23 +0000
29ce623af7
api url typo
abuie
2012-08-10 02:22:13 +0000
ccf6ecf5ed
Extra large indentation (oops)
abuie
2012-08-08 22:38:54 +0000
1990ef0d6a
Fix bug introduced in r676.
nemobis
2012-08-08 22:31:09 +0000
cb77d8738d
TODO: bug - translate relative copyright URLs as in http://archive.org/details/wiki-wikipovrayorg now linking http://archive.org/content/POV-Wiki:Copyrights
nemobis
2012-08-08 17:27:40 +0000
f80bd3e0b6
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@772 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
emijrp
2012-08-08 08:46:42 +0000
4325d60b4b
fixing wikipediadownloader.py, dumps url
emijrp
2012-08-07 08:27:45 +0000
e40eb9b826
disable check integrity function by now
emijrp
2012-08-07 08:03:14 +0000
7bbc0aa387
maybe it's something else
nemobis
2012-08-07 06:41:53 +0000
22a163c183
more todos
nemobis
2012-08-07 05:47:16 +0000
953cfbfc8a
Added some todos.
nemobis
2012-08-06 22:24:50 +0000
10a9ed4df7
uploader: lang parameter
emijrp
2012-08-06 18:39:27 +0000
d5ae56bd24
uploader
emijrp
2012-08-06 15:57:11 +0000
f62e2686de
uploader
emijrp
2012-08-06 15:41:50 +0000
c8ca525ff7
just adding documentation and comments
emijrp
2012-08-06 14:54:31 +0000
d616bcc2e6
just adding documentation and comments
emijrp
2012-08-06 14:48:30 +0000
277d9a9ec9
uploader
emijrp
2012-08-06 14:07:46 +0000
8c2cfa80e9
uploader
emijrp
2012-08-06 14:05:45 +0000
6bdd1a492c
uploader
emijrp
2012-08-06 13:33:04 +0000
d61ea1b77d
uploader
emijrp
2012-08-06 12:57:41 +0000
bc24877a72
uploader
emijrp
2012-08-06 12:12:37 +0000
1d972d10d8
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@731 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
emijrp
2012-08-03 11:07:29 +0000
bbffb24a0c
(Issue 34) XML integry check inside the code
Hydriz
2012-06-22 11:34:27 +0000
7e083de24b
Partial revert of r675: the script was always downloading only the last revision, as reported by Hydriz... *facepalms*
nemobis
2012-06-22 07:07:59 +0000
821bdaf974
duplicate
nemobis
2012-06-18 14:28:54 +0000
080670ca48
duuplicate
nemobis
2012-06-18 14:27:10 +0000
a62117ed28
duplicate
nemobis
2012-06-18 14:26:15 +0000
a1eb913d9d
duplicate
nemobis
2012-06-18 14:25:21 +0000
11c977582a
duplicate
nemobis
2012-06-18 14:24:17 +0000
79c9223e77
duplicate
nemobis
2012-06-18 14:23:31 +0000
c46a8c7b90
done
nemobis
2012-06-18 14:21:14 +0000
979fa84ac3
duplicate
nemobis
2012-06-18 14:19:33 +0000
79a4d81e06
duplicate
nemobis
2012-05-31 21:53:10 +0000
61897f76c6
duplicate
nemobis
2012-05-31 21:51:38 +0000
da005e6c88
duplicate
nemobis
2012-05-31 21:50:57 +0000
8d8a9e3dcc
duplicate
nemobis
2012-05-31 21:47:34 +0000
e132dcf1c7
duplicate
nemobis
2012-05-31 21:46:30 +0000
f28242a57b
duplicate
nemobis
2012-05-31 21:40:29 +0000
26cd289ba1
duplicate
nemobis
2012-05-31 21:39:44 +0000
cac6130ea1
duplicate
nemobis
2012-05-31 21:36:17 +0000
d6d13e1047
duplicate
nemobis
2012-05-31 21:35:20 +0000
8cad43dfb6
Does not compress, but still, moving %s-images.txt to the fulldump compressing command as it rightfully belongs there
Hydriz
2012-05-31 14:32:21 +0000
9f58ca6a9b
Fix Issue 26: Error loop "XML for ... is wrong"
nemobis
2012-05-28 09:35:56 +0000
260f4e6be2
duplicates
nemobis
2012-05-27 15:54:50 +0000
373208b92f
duplicate (without www. and without www.en.)
nemobis
2012-05-27 15:53:17 +0000
f6992312bc
duplicates
nemobis
2012-05-27 15:51:11 +0000
9a067bbe35
duplicate
nemobis
2012-05-27 15:46:20 +0000
4a5597865f
duplicates
nemobis
2012-05-27 15:44:06 +0000
a8479f9936
Get namespaces and images info from API. Code badly copied around, we should probably use modules such as http://packages.python.org/simplemediawiki/ to use API. Tested, seems to work. Fixed some typos.
nemobis
2012-05-20 08:08:28 +0000
b8b2dbcf4a
Moving the collection parameter to the top of the script so as to allow the user to easily configure that setting.
Hydriz
2012-05-11 03:56:27 +0000
78a07544ed
Per emijrp, removing config.txt from archive.
nemobis
2012-05-06 08:06:03 +0000
a28d9d1929
duplicates or done
nemobis
2012-05-05 21:58:52 +0000
143d7fba7c
duplicates
nemobis
2012-05-05 21:57:28 +0000
ccfd566a36
duplicates
nemobis
2012-05-05 21:55:51 +0000
114ec7db8c
duplicate and one huge downloaded
nemobis
2012-05-05 21:54:00 +0000