nemobis
01b177bcf2
Update raw list with scraper run by odder
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@921 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56
0ac7e477f5
Re-formatting of readme-checkalive.txt
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@920 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56
fc9207291a
readme for checkalive.pl
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@919 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56
d9247aa0ba
Lots of changes - improved error handling, progress reporting and other minor changes
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@918 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
0ede45b7cf
Special:BadTitle works only in English wikis
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@917 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
034866a32e
Handle permissions-errors for wikis requiring login or whatever
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@916 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
2bdf8da30c
Add orain and gamepedia lists, might have mistakes
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@915 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
6c69d9800f
Followup, delay needs config; should be BC
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@914 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
7e412d1f17
update with export of http://www.shoutwiki.com/wiki/Category:Flat_list_of_all_wikis
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@911 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp
2f9985ac6f
instructions to compile LaTeX paper;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@909 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp
a88797717c
first draft of paper;
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@908 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp
95c2228f36
creating directory for paper
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@905 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp
6912c8bc71
creating directory for research, papers, etc
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@904 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
55185467e1
Add delay to all checking and listing functions, crappy hosts die on them
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@902 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56
fb87cd9951
This is the first publicly available version of this Perl script.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@901 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
91cf5b4d08
Commit skeleton for Scott's use
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@900 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
72d43634d5
Update Wikia list
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@899 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp
9152b6861b
list of wikis extracted from WikiIndex dump, excluding most wikifarms
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@898 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp
29fc882754
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@897 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
f7c50f8ee5
Add retroshare
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@896 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
1fb5865166
Remove some obvious false positives
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@895 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
67eeed88e4
Add silly RSD discovery to checkalive.py and update wikis list
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@892 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
b8c90df787
Upload separately some already checked with the script
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@891 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
500e8ef350
Remove some more obvious duplicates including trailing slash
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@890 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
31ea06ff86
Remove also wiki/[A-Z].+$
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@889 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
664fa18ea3
Remove sourceforge wikis and URLs with parameters to index.php
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@888 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
3624c02852
Remove Wikimedia Foundation wikis, other 'wikimedia' URLs cleanup
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@887 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
ccfc95e9f4
Issue 59: Add first dirty list of possible MediaWiki sitesFirst passes of the script, now going on with all TLDs.Just sorted and cleaned of biggest noises like mailing lists, github and stackoverflow.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@886 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
002a8d6702
Add first list of sourceforge wikis
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@885 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
d5cf4fe592
Deleting things that have no usecase
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@884 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
7c4cb64859
Revert r874: breaks backwards compatibility, no usecase provided
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@883 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
6efe406ea5
Followup r877, first check most common conditions for shortcut performance
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@882 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
b940293136
Add more wikkii wikis from mutante's wikistats
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@881 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
0dd9c5f299
Follow up r874, forgot to uncomment accesskey and secretkey after testing
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@879 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
611d13f8c5
Follow up r877, check the number of revision tags
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@878 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
64bd837cab
(Issue 34) XML integry check inside the code
...
This *really* fixes the issue and asks the user whether or not to regenerate a dump.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@877 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
5bed9cb28e
Adding MediaWiki Export XML Schema Descriptions to track changes and add support accordingly
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@876 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
8680a55e24
Fix identifier name so that it can be more human-readable and backward compatible
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@874 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
79047a3ded
(Issue 71) Use a better check for private wikis
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@873 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
a0726eed04
Rename revnum.txt to revnum.json and fix downloadurl from GitHub.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@871 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
e14fb0fe1b
Adding revnum.txt, a file used by dumpgenerator.py version 2.0.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@870 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
26873ad495
Fix typo, make domain2prefix quiet again
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@869 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
626118cfab
Let's call it 0.2 then, a bump to 1 would require announcements etc. We're not there yet (API support etc.).
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@867 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz
df1e7efafd
Change version of dumpgenerator.py to 1.1. Using 0.1 is rather confusing.
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@866 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
e1d4de3179
Uncomment appended index.php for guess in most configurations
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@864 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
5d34d9512a
Needs to be non-matching group
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@863 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
82ba173739
Issue 22: allimages now uses aicontinue, not aifrom
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@862 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
c6546ff935
Issue 71: Don't try to download private wikis, first workaround
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@861 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
776038666f
Issue 72: revert r857, just define everything in launcher.py
...
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@860 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis
6113fa3340
Add delay to getPageTitlesScraper
...
We must be nice here too or naughty hosts fail badly, for instance wikkii.com gave
urllib2.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Temporarily
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@859 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago