Commit Graph

847 Commits (33a93fd76a9a6b1df79c9776e8f4f55ec6f0dea0)
 

Author SHA1 Message Date
nemobis 01b177bcf2 Update raw list with scraper run by odder
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@921 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56 0ac7e477f5 Re-formatting of readme-checkalive.txt
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@920 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56 fc9207291a readme for checkalive.pl
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@919 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56 d9247aa0ba Lots of changes - improved error handling, progress reporting and other minor changes
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@918 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 0ede45b7cf Special:BadTitle works only in English wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@917 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 034866a32e Handle permissions-errors for wikis requiring login or whatever
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@916 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 2bdf8da30c Add orain and gamepedia lists, might have mistakes
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@915 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 6c69d9800f Followup, delay needs config; should be BC
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@914 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 7e412d1f17 update with export of http://www.shoutwiki.com/wiki/Category:Flat_list_of_all_wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@911 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp 2f9985ac6f instructions to compile LaTeX paper;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@909 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp a88797717c first draft of paper;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@908 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp 95c2228f36 creating directory for paper
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@905 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp 6912c8bc71 creating directory for research, papers, etc
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@904 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 55185467e1 Add delay to all checking and listing functions, crappy hosts die on them
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@902 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56 fb87cd9951 This is the first publicly available version of this Perl script.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@901 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 91cf5b4d08 Commit skeleton for Scott's use
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@900 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 72d43634d5 Update Wikia list
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@899 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp 9152b6861b list of wikis extracted from WikiIndex dump, excluding most wikifarms
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@898 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
emijrp 29fc882754 git-svn-id: https://wikiteam.googlecode.com/svn/trunk@897 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95 11 years ago
nemobis f7c50f8ee5 Add retroshare
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@896 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 1fb5865166 Remove some obvious false positives
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@895 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 67eeed88e4 Add silly RSD discovery to checkalive.py and update wikis list
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@892 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis b8c90df787 Upload separately some already checked with the script
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@891 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 500e8ef350 Remove some more obvious duplicates including trailing slash
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@890 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 31ea06ff86 Remove also wiki/[A-Z].+$
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@889 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 664fa18ea3 Remove sourceforge wikis and URLs with parameters to index.php
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@888 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 3624c02852 Remove Wikimedia Foundation wikis, other 'wikimedia' URLs cleanup
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@887 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis ccfc95e9f4 Issue 59: Add first dirty list of possible MediaWiki sitesFirst passes of the script, now going on with all TLDs.Just sorted and cleaned of biggest noises like mailing lists, github and stackoverflow.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@886 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 002a8d6702 Add first list of sourceforge wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@885 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz d5cf4fe592 Deleting things that have no usecase
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@884 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 7c4cb64859 Revert r874: breaks backwards compatibility, no usecase provided
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@883 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 6efe406ea5 Followup r877, first check most common conditions for shortcut performance
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@882 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis b940293136 Add more wikkii wikis from mutante's wikistats
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@881 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 0dd9c5f299 Follow up r874, forgot to uncomment accesskey and secretkey after testing
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@879 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 611d13f8c5 Follow up r877, check the number of revision tags
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@878 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 64bd837cab (Issue 34) XML integry check inside the code
This *really* fixes the issue and asks the user whether or not to regenerate a dump.


git-svn-id: https://wikiteam.googlecode.com/svn/trunk@877 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 5bed9cb28e Adding MediaWiki Export XML Schema Descriptions to track changes and add support accordingly
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@876 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 8680a55e24 Fix identifier name so that it can be more human-readable and backward compatible
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@874 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz 79047a3ded (Issue 71) Use a better check for private wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@873 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz a0726eed04 Rename revnum.txt to revnum.json and fix downloadurl from GitHub.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@871 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz e14fb0fe1b Adding revnum.txt, a file used by dumpgenerator.py version 2.0.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@870 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 26873ad495 Fix typo, make domain2prefix quiet again
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@869 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 626118cfab Let's call it 0.2 then, a bump to 1 would require announcements etc. We're not there yet (API support etc.).
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@867 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
Hydriz df1e7efafd Change version of dumpgenerator.py to 1.1. Using 0.1 is rather confusing.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@866 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis e1d4de3179 Uncomment appended index.php for guess in most configurations
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@864 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 5d34d9512a Needs to be non-matching group
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@863 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 82ba173739 Issue 22: allimages now uses aicontinue, not aifrom
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@862 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis c6546ff935 Issue 71: Don't try to download private wikis, first workaround
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@861 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 776038666f Issue 72: revert r857, just define everything in launcher.py
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@860 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 6113fa3340 Add delay to getPageTitlesScraper
We must be nice here too or naughty hosts fail badly, for instance wikkii.com gave

urllib2.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop.
The last 30x error message was:
Moved Temporarily



git-svn-id: https://wikiteam.googlecode.com/svn/trunk@859 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago