Commit Graph

187 Commits (master)

Author SHA1 Message Date
Emilio J. Rodríguez-Posada 374cf83c54 adding info file for orain.org wikifarm list 10 years ago
Emilio J. Rodríguez-Posada 7d00cfa0de adding list info file for tropicalwikis 10 years ago
Emilio J. Rodríguez-Posada ecd539f1ae adding scribblewiki list info file 10 years ago
Emilio J. Rodríguez-Posada e95d8ba6e1 sort list 10 years ago
Emilio J. Rodríguez-Posada c420d4d843 adding spider for wikkii, updating the list (10 diff wikis, 2 new, 8 dead), adding info for list 10 years ago
Emilio J. Rodríguez-Posada c7fc194f0d add info file for wikkii.com list 10 years ago
Emilio J. Rodríguez-Posada 29a64507c2 last update date 10 years ago
Emilio J. Rodríguez-Posada d90127e9cc adding details to shoutwiki info 10 years ago
Emilio J. Rodríguez-Posada 90c442a5b7 updating shoutwiki list and uploading basic spider 10 years ago
Emilio J. Rodríguez-Posada 75e2234c2c adding details to referata list 10 years ago
Emilio J. Rodríguez-Posada db9bcb68ca adding license to referata-spider.py 10 years ago
Emilio J. Rodríguez-Posada eaec1afa83 adding info about shoutwiki list 10 years ago
Emilio J. Rodríguez-Posada 1befbabb02 updating info for referata list 10 years ago
Emilio J. Rodríguez-Posada 7a6ef18339 add more wikis to referata list; uploading basic referata-spider.py 10 years ago
Emilio J. Rodríguez-Posada d97c46afd1 Merge branch 'master' of https://github.com/WikiTeam/wikiteam 10 years ago
Emilio J. Rodríguez-Posada a988968e7e adding info for referata list 10 years ago
Emilio J. Rodríguez-Posada a1a76ac198 issue #111 10 years ago
Emilio J. Rodríguez-Posada 6442b8734d renaming 10 years ago
scottdb56 9b8673768f mediawikis_2013_byothers.txt have been filtered
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@974 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 ac60918e7b Uploaded BHW-alive_wikis.txt
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@972 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 0b2875756f Updated alive list from other 2013 sources
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@971 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 1b0949ce7b other.list has been filtered and added as other-alive_wikis.list
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@970 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 ab1e13207f Filtered out duplicates from mediawikis_2013-alive.txt
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@968 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 17bc29b66f mediawikis_2013.txt filtered by checkalive.pl
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@967 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 b9be07936a A major update - checkalive.pl now checks for api.php and writes it to the list if found.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@965 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis ac4c93c12a Issue 85: more cross-platform shebang on all scripts
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@962 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 7eba594a2c Use api.php URLs where we found them to be working, step 2
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@959 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 92f77ab085 Use api.php URLs where we found them to be working
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@958 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 395408e5a6 Also neoseeker and sourceforge
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@956 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 3fde888394 Remove dome wikifarms wikis and a couple duplicates
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@955 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 04e55c6622 Remove a hundred index.php redundant with api.php URLs
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@954 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 18a7f42086 Update with last raw list and checkalive.py
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@953 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis cbd0905cba Add 2k more URLs from another crawl
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@950 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis eb60580e91 New URLs from Incola
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@946 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis a1c89623a4 Another intermediate update with results from one more run
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@943 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 2bb3fd7a50 Add neoseeker.com, from mutante's wikistats farm list
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@937 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 43074360b7 Experience shows 30 seconds is a more realistic timeout
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@932 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 8cf60a3285 Re-updated pavlo list with 30 s timeout
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@930 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis e2956e60a0 Replace some index.php with api.php where available
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@929 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 57da61aac6 a minor syntax-error fix in checkalive.pl
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@928 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 8da3f15b35 Some manual filtering
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@927 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 84a2f9d6dc Add index.php discovery and other fixes, update checked lists consequently; bad input makes it spit ugly errors but it keeps going
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@926 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis eba5f2d54e More gamepedia from their homepage
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@925 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 51ee9e9847 added a user-agent and another search string
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@924 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 009682a037 Sync API check needle with checkalive.pl, </api> is unreliable
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@923 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis ac8ed21b0c Add Terraria, 270 wikis might be missing but where is the list?
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@922 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 01b177bcf2 Update raw list with scraper run by odder
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@921 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 0ac7e477f5 Re-formatting of readme-checkalive.txt
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@920 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 fc9207291a readme for checkalive.pl
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@919 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 d9247aa0ba Lots of changes - improved error handling, progress reporting and other minor changes
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@918 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 2bdf8da30c Add orain and gamepedia lists, might have mistakes
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@915 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 7e412d1f17 update with export of http://www.shoutwiki.com/wiki/Category:Flat_list_of_all_wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@911 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 fb87cd9951 This is the first publicly available version of this Perl script.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@901 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 91cf5b4d08 Commit skeleton for Scott's use
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@900 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 72d43634d5 Update Wikia list
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@899 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
emijrp 9152b6861b list of wikis extracted from WikiIndex dump, excluding most wikifarms
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@898 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis f7c50f8ee5 Add retroshare
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@896 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 1fb5865166 Remove some obvious false positives
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@895 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 67eeed88e4 Add silly RSD discovery to checkalive.py and update wikis list
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@892 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis b8c90df787 Upload separately some already checked with the script
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@891 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 500e8ef350 Remove some more obvious duplicates including trailing slash
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@890 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 31ea06ff86 Remove also wiki/[A-Z].+$
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@889 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 664fa18ea3 Remove sourceforge wikis and URLs with parameters to index.php
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@888 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 3624c02852 Remove Wikimedia Foundation wikis, other 'wikimedia' URLs cleanup
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@887 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis ccfc95e9f4 Issue 59: Add first dirty list of possible MediaWiki sitesFirst passes of the script, now going on with all TLDs.Just sorted and cleaned of biggest noises like mailing lists, github and stackoverflow.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@886 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 002a8d6702 Add first list of sourceforge wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@885 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis b940293136 Add more wikkii wikis from mutante's wikistats
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@881 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 75746af1c9 Update to current list of about 340k wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@823 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 810c94723e update to current list of 300k wikis, got from API, without http:// protocol
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@807 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 36ea489313 fixing file description bug
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@795 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 51d775c214 light improvements of checkalive.py script
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@794 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
nemobis 06fb988438 Upload my logs.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@787 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp d31828dfaa mediawikis pavlo
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@464 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 657577cc37 checkalive for wiki lists
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@439 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
12 years ago
emijrp 5cc010c675 updating listofwikis directory [removing dupes];
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@316 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp ed0a3d08a2 filling referata list
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@314 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp d3b748e7f2 mediawiki list from Andrew Pavlo
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@313 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp b3e33380d6 wikkii.com list
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@257 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp de29904341 adding shoutwiki.com wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@190 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 2acd8415c7 removing dupes;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@155 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp e1cf136a97 wikia list, > 200000 wikis
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@104 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp d4bc4cba9a readme for wikilists
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@103 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 24ea30f29f scribblewiki
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@101 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp eb6f7c2639 adding some more empty lists to do
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@100 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp f920be3e3c adding some more to wiki-site; have to remove domain dupes '.a'
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@99 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp db0af570c5 sorting
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@98 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago
emijrp 39fc6d2bf4 adding lists of wikis;
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@94 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
13 years ago