Commit Graph

187 Commits (master)

Author SHA1 Message Date
emijrp 60a0ba2e54 sleep 6 years ago
emijrp 061709d9e6 50,000 wikis, do not use this list, use wikispacesXY instead 6 years ago
emijrp 30a6dc268b wikispaces lists 6 years ago
emijrp 145b040784 update, 10000 wikis, still more arriving 6 years ago
Federico Leva 293da80da9 Add alive MediaWikis from the WikiTeam acrhive.org collection 6 years ago
Federico Leva 6a34bf65ea Wikia dumps now use 7z, not gz
Note that existence doesn't mean the dump is usable.
6 years ago
Mirko Sertic c9fc4d2105 http://www.mirkosertic.de is no longer powered by DokuWiki
Removed http://www.mirkosertic.de from the list.
7 years ago
emijrp 0e20be9a6e sort 7 years ago
emijrp bbdaf7723b update neoseeker 7 years ago
emijrp fc48c895ae update info 7 years ago
emijrp c7d5f9bb2e update, 2244 wikis 7 years ago
emijrp 75e7628a11 now get ALL wikis, even closed ones 7 years ago
Hydriz a8270a7769 Update Miraheze wiki farm 7 years ago
Hydriz 9fd6df7a3c Scan for closed wikis as well 7 years ago
Hydriz Scholz 9f97e21503 Update Miraheze wiki farm 8 years ago
emijrp fea6ab3b86 more 8 years ago
emijrp 01ccacd138 first version of wikispaces spider 8 years ago
Alexia E. Smith cb766de5ff Update gamepedia.com wikis.
This is current as of 2016-04-07 and is correct at 1,120 wikis.
8 years ago
emijrp dde7eb90ba wiki.wiki info 9 years ago
emijrp 8048b92029 adding wiki.wiki wikifarm list 9 years ago
emijrp e30cd44384 new wikifarm list of wikis 9 years ago
emijrp d44db951c2 update date 9 years ago
emijrp 64c30f2b50 updating neoseeker list and sorting, +1 new wiki 9 years ago
Southparkfan ebffb99f48 Add Miraheze wiki farm 9 years ago
Hydriz Scholz 1550d3755d Update orain.org wiki list 9 years ago
Federico Leva a1921f0919 Update list of wikia.com unarchived wikis
The list of unarchived wikis was compared to the list of wikis that we
managed to download with dumpgenerator.py:
https://archive.org/details/wikia_dump_20141219
To allow the comparison, the naming format was aligned to the format
used by dumpgenerator.py for 7z files.
9 years ago
Federico Leva ce6fbfee55 Use curl --fail instead and other fixes; add list
Now tested and used to produce the list of some 300k Wikia wikis
which don't yet have a public dump. Will soon be archived.
10 years ago
Federico Leva 7471900e56 It's easier if the list has the actual domains 10 years ago
Federico Leva 8bd3373960 Add wikia.py, to list Wikia wikis we'll dump ourselves 10 years ago
Federico Leva 8cf4d4e6ea Add 30k domains from another crawler
11011 were found alive by checkalive.py (though there could be more
if one checks more subdomains and subdirectories), some thousands
more by checklive.pl (but mostly or all false positives).

Of the alive ones, about 6245 were new to WikiApiary!
https://wikiapiary.com/wiki/Category:Oct_2014_Import
10 years ago
Federico Leva 7e0071ae7f Add some UseModWiki-looking domains 10 years ago
nemobis 6b11cef9dc A few thousands more doku.php URLs from own scraping 10 years ago
Southparkfan 8ca9eb8757 Update date of Orain wikilist 10 years ago
Southparkfan 2e2fe9b818 Update list of Orain wikis 10 years ago
nemobis 23a60fa850 MediaWiki CamelCase 10 years ago
nemobis 31112b3a80 checkalive.py: more checks before accessing stuff 10 years ago
nemobis 225c3eb478 A thousand more doku.php URLs from search 10 years ago
nemobis 3fc7dcb5de Add some more doku.php URLs 10 years ago
PiRSquared17 56c2177106 Add (incomplete) list of dokuwikis 10 years ago
PiRSquared17 03ddde3702 Move wiki lists to mediawiki subdirectory 10 years ago
etesp 1309e89d45 Added information 10 years ago
etesp dab53ea491 Added WikiApiary tropicalwiki api urls 10 years ago
Federico Leva c1e6c0ead3 Merge remote-tracking branch 'upstream/master' 10 years ago
Federico Leva 86c65fc9be Issue 161: add shodan export 10 years ago
Emilio J. Rodríguez-Posada 91cb7bef0c adding info file for wiki-site list 10 years ago
Emilio J. Rodríguez-Posada 603f1aefad adding spider for wiki-site 10 years ago
Emilio J. Rodríguez-Posada 767123e89d updating wiki-site.com list 10 years ago
Emilio J. Rodríguez-Posada 636c6a91df adding spider for neoseeker, updating list, adding info file 10 years ago
Emilio J. Rodríguez-Posada 514d5fea0e removing unused modules 10 years ago
Emilio J. Rodríguez-Posada a3e69666fe adding spider for orain wikifarm, updating list too 10 years ago
Emilio J. Rodríguez-Posada 374cf83c54 adding info file for orain.org wikifarm list 10 years ago
Emilio J. Rodríguez-Posada 7d00cfa0de adding list info file for tropicalwikis 10 years ago
Emilio J. Rodríguez-Posada ecd539f1ae adding scribblewiki list info file 10 years ago
Emilio J. Rodríguez-Posada e95d8ba6e1 sort list 10 years ago
Emilio J. Rodríguez-Posada c420d4d843 adding spider for wikkii, updating the list (10 diff wikis, 2 new, 8 dead), adding info for list 10 years ago
Emilio J. Rodríguez-Posada c7fc194f0d add info file for wikkii.com list 10 years ago
Emilio J. Rodríguez-Posada 29a64507c2 last update date 10 years ago
Emilio J. Rodríguez-Posada d90127e9cc adding details to shoutwiki info 10 years ago
Emilio J. Rodríguez-Posada 90c442a5b7 updating shoutwiki list and uploading basic spider 10 years ago
Emilio J. Rodríguez-Posada 75e2234c2c adding details to referata list 10 years ago
Emilio J. Rodríguez-Posada db9bcb68ca adding license to referata-spider.py 10 years ago
Emilio J. Rodríguez-Posada eaec1afa83 adding info about shoutwiki list 10 years ago
Emilio J. Rodríguez-Posada 1befbabb02 updating info for referata list 10 years ago
Emilio J. Rodríguez-Posada 7a6ef18339 add more wikis to referata list; uploading basic referata-spider.py 10 years ago
Emilio J. Rodríguez-Posada d97c46afd1 Merge branch 'master' of https://github.com/WikiTeam/wikiteam 10 years ago
Emilio J. Rodríguez-Posada a988968e7e adding info for referata list 10 years ago
Emilio J. Rodríguez-Posada a1a76ac198 issue #111 10 years ago
Emilio J. Rodríguez-Posada 6442b8734d renaming 10 years ago
scottdb56 9b8673768f mediawikis_2013_byothers.txt have been filtered
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@974 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 ac60918e7b Uploaded BHW-alive_wikis.txt
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@972 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 0b2875756f Updated alive list from other 2013 sources
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@971 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 1b0949ce7b other.list has been filtered and added as other-alive_wikis.list
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@970 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 ab1e13207f Filtered out duplicates from mediawikis_2013-alive.txt
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@968 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 17bc29b66f mediawikis_2013.txt filtered by checkalive.pl
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@967 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
scottdb56 b9be07936a A major update - checkalive.pl now checks for api.php and writes it to the list if found.
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@965 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis ac4c93c12a Issue 85: more cross-platform shebang on all scripts
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@962 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 7eba594a2c Use api.php URLs where we found them to be working, step 2
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@959 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 92f77ab085 Use api.php URLs where we found them to be working
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@958 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 395408e5a6 Also neoseeker and sourceforge
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@956 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 3fde888394 Remove dome wikifarms wikis and a couple duplicates
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@955 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 04e55c6622 Remove a hundred index.php redundant with api.php URLs
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@954 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 18a7f42086 Update with last raw list and checkalive.py
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@953 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis cbd0905cba Add 2k more URLs from another crawl
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@950 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis eb60580e91 New URLs from Incola
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@946 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis a1c89623a4 Another intermediate update with results from one more run
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@943 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 2bb3fd7a50 Add neoseeker.com, from mutante's wikistats farm list
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@937 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
10 years ago
nemobis 43074360b7 Experience shows 30 seconds is a more realistic timeout
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@932 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 8cf60a3285 Re-updated pavlo list with 30 s timeout
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@930 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis e2956e60a0 Replace some index.php with api.php where available
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@929 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56 57da61aac6 a minor syntax-error fix in checkalive.pl
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@928 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 8da3f15b35 Some manual filtering
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@927 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 84a2f9d6dc Add index.php discovery and other fixes, update checked lists consequently; bad input makes it spit ugly errors but it keeps going
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@926 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis eba5f2d54e More gamepedia from their homepage
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@925 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56 51ee9e9847 added a user-agent and another search string
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@924 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 009682a037 Sync API check needle with checkalive.pl, </api> is unreliable
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@923 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis ac8ed21b0c Add Terraria, 270 wikis might be missing but where is the list?
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@922 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
nemobis 01b177bcf2 Update raw list with scraper run by odder
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@921 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56 0ac7e477f5 Re-formatting of readme-checkalive.txt
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@920 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56 fc9207291a readme for checkalive.pl
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@919 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago
scottdb56 d9247aa0ba Lots of changes - improved error handling, progress reporting and other minor changes
git-svn-id: https://wikiteam.googlecode.com/svn/trunk@918 31edc4fc-5e31-b4c4-d58b-c8bc928bcb95
11 years ago