Commit Graph

187 Commits (master)

Author SHA1 Message Date
Federico Leva 54d9d8051e Remove dead Miraheze wikis per checkalive.py
Closes issue #465
11 months ago
Federico Leva c09db669c9 Update checkalive.pl documentation 11 months ago
Federico Leva 1b02cee1d5 Revert "Update miraheze.org list with checkalive.py"
Some 70 % of the removed wikis still return an HTTP 200 although they
may be frozen or closed.

Tested with:

git show | grep ^- | cut -f3 -d/ | sed --regexp-extended 's,(.+),https://\1/wiki/,g' | sort | shuf -n 100 | xargs -I§ -P10 sh -c "curl -Is -w '%{stderr}%{http_code}\n' § > /dev/null" 2>&1 | sort | uniq -c

This reverts commit 0a3dc23f98.
11 months ago
Federico Leva 0a3dc23f98 Update miraheze.org list with checkalive.py
Addresses issue #465
11 months ago
Federico Leva 40a1f35dae Update miraheze.org list of wikis 11 months ago
Liu d9885e0845 Update shoutwiki-spider to remove duplicates 2 years ago
Liu fcc4080b23 Update neoseeker.com.info instructions 2 years ago
Liu e7f7266550 Update fandom.com spider and remove duplicates 2 years ago
Liu 9c5c55342d Update miraheze.org spider and remove duplicates 2 years ago
Liu 4c970e358d Remove duplicates from wiki-site.com 2 years ago
Liu 74a8e9609f Update wiki-site.com spider and list 2 years ago
Liu ba7fab2e96 Add fandom-spider and update metadata and lists 2 years ago
Liu 49e41ee75d Update neoseeker.com spider and list 2 years ago
Liu 6346fd6553 Update shoutwiki.com spider and list 2 years ago
Liu f93988e9c6 Update fandom.com to HTTPS 2 years ago
Liu 91faa34529 Update shoutwiki.com list 2 years ago
Liu d6fe1d9ff8 Update battlestarwiki.org list 2 years ago
Liu 6f8f160d75 Update fandom.com list 2 years ago
Liu 6b39402ebf Update miraheze.org list 2 years ago
Liu f755153de9 Update neoseeker.com list 2 years ago
Federico Leva 10ee80ca3b Rename wikia list to fandom 2 years ago
RhinosF1 3b28efab80
Update miraheze.org list
Using https://gist.github.com/RhinosF1/18c83dfbfadb84e28ee083628c029b41
4 years ago
Federico Leva 8fb2b44fdb Update list of Wikia wikis with today's list from the API 4 years ago
Federico Leva ed46725a89 Sort list of Wikia wikis again
No change in content.
4 years ago
Federico Leva 7dad9a44cd Give up on Wikia-made dumps
There are less than 500 available right now, out of 400k active wikis.
4 years ago
Federico Leva accc7db019 Update list of MediaWikis
* Run checkalive.py on the "originalurl" URLs from existing items in the
  WikiTeam collection on the Internet Archive, minus dead wiki farms.
* Downloaded the list of unarchived wikis from WikiApiary.
4 years ago
Federico Leva aa0b133c1d Minimal update to list of Wikia wikis
* Change API URL to HTTPS and fandom.com.
* New output of the script (403k wikis), changed to wikia.com for diff purposes.
4 years ago
emijrp 7b03096ace update wikidot list 5 years ago
emijrp 714c9ea1f7 Merge branch 'master' of https://github.com/WikiTeam/wikiteam 5 years ago
emijrp 6aac36ce57 wikidot wiki list 5 years ago
emijrp 0cd4efb51c better spider for wikidot 5 years ago
emijrp ffb39afd1e 800 wikidot sites 6 years ago
emijrp 28158f9b04 wikis 6 years ago
emijrp 7c72c27f2a wikidot 6 years ago
emijrp 4e8c92b6d2 Merge branch 'master' of https://github.com/WikiTeam/wikiteam 6 years ago
emijrp 0ebf86caf6 update, 1.8M users, 400K wikis 6 years ago
TyIsI 09fac2aeeb Updated with vancouver.hackspace.ca domain change 6 years ago
emijrp 5aac17ea03 update 6 years ago
emijrp 72b67c74f1 randomize saving 6 years ago
emijrp a359984932 ++ 6 years ago
emijrp 5525a3cc4a ++ 6 years ago
Federico Leva baae839a38 Complete update of the Wikia lists
* Reduce the offset to 100, the new limit for non-bots.
* Continue listing even when we get an empty request because all
  the wikis in a batch have become inactive and are filtered out.
* Print less from curl's requests.
* Automatically write the domain names to the files here.
6 years ago
Emilio 3a56037279
Merge pull request #310 from nemobis/master
Update Wikia list with wikia.py
6 years ago
emijrp 811a325756 update 6 years ago
emijrp aec3a14b7b update spider incomplete results, still running; userwikispacesXY lists instead 6 years ago
emijrp 51ebefa1c4 100,000 wikispaces 6 years ago
emijrp 7280c89b3b duckduckgo spider 6 years ago
emijrp 83158d4506 70k wikis by spider 6 years ago
emijrp 60704e3303 searching wikis with duckduckgo 6 years ago
Federico Leva b8909baa3d Update Wikia list with wikia.py 6 years ago