emijrp
a359984932
++
6 years ago
emijrp
5525a3cc4a
++
6 years ago
Federico Leva
baae839a38
Complete update of the Wikia lists
...
* Reduce the offset to 100, the new limit for non-bots.
* Continue listing even when we get an empty request because all
the wikis in a batch have become inactive and are filtered out.
* Print less from curl's requests.
* Automatically write the domain names to the files here.
6 years ago
Emilio
3a56037279
Merge pull request #310 from nemobis/master
...
Update Wikia list with wikia.py
6 years ago
emijrp
811a325756
update
6 years ago
emijrp
aec3a14b7b
update spider incomplete results, still running; userwikispacesXY lists instead
6 years ago
emijrp
51ebefa1c4
100,000 wikispaces
6 years ago
emijrp
7280c89b3b
duckduckgo spider
6 years ago
emijrp
83158d4506
70k wikis by spider
6 years ago
emijrp
60704e3303
searching wikis with duckduckgo
6 years ago
Federico Leva
b8909baa3d
Update Wikia list with wikia.py
6 years ago
emijrp
60a0ba2e54
sleep
6 years ago
emijrp
061709d9e6
50,000 wikis, do not use this list, use wikispacesXY instead
6 years ago
emijrp
30a6dc268b
wikispaces lists
6 years ago
emijrp
145b040784
update, 10000 wikis, still more arriving
6 years ago
Federico Leva
293da80da9
Add alive MediaWikis from the WikiTeam acrhive.org collection
6 years ago
Federico Leva
6a34bf65ea
Wikia dumps now use 7z, not gz
...
Note that existence doesn't mean the dump is usable.
6 years ago
Mirko Sertic
c9fc4d2105
http://www.mirkosertic.de is no longer powered by DokuWiki
...
Removed http://www.mirkosertic.de from the list.
7 years ago
emijrp
0e20be9a6e
sort
7 years ago
emijrp
bbdaf7723b
update neoseeker
7 years ago
emijrp
fc48c895ae
update info
7 years ago
emijrp
c7d5f9bb2e
update, 2244 wikis
7 years ago
emijrp
75e7628a11
now get ALL wikis, even closed ones
7 years ago
Hydriz
a8270a7769
Update Miraheze wiki farm
7 years ago
Hydriz
9fd6df7a3c
Scan for closed wikis as well
7 years ago
Hydriz Scholz
9f97e21503
Update Miraheze wiki farm
8 years ago
emijrp
fea6ab3b86
more
8 years ago
emijrp
01ccacd138
first version of wikispaces spider
8 years ago
Alexia E. Smith
cb766de5ff
Update gamepedia.com wikis.
...
This is current as of 2016-04-07 and is correct at 1,120 wikis.
8 years ago
emijrp
dde7eb90ba
wiki.wiki info
9 years ago
emijrp
8048b92029
adding wiki.wiki wikifarm list
9 years ago
emijrp
e30cd44384
new wikifarm list of wikis
9 years ago
emijrp
d44db951c2
update date
9 years ago
emijrp
64c30f2b50
updating neoseeker list and sorting, +1 new wiki
9 years ago
Southparkfan
ebffb99f48
Add Miraheze wiki farm
9 years ago
Hydriz Scholz
1550d3755d
Update orain.org wiki list
9 years ago
Federico Leva
a1921f0919
Update list of wikia.com unarchived wikis
...
The list of unarchived wikis was compared to the list of wikis that we
managed to download with dumpgenerator.py:
https://archive.org/details/wikia_dump_20141219
To allow the comparison, the naming format was aligned to the format
used by dumpgenerator.py for 7z files.
9 years ago
Federico Leva
ce6fbfee55
Use curl --fail instead and other fixes; add list
...
Now tested and used to produce the list of some 300k Wikia wikis
which don't yet have a public dump. Will soon be archived.
10 years ago
Federico Leva
7471900e56
It's easier if the list has the actual domains
10 years ago
Federico Leva
8bd3373960
Add wikia.py, to list Wikia wikis we'll dump ourselves
10 years ago
Federico Leva
8cf4d4e6ea
Add 30k domains from another crawler
...
11011 were found alive by checkalive.py (though there could be more
if one checks more subdomains and subdirectories), some thousands
more by checklive.pl (but mostly or all false positives).
Of the alive ones, about 6245 were new to WikiApiary!
https://wikiapiary.com/wiki/Category:Oct_2014_Import
10 years ago
Federico Leva
7e0071ae7f
Add some UseModWiki-looking domains
10 years ago
nemobis
6b11cef9dc
A few thousands more doku.php URLs from own scraping
10 years ago
Southparkfan
8ca9eb8757
Update date of Orain wikilist
10 years ago
Southparkfan
2e2fe9b818
Update list of Orain wikis
10 years ago
nemobis
23a60fa850
MediaWiki CamelCase
10 years ago
nemobis
31112b3a80
checkalive.py: more checks before accessing stuff
10 years ago
nemobis
225c3eb478
A thousand more doku.php URLs from search
10 years ago
nemobis
3fc7dcb5de
Add some more doku.php URLs
10 years ago
PiRSquared17
56c2177106
Add (incomplete) list of dokuwikis
10 years ago