emijrp
527401560c
2020
4 years ago
emijrp
7b03096ace
update wikidot list
4 years ago
emijrp
714c9ea1f7
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
4 years ago
emijrp
6aac36ce57
wikidot wiki list
4 years ago
emijrp
61b0b1b80b
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
5 years ago
emijrp
0cd4efb51c
better spider for wikidot
5 years ago
emijrp
f6c57d59e7
.
5 years ago
emijrp
5fd980c6b7
delay 1 second
5 years ago
emijrp
aecee2dc53
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
5 years ago
emijrp
33a93fd76a
delay 1 second
5 years ago
emijrp
966df37c54
new url https://www.archiveteam.org/
5 years ago
emijrp
d43d017075
Update README.md
5 years ago
Emilio
080b723334
Update wikiapiary-update-ia-params.py
5 years ago
nemobis
be0dcd8e55
Merge pull request #337 from zerote000/master
...
Wikiapiary update script - Change Internet Archive search string to search using both API URL and Index URL.
5 years ago
Christoffer Popp Nørskov
83f72db6cd
Wikiapiary update script - Change Internet Archive search string to search using both API URL and Index URL.
5 years ago
Emilio
287b8b88a3
250,000 wikis
5 years ago
emijrp
ffb39afd1e
800 wikidot sites
6 years ago
emijrp
28158f9b04
wikis
6 years ago
emijrp
7c72c27f2a
wikidot
6 years ago
emijrp
4e8c92b6d2
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
6 years ago
emijrp
0ebf86caf6
update, 1.8M users, 400K wikis
6 years ago
nemobis
bee34f4b1b
Merge pull request #319 from TyIsI/patch-1
...
Updated with vancouver.hackspace.ca -> vanhack.ca domain change
6 years ago
TyIsI
09fac2aeeb
Updated with vancouver.hackspace.ca domain change
6 years ago
emijrp
5aac17ea03
update
6 years ago
emijrp
72b67c74f1
randomize saving
6 years ago
emijrp
ca672426bb
quotes issues in titles
6 years ago
emijrp
a69f44caab
ignore expired wikis
6 years ago
emijrp
a359984932
++
6 years ago
emijrp
5525a3cc4a
++
6 years ago
emijrp
3361e4d09f
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
6 years ago
emijrp
94ebe5e1a3
skiping deactivated wikispaces
6 years ago
Federico Leva
83af47d6c0
Catch and raise PageMissingError when query() returns no pages
6 years ago
Federico Leva
73902d39c0
For old MediaWiki releases, use rawcontinue and wikitools query()
...
Otherwise the query continuation may fail and only the top revisions
will be exported. Tested with Wikia:
http://clubpenguin.wikia.com/api.php?action=query&prop=revisions&titles=Club_Penguin_Wiki
Also add parentid since it's available after all.
https://github.com/WikiTeam/wikiteam/issues/311#issuecomment-391957783
6 years ago
emijrp
d11df60516
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
6 years ago
emijrp
de7822cd37
duckduckgo parser; remove .zip after upload
6 years ago
Federico Leva
bf4781eeea
Merge branch 'master' of github.com:WikiTeam/wikiteam
6 years ago
Federico Leva
da64349a5d
Avoid UnboundLocalError: local variable 'reply' referenced before assignment
6 years ago
emijrp
273f1b33cb
Merge branch 'master' of https://github.com/WikiTeam/wikiteam
6 years ago
emijrp
70eefcc945
skiping deleted wikis
6 years ago
Federico Leva
3b74173e0f
launcher.py style and minor changes
6 years ago
Federico Leva
6fbde766c4
Further reduce os.walk() in launcher.py to speed up
6 years ago
Federico Leva
b7789751fc
UnboundLocalError: local variable 'reply' referenced before assignment
...
Warning!: "./tdicampswikiacom-20180522-wikidump" path exists
Traceback (most recent call last):
File "./dumpgenerator.py", line 2321, in <module>
main()
File "./dumpgenerator.py", line 2283, in main
while reply.lower() not in ['yes', 'y', 'no', 'n']:
UnboundLocalError: local variable 'reply' referenced before assignment
6 years ago
Federico Leva
d76b4b4e01
Raise and catch PageMissingError when revisions API result is incomplete
...
https://github.com/WikiTeam/wikiteam/issues/317
6 years ago
Federico Leva
7a655f0074
Check for sha1 presence in makeXmlFromPage()
6 years ago
Federico Leva
baae839a38
Complete update of the Wikia lists
...
* Reduce the offset to 100, the new limit for non-bots.
* Continue listing even when we get an empty request because all
the wikis in a batch have become inactive and are filtered out.
* Print less from curl's requests.
* Automatically write the domain names to the files here.
6 years ago
Federico Leva
4bc41c3aa2
Actually keep track of listed titles and stop when duplicates are returned
...
https://github.com/WikiTeam/wikiteam/issues/309
6 years ago
Federico Leva
80288cf49e
Catch allpages and namespaces API without query results
6 years ago
Federico Leva
e47f638a24
Define "check" before running checkAPI()
...
Traceback (most recent call last):
File "./dumpgenerator.py", line 2294, in <module>
main()
File "./dumpgenerator.py", line 2239, in main
config, other = getParameters(params=params)
File "./dumpgenerator.py", line 1587, in getParameters
if api and check:
UnboundLocalError: local variable 'check' referenced before assignment
6 years ago
Federico Leva
dd32202a55
Merge branch 'master' of github.com:WikiTeam/wikiteam
6 years ago
Federico Leva
fcdc1b5cf2
Use os.listdir('.')
6 years ago