You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
Go to file
Emilio J. Rodríguez-Posada 7592c43e6d renaming gpl.txt to LICENSE 10 years ago
batchdownload Sort index.php URL together 10 years ago
listsofwikis mediawikis_2013_byothers.txt have been filtered 10 years ago
research/paper-wikiteam-2014 instructions to compile LaTeX paper; 10 years ago
rewrite Adding rewrite code so others can build on top of it 10 years ago
LICENSE renaming gpl.txt to LICENSE 10 years ago
README.md Update README.md 10 years ago
commonschecker.py Issue 85: more cross-platform shebang on all scripts 10 years ago
commonsdownloader.py Issue 85: more cross-platform shebang on all scripts 10 years ago
commonssql.py Issue 85: more cross-platform shebang on all scripts 10 years ago
dumpgenerator.py Issue 85: more cross-platform shebang on all scripts 10 years ago
gui.py Issue 85: more cross-platform shebang on all scripts 10 years ago
uploadedwikis.txt https://hampedia.org/w/api.php 13 years ago
uploader.py Issue 85: more cross-platform shebang on all scripts 10 years ago
wikiadownloader.py Issue 85: more cross-platform shebang on all scripts 10 years ago
wikipediadownloader.py Issue 85: more cross-platform shebang on all scripts 10 years ago

README.md

WikiTeam

We archive wikis, from Wikipedia to tiniest wikis

WikiTeam software is a set of tools for archiving wikis. They work on MediaWiki wikis, but we want to expand to other wiki engines. As of June 2014, WikiTeam has preserved more than 13,000 stand-alone wikis, several wikifarms, regular Wikipedia dumps and 24TB of Wikimedia Commons images.

There are thousands of wikis in the Internet. Everyday some of them are no longer publicly available and, due to lack of backups, lost forever. Millions of people download tons of media files (movies, music, books, etc) from the Internet, implementing a kind of distributed backup. Wikis, most of them under free licenses, disappear from time to time because nobody grabbed a copy of them. That is a shame that we would like to solve.

WikiTeam is the Archive Team (GitHub) subcommittee on wikis. It was founded and originally developed by Emilio J. Rodríguez-Posada, a Wikipedia veteran editor and amateur archivist. Many people have help sending suggestions, reporting bugs, writing documentation, providing help in the mailing list and making wiki backups. Thanks to all, especially to: Federico Leva, Alex Buie, Scott Boyd, Hydriz, Platonides, Ian McEwen and Mike Dupont.

Documentation Source code Download available backups Community Follow us on Twitter

Quick guide

For downloading a wiki, including the complete XML history and all images, use:

python dumpgenerator.py --api=http://en.wikipedia.org/w/api.php --xml --images