You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
arkiver 7f4db17348 Version 20221021.01. Ignore /tailwind-build.css URL from comment in HTML. 2 years ago
.gitignore Use wget-at with ZSTD. 4 years ago
Dockerfile use onbuild-based image 3 years ago
JSON.lua Version 20220109.01. Add codepoint to utf8 support. Percent encode outlinks correctly. 3 years ago
LICENSE first files 9 years ago Fix README. 2 years ago
cookies.txt Add support for latest change in _options 3 years ago Use wget-at with ZSTD. 4 years ago
ignore-list Version 20200726.01. Fully support new and old design for posts. 4 years ago Version 20221021.01. Ignore /tailwind-build.css URL from comment in HTML. 2 years ago
reddit.lua Version 20221021.01. Ignore /tailwind-build.css URL from comment in HTML. 2 years ago
table_show.lua first files 9 years ago
urlcode.lua first files 9 years ago
user-agents Version 20210306.01. Remove some AppleWebKir user-agents for getting 403s. 4 years ago
wget-lua-warrior rewrite 6 years ago


More information about the archiving project can be found on the ArchiveTeam wiki: Reddit

Setup instructions

Be sure to replace YOURNICKHERE with the nickname that you want to be shown as, on the tracker. You don't need to register it, just pick a nickname you like.

In most of the below cases, there will be a web interface running at http://localhost:8001/. If you don't know or care what this is, you can just ignore it—otherwise, it gives you a fancy view of what's going on.

If anything goes wrong while running the commands below, please scroll down to the bottom of this page. There's troubleshooting information there.

Running with a warrior

Follow the instructions on the ArchiveTeam wiki for installing the Warrior, and select the "Reddit" project in the Warrior interface.

Running with Docker

The recommended way to run these projects is with Docker. The instructions below are a short overview. For more information and detailed explanations of the commands, follow the follow the Docker instructions on the Archive Team wiki.

It is advised to use watchtower to automatically update the project. This requires watchtower:

docker run --name watchtower --restart=unless-stopped -v /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --label-enable --cleanup --interval 3600

after which the project can be run:

docker run --name archiveteam --label=com.centurylinklabs.watchtower.enable=true --restart=unless-stopped --concurrent 1 YOURNICKHERE

Running without a warrior or Docker

To run this outside the warrior, clone this repository, cd into its directory and run:

python3 -m pip install setuptools wheel
python3 -m pip install --upgrade seesaw zstandard requests

then start downloading with:

run-pipeline3 --concurrent 2 YOURNICKHERE

For more options, run:

run-pipeline3 --help

If you don't have root access and/or your version of pip is very old, you can replace "pip install --upgrade seesaw" with:

wget ; python3 --user ; ~/.local/bin/pip3 install --upgrade --user seesaw

so that pip and seesaw are installed in your home, then run

~/.local/bin/run-pipeline3 --concurrent 2 YOURNICKHERE

Running multiple instances on different IPs

This feature requires seesaw version 0.0.16 or greater. Use pip install --upgrade seesaw to upgrade.

Use the --context-value argument to pass in bind_address= (replace the IP address with your own).

Example of running 2 threads, no web interface, and Wget binding of IP address:

run-pipeline3 --concurrent 2 YOURNICKHERE --disable-web-server --context-value bind_address=

Distribution-specific setup

For Debian/Ubuntu:

Package libzstd-dev version 1.4.4 is required which is currently available from buster-backports.

adduser --system --group --shell /bin/bash archiveteam
echo deb buster-backports main contrib > /etc/apt/sources.list.d/backports.list
apt-get update \
&& apt-get install -y git-core libgnutls-dev lua5.1 liblua5.1-0 liblua5.1-0-dev screen bzip2 zlib1g-dev flex autoconf autopoint texinfo gperf lua-socket lua-filesystem lua-sec lua-zip rsync automake pkg-config python3-dev python3-pip build-essential poppler-utils \
&& apt-get -t buster-backports install zstd libzstd-dev libzstd1
python3 -m pip install setuptools wheel
python3 -m pip install --upgrade seesaw zstandard requests
su -c "cd /home/archiveteam; git clone; cd reddit-grab; ./" archiveteam
screen su -c "cd /home/archiveteam/reddit-grab/; run-pipeline3 --concurrent 2 --address '' YOURNICKHERE" archiveteam
[... ctrl+A D to detach ...]

In Debian Jessie, Ubuntu 18.04 Bionic and above, the libgnutls-dev package was renamed to libgnutls28-dev. So, you need to do the following instead:

adduser --system --group --shell /bin/bash archiveteam
echo deb buster-backports main contrib > /etc/apt/sources.list.d/backports.list
apt-get update \
&& apt-get install -y git-core libgnutls28-dev lua5.1 liblua5.1-0 liblua5.1-0-dev screen bzip2 zlib1g-dev flex autoconf autopoint texinfo gperf lua-socket lua-filesystem lua-sec lua-zip rsync automake pkg-config python3-dev python3-pip build-essential poppler-utils \
&& apt-get -t buster-backports install zstd libzstd-dev libzstd1
[... pretty much the same as above ...]

Wget-lua is also available on ArchiveTeam's PPA for Ubuntu.

For CentOS:


Ensure that you have the CentOS equivalent of bzip2 installed as well. You will need the EPEL repository to be enabled.

yum -y groupinstall "Development Tools"
yum -y install gnutls-devel lua-devel python-pip zlib-devel zstd libzstd-devel git-core gperf lua-socket luarocks texinfo git rsync gettext-devel
pip install --upgrade seesaw
[... pretty much the same as above ...]

Tested with EL7 repositories.

For Fedora:

The same as CentOS but with "dnf" instead of "yum". Did not successfully test compiling, so far.

For openSUSE:


zypper install liblua5_1 lua51 lua51-devel screen python-pip libgnutls-devel bzip2 python-devel gcc make
pip install --upgrade seesaw
[... pretty much the same as above ...]

For OS X:


You need Homebrew. Ensure that you have the OS X equivalent of bzip2 installed as well.

brew install python lua gnutls
pip install --upgrade seesaw
[... pretty much the same as above ...]

There is a known issue with some packaged versions of rsync. If you get errors during the upload stage, reddit-grab will not work with your rsync version.

This supposedly fixes it:

alias rsync=/usr/local/bin/rsync

For Arch Linux:

Ensure that you have the Arch equivalent of bzip2 installed as well.

  1. Make sure you have python2-pip installed.
  2. Install the wget-lua package from the AUR.
  3. Run pip2 install --upgrade seesaw.
  4. Modify the run-pipeline script in seesaw to point at #!/usr/bin/python2 instead of #!/usr/bin/python.
  5. useradd --system --group users --shell /bin/bash --create-home archiveteam
  6. screen su -c "cd /home/archiveteam/reddit-grab/; run-pipeline --concurrent 2 --address '' YOURNICKHERE" archiveteam

For Alpine Linux:

Install the dependencies for Zstd and Wget-AT:

apk update &&  apk add lua5.1 lua5.1-socket lua5.1-sec python3 py3-pip git gcc libc-dev \
    lua5.1-dev lua5.1-filesystem zlib-dev gnutls-dev automake autoconf make bash \
    bzip2 rsync flex gettext gettext-dev xz gperf texinfo wget coreutils ca-certificates \

Version 1.4.4 of Zstd is required, so this needs to be built from source for compatibility with arguments provided from other Alpine builds:

# Source:
git clone --depth 1 --branch v1.4.4
cd zstd && export CFLAGS="-O2" && \
    make -C lib HAVE_PTHREAD=1 HAVE_ZLIB=0 HAVE_LZMA=0 HAVE_LZ4=0 lib-mt && \
    make -C programs HAVE_PTHREAD=1 HAVE_ZLIB=0 HAVE_LZMA=0 HAVE_LZ4=0 && \
    make -C contrib/pzstd && \
    make PREFIX="/usr" install && \
    cd ..

Close the repo and build Wget-AT:

git clone
cd reddit-grab; ./

Run the project with

# uncomments the next line if you want to use a virtualenv (sh/bash example)
#python3 -m venv --prompt at .venv && source .venv/bin/activate
pip install --upgrade pip setuptools wheel
pip install --upgrade seesaw zstandard requests
run-pipeline3 --concurrent 2 --address '' YOURNICKHERE

For FreeBSD:

Honestly, I have no idea. ./ supposedly doesn't work due to differences in the tar that ships with FreeBSD. Another problem is the apparent absence of Lua 5.1 development headers. If you figure this out, please do let us know on IRC ( #archiveteam).


Broken? These are some of the possible solutions:

wget-lua was not successfully built

If you get errors about wget.pod or something similar, the documentation failed to compile - wget-lua, however, compiled fine. Try this:

cd get-wget-lua.tmp
mv src/wget ../wget-lua
cd ..

The get-wget-lua.tmp name may be inaccurate. If you have a folder with a similar but different name, use that instead and please let us know on IRC what folder name you had!

Optionally, if you know what you're doing, you may want to use wgetpod.patch.

Problem with gnutls or openssl during get-wget-lua

Please ensure that gnutls-dev(el) and openssl-dev(el) are installed.

ImportError: No module named seesaw

If you're sure that you followed the steps to install seesaw, permissions on your module directory may be set incorrectly. Try the following:

chmod o+rX -R /usr/local/lib/python2.7/dist-packages

run-pipeline: command not found

Install seesaw using pip2 instead of pip.

pip2 install seesaw

Issues in the code

If you notice a bug and want to file a bug report, please use the GitHub issues tracker.

Are you a developer? Help write code for us! Look at our developer documentation for details.

Other problems

Have an issue not listed here? Join us on IRC and ask! We can be found at hackint IRC #shreddit.