reddit-grab/README.md
Robin Rolf b7feddc147
Extra docker container params
watchtower: `--include-restarting` also update if the container is in a crash loop due to a bad build or the like
grab container: `--log-driver json-file --log-opt max-size=50m` to limit logs, docker defaults to json-file with no limit
2023-08-03 23:56:37 +02:00

4.8 KiB

reddit-grab

More information about the archiving project can be found on the ArchiveTeam wiki: Reddit

Setup instructions

General instructions

Data integrity is very important in Archive Team projects. Please note the following important rules:

We strongly encourage you to join the IRC channel associated with this project in order to be informed about project updates and other important announcements, as well as to be reachable in the event of an issue. The Archive Team Wiki has more information about IRC. We can be found at hackint IRC #shreddit.

If you have any questions or issues during setup, please review the wiki pages or contact us on IRC for troubleshooting information.

Running the project

This and other archiving projects can easily be run using the Archive Team Warrior virtual machine. Follow the instructions on the Archive Team wiki for installing the Warrior, and from the web interface running at http://localhost:8001/, enter the nickname that you want to be shown as on the tracker. There is no registration, just pick a nickname you like. Then, select the Reddit project in the Warrior interface.

Project-specific Docker container (for more advanced users)

Alternatively, more advanced users can also run projects using Docker. While users of the Warrior can switch between projects using a web interface, Docker containers are specific to each project. However, while the Warrior supports a maximum of 6 concurrent items, a Docker container supports a maximum of 20 concurrent items. The instructions below are a short overview. For more information and detailed explanations of the commands, follow the follow the Docker instructions on the Archive Team wiki.

It is advised to use Watchtower to automatically update the project container:

docker run -d --name watchtower --restart=unless-stopped -v /var/run/docker.sock:/var/run/docker.sock containrrr/watchtower --label-enable --cleanup --interval 3600 --include-restarting

after which the project container can be run:

docker run -d --name archiveteam --label=com.centurylinklabs.watchtower.enable=true --log-driver json-file --log-opt max-size=50m --restart=unless-stopped atdr.meo.ws/archiveteam/reddit-grab --concurrent 1 YOURNICKHERE

Be sure to replace YOURNICKHERE with the nickname that you want to be shown as on the tracker. There is no registration, just pick a nickname you like.

Supporting Archive Team

Behind the scenes Archive Team has infrastructure to run the projects and process the data with. If you would like to help out with the costs of our infrastructure, a donation on our Open Collective would be very welcome.

Issues in the code

If you notice a bug and want to file a bug report, please use the GitHub issues tracker.

Are you a developer? Help write code for us! Look at our developer documentation for details.

Other problems

Have an issue not listed here? Join us on IRC and ask! We can be found at hackint IRC #shreddit.