Go to file
2020-04-05 20:55:56 +02:00
.github/workflows Add Github action support 2020-04-05 19:28:45 +02:00
build Add basic scheduler implementation 2020-04-05 19:24:19 +02:00
cmd Add basic scheduler implementation 2020-04-05 19:24:19 +02:00
deployments Setup crawler to use tor proxy to reach hidden services 2020-04-05 19:57:56 +02:00
internal Scheduler: normalized received URLs 2020-04-05 20:55:56 +02:00
pkg/proto Run gofmt over the project 2020-04-05 20:02:29 +02:00
scripts Add basic scheduler implementation 2020-04-05 19:24:19 +02:00
.gitignore Initial commit 2020-04-03 17:43:59 +02:00
go.mod Scheduler: normalized received URLs 2020-04-05 20:55:56 +02:00
go.sum Scheduler: normalized received URLs 2020-04-05 20:55:56 +02:00
LICENSE Initial commit 2020-04-03 17:43:59 +02:00
README.md Implement crawler process 2020-04-05 18:45:58 +02:00

Trandoshan dark web crawler

This repository is a complete rewrite of the Trandoshan dark web crawler. Everything has been written inside a single Git repository to ease maintenance.

Why a rewrite?

The first version of Trandoshan (available here) is working great but not really professional, the code start to be a mess, hard to manage since split in multiple repositories, etc..

I have therefore decided to create & maintain the project in this specific directory, where all process code will be available (as a Go module).

How to start the crawler

Since the docker image are not available yet, one must run the following script in order to build the crawler fully.

./scripts/build.sh

The crawler can be started using the start script:

./scripts/start.sh