|
|
|
@ -1,6 +1,6 @@
|
|
|
|
|
# Trandoshan dark web crawler
|
|
|
|
|
|
|
|
|
|
[![trandoshan](https://snapcraft.io//trandoshan/badge.svg)](https://snapcraft.io/trandoshan)
|
|
|
|
|
![CI](https://github.com/creekorful/trandoshan/workflows/CI/badge.svg)
|
|
|
|
|
|
|
|
|
|
This repository is a complete rewrite of the Trandoshan dark web crawler. Everything has been written inside a single
|
|
|
|
|
Git repository to ease maintenance.
|
|
|
|
@ -8,15 +8,15 @@ Git repository to ease maintenance.
|
|
|
|
|
## Why a rewrite?
|
|
|
|
|
|
|
|
|
|
The first version of Trandoshan [(available here)](https://github.com/trandoshan-io) is working great but
|
|
|
|
|
not really professional, the code start to be a mess, hard to manage since split in multiple repositories, etc..
|
|
|
|
|
not really professional, the code start to be a mess, hard to manage since split in multiple repositories, etc.
|
|
|
|
|
|
|
|
|
|
I have therefore decided to create & maintain the project in this specific directory, where all process code will be available
|
|
|
|
|
(as a Go module).
|
|
|
|
|
I have therefore decided to create & maintain the project in this specific repository,
|
|
|
|
|
where all process code will be available (as a Go module).
|
|
|
|
|
|
|
|
|
|
# How to start the crawler
|
|
|
|
|
|
|
|
|
|
Execute the ``/scripts/start.sh`` and wait for all containers to start.
|
|
|
|
|
You can start the crawler in detached mode by passing --detach to start.sh
|
|
|
|
|
You can start the crawler in detached mode by passing --detach to it.
|
|
|
|
|
|
|
|
|
|
## Note
|
|
|
|
|
|
|
|
|
@ -29,15 +29,17 @@ Since the API is exposed on localhost:15005, one can use it to start the crawlin
|
|
|
|
|
using trandoshanctl executable:
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
trandoshanctl schedule https://www.facebookcorewwwi.onion
|
|
|
|
|
$ trandoshanctl schedule https://www.facebookcorewwwi.onion
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
or using the docker image:
|
|
|
|
|
|
|
|
|
|
```sh
|
|
|
|
|
docker run creekorful/trandoshanctl schedule https://www.facebookcorewwwi.onion
|
|
|
|
|
$ docker run creekorful/trandoshanctl --api-uri <uri> schedule https://www.facebookcorewwwi.onion
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
(you'll need to specify the api uri if you use the docker container)
|
|
|
|
|
|
|
|
|
|
this will schedule given URL for crawling.
|
|
|
|
|
|
|
|
|
|
# How to view results
|
|
|
|
|