2
0
mirror of https://github.com/urbanguacamole/torrent-paradise synced 2024-10-31 21:20:13 +00:00
torrent-paradise/README.md
Urban Guacamole 330f426b81 Redo tracker scraping, index generation
Seed/leech counts are fetched from all trackers for every torrent and
data from the tracker with most seeds for a given torrent is used.

I also got rid of SQLite which was previously used for generating
the index. It was replaced by a simple CSV file.  KISS

Some minor bugfixes and tweaks also included (sorry for not breaking
them up into more commits).
2019-03-02 21:26:57 +01:00

2.1 KiB

About

What is this?

If you don't know what Torrent Paradise is, see the website.

This is a repository of all the tools I use to build and run torrent-paradise.ml. The 'code name' of the project is nextgen (next gen torrent search), so don't be surprised if it comes up somewhere.

Can you help me?

Maybe, open an issue. Be sure to demonstrate an effort that you tried to solve the problem yourself.

Setup

Here's what the setup looks like rn:

  • VPS, Debian Stretch, 2 GB RAM
    • PostgreSQL 9.6. pg_hba.conf contains this:

      local   all             all                                      peer
      # IPv4 local connections:
      host    nextgen         nextgen          localhost               md5
      
    • IPFS v0.4.18

    • user with username nextgen on the server

  • my laptop w/ Linux
    • Go toolchain installed
    • node v10.15 & npm
    • Python 3 (required only for index-generator/fix-metajson.py)

The programs create their own tables in the DB that they need. Database name is "nextgen".

What I did first after getting the server up and running was importing the TPB dump. Download https://thepiratebay.org/static/dump/csv/torrent_dump_full.csv.gz to the import-tpb-dump directory and run go run.

Usage

Generate the index

See update-index.sh. Before running it for the first time, you should create the materialized view fresh. For instructions, see the first paragraph of snippets.sql.

Spider the DHT

Run go build in spider/ to compile and scp the binary it to the server. You can use the systemd service file in spider/spider.service to start the spider on server boot.

Scraping trackers for seed/leech data

Run go build in tracker-scraper/ to compile and scp the binary it to the server. Run it every time you want to fetch new seed/leech data for all torrents.

tracker-scraper saves the results into the trackerdata table in database.

Contributing

Before working on something, open an issue to ask if it would be okay. I would love to KISS.