First re-org commit.

4 years ago · 0be7ce9a54
commit 0be7ce9a54
6 changed files with 2471 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@ -0,0 +1 @@
+target
--- a/.rustfmt.toml
+++ b/.rustfmt.toml
@ -0,0 +1,5 @@
+tab_spaces = 2
+edition="2018"
+imports_layout="HorizontalVertical"
+merge_imports=true
+reorder_imports=true
--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
@ -0,0 +1,29 @@
+[package]
+name = "torrents-csv-service"
+version = "0.1.0"
+authors = ["Dessalines <happydooby@gmail.com>"]
+edition = "2018"
+
+[dependencies]
+actix-web = "3.1"
+actix-files = "0.4"
+env_logger = "0.8"
+
+serde = "1.0"
+serde_json = "1.0"
+serde_derive = "1.0"
+
+futures = "0.3"
+failure = "0.1"
+
+r2d2 = "0.8"
+r2d2_sqlite = "0.17"
+
+[dependencies.rusqlite]
+version = "0.24"
+features = ["bundled"]
+
+[dev-dependencies.cargo-husky]
+version = "1"
+default-features = false # Disable features which are enabled by default
+features = ["precommit-hook", "run-cargo-fmt", "run-cargo-clippy"]
--- a/README.md
+++ b/README.md
@ -0,0 +1,140 @@
+# Torrents.csv
+
+<!-- Torrents.csv - An open source, collaborative repository of torrents, with a self-hostable web server.   -->
+
+[Demo Server](https://torrents-csv.ml)
+
+`Torrents.csv` is a *collaborative* repository of torrents and their files, consisting of a searchable `torrents.csv`, and `torrent_files.csv`. With it you can search for torrents, or files within torrents. It aims to be a universal file system for popular data.
+
+Its initially populated with a January 2017 backup of the pirate bay, and new torrents are periodically added from various torrents sites. It comes with a self-hostable [Torrents.csv webserver](https://torrents-csv.ml), a command line search, and a folder scanner to add torrents, and their files.
+
+`Torrents.csv` will only store torrents with at least one seeder to keep the file small, will be periodically purged of non-seeded torrents, and sorted by infohash.
+
+![img](https://i.imgur.com/yTFuwpv.png)
+
+To request more torrents, or add your own, go [here](https://gitlab.com/dessalines/torrents.csv/issues).
+
+Made with [Rust](https://www.rust-lang.org), [ripgrep](https://github.com/BurntSushi/ripgrep), [Actix](https://actix.rs/), [Inferno](https://www.infernojs.org), [Typescript](https://www.typescriptlang.org/).
+
+## Webserver
+
+`Torrents.csv` comes with a simple webserver. [Demo Server](https://torrents-csv.ml)
+
+### Docker
+
+```
+git clone https://gitlab.com/dessalines/torrents.csv
+cd torrents.csv
+cd scripts && ./build_sqlite.sh -f && cd ..
+cd docker/prod
+docker-compose up -d
+```
+### Docker Development
+
+```
+git clone https://gitlab.com/dessalines/torrents.csv
+cd torrents.csv/scripts && ./build_sqlite.sh && cd ..
+cd docker/dev
+docker-compose up -d
+```
+
+### Local
+
+#### Requirements
+
+- [Rust](https://www.rust-lang.org/)
+- [Yarn](https://yarnpkg.com/en/)
+- [SQLite3](https://www.sqlite.org/index.html)
+
+#### Running
+
+```
+git clone https://gitlab.com/dessalines/torrents.csv
+cd torrents.csv/scripts
+./webserver.sh
+```
+and goto http://localhost:8902
+
+If running on a different host, run `export TORRENTS_CSV_ENDPOINT=http://whatever.com` to change the hostname, or use a reverse proxy with nginx or apache2.
+
+The torrent data is updated daily, and to do so, run, or place this in a crontab:
+
+`cd scripts && ./git_update.sh`
+
+This updates the repository, and rebuilds the sqlite cache necessary for searching.
+
+To re-build the frontend assets, use `cd server/ui && yarn build`. There is no need to restart the `./webserver.sh` script.
+
+## Command Line Searching
+
+### Requirements
+
+- [ripgrep](https://github.com/BurntSushi/ripgrep)
+
+### Running
+```
+git clone https://gitlab.com/dessalines/torrents.csv
+cd torrents.csv
+./search.sh "bleh season 1"
+bleh season 1 (1993-)
+	seeders: 33
+	size: 13GiB
+	link: magnet:?xt=urn:btih:INFO_HASH_HERE
+```
+## Uploading / Adding Torrents from a Directory
+
+An *upload*, consists of making a pull request after running the `scan_torrents.sh` script, which adds torrents from a directory you choose to the `.csv` file, after checking that they aren't already there, and that they have seeders. It also adds their files to `torrent_files.csv`.
+
+### Requirements
+- [Torrent-Tracker-Health Dessalines branch](https://github.com/dessalines/torrent-tracker-health)
+  - `npm i -g dessalines/torrent-tracker-health`
+- [jq command line JSON parser: Needs at least jq-1.6](https://stedolan.github.io/jq/)
+- [NodeJS](https://nodejs.org/en/)
+- [Gnu Parallel](https://www.gnu.org/software/parallel/)
+
+### Running
+[Click here](https://gitlab.com/dessalines/torrents.csv/forks/new) to fork this repo.
+```sh
+git clone https://gitlab.com/[MY_USER]/torrents.csv
+cd torrents.csv/scripts
+./scan_torrents.sh MY_TORRENTS_DIR # `MY_TORRENTS_DIR` is `~/.local/share/data/qBittorrent/BT_backup/` for qBittorrent on linux, but you can search for where your torrents are stored for your client.
+git commit -am "Adding [MY_USER] torrents"
+git push
+```
+
+Then [click here](https://gitlab.com/dessalines/torrents.csv/merge_requests/new) to do a pull/merge request to my branch.
+
+## Web scraping torrents
+`Torrents.csv` has a `Rust` repository for scraping new and top torrents from some torrent sites in the `new_torrents_fetcher` folder. It currently scrapes skytorrents, magnetdl, and leetx.
+
+### Requirements
+- Rust
+- [Cloudflare Scrape](https://github.com/Anorov/cloudflare-scrape)
+  - `sudo pip install cfscrape`
+
+### Running
+```
+git clone https://gitlab.com/dessalines/torrents.csv
+cd torrents.csv/scripts
+./update.sh SAVE_TORRENT_DIR
+```
+
+## API
+A JSON output of search results is available at:
+
+http://localhost:8902/service/search?q=[QUERY]&size=[NUMBER_OF_RESULTS]&offset=[PAGE]&type=[torrent | file]
+
+new torrents are at:
+
+http://localhost:8902/service/new?size=[NUMBER_OF_RESULTS]&offset=[PAGE]&type=[torrent | file]
+
+## How the torrents.csv file looks
+```sh
+infohash;name;size_bytes;created_unix;seeders;leechers;completed;scraped_date
+# torrents here...
+```
+
+## How the torrent_files.csv looks
+```sh
+infohash;index;path;size_bytes
+```
--- a/src/main.rs
+++ b/src/main.rs
@ -0,0 +1,231 @@
+extern crate actix_files;
+extern crate actix_web;
+extern crate serde;
+extern crate serde_json;
+#[macro_use]
+extern crate serde_derive;
+extern crate rusqlite;
+#[macro_use]
+extern crate failure;
+extern crate r2d2;
+extern crate r2d2_sqlite;
+
+use actix_files as fs;
+use actix_files::NamedFile;
+use actix_web::{middleware, web, App, HttpResponse, HttpServer};
+use failure::Error;
+use r2d2_sqlite::SqliteConnectionManager;
+use rusqlite::params;
+use serde_json::Value;
+use std::{cmp, env, io, ops::Deref};
+
+const DEFAULT_SIZE: usize = 25;
+
+#[actix_web::main]
+async fn main() -> io::Result<()> {
+  println!("Access me at {}", endpoint());
+  std::env::set_var("RUST_LOG", "actix_web=debug");
+  env_logger::init();
+
+  let manager = SqliteConnectionManager::file(torrents_db_file());
+  let pool = r2d2::Pool::builder().max_size(15).build(manager).unwrap();
+
+  HttpServer::new(move || {
+    App::new()
+      .data(pool.clone())
+      .wrap(middleware::Logger::default())
+      .service(fs::Files::new("/static", front_end_dir()))
+      .route("/", web::get().to(index))
+      .route("/service/search", web::get().to(search))
+  })
+  .keep_alive(None)
+  .bind(endpoint())?
+  .run()
+  .await
+}
+
+async fn index() -> Result<NamedFile, actix_web::error::Error> {
+  Ok(NamedFile::open(front_end_dir() + "/index.html")?)
+}
+
+fn front_end_dir() -> String {
+  env::var("TORRENTS_CSV_FRONT_END_DIR").unwrap_or_else(|_| "../ui/dist".to_string())
+}
+
+fn torrents_db_file() -> String {
+  env::var("TORRENTS_CSV_DB_FILE").unwrap_or_else(|_| "../../torrents.db".to_string())
+}
+
+fn endpoint() -> String {
+  env::var("TORRENTS_CSV_ENDPOINT").unwrap_or_else(|_| "0.0.0.0:8080".to_string())
+}
+
+#[derive(Deserialize)]
+struct SearchQuery {
+  q: String,
+  page: Option<usize>,
+  size: Option<usize>,
+  type_: Option<String>,
+}
+
+async fn search(
+  db: web::Data<r2d2::Pool<SqliteConnectionManager>>,
+  query: web::Query<SearchQuery>,
+) -> Result<HttpResponse, actix_web::Error> {
+  let res = web::block(move || {
+    let conn = db.get().unwrap();
+    search_query(query, conn)
+  })
+  .await
+  .map(|body| {
+    HttpResponse::Ok()
+      .header("Access-Control-Allow-Origin", "*")
+      .json(body)
+  })
+  .map_err(actix_web::error::ErrorBadRequest)?;
+  Ok(res)
+}
+
+fn search_query(
+  query: web::Query<SearchQuery>,
+  conn: r2d2::PooledConnection<SqliteConnectionManager>,
+) -> Result<Value, Error> {
+  let q = query.q.trim();
+  if q.is_empty() || q.len() < 3 || q == "2020" {
+    return Err(format_err!(
+      "{{\"error\": \"{}\"}}",
+      "Empty query".to_string()
+    ));
+  }
+
+  let page = query.page.unwrap_or(1);
+  let size = cmp::min(100, query.size.unwrap_or(DEFAULT_SIZE));
+  let type_ = query.type_.as_ref().map_or("torrent", String::deref);
+  let offset = size * (page - 1);
+
+  println!(
+    "query = {}, type = {}, page = {}, size = {}",
+    q, type_, page, size
+  );
+
+  let res = if type_ == "file" {
+    let results = torrent_file_search(conn, q, size, offset)?;
+    serde_json::to_value(&results).unwrap()
+  } else {
+    let results = torrent_search(conn, q, size, offset)?;
+    serde_json::to_value(&results).unwrap()
+  };
+
+  Ok(res)
+}
+
+#[derive(Debug, Serialize, Deserialize)]
+struct Torrent {
+  infohash: String,
+  name: String,
+  size_bytes: isize,
+  created_unix: u32,
+  seeders: u32,
+  leechers: u32,
+  completed: Option<u32>,
+  scraped_date: u32,
+}
+
+fn torrent_search(
+  conn: r2d2::PooledConnection<SqliteConnectionManager>,
+  query: &str,
+  size: usize,
+  offset: usize,
+) -> Result<Vec<Torrent>, Error> {
+  let stmt_str = "select * from torrents where name like '%' || ?1 || '%' limit ?2, ?3";
+  let mut stmt = conn.prepare(&stmt_str)?;
+  let torrent_iter = stmt.query_map(
+    params![
+      query.replace(" ", "%"),
+      offset.to_string(),
+      size.to_string(),
+    ],
+    |row| {
+      Ok(Torrent {
+        infohash: row.get(0)?,
+        name: row.get(1)?,
+        size_bytes: row.get(2)?,
+        created_unix: row.get(3)?,
+        seeders: row.get(4)?,
+        leechers: row.get(5)?,
+        completed: row.get(6)?,
+        scraped_date: row.get(7)?,
+      })
+    },
+  )?;
+
+  let mut torrents = Vec::new();
+  for torrent in torrent_iter {
+    torrents.push(torrent.unwrap());
+  }
+  Ok(torrents)
+}
+
+#[derive(Debug, Serialize, Deserialize)]
+struct File {
+  infohash: String,
+  index_: u32,
+  path: String,
+  size_bytes: isize,
+  created_unix: u32,
+  seeders: u32,
+  leechers: u32,
+  completed: Option<u32>,
+  scraped_date: u32,
+}
+
+fn torrent_file_search(
+  conn: r2d2::PooledConnection<SqliteConnectionManager>,
+  query: &str,
+  size: usize,
+  offset: usize,
+) -> Result<Vec<File>, Error> {
+  let stmt_str = "select * from files where path like '%' || ?1 || '%' limit ?2, ?3";
+  let mut stmt = conn.prepare(&stmt_str).unwrap();
+  let file_iter = stmt.query_map(
+    params![
+      query.replace(" ", "%"),
+      offset.to_string(),
+      size.to_string(),
+    ],
+    |row| {
+      Ok(File {
+        infohash: row.get(0)?,
+        index_: row.get(1)?,
+        path: row.get(2)?,
+        size_bytes: row.get(3)?,
+        created_unix: row.get(4)?,
+        seeders: row.get(5)?,
+        leechers: row.get(6)?,
+        completed: row.get(7)?,
+        scraped_date: row.get(8)?,
+      })
+    },
+  )?;
+
+  let mut files = Vec::new();
+  for file in file_iter {
+    files.push(file.unwrap());
+  }
+  Ok(files)
+}
+
+#[cfg(test)]
+mod tests {
+  use r2d2_sqlite::SqliteConnectionManager;
+
+  #[test]
+  fn test() {
+    let manager = SqliteConnectionManager::file(super::torrents_db_file());
+    let pool = r2d2::Pool::builder().max_size(15).build(manager).unwrap();
+    let conn = pool.get().unwrap();
+    let results = super::torrent_search(conn, "sherlock", 10, 0);
+    assert!(results.unwrap().len() > 2);
+    // println!("Query took {:?} seconds.", end - start);
+  }
+}