First re-org commit.

pull/3/head
Heretic 4 years ago
commit 0be7ce9a54

1
.gitignore vendored

@ -0,0 +1 @@
target

@ -0,0 +1,5 @@
tab_spaces = 2
edition="2018"
imports_layout="HorizontalVertical"
merge_imports=true
reorder_imports=true

2065
Cargo.lock generated

File diff suppressed because it is too large Load Diff

@ -0,0 +1,29 @@
[package]
name = "torrents-csv-service"
version = "0.1.0"
authors = ["Dessalines <happydooby@gmail.com>"]
edition = "2018"
[dependencies]
actix-web = "3.1"
actix-files = "0.4"
env_logger = "0.8"
serde = "1.0"
serde_json = "1.0"
serde_derive = "1.0"
futures = "0.3"
failure = "0.1"
r2d2 = "0.8"
r2d2_sqlite = "0.17"
[dependencies.rusqlite]
version = "0.24"
features = ["bundled"]
[dev-dependencies.cargo-husky]
version = "1"
default-features = false # Disable features which are enabled by default
features = ["precommit-hook", "run-cargo-fmt", "run-cargo-clippy"]

@ -0,0 +1,140 @@
# Torrents.csv
<!-- Torrents.csv - An open source, collaborative repository of torrents, with a self-hostable web server. -->
[Demo Server](https://torrents-csv.ml)
`Torrents.csv` is a *collaborative* repository of torrents and their files, consisting of a searchable `torrents.csv`, and `torrent_files.csv`. With it you can search for torrents, or files within torrents. It aims to be a universal file system for popular data.
Its initially populated with a January 2017 backup of the pirate bay, and new torrents are periodically added from various torrents sites. It comes with a self-hostable [Torrents.csv webserver](https://torrents-csv.ml), a command line search, and a folder scanner to add torrents, and their files.
`Torrents.csv` will only store torrents with at least one seeder to keep the file small, will be periodically purged of non-seeded torrents, and sorted by infohash.
![img](https://i.imgur.com/yTFuwpv.png)
To request more torrents, or add your own, go [here](https://gitlab.com/dessalines/torrents.csv/issues).
Made with [Rust](https://www.rust-lang.org), [ripgrep](https://github.com/BurntSushi/ripgrep), [Actix](https://actix.rs/), [Inferno](https://www.infernojs.org), [Typescript](https://www.typescriptlang.org/).
## Webserver
`Torrents.csv` comes with a simple webserver. [Demo Server](https://torrents-csv.ml)
### Docker
```
git clone https://gitlab.com/dessalines/torrents.csv
cd torrents.csv
cd scripts && ./build_sqlite.sh -f && cd ..
cd docker/prod
docker-compose up -d
```
### Docker Development
```
git clone https://gitlab.com/dessalines/torrents.csv
cd torrents.csv/scripts && ./build_sqlite.sh && cd ..
cd docker/dev
docker-compose up -d
```
### Local
#### Requirements
- [Rust](https://www.rust-lang.org/)
- [Yarn](https://yarnpkg.com/en/)
- [SQLite3](https://www.sqlite.org/index.html)
#### Running
```
git clone https://gitlab.com/dessalines/torrents.csv
cd torrents.csv/scripts
./webserver.sh
```
and goto http://localhost:8902
If running on a different host, run `export TORRENTS_CSV_ENDPOINT=http://whatever.com` to change the hostname, or use a reverse proxy with nginx or apache2.
The torrent data is updated daily, and to do so, run, or place this in a crontab:
`cd scripts && ./git_update.sh`
This updates the repository, and rebuilds the sqlite cache necessary for searching.
To re-build the frontend assets, use `cd server/ui && yarn build`. There is no need to restart the `./webserver.sh` script.
## Command Line Searching
### Requirements
- [ripgrep](https://github.com/BurntSushi/ripgrep)
### Running
```
git clone https://gitlab.com/dessalines/torrents.csv
cd torrents.csv
./search.sh "bleh season 1"
bleh season 1 (1993-)
seeders: 33
size: 13GiB
link: magnet:?xt=urn:btih:INFO_HASH_HERE
```
## Uploading / Adding Torrents from a Directory
An *upload*, consists of making a pull request after running the `scan_torrents.sh` script, which adds torrents from a directory you choose to the `.csv` file, after checking that they aren't already there, and that they have seeders. It also adds their files to `torrent_files.csv`.
### Requirements
- [Torrent-Tracker-Health Dessalines branch](https://github.com/dessalines/torrent-tracker-health)
- `npm i -g dessalines/torrent-tracker-health`
- [jq command line JSON parser: Needs at least jq-1.6](https://stedolan.github.io/jq/)
- [NodeJS](https://nodejs.org/en/)
- [Gnu Parallel](https://www.gnu.org/software/parallel/)
### Running
[Click here](https://gitlab.com/dessalines/torrents.csv/forks/new) to fork this repo.
```sh
git clone https://gitlab.com/[MY_USER]/torrents.csv
cd torrents.csv/scripts
./scan_torrents.sh MY_TORRENTS_DIR # `MY_TORRENTS_DIR` is `~/.local/share/data/qBittorrent/BT_backup/` for qBittorrent on linux, but you can search for where your torrents are stored for your client.
git commit -am "Adding [MY_USER] torrents"
git push
```
Then [click here](https://gitlab.com/dessalines/torrents.csv/merge_requests/new) to do a pull/merge request to my branch.
## Web scraping torrents
`Torrents.csv` has a `Rust` repository for scraping new and top torrents from some torrent sites in the `new_torrents_fetcher` folder. It currently scrapes skytorrents, magnetdl, and leetx.
### Requirements
- Rust
- [Cloudflare Scrape](https://github.com/Anorov/cloudflare-scrape)
- `sudo pip install cfscrape`
### Running
```
git clone https://gitlab.com/dessalines/torrents.csv
cd torrents.csv/scripts
./update.sh SAVE_TORRENT_DIR
```
## API
A JSON output of search results is available at:
http://localhost:8902/service/search?q=[QUERY]&size=[NUMBER_OF_RESULTS]&offset=[PAGE]&type=[torrent | file]
new torrents are at:
http://localhost:8902/service/new?size=[NUMBER_OF_RESULTS]&offset=[PAGE]&type=[torrent | file]
## How the torrents.csv file looks
```sh
infohash;name;size_bytes;created_unix;seeders;leechers;completed;scraped_date
# torrents here...
```
## How the torrent_files.csv looks
```sh
infohash;index;path;size_bytes
```

@ -0,0 +1,231 @@
extern crate actix_files;
extern crate actix_web;
extern crate serde;
extern crate serde_json;
#[macro_use]
extern crate serde_derive;
extern crate rusqlite;
#[macro_use]
extern crate failure;
extern crate r2d2;
extern crate r2d2_sqlite;
use actix_files as fs;
use actix_files::NamedFile;
use actix_web::{middleware, web, App, HttpResponse, HttpServer};
use failure::Error;
use r2d2_sqlite::SqliteConnectionManager;
use rusqlite::params;
use serde_json::Value;
use std::{cmp, env, io, ops::Deref};
const DEFAULT_SIZE: usize = 25;
#[actix_web::main]
async fn main() -> io::Result<()> {
println!("Access me at {}", endpoint());
std::env::set_var("RUST_LOG", "actix_web=debug");
env_logger::init();
let manager = SqliteConnectionManager::file(torrents_db_file());
let pool = r2d2::Pool::builder().max_size(15).build(manager).unwrap();
HttpServer::new(move || {
App::new()
.data(pool.clone())
.wrap(middleware::Logger::default())
.service(fs::Files::new("/static", front_end_dir()))
.route("/", web::get().to(index))
.route("/service/search", web::get().to(search))
})
.keep_alive(None)
.bind(endpoint())?
.run()
.await
}
async fn index() -> Result<NamedFile, actix_web::error::Error> {
Ok(NamedFile::open(front_end_dir() + "/index.html")?)
}
fn front_end_dir() -> String {
env::var("TORRENTS_CSV_FRONT_END_DIR").unwrap_or_else(|_| "../ui/dist".to_string())
}
fn torrents_db_file() -> String {
env::var("TORRENTS_CSV_DB_FILE").unwrap_or_else(|_| "../../torrents.db".to_string())
}
fn endpoint() -> String {
env::var("TORRENTS_CSV_ENDPOINT").unwrap_or_else(|_| "0.0.0.0:8080".to_string())
}
#[derive(Deserialize)]
struct SearchQuery {
q: String,
page: Option<usize>,
size: Option<usize>,
type_: Option<String>,
}
async fn search(
db: web::Data<r2d2::Pool<SqliteConnectionManager>>,
query: web::Query<SearchQuery>,
) -> Result<HttpResponse, actix_web::Error> {
let res = web::block(move || {
let conn = db.get().unwrap();
search_query(query, conn)
})
.await
.map(|body| {
HttpResponse::Ok()
.header("Access-Control-Allow-Origin", "*")
.json(body)
})
.map_err(actix_web::error::ErrorBadRequest)?;
Ok(res)
}
fn search_query(
query: web::Query<SearchQuery>,
conn: r2d2::PooledConnection<SqliteConnectionManager>,
) -> Result<Value, Error> {
let q = query.q.trim();
if q.is_empty() || q.len() < 3 || q == "2020" {
return Err(format_err!(
"{{\"error\": \"{}\"}}",
"Empty query".to_string()
));
}
let page = query.page.unwrap_or(1);
let size = cmp::min(100, query.size.unwrap_or(DEFAULT_SIZE));
let type_ = query.type_.as_ref().map_or("torrent", String::deref);
let offset = size * (page - 1);
println!(
"query = {}, type = {}, page = {}, size = {}",
q, type_, page, size
);
let res = if type_ == "file" {
let results = torrent_file_search(conn, q, size, offset)?;
serde_json::to_value(&results).unwrap()
} else {
let results = torrent_search(conn, q, size, offset)?;
serde_json::to_value(&results).unwrap()
};
Ok(res)
}
#[derive(Debug, Serialize, Deserialize)]
struct Torrent {
infohash: String,
name: String,
size_bytes: isize,
created_unix: u32,
seeders: u32,
leechers: u32,
completed: Option<u32>,
scraped_date: u32,
}
fn torrent_search(
conn: r2d2::PooledConnection<SqliteConnectionManager>,
query: &str,
size: usize,
offset: usize,
) -> Result<Vec<Torrent>, Error> {
let stmt_str = "select * from torrents where name like '%' || ?1 || '%' limit ?2, ?3";
let mut stmt = conn.prepare(&stmt_str)?;
let torrent_iter = stmt.query_map(
params![
query.replace(" ", "%"),
offset.to_string(),
size.to_string(),
],
|row| {
Ok(Torrent {
infohash: row.get(0)?,
name: row.get(1)?,
size_bytes: row.get(2)?,
created_unix: row.get(3)?,
seeders: row.get(4)?,
leechers: row.get(5)?,
completed: row.get(6)?,
scraped_date: row.get(7)?,
})
},
)?;
let mut torrents = Vec::new();
for torrent in torrent_iter {
torrents.push(torrent.unwrap());
}
Ok(torrents)
}
#[derive(Debug, Serialize, Deserialize)]
struct File {
infohash: String,
index_: u32,
path: String,
size_bytes: isize,
created_unix: u32,
seeders: u32,
leechers: u32,
completed: Option<u32>,
scraped_date: u32,
}
fn torrent_file_search(
conn: r2d2::PooledConnection<SqliteConnectionManager>,
query: &str,
size: usize,
offset: usize,
) -> Result<Vec<File>, Error> {
let stmt_str = "select * from files where path like '%' || ?1 || '%' limit ?2, ?3";
let mut stmt = conn.prepare(&stmt_str).unwrap();
let file_iter = stmt.query_map(
params![
query.replace(" ", "%"),
offset.to_string(),
size.to_string(),
],
|row| {
Ok(File {
infohash: row.get(0)?,
index_: row.get(1)?,
path: row.get(2)?,
size_bytes: row.get(3)?,
created_unix: row.get(4)?,
seeders: row.get(5)?,
leechers: row.get(6)?,
completed: row.get(7)?,
scraped_date: row.get(8)?,
})
},
)?;
let mut files = Vec::new();
for file in file_iter {
files.push(file.unwrap());
}
Ok(files)
}
#[cfg(test)]
mod tests {
use r2d2_sqlite::SqliteConnectionManager;
#[test]
fn test() {
let manager = SqliteConnectionManager::file(super::torrents_db_file());
let pool = r2d2::Pool::builder().max_size(15).build(manager).unwrap();
let conn = pool.get().unwrap();
let results = super::torrent_search(conn, "sherlock", 10, 0);
assert!(results.unwrap().len() > 2);
// println!("Query took {:?} seconds.", end - start);
}
}
Loading…
Cancel
Save