diff --git a/README.md b/README.md index 442ca47..c96ac24 100644 --- a/README.md +++ b/README.md @@ -1,79 +1,87 @@ +

+ +

+ +[![license](https://shields.io/badge/license-MIT-green)](https://github.com/terhechte/postsack/blob/main/LICENSE.md) +![Rust CI](https://github.com/terhechte/postsack/actions/workflows/rust.yml/badge.svg) + + # Postsack -## Provides a high level visual overview of swaths of email +## A high level visual overview of swaths of email + +[TLDR! A web demo that shows how Postsack clusters a set of 10.000 fake emails](https://terhech.de/postsack_demo) + +Do you have a lot of mail? I have a lot of mail. There're inbox zero or archive zero people. I'm not one of them. I recently realized that my gmail account contains roughly 650.000 emails. I looked at that and I began +to wonder.. *Why?*.. +Sure, I've been using Gmail since 2004 but still, that's 38.000 Emails per year which strikes me as a bit +on the crazy side of things. Curious as I am I wanted to figure out where all these mails came from. + +Turns out that is a tricky problem because Gmail doesn't offer any tooling for such a use case. Hence I +build my own. This tool (lovingly named `Postsack` which is German for a bag full of mail) parses all your +emails and allows you to dig into them with a treemap (see screenshot below): + +- Import all your local mails (currently, only MBox, Apple Mail and Gmail Backups are supported) +- Build up clustered visualizations of your mails to see and understand what kind of emails you have +- Cluster the emails by sender domain / name, month, day, year, name, and some more +- Additional filters for seen mails or tags / labels +- See all the mails for the current set of filters / current cluster +- Save the generated database as a SQLite file so you can do additional queries yourself (or open it again) +- Very fast email parsing / import. My **650k mails are imported in ~1 Minute** on a Macbook M1 Pro Max and ~ 2 Minutes on a Intel Core i7-8700B 3.2 Ghz. +- Cross platform (macOS, Windows, Linux and a [Web Demo](https://terhech.de/postsack_demo)) -### Performance +## It looks like this -It currently parses 632383 emails in ~160 seconds, so roughly `4.000` emails per second. This excludes (for now) attachments. -Update: It currently parses 632115 emails in ~56 seconds, so roughly `11.000` emails per second. This excludes (for now) attachments. (on M1) +![Example](resources/animation.gif) -## Open Issues +The look is similar on all platforms as it uses the [Rust egui](https://github.com/emilk/egui) GUI library. -- [ ] check for unused dependencies in each crate -- [ ] `pub use ...` some deps in core so that they don't need to be re-imported -- [ ] check the feature.lru to see if it compiles without LRU -- [ ] build static linux binary via docker: Via Github Actions? -- [ ] try to build a static windows binary: Via Github Actions? -- [ ] try to build a macos binary: Via Github Actions? -- [ ] Demo Video -- [ ] Documentation -- [ ] wasm build? +## Videos +Here's a video showing the UI in action (e.g. me selecting some mail clusters) -## Windows Issues + -- [ ] No Outlook support -- [ ] The `apple importer` fails -- [ ] Very much untested (it does run though) +Here's another video where you can see the importer importer 650k mails (it is a bit boring but.) -## Future Options + -- [ ] Add additional UI based on Druid, Iced or Native Cocoa -- [ ] maybe add blocking versions of the calls, too (in model) -- [ ] abstract over `Fields` and backend to have a generic way to display groupable information -- [ ] apply the window changes (no status etc) on startup, not just when loading main -- [ ] split up into multiple crates -- [ ] action when clicking an email? -- [ ] support light theme -- [ ] allow diving into splits/segments until there're no gropu bys anymore, but the last split can be opened full (to see the mails) -- [ ] remove unneeded dependencies and features -- [ ] add support for generating mail deletion rules based on the visible mails -- [ ] support more mail formats: - - [ ] outlook - - [ ] notmuch - - [ ] maildir +## Web Demo -## Development +In addition to that, you can also [play around with some fake data in this Postsack Web Demo](https://terhech.de/postsack_demo) -Generate a macOS bundle (requires [Cargo Bundle](https://github.com/burtonageo/cargo-bundle)) +## Using It -``` sh -./build_mac.sh -``` +Currently, Postsack supports three different types of mail storage: -### Linux Dependencies +- MBox files [though with some issues](https://github.com/terhechte/postsack/issues/19) +- Apple Mail +- [GMVault](http://gmvault.org) GMail backups [though that could also be included natively](https://github.com/terhechte/postsack/issues/23) -In order to build (and or run) on Linux, the following dependencies are needed: +There're open issues for other formats such as [maildir](https://github.com/terhechte/postsack/issues/18), [notmuch](https://github.com/terhechte/postsack/issues/17) or [Outlook](https://github.com/terhechte/postsack/issues/3) but if you use one of these formats your best bet would be to export your emails as MBox which seems to be something most mail apps support. Alternatively, I'd be more than happy for PR's implementing additional Mail Storage Formats. -#### Fedora +If you have mails in any of the archives above, you can start Postsack select the folder with the emails +and it will do the rest. -``` sh -# Development -sudo dnf install @development-tools glib cairo-devel pango-devel gdk-pixbux2-devel atk-devel gtk3 gtk3-devel libsqlite3x-devel -``` +## Current State -#### Ubuntu +I've created issues for some of the missing functionality. Most importantly more email formats (as explained above). Beyond that, there're probably bugs, there's a certain lack of tests and documentation, the [windows build is on shaky grounds](https://github.com/terhechte/postsack/issues/20), [the light theme is wonky](https://github.com/terhechte/postsack/issues/15), [some parts need a healthy refactoring to be useful beyond Postsack](https://github.com/terhechte/postsack/issues/11), [and it would be great if the binaries could be generated from the Github actions](https://github.com/terhechte/postsack/issues/10). -``` sh -# Development -sudo apt-get install libxcb-render0-dev libxcb-shape0-dev libxcb-xfixes0-dev libspeechd-dev libxkbcommon-dev libssl-dev libsqlite3-dev -``` +## Deploying to crates.io for `cargo install` +One of the issues I ran into was that many of the emails I had received over the years were not +properly standards compliant. I forked [`email-parser`, email parser that Postsack is using](https://github.com/Mubelotix/email-parser/pull/11) in order to support all the weird issues I encountered. However, +this PR still needs a couple of enhancements before it will be merged into `email-parser`. Therefore, +Postsack is currently dependent on a fork of a crates.io crate. This means that I can't deploy this to +crates yet. Once The aforementioned pull request has been improved and merged, I will subsequently draft +a crates.io release. -#### Windows +## Overview -Windows supor is a bit shaky. +Here's an overview of the different crates in the Postsack Workspace: -- [cargo bundle](https://github.com/burtonageo/cargo-bundle/issues/77) doesn't currently work on Windows -- [cargo wix should work](https://github.com/volks73/cargo-wix), but I could not get it to work -- `cargo build --release` works, but then the binary has no icon. \ No newline at end of file +- [ps-core](ps-core/src/lib.rs): Core types, traits and imports +- [ps-importer](ps-importer/src/lib.rs): Import different email formats into a database +- [ps-database](ps-database/src/lib.rs): Implemts the `ps-core::DatabaseLike` trait on top of SQLite +- [postsack-native](postsack-native): Builds the native (macOS, Linux, Windows) versions of Postsack +- [postsack-web](postsack-web): Builds [the web demo of Postsack](https://terhech.de/postsack_demo) diff --git a/postsack-native/tests/test_engine.rs b/postsack-native/tests/test_engine.rs index 0fb3eda..91e6b81 100644 --- a/postsack-native/tests/test_engine.rs +++ b/postsack-native/tests/test_engine.rs @@ -1,7 +1,7 @@ use ps_core::{ self, model::{self, Engine, Rect}, - Config, DatabaseLike, DatabaseQuery, Field, Filter, FormatType, Importerlike, ValueField, + Config, DatabaseLike, Field, Filter, FormatType, Importerlike, ValueField, }; use ps_database::Database; use ps_importer::mbox_importer; diff --git a/ps-core/src/importer.rs b/ps-core/src/importer.rs index 811124d..41ae326 100644 --- a/ps-core/src/importer.rs +++ b/ps-core/src/importer.rs @@ -23,7 +23,7 @@ pub enum Message { ReadTotal(usize), /// Whenever an item out of the total is read, this message will be emitted ReadOne, - /// Similar to [`ReadTotal`] + /// Similar to [`Message::ReadTotal`] WriteTotal(usize), /// Similar to `ReadOne` WriteOne, diff --git a/ps-core/src/lib.rs b/ps-core/src/lib.rs index 4830697..0d1f76d 100644 --- a/ps-core/src/lib.rs +++ b/ps-core/src/lib.rs @@ -1,3 +1,48 @@ +//! # Core types, traits and imports +//! +//! This crate is responsible for the core query constructs and core types. +//! It also re-exports the crates that need to be used in the rest of the workspace. +//! +//! It contains the following modules: +//! +//! ## database +//! +//! Query and Query result abstractions. These are the types that are used by the +//! gui to request data from the database and from the database to send the data +//! back to the gui. +//! Also, the required traits to implement a generic database type for the importer +//! and the gui. +//! +//! ## importer +//! +//! Types and traits that define how a data importer works. Types conforming to these +//! traits are used in the gui to import data into a database. +//! +//! ## model +//! +//! All the functionality related to the view related model requirements. Query data, +//! generate 2d segmentation rectangles out of the data, set filters, select segments, +//! basically all the processing of the data. +//! +//! ## message_adapter +//! +//! A abstraction on top of any `importer` to simplify using them. +//! +//! ## types +//! +//! Multiple types which are needed across the codebase, such as the `Configuration` or +//! the representation of an email. +//! +//! # Usage +//! +//! The core library itself needs a database and an importer to be useful. Once these +//! types exist, core will use the importer to fill the database and then the types +//! in `model` (e.g. `engine.rs`) can be used to perform segmentations of the data. +//! +//! Currently the model / engine is implemented in a non-intuitive async (not in the +//! Rust async way) way but this is due to finding a solution that would work with +//! egui. See: + mod database; mod importer; pub mod message_adapter; diff --git a/ps-core/src/message_adapter.rs b/ps-core/src/message_adapter.rs index 9af96d2..82b1461 100644 --- a/ps-core/src/message_adapter.rs +++ b/ps-core/src/message_adapter.rs @@ -33,7 +33,7 @@ pub struct State { pub missing_permissions: bool, } -/// This can be initialized with a [`MessageSender`] and it will +/// This can be initialized with a [`crate::MessageSender`] and it will /// automatically tally up the information into a thread-safe /// datastructure pub struct Adapter { @@ -57,7 +57,7 @@ impl Adapter { } /// Starts up a thread that handles the `MessageReceiver` messages - /// into state that can be accessed via [`read_count`], [`write_count`] and [`finished`] + /// into state that can be accessed via [`Adapter::read_count`], [`Adapter::write_count`] and [`Adapter::finished`] pub fn process( &self, database: Database, diff --git a/ps-core/src/model/engine.rs b/ps-core/src/model/engine.rs index 31559ed..a390b36 100644 --- a/ps-core/src/model/engine.rs +++ b/ps-core/src/model/engine.rs @@ -2,8 +2,8 @@ //! displayed in Segmentations. //! See [`Engine`] for more information. //! See also: -//! - [`segmentations::`] -//! - [`items::`] +//! - [`crate::model::segmentations`] +//! - [`crate::model::items`] use eyre::{bail, Result}; use lru::LruCache; @@ -139,7 +139,7 @@ impl Engine { } /// Pop the current `Segmentation` from the stack. - /// The opposite of [`engine::push`] + /// The opposite of [`crate::model::Engine::push`] pub fn pop(&mut self) { if self.group_by_stack.is_empty() || self.segmentations.is_empty() @@ -169,7 +169,7 @@ impl Engine { } /// Call this continously to retrieve calculation results and apply them. - /// Any mutating function on [`Engine`], such as [`Engine::push`] or [`items::items`] + /// Any mutating function on [`crate::model::Engine`], such as [`crate::model::Engine::push`] or [`crate::model::items`] /// require calling this method to apply there results once they're /// available from the asynchronous backend. /// This method is specifically non-blocking for usage in diff --git a/ps-core/src/model/mod.rs b/ps-core/src/model/mod.rs index ab50f27..b3dd01f 100644 --- a/ps-core/src/model/mod.rs +++ b/ps-core/src/model/mod.rs @@ -5,4 +5,4 @@ pub mod segmentations; mod types; pub use engine::Engine; -pub use types::{Rect, Segment}; +pub use types::{Rect, Segment, Segmentation}; diff --git a/ps-core/src/model/segmentations.rs b/ps-core/src/model/segmentations.rs index 1c9d8e7..0cc33d8 100644 --- a/ps-core/src/model/segmentations.rs +++ b/ps-core/src/model/segmentations.rs @@ -2,16 +2,21 @@ //! //! `Segmentations` are collections of `Segments` based on an aggregation of `Items`. //! -//! A `Segmentation` can be changed to be aggregated on a different `Field. -//! - [`aggregations`] -//! - [`aggregated_by`] -//! - [`set_aggregation`] +//! A `Segmentation` can be changed to be aggregated on a different `Field`. +//! +//! - [`crate::model::segmentations::aggregation_fields`] +//! - [`crate::model::segmentations::aggregated_by`] +//! - [`crate::model::segmentations::set_aggregation`] +//! //! A `Segmentation` can be changed to only return a `Range` of segments. -//! - [`current_range`] -//! - [`set_current_range`] -//! A `Segmentation` has multiple `Segments` which each can be layouted +//! +//! - [`crate::model::segmentations::segments_range`] +//! - [`crate::model::segmentations::set_segments_range`] +//! +//! A [`crate::model::Segmentation`] has multiple [`crate::model::Segment`]s which each can be layouted //! to fit into a rectangle. -//! - [`layouted_segments] +//! +//! - [`crate::model::segmentations::layouted_segments] use eyre::{eyre, Result}; @@ -140,7 +145,7 @@ pub fn aggregated_by(engine: &Engine) -> Vec { /// The `Aggregation` will identify the `Segmentation` to use. So this function /// can be used to change the way a `Segmentation` is the aggregated. /// -/// Retrieve the available aggregations with [`segmentation::aggregated_by`]. +/// Retrieve the available aggregations with [`crate::model::segmentations::aggregated_by`]. /// /// # Arguments /// @@ -179,8 +184,7 @@ pub fn layouted_segments(engine: &mut Engine, bounds: types::Rect) -> Option<&[S Some(segmentation.items()) } -/// Can another level of aggregation be performed? Based on -/// [`Engine::default_group_by_stack`] +/// Can another level of aggregation be performed? pub fn can_aggregate_more(engine: &Engine) -> bool { let index = engine.group_by_stack.len(); super::engine::default_group_by_stack(index).is_some() diff --git a/ps-database/src/lib.rs b/ps-database/src/lib.rs index e6b86d1..48864dc 100644 --- a/ps-database/src/lib.rs +++ b/ps-database/src/lib.rs @@ -1,3 +1,7 @@ +//! # Database +//! +//! A implementation of the [`ps_core::DatabaseLike`] trait on top of SQLite. + mod conversion; mod db; mod sql; diff --git a/ps-gui/src/widgets/color_utils.rs b/ps-gui/src/widgets/color_utils.rs index b28bbba..b353c68 100644 --- a/ps-gui/src/widgets/color_utils.rs +++ b/ps-gui/src/widgets/color_utils.rs @@ -1,5 +1,5 @@ //! Some functions to generate beautiful colors based on a hash. -//! loosely based on https://github.com/BrandtM/colourado/blob/master/src/lib.rs +//! loosely based on use eframe::egui::Color32; diff --git a/ps-gui/src/widgets/popover.rs b/ps-gui/src/widgets/popover.rs index fd30e2b..3efd9d4 100644 --- a/ps-gui/src/widgets/popover.rs +++ b/ps-gui/src/widgets/popover.rs @@ -1,6 +1,6 @@ //! A popover is a popup that only closes when clicking outside. //! It is lifted from: -//! https://github.com/emilk/egui/blob/a1bf5aff47a7f6f3d698e6ccfb7b62b65ef2de5b/egui/src/widgets/color_picker.rs +//! //! Line 355. //! use eframe::egui::{self, Id, Response, Ui}; diff --git a/ps-gui/src/widgets/table.rs b/ps-gui/src/widgets/table.rs index 69fe0a8..9b4c996 100644 --- a/ps-gui/src/widgets/table.rs +++ b/ps-gui/src/widgets/table.rs @@ -1,4 +1,4 @@ -//! This is a fork of `https://raw.githubusercontent.com/sagebind/smplinfo/master/src/ui/widgets/table.rs` with +//! This is a fork of with //! some modifications. use std::ops::Range; diff --git a/ps-importer/src/formats/mbox/mod.rs b/ps-importer/src/formats/mbox/mod.rs index 84929fe..f307a64 100644 --- a/ps-importer/src/formats/mbox/mod.rs +++ b/ps-importer/src/formats/mbox/mod.rs @@ -1,9 +1,3 @@ -//! FIXME: Implement our own Mailbox reader that better implements the spec. -//! use jetsci for efficient searching: -//! https://github.com/shepmaster/jetscii -//! (or aho corasick) -//! MBox parsing is also not particularly fast as it currently doesn't use parallelism - use ps_core::eyre::eyre; use ps_core::tracing; use rayon::prelude::*; diff --git a/ps-importer/src/lib.rs b/ps-importer/src/lib.rs index bb985c9..5c70ece 100644 --- a/ps-importer/src/lib.rs +++ b/ps-importer/src/lib.rs @@ -1,3 +1,24 @@ +//! # Importer +//! +//! This crate is responsible for importing different email formats (or email storage formats) +//! by reading and parsing the data and writing it into a database (which is defined as a +//! generic type but most probably the `ps-database` module). +//! +//! Currently, the importer requires the construction of a specific type as well as the +//! configuration of the importer format in a configuration. +//! +//! ``` rs +//! https://github.com/terhechte/postsack/issues/11 +//! let path = "tests/resources/mbox"; +//! let config = +//! ps_core::Config::new(None, path, vec!["".to_string()], ps_core::FormatType::Mbox).expect("Config"); +//! let importer = mbox_importer(config.clone()); +//! +//! // Next, crate a database and run the importer +//! // let database = Database::new(&config.database_path).unwrap(); +//! // let (_receiver, handle) = importer.import(database).unwrap(); +//! ``` + use ps_core::eyre::Result; pub(crate) mod formats; diff --git a/resources/animation.gif b/resources/animation.gif new file mode 100644 index 0000000..8a0463f Binary files /dev/null and b/resources/animation.gif differ diff --git a/resources/github_logo.png b/resources/github_logo.png new file mode 100644 index 0000000..95aabea Binary files /dev/null and b/resources/github_logo.png differ diff --git a/resources/postsack_importer_video.mp4 b/resources/postsack_importer_video.mp4 new file mode 100644 index 0000000..4e2e9e7 Binary files /dev/null and b/resources/postsack_importer_video.mp4 differ diff --git a/resources/postsack_video.mp4 b/resources/postsack_video.mp4 new file mode 100644 index 0000000..54518b2 Binary files /dev/null and b/resources/postsack_video.mp4 differ diff --git a/resources/screenshot_app.jpg b/resources/screenshot_app.jpg new file mode 100644 index 0000000..4be395b Binary files /dev/null and b/resources/screenshot_app.jpg differ diff --git a/resources/screenshot_importer.jpg b/resources/screenshot_importer.jpg new file mode 100644 index 0000000..db82c2c Binary files /dev/null and b/resources/screenshot_importer.jpg differ