diff --git a/README.md b/README.md index 442ca47..c96ac24 100644 --- a/README.md +++ b/README.md @@ -1,79 +1,87 @@ +
+ +
+ +[![license](https://shields.io/badge/license-MIT-green)](https://github.com/terhechte/postsack/blob/main/LICENSE.md) +![Rust CI](https://github.com/terhechte/postsack/actions/workflows/rust.yml/badge.svg) + + # Postsack -## Provides a high level visual overview of swaths of email +## A high level visual overview of swaths of email + +[TLDR! A web demo that shows how Postsack clusters a set of 10.000 fake emails](https://terhech.de/postsack_demo) + +Do you have a lot of mail? I have a lot of mail. There're inbox zero or archive zero people. I'm not one of them. I recently realized that my gmail account contains roughly 650.000 emails. I looked at that and I began +to wonder.. *Why?*.. +Sure, I've been using Gmail since 2004 but still, that's 38.000 Emails per year which strikes me as a bit +on the crazy side of things. Curious as I am I wanted to figure out where all these mails came from. + +Turns out that is a tricky problem because Gmail doesn't offer any tooling for such a use case. Hence I +build my own. This tool (lovingly named `Postsack` which is German for a bag full of mail) parses all your +emails and allows you to dig into them with a treemap (see screenshot below): + +- Import all your local mails (currently, only MBox, Apple Mail and Gmail Backups are supported) +- Build up clustered visualizations of your mails to see and understand what kind of emails you have +- Cluster the emails by sender domain / name, month, day, year, name, and some more +- Additional filters for seen mails or tags / labels +- See all the mails for the current set of filters / current cluster +- Save the generated database as a SQLite file so you can do additional queries yourself (or open it again) +- Very fast email parsing / import. My **650k mails are imported in ~1 Minute** on a Macbook M1 Pro Max and ~ 2 Minutes on a Intel Core i7-8700B 3.2 Ghz. +- Cross platform (macOS, Windows, Linux and a [Web Demo](https://terhech.de/postsack_demo)) -### Performance +## It looks like this -It currently parses 632383 emails in ~160 seconds, so roughly `4.000` emails per second. This excludes (for now) attachments. -Update: It currently parses 632115 emails in ~56 seconds, so roughly `11.000` emails per second. This excludes (for now) attachments. (on M1) +![Example](resources/animation.gif) -## Open Issues +The look is similar on all platforms as it uses the [Rust egui](https://github.com/emilk/egui) GUI library. -- [ ] check for unused dependencies in each crate -- [ ] `pub use ...` some deps in core so that they don't need to be re-imported -- [ ] check the feature.lru to see if it compiles without LRU -- [ ] build static linux binary via docker: Via Github Actions? -- [ ] try to build a static windows binary: Via Github Actions? -- [ ] try to build a macos binary: Via Github Actions? -- [ ] Demo Video -- [ ] Documentation -- [ ] wasm build? +## Videos +Here's a video showing the UI in action (e.g. me selecting some mail clusters) -## Windows Issues + -- [ ] No Outlook support -- [ ] The `apple importer` fails -- [ ] Very much untested (it does run though) +Here's another video where you can see the importer importer 650k mails (it is a bit boring but.) -## Future Options + -- [ ] Add additional UI based on Druid, Iced or Native Cocoa -- [ ] maybe add blocking versions of the calls, too (in model) -- [ ] abstract over `Fields` and backend to have a generic way to display groupable information -- [ ] apply the window changes (no status etc) on startup, not just when loading main -- [ ] split up into multiple crates -- [ ] action when clicking an email? -- [ ] support light theme -- [ ] allow diving into splits/segments until there're no gropu bys anymore, but the last split can be opened full (to see the mails) -- [ ] remove unneeded dependencies and features -- [ ] add support for generating mail deletion rules based on the visible mails -- [ ] support more mail formats: - - [ ] outlook - - [ ] notmuch - - [ ] maildir +## Web Demo -## Development +In addition to that, you can also [play around with some fake data in this Postsack Web Demo](https://terhech.de/postsack_demo) -Generate a macOS bundle (requires [Cargo Bundle](https://github.com/burtonageo/cargo-bundle)) +## Using It -``` sh -./build_mac.sh -``` +Currently, Postsack supports three different types of mail storage: -### Linux Dependencies +- MBox files [though with some issues](https://github.com/terhechte/postsack/issues/19) +- Apple Mail +- [GMVault](http://gmvault.org) GMail backups [though that could also be included natively](https://github.com/terhechte/postsack/issues/23) -In order to build (and or run) on Linux, the following dependencies are needed: +There're open issues for other formats such as [maildir](https://github.com/terhechte/postsack/issues/18), [notmuch](https://github.com/terhechte/postsack/issues/17) or [Outlook](https://github.com/terhechte/postsack/issues/3) but if you use one of these formats your best bet would be to export your emails as MBox which seems to be something most mail apps support. Alternatively, I'd be more than happy for PR's implementing additional Mail Storage Formats. -#### Fedora +If you have mails in any of the archives above, you can start Postsack select the folder with the emails +and it will do the rest. -``` sh -# Development -sudo dnf install @development-tools glib cairo-devel pango-devel gdk-pixbux2-devel atk-devel gtk3 gtk3-devel libsqlite3x-devel -``` +## Current State -#### Ubuntu +I've created issues for some of the missing functionality. Most importantly more email formats (as explained above). Beyond that, there're probably bugs, there's a certain lack of tests and documentation, the [windows build is on shaky grounds](https://github.com/terhechte/postsack/issues/20), [the light theme is wonky](https://github.com/terhechte/postsack/issues/15), [some parts need a healthy refactoring to be useful beyond Postsack](https://github.com/terhechte/postsack/issues/11), [and it would be great if the binaries could be generated from the Github actions](https://github.com/terhechte/postsack/issues/10). -``` sh -# Development -sudo apt-get install libxcb-render0-dev libxcb-shape0-dev libxcb-xfixes0-dev libspeechd-dev libxkbcommon-dev libssl-dev libsqlite3-dev -``` +## Deploying to crates.io for `cargo install` +One of the issues I ran into was that many of the emails I had received over the years were not +properly standards compliant. I forked [`email-parser`, email parser that Postsack is using](https://github.com/Mubelotix/email-parser/pull/11) in order to support all the weird issues I encountered. However, +this PR still needs a couple of enhancements before it will be merged into `email-parser`. Therefore, +Postsack is currently dependent on a fork of a crates.io crate. This means that I can't deploy this to +crates yet. Once The aforementioned pull request has been improved and merged, I will subsequently draft +a crates.io release. -#### Windows +## Overview -Windows supor is a bit shaky. +Here's an overview of the different crates in the Postsack Workspace: -- [cargo bundle](https://github.com/burtonageo/cargo-bundle/issues/77) doesn't currently work on Windows -- [cargo wix should work](https://github.com/volks73/cargo-wix), but I could not get it to work -- `cargo build --release` works, but then the binary has no icon. \ No newline at end of file +- [ps-core](ps-core/src/lib.rs): Core types, traits and imports +- [ps-importer](ps-importer/src/lib.rs): Import different email formats into a database +- [ps-database](ps-database/src/lib.rs): Implemts the `ps-core::DatabaseLike` trait on top of SQLite +- [postsack-native](postsack-native): Builds the native (macOS, Linux, Windows) versions of Postsack +- [postsack-web](postsack-web): Builds [the web demo of Postsack](https://terhech.de/postsack_demo) diff --git a/postsack-native/tests/test_engine.rs b/postsack-native/tests/test_engine.rs index 0fb3eda..91e6b81 100644 --- a/postsack-native/tests/test_engine.rs +++ b/postsack-native/tests/test_engine.rs @@ -1,7 +1,7 @@ use ps_core::{ self, model::{self, Engine, Rect}, - Config, DatabaseLike, DatabaseQuery, Field, Filter, FormatType, Importerlike, ValueField, + Config, DatabaseLike, Field, Filter, FormatType, Importerlike, ValueField, }; use ps_database::Database; use ps_importer::mbox_importer; diff --git a/ps-core/src/importer.rs b/ps-core/src/importer.rs index 811124d..41ae326 100644 --- a/ps-core/src/importer.rs +++ b/ps-core/src/importer.rs @@ -23,7 +23,7 @@ pub enum Message { ReadTotal(usize), /// Whenever an item out of the total is read, this message will be emitted ReadOne, - /// Similar to [`ReadTotal`] + /// Similar to [`Message::ReadTotal`] WriteTotal(usize), /// Similar to `ReadOne` WriteOne, diff --git a/ps-core/src/lib.rs b/ps-core/src/lib.rs index 4830697..0d1f76d 100644 --- a/ps-core/src/lib.rs +++ b/ps-core/src/lib.rs @@ -1,3 +1,48 @@ +//! # Core types, traits and imports +//! +//! This crate is responsible for the core query constructs and core types. +//! It also re-exports the crates that need to be used in the rest of the workspace. +//! +//! It contains the following modules: +//! +//! ## database +//! +//! Query and Query result abstractions. These are the types that are used by the +//! gui to request data from the database and from the database to send the data +//! back to the gui. +//! Also, the required traits to implement a generic database type for the importer +//! and the gui. +//! +//! ## importer +//! +//! Types and traits that define how a data importer works. Types conforming to these +//! traits are used in the gui to import data into a database. +//! +//! ## model +//! +//! All the functionality related to the view related model requirements. Query data, +//! generate 2d segmentation rectangles out of the data, set filters, select segments, +//! basically all the processing of the data. +//! +//! ## message_adapter +//! +//! A abstraction on top of any `importer` to simplify using them. +//! +//! ## types +//! +//! Multiple types which are needed across the codebase, such as the `Configuration` or +//! the representation of an email. +//! +//! # Usage +//! +//! The core library itself needs a database and an importer to be useful. Once these +//! types exist, core will use the importer to fill the database and then the types +//! in `model` (e.g. `engine.rs`) can be used to perform segmentations of the data. +//! +//! Currently the model / engine is implemented in a non-intuitive async (not in the +//! Rust async way) way but this is due to finding a solution that would work with +//! egui. See: