Documentation (#24)

Improved Documentation
pull/30/head
Benedikt Terhechte 2 years ago committed by GitHub
parent e8f9d9fdb5
commit daf650bab9
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

@ -1,79 +1,87 @@
<p align="center">
<img src="resources/github_logo.png" width="241" height="287" />
</p>
[![license](https://shields.io/badge/license-MIT-green)](https://github.com/terhechte/postsack/blob/main/LICENSE.md)
![Rust CI](https://github.com/terhechte/postsack/actions/workflows/rust.yml/badge.svg)
# Postsack
## Provides a high level visual overview of swaths of email
## A high level visual overview of swaths of email
[TLDR! A web demo that shows how Postsack clusters a set of 10.000 fake emails](https://terhech.de/postsack_demo)
Do you have a lot of mail? I have a lot of mail. There're inbox zero or archive zero people. I'm not one of them. I recently realized that my gmail account contains roughly 650.000 emails. I looked at that and I began
to wonder.. *Why?*..
Sure, I've been using Gmail since 2004 but still, that's 38.000 Emails per year which strikes me as a bit
on the crazy side of things. Curious as I am I wanted to figure out where all these mails came from.
Turns out that is a tricky problem because Gmail doesn't offer any tooling for such a use case. Hence I
build my own. This tool (lovingly named `Postsack` which is German for a bag full of mail) parses all your
emails and allows you to dig into them with a treemap (see screenshot below):
- Import all your local mails (currently, only MBox, Apple Mail and Gmail Backups are supported)
- Build up clustered visualizations of your mails to see and understand what kind of emails you have
- Cluster the emails by sender domain / name, month, day, year, name, and some more
- Additional filters for seen mails or tags / labels
- See all the mails for the current set of filters / current cluster
- Save the generated database as a SQLite file so you can do additional queries yourself (or open it again)
- Very fast email parsing / import. My **650k mails are imported in ~1 Minute** on a Macbook M1 Pro Max and ~ 2 Minutes on a Intel Core i7-8700B 3.2 Ghz.
- Cross platform (macOS, Windows, Linux and a [Web Demo](https://terhech.de/postsack_demo))
### Performance
## It looks like this
It currently parses 632383 emails in ~160 seconds, so roughly `4.000` emails per second. This excludes (for now) attachments.
Update: It currently parses 632115 emails in ~56 seconds, so roughly `11.000` emails per second. This excludes (for now) attachments. (on M1)
![Example](resources/animation.gif)
## Open Issues
The look is similar on all platforms as it uses the [Rust egui](https://github.com/emilk/egui) GUI library.
- [ ] check for unused dependencies in each crate
- [ ] `pub use ...` some deps in core so that they don't need to be re-imported
- [ ] check the feature.lru to see if it compiles without LRU
- [ ] build static linux binary via docker: Via Github Actions?
- [ ] try to build a static windows binary: Via Github Actions?
- [ ] try to build a macos binary: Via Github Actions?
- [ ] Demo Video
- [ ] Documentation
- [ ] wasm build?
## Videos
Here's a video showing the UI in action (e.g. me selecting some mail clusters)
## Windows Issues
<a href="resources/postsack_video.mp4"><img src="resources/screenshot_app.jpg" width="50%" height="50%" /></a>
- [ ] No Outlook support
- [ ] The `apple importer` fails
- [ ] Very much untested (it does run though)
Here's another video where you can see the importer importer 650k mails (it is a bit boring but.)
## Future Options
<a href="resources/postsack_importer_video.mp4"><img src="resources/screenshot_importer.jpg" width="50%" height="50%" /></a>
- [ ] Add additional UI based on Druid, Iced or Native Cocoa
- [ ] maybe add blocking versions of the calls, too (in model)
- [ ] abstract over `Fields` and backend to have a generic way to display groupable information
- [ ] apply the window changes (no status etc) on startup, not just when loading main
- [ ] split up into multiple crates
- [ ] action when clicking an email?
- [ ] support light theme
- [ ] allow diving into splits/segments until there're no gropu bys anymore, but the last split can be opened full (to see the mails)
- [ ] remove unneeded dependencies and features
- [ ] add support for generating mail deletion rules based on the visible mails
- [ ] support more mail formats:
- [ ] outlook
- [ ] notmuch
- [ ] maildir
## Web Demo
## Development
In addition to that, you can also [play around with some fake data in this Postsack Web Demo](https://terhech.de/postsack_demo)
Generate a macOS bundle (requires [Cargo Bundle](https://github.com/burtonageo/cargo-bundle))
## Using It
``` sh
./build_mac.sh
```
Currently, Postsack supports three different types of mail storage:
### Linux Dependencies
- MBox files [though with some issues](https://github.com/terhechte/postsack/issues/19)
- Apple Mail
- [GMVault](http://gmvault.org) GMail backups [though that could also be included natively](https://github.com/terhechte/postsack/issues/23)
In order to build (and or run) on Linux, the following dependencies are needed:
There're open issues for other formats such as [maildir](https://github.com/terhechte/postsack/issues/18), [notmuch](https://github.com/terhechte/postsack/issues/17) or [Outlook](https://github.com/terhechte/postsack/issues/3) but if you use one of these formats your best bet would be to export your emails as MBox which seems to be something most mail apps support. Alternatively, I'd be more than happy for PR's implementing additional Mail Storage Formats.
#### Fedora
If you have mails in any of the archives above, you can start Postsack select the folder with the emails
and it will do the rest.
``` sh
# Development
sudo dnf install @development-tools glib cairo-devel pango-devel gdk-pixbux2-devel atk-devel gtk3 gtk3-devel libsqlite3x-devel
```
## Current State
#### Ubuntu
I've created issues for some of the missing functionality. Most importantly more email formats (as explained above). Beyond that, there're probably bugs, there's a certain lack of tests and documentation, the [windows build is on shaky grounds](https://github.com/terhechte/postsack/issues/20), [the light theme is wonky](https://github.com/terhechte/postsack/issues/15), [some parts need a healthy refactoring to be useful beyond Postsack](https://github.com/terhechte/postsack/issues/11), [and it would be great if the binaries could be generated from the Github actions](https://github.com/terhechte/postsack/issues/10).
``` sh
# Development
sudo apt-get install libxcb-render0-dev libxcb-shape0-dev libxcb-xfixes0-dev libspeechd-dev libxkbcommon-dev libssl-dev libsqlite3-dev
```
## Deploying to crates.io for `cargo install`
One of the issues I ran into was that many of the emails I had received over the years were not
properly standards compliant. I forked [`email-parser`, email parser that Postsack is using](https://github.com/Mubelotix/email-parser/pull/11) in order to support all the weird issues I encountered. However,
this PR still needs a couple of enhancements before it will be merged into `email-parser`. Therefore,
Postsack is currently dependent on a fork of a crates.io crate. This means that I can't deploy this to
crates yet. Once The aforementioned pull request has been improved and merged, I will subsequently draft
a crates.io release.
#### Windows
## Overview
Windows supor is a bit shaky.
Here's an overview of the different crates in the Postsack Workspace:
- [cargo bundle](https://github.com/burtonageo/cargo-bundle/issues/77) doesn't currently work on Windows
- [cargo wix should work](https://github.com/volks73/cargo-wix), but I could not get it to work
- `cargo build --release` works, but then the binary has no icon.
- [ps-core](ps-core/src/lib.rs): Core types, traits and imports
- [ps-importer](ps-importer/src/lib.rs): Import different email formats into a database
- [ps-database](ps-database/src/lib.rs): Implemts the `ps-core::DatabaseLike` trait on top of SQLite
- [postsack-native](postsack-native): Builds the native (macOS, Linux, Windows) versions of Postsack
- [postsack-web](postsack-web): Builds [the web demo of Postsack](https://terhech.de/postsack_demo)

@ -1,7 +1,7 @@
use ps_core::{
self,
model::{self, Engine, Rect},
Config, DatabaseLike, DatabaseQuery, Field, Filter, FormatType, Importerlike, ValueField,
Config, DatabaseLike, Field, Filter, FormatType, Importerlike, ValueField,
};
use ps_database::Database;
use ps_importer::mbox_importer;

@ -23,7 +23,7 @@ pub enum Message {
ReadTotal(usize),
/// Whenever an item out of the total is read, this message will be emitted
ReadOne,
/// Similar to [`ReadTotal`]
/// Similar to [`Message::ReadTotal`]
WriteTotal(usize),
/// Similar to `ReadOne`
WriteOne,

@ -1,3 +1,48 @@
//! # Core types, traits and imports
//!
//! This crate is responsible for the core query constructs and core types.
//! It also re-exports the crates that need to be used in the rest of the workspace.
//!
//! It contains the following modules:
//!
//! ## database
//!
//! Query and Query result abstractions. These are the types that are used by the
//! gui to request data from the database and from the database to send the data
//! back to the gui.
//! Also, the required traits to implement a generic database type for the importer
//! and the gui.
//!
//! ## importer
//!
//! Types and traits that define how a data importer works. Types conforming to these
//! traits are used in the gui to import data into a database.
//!
//! ## model
//!
//! All the functionality related to the view related model requirements. Query data,
//! generate 2d segmentation rectangles out of the data, set filters, select segments,
//! basically all the processing of the data.
//!
//! ## message_adapter
//!
//! A abstraction on top of any `importer` to simplify using them.
//!
//! ## types
//!
//! Multiple types which are needed across the codebase, such as the `Configuration` or
//! the representation of an email.
//!
//! # Usage
//!
//! The core library itself needs a database and an importer to be useful. Once these
//! types exist, core will use the importer to fill the database and then the types
//! in `model` (e.g. `engine.rs`) can be used to perform segmentations of the data.
//!
//! Currently the model / engine is implemented in a non-intuitive async (not in the
//! Rust async way) way but this is due to finding a solution that would work with
//! egui. See: <https://github.com/terhechte/postsack/issues/11>
mod database;
mod importer;
pub mod message_adapter;

@ -33,7 +33,7 @@ pub struct State {
pub missing_permissions: bool,
}
/// This can be initialized with a [`MessageSender`] and it will
/// This can be initialized with a [`crate::MessageSender`] and it will
/// automatically tally up the information into a thread-safe
/// datastructure
pub struct Adapter {
@ -57,7 +57,7 @@ impl Adapter {
}
/// Starts up a thread that handles the `MessageReceiver` messages
/// into state that can be accessed via [`read_count`], [`write_count`] and [`finished`]
/// into state that can be accessed via [`Adapter::read_count`], [`Adapter::write_count`] and [`Adapter::finished`]
pub fn process<Database: DatabaseLike + 'static, Importer: Importerlike + 'static>(
&self,
database: Database,

@ -2,8 +2,8 @@
//! displayed in Segmentations.
//! See [`Engine`] for more information.
//! See also:
//! - [`segmentations::`]
//! - [`items::`]
//! - [`crate::model::segmentations`]
//! - [`crate::model::items`]
use eyre::{bail, Result};
use lru::LruCache;
@ -139,7 +139,7 @@ impl Engine {
}
/// Pop the current `Segmentation` from the stack.
/// The opposite of [`engine::push`]
/// The opposite of [`crate::model::Engine::push`]
pub fn pop(&mut self) {
if self.group_by_stack.is_empty()
|| self.segmentations.is_empty()
@ -169,7 +169,7 @@ impl Engine {
}
/// Call this continously to retrieve calculation results and apply them.
/// Any mutating function on [`Engine`], such as [`Engine::push`] or [`items::items`]
/// Any mutating function on [`crate::model::Engine`], such as [`crate::model::Engine::push`] or [`crate::model::items`]
/// require calling this method to apply there results once they're
/// available from the asynchronous backend.
/// This method is specifically non-blocking for usage in

@ -5,4 +5,4 @@ pub mod segmentations;
mod types;
pub use engine::Engine;
pub use types::{Rect, Segment};
pub use types::{Rect, Segment, Segmentation};

@ -2,16 +2,21 @@
//!
//! `Segmentations` are collections of `Segments` based on an aggregation of `Items`.
//!
//! A `Segmentation` can be changed to be aggregated on a different `Field.
//! - [`aggregations`]
//! - [`aggregated_by`]
//! - [`set_aggregation`]
//! A `Segmentation` can be changed to be aggregated on a different `Field`.
//!
//! - [`crate::model::segmentations::aggregation_fields`]
//! - [`crate::model::segmentations::aggregated_by`]
//! - [`crate::model::segmentations::set_aggregation`]
//!
//! A `Segmentation` can be changed to only return a `Range` of segments.
//! - [`current_range`]
//! - [`set_current_range`]
//! A `Segmentation` has multiple `Segments` which each can be layouted
//!
//! - [`crate::model::segmentations::segments_range`]
//! - [`crate::model::segmentations::set_segments_range`]
//!
//! A [`crate::model::Segmentation`] has multiple [`crate::model::Segment`]s which each can be layouted
//! to fit into a rectangle.
//! - [`layouted_segments]
//!
//! - [`crate::model::segmentations::layouted_segments]
use eyre::{eyre, Result};
@ -140,7 +145,7 @@ pub fn aggregated_by(engine: &Engine) -> Vec<Aggregation> {
/// The `Aggregation` will identify the `Segmentation` to use. So this function
/// can be used to change the way a `Segmentation` is the aggregated.
///
/// Retrieve the available aggregations with [`segmentation::aggregated_by`].
/// Retrieve the available aggregations with [`crate::model::segmentations::aggregated_by`].
///
/// # Arguments
///
@ -179,8 +184,7 @@ pub fn layouted_segments(engine: &mut Engine, bounds: types::Rect) -> Option<&[S
Some(segmentation.items())
}
/// Can another level of aggregation be performed? Based on
/// [`Engine::default_group_by_stack`]
/// Can another level of aggregation be performed?
pub fn can_aggregate_more(engine: &Engine) -> bool {
let index = engine.group_by_stack.len();
super::engine::default_group_by_stack(index).is_some()

@ -1,3 +1,7 @@
//! # Database
//!
//! A implementation of the [`ps_core::DatabaseLike`] trait on top of SQLite.
mod conversion;
mod db;
mod sql;

@ -1,5 +1,5 @@
//! Some functions to generate beautiful colors based on a hash.
//! loosely based on https://github.com/BrandtM/colourado/blob/master/src/lib.rs
//! loosely based on <https://github.com/BrandtM/colourado/blob/master/src/lib.rs>
use eframe::egui::Color32;

@ -1,6 +1,6 @@
//! A popover is a popup that only closes when clicking outside.
//! It is lifted from:
//! https://github.com/emilk/egui/blob/a1bf5aff47a7f6f3d698e6ccfb7b62b65ef2de5b/egui/src/widgets/color_picker.rs
//! <https://github.com/emilk/egui/blob/a1bf5aff47a7f6f3d698e6ccfb7b62b65ef2de5b/egui/src/widgets/color_picker.rs>
//! Line 355.
//!
use eframe::egui::{self, Id, Response, Ui};

@ -1,4 +1,4 @@
//! This is a fork of `https://raw.githubusercontent.com/sagebind/smplinfo/master/src/ui/widgets/table.rs` with
//! This is a fork of <https://raw.githubusercontent.com/sagebind/smplinfo/master/src/ui/widgets/table.rs> with
//! some modifications.
use std::ops::Range;

@ -1,9 +1,3 @@
//! FIXME: Implement our own Mailbox reader that better implements the spec.
//! use jetsci for efficient searching:
//! https://github.com/shepmaster/jetscii
//! (or aho corasick)
//! MBox parsing is also not particularly fast as it currently doesn't use parallelism
use ps_core::eyre::eyre;
use ps_core::tracing;
use rayon::prelude::*;

@ -1,3 +1,24 @@
//! # Importer
//!
//! This crate is responsible for importing different email formats (or email storage formats)
//! by reading and parsing the data and writing it into a database (which is defined as a
//! generic type but most probably the `ps-database` module).
//!
//! Currently, the importer requires the construction of a specific type as well as the
//! configuration of the importer format in a configuration.
//!
//! ``` rs
//! https://github.com/terhechte/postsack/issues/11
//! let path = "tests/resources/mbox";
//! let config =
//! ps_core::Config::new(None, path, vec!["".to_string()], ps_core::FormatType::Mbox).expect("Config");
//! let importer = mbox_importer(config.clone());
//!
//! // Next, crate a database and run the importer
//! // let database = Database::new(&config.database_path).unwrap();
//! // let (_receiver, handle) = importer.import(database).unwrap();
//! ```
use ps_core::eyre::Result;
pub(crate) mod formats;

Binary file not shown.

After

Width:  |  Height:  |  Size: 926 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 168 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 97 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 126 KiB

Loading…
Cancel
Save