Initial work on mbox extractor
parent
5fa777605d
commit
9642552fa3
@ -0,0 +1,217 @@
|
||||
To: submit.t4eseGWSvG1JST3r@spam.spamcop.net
|
||||
From: 2012gdwu <2012gdwu@posteo.de>
|
||||
Subject: Postbank Spam
|
||||
Autocrypt: addr=2012gdwu@posteo.de; keydata=
|
||||
mDMEXXjwiRYJKwYBBAHaRw8BAQdAmjXRazNXXy5tK05Dwl5mSRbdth9JkQq92V/QVyqjdgm0
|
||||
I0FybmUgS2VsbGVyIDxhcm5lLmtlbGxlckBwb3N0ZW8uZGU+iJYEExYIAD4WIQR2UN3HoAGx
|
||||
KI0B7Eih+UCxBQvPLgUCXXjwiQIbAwUJCWYBgAULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAK
|
||||
CRCh+UCxBQvPLpPfAP4gs6Oky3+UO2LU2XxweeQO+YEWXK0QtM2+ajzrGaF3HAD+LBfmyB9+
|
||||
Wom2KP0CwxUzI4d6zmiAMSKOnGGgzd65igm4OARdePCJEgorBgEEAZdVAQUBAQdAncxZ3Rox
|
||||
wmvm+/qCkCm9+PU2HmWr08M3qdqkf2L4IngDAQgHiH4EGBYIACYWIQR2UN3HoAGxKI0B7Eih
|
||||
+UCxBQvPLgUCXXjwiQIbDAUJCWYBgAAKCRCh+UCxBQvPLpQkAQCgYOlOftMNi+sfn+XQvfOc
|
||||
ULQWp+cgOBMcyVCdpJEQCwD9HBuwuHobl8FPm0PbRtlCn/7GY4WK+Hh4+3BKmhRn8wU=
|
||||
Message-ID: <1530ae05-33a7-fa40-9473-ca625a14385a@posteo.de>
|
||||
Date: Mon, 20 Jul 2020 07:35:55 +0200
|
||||
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
|
||||
Thunderbird/68.10.0
|
||||
MIME-Version: 1.0
|
||||
Content-Type: multipart/mixed;
|
||||
boundary="------------6670F92201FB126ED9472803"
|
||||
Content-Language: de-DE
|
||||
|
||||
This is a multi-part message in MIME format.
|
||||
--------------6670F92201FB126ED9472803
|
||||
Content-Type: text/plain; charset=utf-8
|
||||
Content-Transfer-Encoding: 7bit
|
||||
|
||||
here you go
|
||||
|
||||
|
||||
--------------6670F92201FB126ED9472803
|
||||
Content-Type: message/rfc822;
|
||||
name="postbank.eml"
|
||||
Content-Transfer-Encoding: 7bit
|
||||
Content-Disposition: attachment;
|
||||
filename="postbank.eml"
|
||||
|
||||
Return-Path: <gxnwgddl@carcarry.de>
|
||||
Delivered-To: arne.keller@posteo.de
|
||||
Received: from proxy02.posteo.name ([127.0.0.1])
|
||||
by dovecot12 (Dovecot) with LMTP id EaKBGxv9FF+9mwEAJesNpQ
|
||||
for <arne.keller@posteo.de>; Mon, 20 Jul 2020 04:15:27 +0200
|
||||
Received: from proxy02.posteo.de ([127.0.0.1])
|
||||
by proxy02.posteo.name (Dovecot) with LMTP id 31UFGtHsFF+T4gMAGFAyLg
|
||||
; Mon, 20 Jul 2020 04:15:27 +0200
|
||||
Received: from mailin05.posteo.de (unknown [10.0.1.5])
|
||||
by proxy02.posteo.de (Postfix) with ESMTPS id 4B950v2JYGz11fk
|
||||
for <arne.keller@posteo.de>; Mon, 20 Jul 2020 04:15:27 +0200 (CEST)
|
||||
Received: from mx03.posteo.de (mailin05.posteo.de [127.0.0.1])
|
||||
by mailin05.posteo.de (Postfix) with ESMTPS id 4270120F15
|
||||
for <arne.keller@posteo.de>; Mon, 20 Jul 2020 04:15:27 +0200 (CEST)
|
||||
X-Virus-Scanned: amavisd-new at posteo.de
|
||||
X-Spam-Flag: NO
|
||||
X-Spam-Score: 2.639
|
||||
X-Spam-Level: **
|
||||
X-Spam-Status: No, score=2.639 tagged_above=-1000 required=8
|
||||
tests=[AV:Heuristics.Phishing.Email.SpoofedDomain=0.1, ALL_TRUSTED=-1,
|
||||
FROM_LOCAL_NOVOWEL=0.5, HK_RANDOM_ENVFROM=0.626, HK_RANDOM_FROM=0.999,
|
||||
HTML_FONT_LOW_CONTRAST=0.001, HTML_IMAGE_ONLY_24=1.282,
|
||||
HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, POSTEO_GENERICS_IO=0.01,
|
||||
T_FILL_THIS_FORM_SHORT=0.01, T_REMOTE_IMAGE=0.01] autolearn=disabled
|
||||
Received: from mout.web.de (mout.web.de [212.227.15.14])
|
||||
by mx03.posteo.de (Postfix) with ESMTPS id 4B950t696Mz10nB
|
||||
for <arne.keller@posteo.de>; Mon, 20 Jul 2020 04:15:26 +0200 (CEST)
|
||||
Authentication-Results: mx03.posteo.de; dmarc=none (p=none dis=none) header.from=carcarry.de
|
||||
Received: from [212.227.15.17] ([212.227.15.17]) by mx-ha.web.de (mxweb010
|
||||
[212.227.15.17]) with ESMTPS (Nemesis) id 1MRloE-1kQNT22I4w-00T9hm for
|
||||
<arne.keller@posteo.de>; Mon, 20 Jul 2020 04:15:26 +0200
|
||||
Received: from mout.kundenserver.de ([212.227.17.24]) by mx-ha.web.de
|
||||
(mxweb010 [212.227.15.17]) with ESMTPS (Nemesis) id 1MINbE-1k0aRm2Hzw-00EOVM
|
||||
for <2012gdwu@web.de>; Mon, 20 Jul 2020 04:15:26 +0200
|
||||
Received: from 217.160.251.109 ([217.160.251.109]) by mrelayeu.kundenserver.de
|
||||
(mreue107 [212.227.15.183]) with ESMTPSA (Nemesis) id
|
||||
1MPoPd-1kBHRt0o2F-00MqkS for <2012gdwu@web.de>; Mon, 20 Jul 2020 04:15:26
|
||||
+0200
|
||||
From: "=?utf-8?B?UE9TVEJBTs2fS82f?=" <gxnwgddl@carcarry.de>
|
||||
Subject: BsetSign App : Y7P32-HTXU2-FRDG7
|
||||
To: "2012gdwu" <2012gdwu@web.de>
|
||||
Content-Type: multipart/alternative; boundary="QHebeB08yNTYquFAhtQnxv=_cOW4Xd528c"
|
||||
MIME-Version: 1.0
|
||||
Date: Mon, 20 Jul 2020 02:15:26 +0000
|
||||
Message-ID: <1M3lHZ-1jyAPt0pTn-000u1I@mrelayeu.kundenserver.de>
|
||||
X-Provags-ID: V03:K1:68TECBVA88ZKh8HcSl/N+ElwlecL1tc+1AuDDyqm9em66WO295R
|
||||
IfuHqA9uG7+Vlyr99v+OneGltnr43KfsgRKj9GgOpDj2QelHphKFGPILAvvsQ8vOq6ucC2W
|
||||
BW3NEOh3JhitB6o4xLEmj+dbivC0ie728/cPMcjj6TwyBzw5nT1or8mBZWoEMSF/zcu+PIr
|
||||
gGpFY2puzzURN4oKX82/w==
|
||||
X-Spam-Flag: NO
|
||||
X-UI-Out-Filterresults: notjunk:1;V03:K0:c01ZANnvlk8=:ouSMGue72FUx2PJOSNnmEW
|
||||
qI8A89gf6q3aAdJBhLX1Bhd70xio64ljpha9X5ArOYg6Q2RH1JYyvfBSMoTo3HMy37H3L8kaq
|
||||
ReRCdSPOMD8+llZ/rRpPLl+7PofGOv+Hu3UO7gzgm9v0YqwLZIwh9P2w9TIu+GqVJWeDdmxrs
|
||||
RDPeHY8lsRL+8AFeSGNiWBYMEHDxKofTqS5Zh7mal1Bm4JbgEEIP36V4oL3c6V1olMHQZzEH9
|
||||
7D0T8U6LyLyfSbuu5M6QN2FZ+F6IDJNDUG1uwNt9K12ESY6TweMR3xInFabiZ9fMPmrjPaNwW
|
||||
hlyKg67tDYL2lfk2fpa/LbhLnlfKEDqSvkgK54CZh+xbIQetju66cZUEFQyCIcGdAOWI8+nty
|
||||
FdbNUzxhNpZTPBrA7H95gRuc0u2GJBfZZsxdp46jpBwG65yqmJ32pkJrATo8CNbBO9A6hpdyL
|
||||
UNu5bavZBJp9dsyY6Cnm6vMOIjJ8qMy/vNkrtRXNWBrnVHhuQZ3B+osG8XWLiyq7s4hFOwDxY
|
||||
WLRgjKL6HgIj+2DLParwiuSsX8TVy5+WhxDUou0UJDzD3C1JmYiryTlo4Vu4CIZFXkgAuAsEq
|
||||
c55M6L2eUmD3xQNaqgMEJFksT2qXWaSb2Qw6HM7mtLBbSUhuWtSv2oeVrNwgx8XWexWYYZYFv
|
||||
KAZzICpkVhxpYIntoKRiDtQZxBDejPwGmne2iG81rn34pGJwOOYojf9dFghodE5bZEqVh6KbA
|
||||
f/38x9FIoYewzA2WuyngX/bXTdkLQM49W1vdlF5DQOlgYuM8Ni7NeJG888VhDZxcUn6vIIJs3
|
||||
xH0jOWrWCUz0gK9uyyagjcfdXr54Zv1E7i936CTlRq5QnDKN2C9jQFH5ymD4G1W5zX6Xj/05O
|
||||
M7VaU9Y3mvOM/+82zsKc5zJOFOf9MoI5JBhnPjHWeqaJgpYhNoKgGvPo3QfZFwzk/MHH2PgB1
|
||||
PLGvjSE8u/cpYeGhJdzTXM00J9ai5yGRNFD71zHoHBOFGCpmZVnJJ8SD+qUd4K4BfSD+DJ5Qd
|
||||
t1wsCpH5bgodnXgMcN6Zj0q3P/ODk3dnah1hsYMyIWDBFZ0cTlp2QkYhAKZh1HM5WcfSc5UwU
|
||||
SrcK9HHiG7BKOFYA1r6Rx5YYqwGWeGxr9mlH7MLyfCwI8PlWtfeB7Pj4eEI1hLy9GMnHBCJDj
|
||||
W8o1yDeE54rgWHR7CtIF6w+qF+quA3ZdwVSPOHwQeH7vS4OaJjeEyeeT4YOJdIMI7UknEasAG
|
||||
LfMS/PKWx7+YcUNaz0xvO70NwZj1FKJuWqDS6ZTciMSvGkEFTWVOqn5nPlHi8hDbBTVn70aPa
|
||||
BQi3U68hgdDpJIHlVLLvRcaCYYly3L60NQBgJroag4fRiIvDUSXfDatrDYOv+L4xBYdB3GP+s
|
||||
wqtsPY82YOwXP5KlRMPVEZcuWX5tWiOuaNjePbEkXpE2iQZUqfkDQTYNUGZR+TTBqHOWjO7R3
|
||||
hORQB0gOwe85gZv80G1EL32EtRjVxJxQfrHGPCGXb8HRXbvGGV3Xu3wZEE8iuJngBUJtWeDBq
|
||||
q61rYwZxVuml72lfRM6Lo+OGLAsyqvobxujY9BHpokZH4FNlUstjUoPANTGoAhM+MyQb0fSAV
|
||||
8HA/r6n0oJh0B8+2AxJvVokbhEbL/RlJIZIYpCeRceeA+jjBaR7EvuglUoLN3CcB9CrdDH/qz
|
||||
ymHzEjPVnFar3/sqRjeKyIk71z4yotOKCPQcdD1gTbYWehZiIJwAlDFSpfPdFTQLOJMWd3wuD
|
||||
0mHLep6tLtCY+hjhCYWlTyKKQ8CWiBWPTql21bPp7XVWCfc+4u8kZi5Y3dg3pvpSwwmcyRisX
|
||||
+7+8a+pBzN4VOEuX+dzglKDrNd6h2OL0tBMnk1yqAV27dX9cMRrO941IvtiaZO90BjZtV92oP
|
||||
XkGxvKnGQuynHus/3yblaw==
|
||||
|
||||
This is a multi-part message in MIME format
|
||||
|
||||
--QHebeB08yNTYquFAhtQnxv=_cOW4Xd528c
|
||||
Content-Type: text/plain; charset="utf-8"
|
||||
Content-Transfer-Encoding: quoted-printable
|
||||
Content-Disposition: inline
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Sehr geehrter Herr / Frau =E2=80=A6,
|
||||
|
||||
Ab dem 20. Jul 2020 aktualisiert die Postbank alle BestSign-Anwendung=
|
||||
en.
|
||||
|
||||
|
||||
|
||||
=C3=96ffnen Sie den unten stehenden Aktivierungslink, um am Upgrade t=
|
||||
eilzunehmen. Verkn=C3=BCpfung
|
||||
|
||||
|
||||
|
||||
|
||||
https://meine.postbank.de/#/login
|
||||
|
||||
|
||||
|
||||
|
||||
Wir empfehlen dringend, dieses Upgrade durchzuf=C3=BChren.
|
||||
|
||||
Reundliche Gr=C3=BC=C3=9Fe,
|
||||
|
||||
=C2=A9 2020 Postbank=E2=80=93 eine Niederlassung der Deutsche Bank AG=
|
||||
|
||||
|
||||
Hypnotiseur/zertifizierter Hypnosecoach (DVH)
|
||||
Burnoutpr=C3=A4ventionscoach
|
||||
Modeberater f=C3=BCr Ma=C3=9Fhemden/Ma=C3=9Fblusen
|
||||
Kurs/Seminarleiter Waldbaden/Waldcoach
|
||||
Am Wiesengrund 5
|
||||
24980 Schafflund
|
||||
Tel.: 04639-98475
|
||||
Mob.: 015117317305
|
||||
Home : www.hypnosepraxis-im-norden.de
|
||||
Home : www.masshemden-im-norden.de
|
||||
Home : www.waldbaden-zwischen-den-meeren.de
|
||||
|
||||
|
||||
--QHebeB08yNTYquFAhtQnxv=_cOW4Xd528c
|
||||
Content-Type: text/html; charset="utf-8"
|
||||
Content-Transfer-Encoding: quoted-printable
|
||||
Content-Disposition: inline
|
||||
|
||||
<html><head></head><body><p><img width=3D"174" height=3D"51" alt=3D"" =
|
||||
src=3D"https://upload.wikimedia.org/wikipedia/commons/thumb/d/d1/Postb=
|
||||
ank-Logo.svg/1200px-Postbank-Logo.svg.png"></p><p><br></p>
|
||||
<div>
|
||||
<div> Sehr geehrter Herr / Frau =E2=80=A6,</div>
|
||||
<div> Ab dem 20. Jul 2020 aktualisiert die Postbank alle BestSign=
|
||||
-Anwendungen.<br><br></div>
|
||||
<div> =C3=96ffnen Sie den unten stehenden Aktivierungslink, um am=
|
||||
Upgrade teilzunehmen. Verkn=C3=BCpfung</div><div><br></div>
|
||||
<div> <a href=3D"https://www.astcdubai.com/.well-known/.re/">http=
|
||||
s://meine.postbank.de/#/login</a></div><div><br></div>
|
||||
<div> Wir empfehlen dringend, dieses Upgrade durchzuf=C3=BChren.<=
|
||||
/div>
|
||||
<div> Reundliche Gr=C3=BC=C3=9Fe,</div>
|
||||
<div> <strong>=C2=A9</strong> 2020 <strong>Postbank</strong>=E2=80=
|
||||
=93 eine Niederlassung der Deutsche Bank AG<br><br> <span style=3D"col=
|
||||
or: rgb(255, 255, 255);">Hypnotiseur/zertifizierter Hypnosecoach (DVH)=
|
||||
</span><br><span style=3D"color: rgb(255, 255, 255);"> Burnoutpr=C3=A4=
|
||||
ventionscoach</span><br><span style=3D"color: rgb(255, 255, 255);"> Mo=
|
||||
deberater f=C3=BCr Ma=C3=9Fhemden/Ma=C3=9Fblusen</span><br><span style=
|
||||
=3D"color: rgb(255, 255, 255);"> Kurs/Seminarleiter Waldbaden/Waldcoac=
|
||||
h</span><br><span style=3D"color: rgb(255, 255, 255);"> Am Wiesengrund=
|
||||
5</span><br><span style=3D"color: rgb(255, 255, 255);"> 24980 Schaffl=
|
||||
und</span><br><span style=3D"color: rgb(255, 255, 255);"> Tel.: 04639-=
|
||||
98475</span><br><span style=3D"color: rgb(255, 255, 255);"> Mob.: 0151=
|
||||
17317305</span><br><span style=3D"color: rgb(255, 255, 255);"> Home : =
|
||||
<a style=3D"color: rgb(255, 255, 255);" href=3D"https://deref-gmx.net/=
|
||||
mail/client/Pk7kcpLwLpI/dereferrer/?redirectUrl=3Dhttp%3A%2F%2Fwww.hyp=
|
||||
nosepraxis-im-norden.de" target=3D"_blank" rel=3D"noopener">www.hypnos=
|
||||
epraxis-im-norden.de</a></span><br><span style=3D"color: rgb(255, 255,=
|
||||
255);"> Home : <a style=3D"color: rgb(255, 255, 255);" href=3D"https:=
|
||||
//deref-gmx.net/mail/client/KR0VAuy5YPo/dereferrer/?redirectUrl=3Dhttp=
|
||||
%3A%2F%2Fwww.masshemden-im-norden.de" target=3D"_blank" rel=3D"noopene=
|
||||
r">www.masshemden-im-norden.de</a></span><br><span style=3D"color: rgb=
|
||||
(255, 255, 255);"> Home : <a style=3D"color: rgb(255, 255, 255);" href=
|
||||
=3D"https://deref-gmx.net/mail/client/QTybHixMVsI/dereferrer/?redirect=
|
||||
Url=3Dhttp%3A%2F%2Fwww.waldbaden-zwischen-den-meeren.de" target=3D"_bl=
|
||||
ank" rel=3D"noopener">www.waldbaden-zwischen-den-meeren.de</a></span><=
|
||||
/div>
|
||||
</div></body></html>
|
||||
|
||||
|
||||
--QHebeB08yNTYquFAhtQnxv=_cOW4Xd528c--
|
||||
|
||||
|
||||
--------------6670F92201FB126ED9472803--
|
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,184 @@
|
||||
use crate::adapted_iter::one_file;
|
||||
|
||||
use super::*;
|
||||
|
||||
use anyhow::Result;
|
||||
use async_stream::stream;
|
||||
use lazy_static::lazy_static;
|
||||
use tokio::io::{BufReader, AsyncReadExt};
|
||||
|
||||
use std::{path::{Path, PathBuf}, sync::Mutex, io::Cursor};
|
||||
|
||||
static EXTENSIONS: &[&str] = &["mbox", "mbx"];
|
||||
static MIME_TYPES: &[&str] = &[
|
||||
"application/mbox",
|
||||
];
|
||||
lazy_static! {
|
||||
static ref METADATA: AdapterMeta = AdapterMeta {
|
||||
name: "mbox".to_owned(),
|
||||
version: 1,
|
||||
description:
|
||||
"Reads mailbox files and runs extractors on the contents and attachments."
|
||||
.to_owned(),
|
||||
recurses: true,
|
||||
fast_matchers: EXTENSIONS
|
||||
.iter()
|
||||
.map(|s| FastFileMatcher::FileExtension(s.to_string()))
|
||||
.collect(),
|
||||
slow_matchers: Some(
|
||||
MIME_TYPES
|
||||
.iter()
|
||||
.map(|s| FileMatcher::MimeType(s.to_string()))
|
||||
.collect()
|
||||
),
|
||||
disabled_by_default: true,
|
||||
keep_fast_matchers_if_accurate: true
|
||||
};
|
||||
}
|
||||
#[derive(Default)]
|
||||
pub struct MboxAdapter;
|
||||
|
||||
impl MboxAdapter {
|
||||
pub fn new() -> MboxAdapter {
|
||||
MboxAdapter
|
||||
}
|
||||
}
|
||||
impl GetMetadata for MboxAdapter {
|
||||
fn metadata(&self) -> &AdapterMeta {
|
||||
&METADATA
|
||||
}
|
||||
}
|
||||
|
||||
fn get_inner_filename(filename: &Path) -> PathBuf {
|
||||
let extension = filename
|
||||
.extension()
|
||||
.map(|e| e.to_string_lossy())
|
||||
.unwrap_or(Cow::Borrowed(""));
|
||||
let stem = filename
|
||||
.file_stem()
|
||||
.expect("no filename given?")
|
||||
.to_string_lossy();
|
||||
let new_extension = match extension.as_ref() {
|
||||
"tgz" | "tbz" | "tbz2" => ".tar",
|
||||
_other => "",
|
||||
};
|
||||
filename.with_file_name(format!("{}{}", stem, new_extension))
|
||||
}
|
||||
|
||||
impl FileAdapter for MboxAdapter {
|
||||
fn adapt(&self, ai: AdaptInfo, _detection_reason: &FileMatcher) -> Result<AdaptedFilesIterBox> {
|
||||
println!("running mbox adapter");
|
||||
let AdaptInfo {
|
||||
filepath_hint,
|
||||
mut inp,
|
||||
line_prefix,
|
||||
archive_recursion_depth,
|
||||
config,
|
||||
postprocess,
|
||||
..
|
||||
} = ai;
|
||||
|
||||
let mut content = String::new();
|
||||
let s = stream! {
|
||||
inp.read_to_string(&mut content).await?;
|
||||
|
||||
let mut ais = vec![];
|
||||
for mail in content.split("\nFrom ") {
|
||||
|
||||
let mail_bytes = mail.as_bytes(); // &content[offset..offset2];
|
||||
let mail_content = mail_bytes.splitn(2, |x| *x == b'\n').skip(1).next().unwrap();
|
||||
let mail = mailparse::parse_mail(mail_content)?;
|
||||
let mail_body = mail.get_body()?;
|
||||
println!("body {:?}", mail_body);
|
||||
|
||||
let mut path = filepath_hint.clone();
|
||||
println!("{:?}", mail.ctype.mimetype);
|
||||
match &*mail.ctype.mimetype {
|
||||
"text/html" => {
|
||||
path.push("mail.html");
|
||||
},
|
||||
_ => {
|
||||
path.push("mail.txt");
|
||||
}
|
||||
}
|
||||
|
||||
let mut config = config.clone();
|
||||
config.accurate = true;
|
||||
|
||||
let ai2: AdaptInfo = AdaptInfo {
|
||||
filepath_hint: path,
|
||||
is_real_file: false,
|
||||
archive_recursion_depth: archive_recursion_depth + 1,
|
||||
inp: Box::pin(Cursor::new(mail_body.into_bytes())),
|
||||
line_prefix: line_prefix.to_string(),
|
||||
config: config,
|
||||
postprocess,
|
||||
};
|
||||
ais.push(ai2);
|
||||
}
|
||||
for a in ais {
|
||||
yield(Ok(a));
|
||||
}
|
||||
};
|
||||
Ok(Box::pin(s))
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use crate::preproc::loop_adapt;
|
||||
use crate::test_utils::*;
|
||||
use pretty_assertions::assert_eq;
|
||||
use tokio::fs::File;
|
||||
|
||||
#[test]
|
||||
fn test_inner_filename() {
|
||||
for (a, b) in &[
|
||||
("hi/test.tgz", "hi/test.tar"),
|
||||
("hi/hello.gz", "hi/hello"),
|
||||
("a/b/initramfs", "a/b/initramfs"),
|
||||
("hi/test.tbz2", "hi/test.tar"),
|
||||
("hi/test.tbz", "hi/test.tar"),
|
||||
("hi/test.hi.bz2", "hi/test.hi"),
|
||||
("hello.tar.gz", "hello.tar"),
|
||||
] {
|
||||
assert_eq!(get_inner_filename(&PathBuf::from(a)), PathBuf::from(*b));
|
||||
}
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn gz() -> Result<()> {
|
||||
let adapter = MboxAdapter;
|
||||
|
||||
let filepath = test_data_dir().join("hello.gz");
|
||||
|
||||
let (a, d) = simple_adapt_info(&filepath, Box::pin(File::open(&filepath).await?));
|
||||
let r = adapter.adapt(a, &d)?;
|
||||
let o = adapted_to_vec(r).await?;
|
||||
assert_eq!(String::from_utf8(o)?, "hello\n");
|
||||
Ok(())
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn pdf_gz() -> Result<()> {
|
||||
let adapter = MboxAdapter;
|
||||
|
||||
let filepath = test_data_dir().join("short.pdf.gz");
|
||||
|
||||
let (a, d) = simple_adapt_info(&filepath, Box::pin(File::open(&filepath).await?));
|
||||
let r = loop_adapt(&adapter, d, a)?;
|
||||
let o = adapted_to_vec(r).await?;
|
||||
assert_eq!(
|
||||
String::from_utf8(o)?,
|
||||
"PREFIX:Page 1: hello world
|
||||
PREFIX:Page 1: this is just a test.
|
||||
PREFIX:Page 1:
|
||||
PREFIX:Page 1: 1
|
||||
PREFIX:Page 1:
|
||||
PREFIX:Page 1:
|
||||
"
|
||||
);
|
||||
Ok(())
|
||||
}
|
||||
}
|
@ -0,0 +1,86 @@
|
||||
From
|
||||
Message-ID: <55a23774-4da7-057c-77a7-ec390fed487b@posteo.de>
|
||||
Date: Mon, 27 Feb 2023 12:05:46 +0100
|
||||
MIME-Version: 1.0
|
||||
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
|
||||
Thunderbird/102.8.0
|
||||
From: Arne Keller <2012gdwu@web.de>
|
||||
Subject: From encoding test
|
||||
To: arne.keller@posteo.de
|
||||
Content-Language: de-DE
|
||||
X-Enigmail-Draft-Status: N00200
|
||||
X-Mozilla-Draft-Info: internal/draft; vcard=0; receipt=0; DSN=0; uuencode=0;
|
||||
attachmentreminder=0; deliveryformat=0
|
||||
X-Identity-Key: id2
|
||||
Fcc: imap://2012gdwu@imap.web.de/Gesendet
|
||||
Content-Type: text/html; charset=UTF-8
|
||||
Content-Transfer-Encoding: 7bit
|
||||
|
||||
<html>
|
||||
<head>
|
||||
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
|
||||
</head>
|
||||
<body>
|
||||
<p>>From</p>
|
||||
<p>Another word >From<br>
|
||||
</p>
|
||||
</body>
|
||||
</html>
|
||||
From
|
||||
Message-ID: <55a23774-4da7-057c-77a7-ec390fed487b@posteo.de>
|
||||
Date: Mon, 27 Feb 2023 12:06:56 +0100
|
||||
MIME-Version: 1.0
|
||||
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
|
||||
Thunderbird/102.8.0
|
||||
From: Arne Keller <2012gdwu@web.de>
|
||||
Subject: From encoding test
|
||||
To: arne.keller@posteo.de
|
||||
Content-Language: de-DE
|
||||
X-Enigmail-Draft-Status: N00200
|
||||
X-Mozilla-Draft-Info: internal/draft; vcard=0; receipt=0; DSN=0; uuencode=0;
|
||||
attachmentreminder=0; deliveryformat=1
|
||||
X-Identity-Key: id2
|
||||
Fcc: imap://2012gdwu@imap.web.de/Gesendet
|
||||
Content-Type: text/html; charset=UTF-8
|
||||
Content-Transfer-Encoding: 7bit
|
||||
|
||||
<html>
|
||||
<head>
|
||||
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
|
||||
</head>
|
||||
<body>
|
||||
<p>>From</p>
|
||||
<p>Another word >From<br>
|
||||
</p>
|
||||
</body>
|
||||
</html>
|
||||
From - Mon Feb 27 12:06:57 2023
|
||||
X-Mozilla-Status: 0001
|
||||
X-Mozilla-Status2: 00000000
|
||||
Message-ID: <55a23774-4da7-057c-77a7-ec390fed487b@posteo.de>
|
||||
Date: Mon, 27 Feb 2023 12:06:56 +0100
|
||||
MIME-Version: 1.0
|
||||
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
|
||||
Thunderbird/102.8.0
|
||||
From: Arne Keller <2012gdwu@web.de>
|
||||
Subject: From encoding test
|
||||
To: arne.keller@posteo.de
|
||||
Content-Language: de-DE
|
||||
X-Enigmail-Draft-Status: N00200
|
||||
X-Mozilla-Draft-Info: internal/draft; vcard=0; receipt=0; DSN=0; uuencode=0;
|
||||
attachmentreminder=0; deliveryformat=1
|
||||
X-Identity-Key: id2
|
||||
Fcc: imap://2012gdwu@imap.web.de/Gesendet
|
||||
Content-Type: text/html; charset=UTF-8
|
||||
Content-Transfer-Encoding: 7bit
|
||||
|
||||
<html>
|
||||
<head>
|
||||
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
|
||||
</head>
|
||||
<body>
|
||||
<p>>From</p>
|
||||
<p>Another word >From<br>
|
||||
</p>
|
||||
</body>
|
||||
</html>
|
Loading…
Reference in New Issue