priv: mvp module plugins

Signed-off-by: blob42 <contact@blob42.xyz>
This commit is contained in:
blob42 2024-10-10 18:37:11 +02:00
parent 78f292cf80
commit fbaab11bab
17 changed files with 649 additions and 1 deletions

188
browsers/firefox/readme.md Normal file
View File

@ -0,0 +1,188 @@
# Links
- [Mozilla places database]( https://developer.mozilla.org/en-US/docs/Mozilla/Tech/Places/Database )
- [places explorations](https://gist.github.com/olejorgenb/9418bef65c65cd1f489557cfc08dde96)
- [Firefox nsNavBookmarks.cpp](https://searchfox.org/mozilla-esr60/source/toolkit/components/places/nsNavBookmarks.cpp)
- Running firefox in test mode:
`MOZILLA_DISABLE_PLUGINS=1 ./firefox -safe-mode -no-remote \
-profile $(mktemp -d tmp-firefox-profile-XXXXXXXXXX) -offline about:blank`
- https://shallowsky.com/blog/tech/web/modifying-omni.ja.html
# File locking & VFS
- Sqlite allows file locking of the database using the local file system VFS.
Previous versions of FF allowed external processes to access the file.
- Since firefox v(?) this has changed, when initializing the database FF checks
the preference option `storage.multiProcessAccess.enabled` which is not
documented officially.
Source code links:
- https://dxr.mozilla.org/mozilla-central/source/storage/TelemetryVFS.cpp#884
- https://dxr.mozilla.org/mozilla-central/source/storage/mozStorageService.cpp#377
- [the change on github](https://github.com/mozilla/gecko-dev/commit/a543f35d4be483b19446304f52e4781d7a4a0a2f)
*UPDATE (2020-09-01)*
- Mozilla keeps changing this option, a new change in vfs locks [ happenend in ](https://github.com/mozilla/gecko-dev/commit/14784dc42d7994ea9fc8ff279e5f685501289d60)
- We need to come up with a strategy to avoid being at the mercy of mozilla
changes, the following strategy is proposed:
1. Main strategy try somehow to make Firefox use non exclusive lock
2. If non exclusive lock not possible use the copy method as follows:
- Copy the `places.sqlite*` database files to a temporary destination
- Parse bookmarks from the temporary destination
# Queries
See inside ./queries.sql
## Find duplicate links
select metadata,url from bookmarks join (select url,metadata, count(url) as x
from bookmarks group by url having x > 1) using (metadata,url) order by url
limit 10
## Find all places for a tag
SELECT moz_places.id, moz_places.url, moz_places.title, moz_bookmarks.parent
FROM moz_places
LEFT OUTER JOIN moz_bookmarks
ON moz_places.id = moz_bookmarks.fk
WHERE moz_bookmarks.parent = N(tag id ?)
# Mozilla places.sqlite
## Updates Events
When firefox updates `places.sqlite` it emits multiple `WRITE` events. We need
to implement a debounce system to avoid reparsing everything.
## Table moz_bookmarks
- keyword_id is not used
- folder_type is not used
columns:
id
type
fk
parent
position
title
### Column Types
`type`
1: URL
2: Tag or Folder
3:
`parent` (main ids)
-- Root FOLDERS
1. Root
2. Bookmarks Menu
3. Bookmarks Toolbar
4. Tags
5. Other Bookmarks
6. Mobile Bookmarks
-- Mozilla Folders
7. Moz Firefox Folder
8. Help & Tutorials Folder
9..
.. 12 Last mozilla reserved id
-- User bookmarks
13. User Bookmark XX
*user bookmarks start from id=13 and more*
NOTES:
- Mozilla reserves ids (1-12 for their own use, user defined bookmarks start at
id=13 ?
for ex to find all tags
`select * from moz_bookmarks where parent = 4`
* !Regarding mobile bookmarks:
By default firefox hides the mobile bookmarks folder.
It can be activated using the prefs.js or about:config boolean option
`browser.bookmarks.showMobileBookmarks` to `true`
## How are bookmarks represented
### 1. Bookmarks with tags
|----|------|----------|----------------|-------------|
| id | type | fk | title | parent |
|----|------|----------|----------------|-------------|
| 42 | 2 | NULL | Tag Name | 4 (Tags) | <- A
| xx | 1 | place_id | Bookmark Title | 5 (unfiled) | <- B
| yy | 1 | place_id | NULL | 42 | <- C
|----|------|----------|----------------|-------------|
A. An entry for the tag itself
B. The bookmark entry (type=1) which points also to fk=moz_places.id with as
parent the *dummy* `unfiled` folder.
C. A link between Tag(A) and Bookmark (B)
### 2. Bookmarks with folders
|----|------|----------|----------------|------------------|
| id | type | fk | title | parent |
|----|------|----------|----------------|------------------|
| xx | 1 | place_id | Bookmark Title | parent_folder_id | <- A
|----|------|----------|----------------|------------------|
A. An entry for the bookmark with a link to places and the folder id as parent.
note: the parent folder can itself be child of other folders recursively until
reaching any of the root folders: menu(2), toolbar(3) and mobile(6)
### 3. Bookmarks with tags AND within folders
|----|------|----------|----------------|------------------|
| id | type | fk | title | parent |
|----|------|----------|----------------|------------------|
| 42 | 2 | NULL | Tag Name | 4 (Tags) | <- A
| xx | 1 | place_id | Bookmark Title | parent_folder_id | <- B
| yy | 1 | place_id | NULL | 42 | <- C
|----|------|----------|----------------|------------------|
A. An entry for the tag itself
B. The bookmark entry (type=1) which points also to fk=moz_places.id
note that compared to bookmarks with tags only, this links' parent is a folder
and not the **unfiled** dummy folder
C. A link between Tag(A) and Bookmark (B)
## Parsing Algorithm
- Create a main Root node
- Recursive parsing from the **Selected Root Folders** (id = 1) in the same way as chrome
- To compare between a URL and folder use the type (2=folder, 1=URL)
## Root folders to parse from
- `Bookmarks Toolbar`: id = 3 (where parent = 3)
- `Tags`: id = 4 (where parent = 4)
- `Mobile Bookmarks`: id = 6 (where parent = 6)
## Sqlite WAL
- WAL automatic checkpoints may prevent us from seeing bookmark changes
- we may need ways to force vacuuming of the WAL file or reverse engineer it
with a tool like [this one](https://github.com/n0fate/walitean) ?
## moz_bookmarks timestamps
- firefox stores timestamps in milliseconds as integer
- sqlite3 strftime('%s', ...) returns seconds

View File

@ -35,7 +35,7 @@ import (
"github.com/urfave/cli/v2"
// Load firefox browser modules
// _ "github.com/blob42/gosuki/browsers/firefox"
_ "github.com/blob42/gosuki/browsers/firefox"
// Load chrome browser module
// _ "github.com/blob42/gosuki/browsers/chrome"
// github module

View File

@ -0,0 +1,68 @@
# Buffers and Cache
### Cache
- The **Cache** in code is `cacheDB` and uses **sqlite**
- It represents the bookmarks over all browsers.
- It is periodically synced to the local disk gosuki.
- It is preloaded to memory when the program is started.
### Buffer
- is an **sqlite** memory db called `bufferDB`
- Represents *one browser* state across all jobs
- Is synced with `cacheDB`
### URLIndex
This is a [RedBlack Tree Hashmap](https://github.com/sp4ke/hashmap).
- It holds in memory the state of the last building of the tree.
- Used as fast query db
- Each URL holds a pointer to a node in [nodeTree](#nodeTree)
### nodeTree
- Is a tree datastructure that can be stored in a browser representing the bookmarks in a node structure.
- It allows for easy parsing of Folders and URLs and keeps the parent/child relationship.
## Architecture
- Insertion happens when
1. New bookmarks are detected
2. Initial loading of bookmarks when the program starts
- The index needs to have very fast lookups, since every bookmark is checked to see if the tags are modified.
-The test process needs to be as fast as possible, the best being to create a fast hash of the data to test in the index first time when the browser bookmarks are loaded. The hash is tested again when a reload of the bookmarks is done.
## Data structures
### Hash Map
[hash map](https://github.com/DusanKasan/hashmap)
- The hash map is a map of [hash_url] --> hash_content.
- the URL becomes the main index for lookups
- The `hash_content` is the hash of content to test against if it changed.
### Hash function
[xxhash](https://github.com/OneOfOne/xxhash)
- This is the hash function used to generate the url_hash as well as the content hash.
# Parsing Bookmarks and Tags
- Run program
- Preload bookmark to `cacheDB`
- Bookmarks changed event
- Create a new `nodeTree building job` and for each bookmark do:
- If BK not in URLIndex continue, add to Index as pointer to BK node
- If BK in URLIndex and name changed, mark BK as `has_changed`
- Run parsing hooks on BK
- Rebuild the Index to mirror the nodeTree
- Sync the `nodeTree` to `bufferDB`
- Sync `bufferDB` to `cacheDB`
- Flush cache to disk
### Subsequent changes
#### Updating a bookmark
#### Deleting a bookmark

79
docs/development.md Normal file
View File

@ -0,0 +1,79 @@
# Development
## Dependencies
### 1. Make dependencies:
Use the Makefile to install dependencies with `make deps`.
*caddy v2* is needed as well
#### dependencies:
(arch linux)
- inotify-tools (watch file changes)
- libnotify (linux desktop notifications)
# Quick Intro
Gosuki is a blazing fast real time bookmarks sniffer and synchronizer.
It detects installed browsers in the system and automatically watches for
bookmark changes without relying on any external plugin. This is achieved by
manually reading the corresponding bookmark files. This solution allows for a
fully passive way to copy bookmarks without installing any plugin.
# Design And Architecture
## Sync Strategy
Since gosuki might run against multiple running browsers, handling various
scenarios for when bookmarks are deleted would be very complex. The chosen
solution is to do *Read Only* operations on the browser bookmark files.
Any bookmark deleted on Gosuki database will not be synced back to the browser.
Same for the bookmarks deleted on browser which is not carried to gosuki. This
means, gosuki is a read only backup for all bookmarks ever added on any
registered browser.
This should not be an issue as Gosuki provides its own Web Interface UI that
gives access to all gathered bookmarks from all browser. Modification/deletion
could be carried directly on that database.
*Note* that changes on bookmarks ARE detected and synced from the browser to
gosuki.
If a real need for syncing back bookmarks from Gosuki back to the browser is
needed, browser extensions could esailly be developped using gosuki API.
## The Gosuki database is compatible with BUKU
The sqlite3 database format used in gosuki was designed to be fully copatible
with [buku](https://github.com/jarun/Buku). This means:
- All bookmarks saved by gosuki can be directly accessed and manipulated using
buku.
- All bookmarks saved by buku can be loaded in gosuki.
## Readings
### Tree Diff (Graph Isomorphism)
- [SO](https://stackoverflow.com/questions/5894879/detect-differences-between-tree-structures)
- [React Diffing](https://reactjs.org/docs/reconciliation.html)
- [Change Distilling](http://www.merlin.uzh.ch/contributionDocument/download/2162)
- [ A congruence theorem for trees ](https://msp.org/pjm/1957/7-1/p14.xhtml)
- [The Design and Analysis of Computer Algorithms](https://www.amazon.com/Design-Analysis-Computer-Algorithms/dp/0201000296)
idea: Rebuild the target tree on change ?
# Related - Libraries
- https://godoc.org/github.com/shirou/gopsutil (Process utils)
- [sqlite queries for common programs](https://github.com/kacos2000/Queries)
## Other Crawlers
- github shiori (golang)
- https://github.com/spyglass-search/netrunner (rust)
- https://github.com/a5huynh/spyglass (rust)

BIN
docs/firefox-startup.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

32
docs/firefox-startup.puml Normal file
View File

@ -0,0 +1,32 @@
#
# Copyright ⓒ 2023 Chakib Ben Ziane <contact@blob42.xyz> and [`GoSuki` contributors](https://github.com/blob42/gosuki/graphs/contributors).
#
# All rights reserved.
#
# SPDX-License-Identifier: AGPL-3.0-or-later
#
# This file is part of GoSuki.
#
# GoSuki is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
#
# GoSuki is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License along with gosuki. If not, see <http://www.gnu.org/licenses/>.
@startuml
Gosuki --> Gosuki: start server firefox
Gosuki --> Gosuki: init
Gosuki --> Config: init
Gosuki <-- Firefox: import
Firefox --> Firefox: init
Firefox --> Config: register conf
note over Config
reg cmd global flags
reg config ready hook
end note
note over Config
set profile gosuki
end note
Gosuki --> Config: app:before:RunConfHook
Firefox --> Firefox: provision browser
Gosuki --> Firefox: Run()
@enduml

30
docs/home.md Normal file
View File

@ -0,0 +1,30 @@
## References and links:
### Buku bookmark manager
Buku is a python bookmark manager, it implements most of the algorithms and functions needed for our project. The`buku.py` file is included in the repository as a reference for functions to implement.
### Other links
- [Algorithms and data structures used](Algorithms-and-Data-Structres)
- [Fast json parsing library in golang (not depending on encoding/json)](https://github.com/buger/jsonparser)
- [Parsing and loading firefox/chrome bookmarks in buku](https://github.com/jarun/Buku/issues/175)
- [Sqlite indexes](https://www.tutorialspoint.com/sqlite/sqlite_indexes.htm) and [sqlite planner](https://www.sqlite.org/queryplanner.html)
- [Sqlite quick tutorial](http://tech.marksblogg.com/sqlite3-tutorial-and-guide.html)
#### Misc
- https://github.com/unode/firefox_decrypt
### Project Code Architecture
### Database types
#### Cache
Name: memcache
Used as memory buffer between Gosuki and the local gosuki database `gosuki.db`
#### Buffer
Name: buffer_<browser>
Used as memory buffer between gosuki and <browser> bookmarks

37
docs/module-loading.d2 Normal file
View File

@ -0,0 +1,37 @@
modTypes: module types {
browser
simple
}
modules.shape: package
modules: modules package {
mod1
mod2
".."
}
gosuki <- modules.mod1: RegisterModule()
gosuki -> daemon: start
isBrowser: is browser
isBrowser.shape: diamond
modules -> daemon: GetModules()
daemon -> isBrowser: each mod
setupBrowser: browser setup
setupMod: simple setup
isBrowser -> setupBrowser: yes
isBrowser -> setupMod: no
loadRun: load and run
loadRun.shape: oval
setupBrowser -> loadRun
setupMod -> loadRun

BIN
docs/module-loading.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 406 KiB

View File

@ -0,0 +1,75 @@
title: |md
# modules & interfaces
| {near: top-center}
modules.shape: package
profiles.shape: package
watch.shape: package
classes: {
if: {
shape: page
}
}
initializer.class: if
loader.class: if
shutdowner.class: if
explanation_left: |md
# ProfileManager
For modules that want to support multiple profiles.
See browser modules for reference.
| {near: center-left}
explanation_right: |md
# Initializer / ProfileInitializer
The very first code run by gosuki after importing the module.
Use it to initialize the state of the module at the earliest stage.
# Loader
Implement this if you want to preload bookmarks and data before the
module enters the main loop.
# Runners
The core logic of bookmark loading happens within runners.
Every module is `required` to implement one of the runner interfaces.
| { near: center-right }
runners {
watchRunner.class: if
intervalFetcher.class: if
}
watch -- runners.watchRunner
watch -- runners.intervalFetcher
profileManager.class: if
profileInitializer.class: if
profiles -- profileManager
profiles -- profileInitializer
modules -- profileInitializer
modules -- initializer
profileInitializer -- loader
initializer -- loader
loader -- runners
runners -- shutdowner
legend: {
style: {
stroke: transparent
fill: transparent
}
near: bottom-left
packages.shape: package
interfaces.shape: page
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 368 KiB

BIN
docs/mozilla-places.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

View File

@ -0,0 +1,14 @@
- https://www.reddit.com/r/firefox/comments/dez7hh/does_anyone_actually_use_mobile_bookmarks/
~~https://xyne.archlinux.ca/projects/moz2fs/~~
- https://xyne.dev/projects/moz2fs/
- https://github.com/a5huynh/spyglass/blob/main/plugins/firefox-importer/src/main.rs
# Miniflux discussions on bookmarking and RSS Feeds
- https://github.com/miniflux/v2/issues/329
- https://github.com/miniflux/v2/issues/1484
* Discussion about how people use bookmarks https://lemmy.ml/post/579361

BIN
docs/search-page-mockup.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

View File

@ -0,0 +1,66 @@
# linking urls to tags and folders at the same time
options:
1. store every bookmark under a tag node as well as a corresponding folder nodes
- a lot of code changes required, Node.Parent would point to multiple parents
- will break tree algorithms or not ? it should not break
1.2 store url nodes under tag nodes but keep their parent as folers only
2. Store a pointer to tags for each node
- if node has no tags (folders) the tags node list is empty
- require change of Node.Tags from []string to []*Node
```
| PrintTree
| .
| └── 0 <root>
| ├── 2 <Bookmarks Menu>
| │   ├── 2 <Mozilla Firefox>
| │   └── 2 <GosukiMenu>
| ├── 2 <Bookmarks Toolbar>
| │   ├── 2 <cooking>
| │   │   └── 2 <indian>
| │   └── 2 <Travel>
| ├── 2 <Other Bookmarks>
| ├── 2 <Mobile Bookmarks>
| └── 0 <TAGS>
| ├── 3 <golang>
| │   └── 1 <The Go Programming Language>
| ├── 3 <programming>
| │   ├── 1 <The Go Programming Language>
| │   └── 1 <Rust Programming Language>
| ├── 3 <>
| │   └── 1 <Indian Cooking at Home: A Beginner's Guide | Taste of
Home>
| ├── 3 <based>
| │   └── 1 <Based Cooking>
| ├── 3 <rust>
| │   └── 1 <Rust Programming Language>
| ├── 3 <systems>
| │   └── 1 <Rust Programming Language>
| ├── 3 <budapest>
| │   └── 1 <Budapest - Official tourist information >
| ├── 3 <gosuki>
| │   └── 1 <universal bookmark tracker>
| └── 3 <libre>
| └── 1 <Front Page Free Software Foundation working togeth
er for free software>
```
## References to Node.Tags:
firefox/firefox.go|265 col 43|  ./firefox/firefox.go Tags Reference 6 of: 11
firefox/firefox.go|265 col 43| 265: urlNode.Tags = append(urlNode.Tags, tagNode.Name) 碑 () scanBookmarks()
firefox/firefox.go|536 col 15| 536: urlNode.Tags = utils.Extends(urlNode.Tags, tagNode.Name) 碑 () Run()
firefox/firefox.go|787 col 33| 787: urlNode.Tags = append(urlNode.Tags, tagNode.Name) 碑 loadBookmarks()
parsing/parse.go|55 col 8|  ./parsing/parse.go Reference 3 of: 11
parsing/parse.go|55 col 8| 55: node.Tags = append(node.Tags, m[1]) 碑 ParseTags()
parsing/parse.go|59 col 14| 59: if len(node.Tags) > 0  ParseTags()
parsing/parse.go|60 col 58| 60: log.Debugf("[in title] found following tags: %s", node.Tags)  ParseTags()
tree/tree.go|31 col 2|  ./tree/tree.go Reference 1
tree/tree.go|31 col 2| 31: Tags []string 什

58
tmp_readme Normal file
View File

@ -0,0 +1,58 @@
# Gosuki: Your Universal, Plugin-Free, Multi-Browser Bookmark Manager
## What's this ? Oh, just another bookmark organizer. No big deal.
Ever feel like your bookmarks are a chaotic mess? Ever find yourself swapping browsers when they can't keep up with your demands? Have past bookmark managers let you down, locking vendors or abandoning you in your time of need? Maybe you're cautious about entrusting your bookmarks to unreliable cloud companies? Or perhaps you keep multiple browser profiles for work, research, streaming, and web development? Even pondering about monks who may be using an "obscure" browser that doesn't support plugins?
If you're nodding your head to any of the above, then look no further - Gosuki is a blazing fast real-time bookmarks sniffer and synchronizer.
It doesn't just detect installed browsers in the system, oh no. It magically watches for bookmark changes without needing any of those pesky external plugins. How? By manually reading the bookmark files. Because who needs plugins when you can do it the hard way, right?
But its brilliance doesn't just stop at detecting browsers installed in your system. Gosuki is always on guard, identifying bookmark modifications by reading the bookmark files directly. 'Coz why take the conventional route when you can opt for the more exciting one? Brace for bookmark ease like you've never experienced before.
## Features in a nutshell
- Standalone Functionality: Gosuki is a single binary with no dependencies or external plugins necessary. It's designed to just work right out of the box.
- Quick Bookmarking: Gosuki leverages the universal shortcut `ctrl+d` with native bookmarks UI that exists in all browsers.
- Constant Monitoring: Gosuki keeps track of your bookmarks, spotting any changes as they happen.
- Tagging: You can tag your bookmarks in any browser. In Chrome, for example, you can include `#tag1 #tag2` in your bookmark title.
- Customizable: You can add commands in your bookmark title to initiate certain Gosuki actions, like archiving a bookmark with `:archive`.
- Multi Browser: Detects which browsers you have installed and watch changes in all of them.
- Profile Support: Also handles multiple profiles for each browser.
- Buku Compatibility: Gosuki uses a [Buku] compatible sqlite database, so you can use any program that was made for buku.
## What's this ? Oh, just another bookmark organizer. No big deal.
Gosuki is a blazing fast real-time bookmarks sniffer and synchronizer.
Do you have your bookmarks all over the place ? Do you often hop to new browsers
because they stopped fulfilling your needs ? Did you try multiple bookmark
managers where the company abandoned the project and vendor locked you ? Do you
refuse to trust your precious bookmarks in the hand of shady cloud companies ?
Do you use multiple browsing profiles for: work, research, streaming, web
development ? Are you a monk that uses some "arcane" browser for which no plugins exist
?
If you answered `yes` to any of the previous questions then Gosuki is for you!
Gosuki is a bookmark organizer and manager that works with any browser without
requiring any plugins to be installed. It doesn't just detect installed browsers
in the system, oh no. It magically watches for bookmark changes without needing
any of those pesky external plugins. How? By manually reading the bookmark
files. Because who needs plugins when you can do it the hard way, right?
- Runs in the background as a service and detects bookmark changes in real-time
- Add bookmark anywhere using just `ctrl+d`: the universal bookmark shortcut.
- Adds `tag` support to browsers that don't support it natively.
In Chrome you can add tags by adding `#tag1 #tag2` to the bookmark title.
- Pass in custom commands in the native bookmark window of your browser to
trigger Gosuki actions. Fro example using `:archive` in the bookmark title
will archive the bookmark.
- Detects installed browsers in the system
- Supports multiple profiles for each browser
- Uses a [Buku] compatible sqlite database: Use Buku or any other compatible
program to manage your bookmarks.

1
web/.nvmrc Normal file
View File

@ -0,0 +1 @@
v14.13.0