# Duplicate finder checks included and excluded directories
# Others are just check delete files number
# Windows build have some elements changed like powershell before wget and unzip, python instead python3, rm -r instead rm -r and one (at the top) additional test which check size of characyers
# Windows build have some elements changed like powershell before wget and unzip, python instead python3, rm -r instead rm -r and one (at the top) additional test which check size of characters
- Improve custom selecting of records(allows to use Rust regex) - [#489](https://github.com/qarmin/czkawka/pull/478)
@ -63,7 +63,7 @@
- Add additional info to printed errors [#446](https://github.com/qarmin/czkawka/pull/446)
- Add support for multiple image filters, hashes and sizes in similar images tool [#447](https://github.com/qarmin/czkawka/pull/447), [#448](https://github.com/qarmin/czkawka/pull/448)
- Button to move files/folders to provided location [#449](https://github.com/qarmin/czkawka/pull/449)
- Add nonclickable button to fix white theme [#450](https://github.com/qarmin/czkawka/pull/450)
- Add non-clickable button to fix white theme [#450](https://github.com/qarmin/czkawka/pull/450)
- Fixed freeze when opening in same thread file/folder [#448](https://github.com/qarmin/czkawka/pull/448)
- Tool to check performance of different image filters and hash types and sizes [#447](https://github.com/qarmin/czkawka/pull/447)
- Add scheduled CI and pin it to support Rust 1.53.0 [7bb](https://github.com/qarmin/czkawka/commit/7bbdf742739a513b80d0cc06ba61dfafec976b23), [#431](https://github.com/qarmin/czkawka/pull/431)
@ -123,7 +123,7 @@
- Add support for CRC32 and XXH3 hash - [#243](https://github.com/qarmin/czkawka/pull/243)
- Add delete method to replace duplicate files with hard links - [#236](https://github.com/qarmin/czkawka/pull/236)
- Add checking for broken music opt-in - [#249](https://github.com/qarmin/czkawka/pull/249)
- Allow to save to files similar images results - [10156ccfd3](https://github.com/qarmin/czkawka/commit/10156ccfd3ba880d26d4bbad1e025b0050d7753b)
- Allow to save to file similar images results - [10156ccfd3](https://github.com/qarmin/czkawka/commit/10156ccfd3ba880d26d4bbad1e025b0050d7753b)
- Keep original file if replacing duplicate with hardlink fails - [#256](https://github.com/qarmin/czkawka/pull/256)
- Fix Windows theme - [#265](https://github.com/qarmin/czkawka/pull/265)
- Windows taskbar progress support - [#264](https://github.com/qarmin/czkawka/pull/264)
@ -169,7 +169,7 @@ This version is only needed to test flatpak build
- Add support for delete button - [#159](https://github.com/qarmin/czkawka/pull/159)
- Allow to select multiple entries in File Chooser - [#154](https://github.com/qarmin/czkawka/pull/154)
- Add cache support for similar images - [#139](https://github.com/qarmin/czkawka/pull/139)
- Add selecting images with it's size - [#138](https://github.com/qarmin/czkawka/pull/138)
- Add selecting images with its size - [#138](https://github.com/qarmin/czkawka/pull/138)
- Modernize popovers code and simplify later changes - [#137](https://github.com/qarmin/czkawka/pull/137)
## Version 2.0.0 - 23.12.2020r
@ -213,7 +213,7 @@ This version is only needed to test flatpak build
## Version 1.3.0 - 02.11.2020r
- Appimage support - [#77](https://github.com/qarmin/czkawka/pull/77)
- Removed warnings about non existend excluded directories - [#79](https://github.com/qarmin/czkawka/pull/79)
- Removed warnings about non-existed excluded directories - [#79](https://github.com/qarmin/czkawka/pull/79)
- Added pre hash support(speedup for searching big duplicates) - [#83](https://github.com/qarmin/czkawka/pull/83)
- Support for searching duplicates by file name - [#84](https://github.com/qarmin/czkawka/pull/84)
@ -247,7 +247,7 @@ This version is only needed to test flatpak build
## Version 1.0.1 - 06.10.2020r
- Replaced default argument parser with StructOpt [#37](https://github.com/qarmin/czkawka/pull/37)
- Added all(except MacOS GTK build) builds to CI where can be freely downloaded [#41](https://github.com/qarmin/czkawka/pull/41) [#39](https://github.com/qarmin/czkawka/pull/39)
- Added all(except macOS GTK build) builds to CI where can be freely downloaded [#41](https://github.com/qarmin/czkawka/pull/41) [#39](https://github.com/qarmin/czkawka/pull/39)
- App can be downloaded also from Arch AUR and Cargo [#36](https://github.com/qarmin/czkawka/pull/36)
- Fixed crash with invalid file modification date [#33](https://github.com/qarmin/czkawka/issues/33)
- Upper tabs can hide and show when this is necessary [#38](https://github.com/qarmin/czkawka/pull/38)
@ -298,11 +298,11 @@ This version is only needed to test flatpak build
- Basic search in GTK
- Cleaned core from println
- Core functions doesn't use now process::exit(everything is done with help of messages/errors/warnings)
- Added support for nonrecursive search
- Added support for non-recursive search
- Improved finding number and size of duplicated files
- Saving results to file
- Print how much data was read by duplicate finder(debug only)
**Czkawka** (_tch•kav•ka_ (IPA: [ʈ͡ʂkafka]), "hiccup" in Polish) is a simple, fast and free app to remove unnecessary files from your computer.
## UNSTABLE WARNING
**Currently master branch of this repository contains unstable GTK 4 port, so multiple regression, broken features etc. are expected.**
**Currently, master branch of this repository contains unstable GTK 4 port, so multiple regression, broken features etc. are expected.**
**You can use old stable version built with GTK 3 - [4.1.0](https://github.com/qarmin/czkawka/releases/tag/4.1.0) or compile app from git before GTK 4 PR merge.**
@ -50,7 +50,7 @@ If you want to try and develop Czkawka or just use the latest available feature,
Since Czkawka is written in Rust and it aims to be a faster alternative to FSlint or DupeGuru which are written in Python, we need to compare the speed of these tools.
I tested it on a 256 GB SSD and a i7-4770 CPU.
I tested it on a 256 GB SSD and an i7-4770 CPU.
I prepared a disk and performed a test without any folder exceptions and with disabled ignoring of hard links. The disk contained 363 215 files, took 221,8 GB and had 62093 duplicate files in 31790 groups which occupied 4,1 GB.
@ -135,7 +135,7 @@ There are many similar applications to Czkawka on the Internet, which do some th
### CLI
Due to limited time, the biggest emphasis is on the GUI version so if you are looking for really good and feature-packed console apps, then take a look at these:
- [Fclones](https://github.com/pkolaczk/fclones) - One of the fastest tools to find duplicates; it is written also in Rust
- [Rmlint](https://github.com/sahib/rmlint) - Really good console interface and also is feature packed
- [Rmlint](https://github.com/sahib/rmlint) - Nice console interface and also is feature packed
- [RdFind](https://github.com/pauldreik/rdfind) - Fast, but written in C++ ¯\\\_(ツ)\_/¯
## Contributions
@ -185,7 +185,7 @@ Big thanks to Pádraig Brady, creator of fantastic FSlint, because without his w
Thanks also to all the people who create patches for this program, make it available on other systems, create videos, articles about it etc.
Also I really appreciate work of people that create crates on which Czkawka is based and for that I try to report bugs to make it even better.
Also, I really appreciate work of people that create crates on which Czkawka is based and for that I try to report bugs to make it even better.
## Donations
If you are using the app, I would appreciate a donation for its further development, which can be done [here](https://github.com/sponsors/qarmin).
file_entry_clone.error_string="Audio crashes due parsing, please report bug here - https://github.com/qarmin/audio_checker/issues".to_string();
Some(Some(file_entry_clone))
}
},
}
Err(_inspected)=>Some(None),
},
@ -502,7 +502,7 @@ impl BrokenFiles {
ifletErr(e)=pdf::file::File::from_data(content){
file_entry.error_string=e.to_string();
leterror=unpack_pdf_error(e);
ifletpdf::PdfError::InvalidPassword=error{
ifletPdfError::InvalidPassword=error{
returnSome(None);
}
}
@ -515,7 +515,7 @@ impl BrokenFiles {
file_entry_clone.error_string="PDF-rs library crashed when opening pdf, and if it is not reported, please report bug here - https://github.com/pdf-rs/pdf".to_string();
Some(Some(file_entry_clone))
}
},
}
Err(_inspected)=>Some(None)
}
}
@ -615,7 +615,7 @@ impl DebugPrint for BrokenFiles {
For now at least 10 languages are supported(some was translated by computers)
### Opening/Manipulating files
It is possible to open selected files by doubleclicking on them.
It is possible to open selected files by double-clicking on them.
To open multiple file just select desired files with CTRL key pressed and still when clicking this key, doubleclick at selected items with left mouse button.
To open multiple file just select desired files with CTRL key pressed and still when clicking this key, double-click at selected items with left mouse button.
To open folder containing selected file, just click twice on it with right mouse button.
To invert a selection of files, click on a file with the middle mouse button and it will invert the selection of the other files in the same group.
To invert a selection of files, click on a file with the middle mouse button, and it will invert the selection of the other files in the same group.
### Adding directories
By default current path is loaded to included directory and excluded directories are filled with default paths.
By default, current path is loaded to included directory and excluded directories are filled with default paths.
It is possible to override this, by adding arguments when opening app e.g. `czkawka_gui /home /usr --/home/rafal --/home/zaba` which means that `/home` and `/usr` directories will be checked and `/home/rafal` and `/home/zaba` will be excluded.
@ -64,13 +64,13 @@ By default, all tools only write about results to console, but it is possible wi
## Config/Cache files
Currently, Czkawka stores few config and cache files on disk:
- `czkawka_gui_config.txt` - stores configuration of GUI which may be loaded at startup
- `cache_similar_image_SIZE_HASH_FILTER.bin/json` - stores cache data and hashes which may be used later without needing to compute image hash again.. Each algorithms uses its own file, because hashes are completely different in each.
- `cache_similar_image_SIZE_HASH_FILTER.bin/json` - stores cache data and hashes which may be used later without needing to compute image hash again. Each algorithms use its own file, because hashes are completely different in each.
- `cache_broken_files.txt` - stores cache data of broken files
- `cache_duplicates_HASH.txt` - stores cache data of duplicated files, to not suffer too big of a performance hit when saving/loading file, only already fully hashed files bigger than 5MB are stored. Similar files with replaced `Blake3` to e.g. `SHA256` may be shown, when support for new hashes will be introduced in Czkawka.
- `cache_similar_videos.bin/json` - stores cache data of video files.
Editing `bin` files may cause showing strange crashes, so in case of having any, removing these files should help.
It is possible to modify files with JSON extension(may be helpful when moving files to different disk or trying to use cache file on different computer). To do this, it is required to enable in settings option to generate also cache json file. Next file can be changed/modified. By default cache files with `bin` extension are loaded, but if it is missing(can be renamed or removed), then data from json file is loaded if exists.
It is possible to modify files with JSON extension(may be helpful when moving files to different disk or trying to use cache file on different computer). To do this, it is required to enable in settings option to generate also cache json file. Next file can be changed/modified. By default, cache files with `bin` extension are loaded, but if it is missing(can be renamed or removed), then data from json file is loaded if exists.
Config files are located in this path:
@ -99,7 +99,7 @@ Windows - `C:\Users\Username\AppData\Local\Qarmin\Czkawka\cache`
- It is possible to open parent folder of selected items with double click with right mouse button(RMB)
it is also possible to open such item with double click with left mouse button(LMB).
- **Faster scanning for big number of duplicates**
By default for all files grouped by same size are computed partial hash(hash from only of 2KB each file). Such hash is computed usually very fast, especially on SSD and fast multicore processors. But when scanning a hundred of thousands or millions of files with HDD or slow processor, usually this step can take much time. In settings exists option `Use prehash cache` which enables caching such things. It is disabled by default because can increase time of loading/saving cache, with big number of entries.
By default for all files grouped by same size are computed partial hash(hash from only of 2KB each file). Such hash is computed usually very fast, especially on SSD and fast multicore processors. But when scanning a hundred of thousands or millions of files with HDD or slow processor, typically this step can take much time. In settings exists option `Use prehash cache` which enables caching such things. It is disabled by default because can increase time of loading/saving cache, with big number of entries.
- **Permanent store of cache entries**
After each scan, entries in cache are validated and outdated ones(which points at non-existent files) are removed. This may be problematic when scanning external drivers(like pendrives, disks etc.) and later unplugging and plugging them again. In settings exists option `Delete outdated cache entries automatically` which automatically clear this, but this can be disabled. Disabling such option may create big cache files, so button `Remove outdated results` will do it manually.
- **Partial scanning**
@ -176,9 +176,9 @@ This only removes the most basic temporary files, for more I suggest to use Blea
### Invalid Symlinks
To find invalid symlinks we must first find symlinks.
After searching for them you should check at which element it points to and if it does not exist, add this symlinks into the list of invalid symlinks, pointing to a non-existent path.
After searching for them, you should check at which element it points to and if it does not exist, add this symlinks into the list of invalid symlinks, pointing to a non-existent path.
The second mode is to detect recursive symlink. Unfortunately, this mode does not work and it displays when using it an error of a non-existent target element, but it is implemented by counting the jumps of the symlink and after exceeding a certain number (e.g. 20) it is considered that the given symlink is recursive.
The second mode is to detect recursive symlink. Unfortunately, this mode does not work, and it displays when using it an error of a non-existent target element, but it is implemented by counting the jumps of the symlink and after exceeding a certain number (e.g. 20) it is considered that the given symlink is recursive.
### Same Music
This is a mode to find identical music files through tags.
@ -202,7 +202,7 @@ It is a tool for finding similar images that differ e.g. in watermark, size etc.
The tool first collects images with specific extensions that can be checked - `[".jpg", ".jpeg", ".png", ".bmp", ".tiff", ".tif", ".pnm", ".tga", ".ff", ".gif", ".jif", ".jfi", ".ico", ".webp", ".avif"]`.
Next cached data is loaded from file to prevent hashing twice the same file.
The cache which points to nonexisting data, by default is deleted automatically.
The cache which points to non-existing data, by default is deleted automatically.
Then a perceptual hash is created for each image which isn't available in cache.
@ -232,7 +232,7 @@ Some images broke hash functions and create hashes full of `0` or `255`, so thes
You can test each algorithm with provided CLI tool, just put to folder `test.jpg` file and run inside this command `czkawka_cli tester -i`
Faster compare option allows to only once compare results, so checking should works a lot of faster when using higher number of similarity.
Faster compare option allows to only once compare results, so checking should work a lot of faster when using higher number of similarity.
Some tidbits:
- Smaller hash size not always means that calculating it will take more time