We started to use lazycell because syntect already used it. But syntect has
changed to use once_cell. So we should also do that to prepare for using the
upcoming version of syntect.
I had to use a `lazy_static` due to that the clap API that only accepts a
reference to a version string. And, in our code, only a 'static reference to a
version string.
Code could probably be refactored to accept a "normal" reference, but that would
be a major undertaking.
By forwarding the task to find the `Plain Text` syntax to `assets`. Not only does
the code become simpler; we also get rid of a call to `self.get_syntax_set()`
which is beneficial to the long term goal of replacing `syntaxes.bin` with
`minimal_syntaxes.bin`.
Note that the use of `.expect()` is not a regression in error handling. It was
previously hidden in `.find_syntax_plain_text()`.
This information is useful when you want to build several SyntaxSets, but
without having to duplicate SyntaxDefinitions. For example:
"Rust" has no dependencies. But "Markdown" depends on "Rust". With the data
structures this code adds, we know that "Rust" is a dependent syntax for
"Markdown", and can construct a SyntaxSet that takes that into account.
Note that code has a temporary environment flag to ignore any information about
dependents when constructing SyntaxSets. Code that makes use of the new data
structure will be added later.
This significantly speeds up the startup time of bat, since only a single
linked SyntaxDefinition is loaded for each file. The size increase of the
binary is just ~400 kB.
In order for startup time to be improved, the --language arg must be used, and
it must match one of the following names:
"Plain Text", "ActionScript", "AppleScript", "Batch File", "NAnt Build File",
"C#", "C", "CSS", "D", "Diff", "Erlang", "Go", "Haskell", "JSON", "Java
Properties", "BibTeX", "LaTeX Log", "TeX", "Lisp", "Lua", "MATLAB", "Pascal",
"R", "Regular Expression", "Rust", "SQL", "Scala", "Tcl", "XML", "YAML", "Apache
Conf", "ARM Assembly", "Assembly (x86_64)", "CMakeCache", "Comma Separated
Values", "Cabal", "CoffeeScript", "CpuInfo", "Dart Analysis Output", "Dart",
"Dockerfile", "DotENV", "F#", "Friendly Interactive Shell (fish)", "Fortran
(Fixed Form)", "Fortran (Modern)", "Fortran Namelist", "fstab", "GLSL",
"GraphQL", "Groff/troff", "group", "hosts", "INI", "Jinja2", "jsonnet",
"Kotlin", "Less", "LLVM", "Lean", "MemInfo", "Nim", "Ninja", "Nix", "passwd",
"PowerShell", "Protocol Buffer (TEXT)", "Puppet", "Rego", "resolv", "Robot
Framework", "SML", "Strace", "Stylus", "Solidity", "Vyper", "Swift",
"SystemVerilog", "TOML", "Terraform", "TypeScript", "TypeScriptReact",
"Verilog", "VimL", "Zig", "gnuplot", "log", "requirements.txt", "Highlight
non-printables", "Private Key", "varlink"
Later commits will improve startup time for more code paths.
* fix some typos and misspellings
* CHANGELOG.md: Add Performance section (preliminary)
* Add a CHANGELOG.md entry for this PR
This will be needed to later support zero-copy deserialization of independent
syntax sets, but is interesting and useful on its own.
Instead of deferring serialization and deserialization to syntect, we implement it
ourselves in the same way, but make compression optional.
We can't use #[from] on Error::Msg(String) because String does not implement Error.
(Which it shouldn't; see e.g. https://internals.rust-lang.org/t/impl-error-for-string/8881.)
So we implement From manually for Error::Msg, since our current code was written
in that way for error-chain.
Move code to build assets to its own file. That results in better modularity and flexibility.
It also allows us to simplify HighlightingAssets a lot, since it will now always
be initialized with a SerializedSyntaxSet.
To improve startup performance, we will later load smaller `SyntaxSet`s instead
of one giant one. However, the current API assumes only one `SyntaxSet` is ever used,
and that that implicitly is the `SyntaxSet` from which returned `SyntaxReference`s
comes.
This change changes the API to reflect that `SyntaxSet` and `SyntaxReference`
are tightly coupled, and enables the use of several `SyntaxSet`.
Instead of 100 ms - 50 ms, startup takes 10 ms - 5 ms.
HighlightingAssets::get_syntax_set() is never called when e.g. piping the bat
output to a file (see Config::loop_through), so by loading the SyntaxSet only
when needed, we radically improve startup time when it is not needed.
They are just a way to get access to data embedded in the binary, so they don't
conceptually belong inside HighlightingAssets.
This has the nice side effect of getting HighlightingAssets::from_cache() and
::from_binary(), that are highly related, next to each other.
Or rather, introduce new versions of these methods and deprecate the old ones.
This is preparation to enable robust and user-friendly support for lazy-loading.
With lazy-loading, we don't know if the SyntaxSet is valid until after we try to
use it, so wherever we try to use it, we need to return a Result. See discussion
about panics in #1747.
Using BufReader makes sense for large files, but assets are never large enough
to require buffering. It is significantly faster to load the file contents in
one go, so let's do that instead.
Closes#1753
It already now reduces code duplication slightly, but will become even more
useful in the future when we add more complicated logic such as lazy-loading.
Since we only modify `pub(crate)` items, the stable bat-as-a-library API is not
affected.
This takes us one step closer to making SyntaxSet lazy-loaded, which in turn
takes us one step closer to solving #951.
This fixes a bug on Windows where `Command::new` would also run
executables from the current working directory, possibly resulting in
accidental runs of programs called `less`.
Otherwise Rust 1.53.0 gets confused during `cargo doc` because it thinks
we want an actual URL:
warning: this URL is not a hyperlink
--> src/pretty_printer.rs:331:40
|
331 | /// The title for the input (e.g. "http://example.com/example.txt")
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ help: use an automatic link instead: `<http://example.com/example.txt>`
|
= note: `#[warn(rustdoc::bare_urls)]` on by default
= note: bare URLs are not automatically turned into clickable links
It was perhaps also a bit confusing to give an URL as an example in the
first place, because according to our own API example
`examples/inputs.rs` it is meant to be more a free-text thing.
Less 581.2 is here, and it has a ".2" in the version string, which can't
be parsed as a usize.
Update the check to find a non-digit character rather than a space. This
ignores the minor version, but parses the major version correctly.
Do not ignore `BAT_CONFIG_PATH` if it doesn't exist. Both when
generating a new config file with `--generate-config-file` and
when attempting to read the config.
Also, provide a better error message in case the file can not
be created.
closes#1550
closes#1510
The change in `create_highlighted_versions.py` fixes a "unknown theme
"'1337'" warning. The single quotes were wrong. `bat` was always falling
back to the default theme, so let's use that for now.
Fixed by implementing the proposal by sharkdp:
* Allow PAGER=bat, but ignore the setting in bat and simply default to
less. Unless of course, BAT_PAGER or --pager is used to overwrite the
value of PAGER.
* Disallow the usage of bat within BAT_PAGER and --pager.
This will fix#614 by making it clear what is wrong by showing the
following error message:
Failed to load one or more themes from
'/Users/me/.config/bat/themes' (reason: 'Invalid syntax theme
settings')
We also need to add a check if theme_dir.exists(), otherwise an absent
dir will seem like an error:
Failed to load one or more themes from
'/Users/me/.config/bat/themes' (reason: 'IO error for
operation on /Users/me/.config/bat/themes: No such file or
directory (os error 2)')
(This is the same check we already have for syntax_dir.)
To trigger/verify the changed code, run
bat --list-languages # or -L
This is the last clippy warning in the code that you get if you run
cargo clippy --all-targets --all-features -- --allow clippy::style
so by fixing it it becomes easier to spot when a new warning is
introduced (that does not belong to the clippy category clippy::style).
And by making it easy to spot new warnings, we increase chance of such
regressions not ending up in the code base.
This macro is intended to be package-internal and is not to be
considered part of the public lib API.
Use it in three places to reduce code duplication. However, main reason
for this refactoring is to allow us to fix#1063 without duplicating the
code yet another time.
The macro can also be used for the "Binary content from {} will not be
printed to the terminal" message if that message starts to use eprintln!
instead (if ever).
To trigger/verify the changed code, the following commands can be used:
cargo run -- --theme=ansi-light tests/examples/single-line.txt
cargo run -- --theme=does-not-exist tests/examples/single-line.txt
cargo run -- --style=grid,rule tests/examples/single-line.txt
This combines ansi-light and ansi-dark into a single theme that works
with both light and dark backgrounds. Instead of specifying white/black,
the ansi theme uses the terminal's default foreground/background color
by setting alpha=01, i.e. #00000001. This is in addition to the alpha=00
encoding where red contains an ANSI color palette number.
Now, `--theme ansi-light` and `--theme ansi-dark` will print a
deprecation notice and use ansi instead (unless the user has a custom
theme named ansi-light or ansi-dark, which would take precedence).
This fixes#1438.
Note however, that using a pager such as less will add a newline itself.
So to actually not print a newline for such files, you need to either
disable paging:
bat --style=plain --paging=never no-newline-at-end-of-file.txt
or use a "pager" that does not add a newline:
bat --style=plain --pager=cat no-newline-at-end-of-file.txt
Note that we also update syntax tests file since a bunch of them had
missing newlines on the last lines.
Since it has a functional role, we can not just replace it, we must keep
it around. This also allows us to simplify the code slightly.
We must fix this before we fix#1438 since otherwise the \n will be
missing with --style=plain, since we will stop adding it if it is
missing.
As mentioned on #1334 `bat` should not be used as a value for `pager`,
this change checks both the balue of `bat` provided as a parameter or
as an environment variable.
Prevent less from wrapping lines by setting the proper flag when `--wrap=never`.
If the user set a custom value for `--pager`, no additional flag is set.
Bat already has a base16 theme. The new base16-256 theme is for users
of base16-shell, who configure their terminal with a 256-color variant
of a base16 theme. These variants put some of the base16 colors in
elsewhere in the 256-color table to avoid clobbering bright color slots
(ansi codes 8 to 15) with colors that don't respect the ordinary meaning
of that slot (e.g. bright green in ordinary base16 is not green).
For more details, see https://github.com/chriskempson/base16-shell
## [why]
For 'windows' platforms, directly spawning a process (eg, called PATHNAME) bypasses the
usual windows shell machinery for determining which process to execute. Specifically,
the extensions in PATHEXT will not be used to determine the final executable. So,
`PATHNAME.bat`, `PATHNAME.cmd`, ... will *not* be executed even if on they exist on the
PATH; and this is counter to the usual expectation of a Windows user. Additionally,
built-in commands, such as `echo` and `dir`, will never be accessible as they do not
have a PATH to execute and, so, will never be found.
To use the usual machinery, giving access to PATHNAME.bat and `echo`, execute the PATHNAME
using the windows shell, eg `cmd /d/c PATHNAME`. Note this may expose the constructed
command line to the windows shell quoting vagaries (sadly, that may be part of the price).
Following Windows standards, the ComSpec environment variable is used to determine which
shell to use, with a fallback to the "modern", built-in `cmd` shell.
This changes the base16 theme back from #RRGGBB0f to #RRGGBB00,
reverting part of #934. That PR used the 0f encoding to produce ANSI
escape sequences 30-37 and 40-47 rather than 38;5 and 48;5 which require
256-color support. Unfortunately, it resulted in base16 using the wrong
colors becuase ansi_term does not support the bright variants (90-97 and
100-107) so it simply mapped them to the non-bright colors.
This PR makes combines the 00 and 0f alpha encodings into 00, and makes
them use the Color enum for the first 8 colors and Fixed otherwise. This
means the ansi-light and ansi-dark themes will work on terminals without
256-color support, and base16 will render bright colors correctly.