You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
Go to file
gardenapple 099752fa45
Add GPL-3.0 license
4 years ago
.gitignore Initial commit, basic features 4 years ago
LICENSE.md Add GPL-3.0 license 4 years ago
README.md Fix mistake in example 4 years ago
index.js Add GPL-3.0 license 4 years ago
package-lock.json Rebranding back again 4 years ago
package.json Fix mistake in example 4 years ago

README.md

readability-cli

Firefox Reader View in your terminal!

readability-cli takes any HTML page and strips out unnecessary bloat by using Mozilla's Readability library. As a result, you get a web page which contains only the core content and nothing more. The resulting HTML is suitable for terminal browsers, text readers, and other uses.

Here is a before-and-after comparison, using an article from The Guardian as a test subject.

Standard view in W3M

An article from The Guardian in W3M

So much useless stuff that the main article does not even fit on the screen!

readability-cli + W3M

An article from The Guardian in W3M using readability-cli

Ah, much better.

Installation

The best way to install readability-cli is through NPM:

npm install -g @gardenapple/readability-cli

Usage

readable [SOURCE] [options]

readable [options] -- [SOURCE]

where SOURCE is a file, an http(s) URL, or '-' for standard input

See readable --help for more information.

Examples

Read HTML from a file and output the result to the console:

readable index.html

Fetch a random Wikipedia article, get its title and an excerpt:

readable https://en.wikipedia.org/wiki/Special:Random -p title,excerpt

Fetch a web page and read it in W3M:

readable https://www.nytimes.com/2020/01/18/technology/clearview-privacy-facial-recognition.html | w3m -T text/html

Download a web page using cURL, parse it and save it into a file:

curl https://github.com/mozilla/readability | readable --url=https://github.com/mozilla/readability > example.html

It's a good idea to supply the --url parameter when piping input, otherwise readable won't know the document's URL, and things like relative links won't work.

Why Node.js? It's so slow!

I know that it's slow, but JavaScript is the most sensible option for this, since Mozilla's Readabilty library is written in JavaScript. There have been ports of the Readability algorithm to other languages, but Mozilla's version is the only one that's actively maintained as of 2020.