The arm/v7 builds have caused lots of problems due to the lack of
support from the cryptography library, and now issues related to
installing the latest version of cffi. As a result, this build variant
has been removed for now. It may or may not come back later, since the
amount of work just to figure out which library is broken and how to fix
it doesn't feel worth it anymore.
The redirects portion of the error page is only needed in scenarios
where the instance is rate limited, in which case the user's query is
provided to the error template. If this isn't provided, it should just
display the error and allow the user to redirect to the home page.
Fixes#1122
* Bump cryptography to 42.0.4
* Bump pyopenssl to 24.0.0
* Squashed commit of the following:
commit 2395bb7a6a
Author: Ben Busby <contact@benbusby.com>
Date: Wed Mar 6 09:35:48 2024 -0700
Remove version from DDG bangs url
Including the version portion of the URL now redirects to search results
for the name of the bang file, rather than returning the bang file
itself. Removing the version from the URL returns the correct bang file.
---------
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Ben Busby <contact@benbusby.com>
Including the version portion of the URL now redirects to search results
for the name of the bang file, rather than returning the bang file
itself. Removing the version from the URL returns the correct bang file.
The error template previously only included the option to continue a
user's search via Farside (whoogle or searxng), and would only appear
when an instance was ratelimited. This has been updated to display
anytime an exception has occurred, and now includes other options for
continuing a search, such as Kagi, DDG, Brave, Ecosia, etc.
Closes#1099
- WHOOGLE_SHOW_FAVICONS: Default on, can be set to 0 to hide favicons
and skip the request for fetching them
- WHOOGLE_UPDATE_CHECK: Default on, can be set to 0 to disable the
daily check for new versions released on github
Closes#1098Closes#1059
The default unix socket permissions of 600 is too restrictive for many use
cases.
Added a new argument --unix-socket-perms which is passed to waitress to allow
for user configurable socket permissions
There are certain links (such as the age verification link mentioned in
issue #1083) that should trigger removal of the entire container div on
the results page, rather than just hiding the link itself.
This introduces a new `unsupported_g_divs` list that holds links that
will trigger a removal of the result div on the result page.
Fixes#1083
Since POST requests are now redirected to GET requests (with an
encrypted query string), POST searches are no longer the correct
approach to use for testing purposes.
This should fix the annoyance with browsers like Firefox not caching
POST request responses. By redirecting a POST search to be a GET request
instead (with an encrypted query string), the page can be cached and
successfully navigated back to after visiting a result.
DDG provides favicons using the url format
icons.duckduckgo.com/ip2/{site}.ico
This can be used to fetch favicons in the event that the default
"/favicon.ico" path does not work.
Scroller results (like the "latest from ___" or "top stories" results)
shouldn't have a site icon associated with them. This extracts the class
that those types of results have and skips over the process of inserting
an icon.
Audio controls are now always shown by default (mostly found in searches
that contain word pronunciation guides).
Site icons were moved to the left side of the results.
This improves the search result icon feature by "hiding" the site's icon
if one was not found. This happens in scenarios where a site doesn't
have a /favicon.ico due to having a unique path or using javascript to
load the icon.
This appends an icon element to each search result, using the result
domain's "/favicon.ico" path.
Note that some sites do not have a standard /favicon.ico, but have a
unique path to a specifically sized favicon instead. Worse still, some
sites use javascript to load their favicon, which would make it even
more difficult for Whoogle to figure out.
For now this approach is fine, but can be expanded upon in the future
if desired.
Domains were previously not validated before being handled, leading to a
potential scenario where someone could pass something like
"element_url=127.0.0.1:<port>/<resource>" to access other resources on a
machine running Whoogle. This change ensures that the resource used in
both endpoints is a valid domain.
This also includes validation of config names to prevent names from
including path values such as "../../(etc)".
When starting whoogle from another directory, the path to the calculator
widget was previously invalid. It now specifies the path relative to the widget
loader file.
The calculator was previously triggered for partial matches with words
like "calc", which meant searches containing the word "calcium" would
cause the calculator widget to appear.
Redirects to alternative frontends can now be defined using the
WHOOGLE_REDIRECTS environment variable. Usage is documented in the
readme, but is basically defined as <parent>:<new>.
Closes#988
This relates to an issue with an unknown cause (unable to reproduce on
my end) where the preferences string does not contain the correct amount
of padding on a base64 encoded value. This is mediated by appending
padding to the end of the encoded value, since any extra padding is
removed anyways.
Fixes#987
Defines separate environment variables for setting mobile vs desktop user
agents
Defines an environment variable for using the client's User-Agent
Co-authored-by: Ben Busby <contact@benbusby.com>
Get Google search results, but without any ads, javascript, AMP links, cookies, or IP address tracking. Easily deployable in one click as a Docker app, and customizable with a single config file. Quick and simple to implement as a primary search engine replacement on both desktop and mobile.
Get Google search results, but without any ads, JavaScript, AMP links, cookies, or IP address tracking. Easily deployable in one click as a Docker app, and customizable with a single config file. Quick and simple to implement as a primary search engine replacement on both desktop and mobile.
- Downtime after periods of inactivity \([solution 1](https://repl.it/talk/ask/use-this-pingmat1replco-just-enter/28821/101298), [solution 2](https://repl.it/talk/learn/How-to-use-and-setup-UptimeRobot/9003)\)
- Downtime after periods of inactivity ([solution](https://repl.it/talk/learn/How-to-use-and-setup-UptimeRobot/9003)\)
___
### [Fly.io](https://fly.io)
You will need a **PAID**[Fly.io](https://fly.io) account to deploy Whoogle.
You will need a [Fly.io](https://fly.io) account to deploy Whoogle. The [free allowances](https://fly.io/docs/about/pricing/#free-allowances) are enough for personal use.
#### Install the CLI: https://fly.io/docs/hands-on/installing/
@ -398,6 +413,10 @@ There are a few optional environment variables available for customizing a Whoog
| WHOOGLE_PROXY_PASS | The password of the proxy server. |
| WHOOGLE_PROXY_TYPE | The type of the proxy server. Can be "socks5", "socks4", or "http". |
| WHOOGLE_PROXY_LOC | The location of the proxy server (host or ip). |
| WHOOGLE_USER_AGENT | The desktop user agent to use. Defaults to a randomly generated one. |
| WHOOGLE_USER_AGENT_MOBILE | The mobile user agent to use. Defaults to a randomly generated one. |
| WHOOGLE_USE_CLIENT_USER_AGENT | Enable to use your own user agent for all requests. Defaults to false. |
| WHOOGLE_REDIRECTS | Specify sites that should be redirected elsewhere. See [custom redirecting](#custom-redirecting). |
| EXPOSE_PORT | The port where Whoogle will be exposed. |
| HTTPS_ONLY | Enforce HTTPS. (See [here](https://github.com/benbusby/whoogle-search#https-enforcement)) |
| WHOOGLE_ALT_TW | The twitter.com alternative to use when site alternatives are enabled in the config. Set to "" to disable. |
@ -406,7 +425,7 @@ There are a few optional environment variables available for customizing a Whoog
| WHOOGLE_ALT_TL | The Google Translate alternative to use. This is used for all "translate ____" searches. Set to "" to disable. |
| WHOOGLE_ALT_MD | The medium.com alternative to use when site alternatives are enabled in the config. Set to "" to disable. |
| WHOOGLE_ALT_IMG | The imgur.com alternative to use when site alternatives are enabled in the config. Set to "" to disable. |
| WHOOGLE_ALT_WIKI | The wikipedia.com alternative to use when site alternatives are enabled in the config. Set to "" to disable. |
| WHOOGLE_ALT_WIKI | The wikipedia.org alternative to use when site alternatives are enabled in the config. Set to "" to disable. |
| WHOOGLE_ALT_IMDB | The imdb.com alternative to use when site alternatives are enabled in the config. Set to "" to disable. |
| WHOOGLE_ALT_QUORA | The quora.com alternative to use when site alternatives are enabled in the config. Set to "" to disable. |
| WHOOGLE_AUTOCOMPLETE | Controls visibility of autocomplete/search suggestions. Default on -- use '0' to disable. |
@ -416,6 +435,8 @@ There are a few optional environment variables available for customizing a Whoog
| WHOOGLE_TOR_SERVICE | Enable/disable the Tor service on startup. Default on -- use '0' to disable. |
| WHOOGLE_TOR_USE_PASS | Use password authentication for tor control port. |
| WHOOGLE_TOR_CONF | The absolute path to the config file containing the password for the tor control port. Default: ./misc/tor/control.conf WHOOGLE_TOR_PASS must be 1 for this to work.|
| WHOOGLE_SHOW_FAVICONS | Show/hide favicons next to search result URLs. Default on. |
| WHOOGLE_UPDATE_CHECK | Enable/disable the automatic daily check for new versions of Whoogle. Default on. |
### Config Environment Variables
These environment variables allow setting default config values, but can be overwritten manually by using the home page config menu. These allow a shortcut for destroying/rebuilding an instance to the same config state every time.
@ -441,6 +462,7 @@ These environment variables allow setting default config values, but can be over
| WHOOGLE_CONFIG_STYLE | The custom CSS to use for styling (should be single line) |
| WHOOGLE_CONFIG_PREFERENCES_KEY | Key to encrypt preferences in URL (REQUIRED to show url) |
| WHOOGLE_CONFIG_ANON_VIEW | Include the "anonymous view" option for each search result |
## Usage
Same as most search engines, with the exception of filtering by time range.
@ -448,6 +470,7 @@ Same as most search engines, with the exception of filtering by time range.
To filter by a range of time, append ":past <time>" to the end of your search, where <time> can be `hour`, `day`, `month`, or `year`. Example: `coronavirus updates :past hour`
## Extra Steps
### Set Whoogle as your primary search engine
*Note: If you're using a reverse proxy to run Whoogle Search, make sure the "Root URL" config option on the home page is set to your URL before going through these steps.*
@ -492,6 +515,40 @@ Browser settings:
- Manual
- Under search engines > manage search engines > add, manually enter your Whoogle instance details with a `<whoogle url>/search?q=%s` formatted search URL.
### Custom Redirecting
You can set custom site redirects using the `WHOOGLE_REDIRECTS` environment
variable. A lot of sites, such as Twitter, Reddit, etc, have built-in redirects
to [Farside links](https://sr.ht/~benbusby/farside), but you may want to define
your own.
To do this, you can use the following syntax:
```
WHOOGLE_REDIRECTS="<parent_domain>:<new_domain>"
```
For example, if you want to redirect from "badsite.com" to "goodsite.com":
```
WHOOGLE_REDIRECTS="badsite.com:goodsite.com"
```
This can be used for multiple sites as well, with comma separation:
NOTE: Do not include "http(s)://" when defining your redirect.
### Custom Bangs
You can create your own custom bangs. By default, bangs are stored in
`app/static/bangs`. See [`00-whoogle.json`](https://github.com/benbusby/whoogle-search/blob/main/app/static/bangs/00-whoogle.json)
for an example. These are parsed in alphabetical order with later files
overriding bangs set in earlier files, with the exception that DDG bangs
(downloaded to `app/static/bangs/bangs.json`) are always parsed first. Thus,
any custom bangs will always override the DDG ones.
### Prevent Downtime (Heroku only)
Part of the deal with Heroku's free tier is that you're allocated 550 hours/month (meaning it can't stay active 24/7), and the app is temporarily shut down after 30 minutes of inactivity. Once it becomes inactive, any Whoogle searches will still work, but it'll take an extra 10-15 seconds for the app to come back online before displaying the result, which can be frustrating if you're in a hurry.
@ -568,7 +625,7 @@ Under the hood, Whoogle is a basic Flask app with the following structure:
- `opensearch.xml`: A template used for supporting [OpenSearch](https://developer.mozilla.org/en-US/docs/Web/OpenSearch).
- `imageresults.html`: An "experimental" template used for supporting the "Full Size" image feature on desktop.
- `static/<css|js>`
- CSS/Javascript files, should be self-explanatory
- CSS/JavaScript files, should be self-explanatory
- `static/settings`
- Key-value JSON files for establishing valid configuration values
@ -607,7 +664,7 @@ I'm a huge fan of Searx though and encourage anyone to use that instead if they
**Why does the image results page look different?**
A lot of the app currently piggybacks on Google's existing support for fetching results pages with Javascript disabled. To their credit, they've done an excellent job with styling pages, but it seems that the image results page - particularly on mobile - is a little rough. Moving forward, with enough interest, I'd like to transition to fetching the results and parsing them into a unique Whoogle-fied interface that I can style myself.
A lot of the app currently piggybacks on Google's existing support for fetching results pages with JavaScript disabled. To their credit, they've done an excellent job with styling pages, but it seems that the image results page - particularly on mobile - is a little rough. Moving forward, with enough interest, I'd like to transition to fetching the results and parsing them into a unique Whoogle-fied interface that I can style myself.
## Public Instances
@ -621,15 +678,22 @@ A lot of the app currently piggybacks on Google's existing support for fetching
| [https://s.tokhmi.xyz](https://s.tokhmi.xyz) | 🇺🇸 US | Multi-choice | ✅ |
| [https://search.sethforprivacy.com](https://search.sethforprivacy.com) | 🇩🇪 DE | English | |
| [https://whoogle.dcs0.hu](https://whoogle.dcs0.hu) | 🇭🇺 HU | Multi-choice | |
| [https://whoogle.esmailelbob.xyz](https://whoogle.esmailelbob.xyz) | 🇨🇦 CA | Multi-choice | |
| [https://gowogle.voring.me](https://gowogle.voring.me) | 🇺🇸 US | Multi-choice | |
| [https://whoogle.privacydev.net](https://whoogle.privacydev.net) | 🇳🇱 NL | English | |
| [https://whoogle.privacydev.net](https://whoogle.privacydev.net) | 🇫🇷 FR | English | |
| [https://wg.vern.cc](https://wg.vern.cc) | 🇺🇸 US | English | |
| [https://whoogle.hxvy0.gq](https://whoogle.hxvy0.gq) | 🇨🇦 CA | Turkish Only | ✅ |
* A checkmark in the "Cloudflare" category here refers to the use of the reverse proxy, [Cloudflare](https://cloudflare.com). The checkmark will not be listed for a site which uses Cloudflare DNS but rather the proxying service which grants Cloudflare the ability to monitor traffic to the website.
@ -640,7 +704,8 @@ A lot of the app currently piggybacks on Google's existing support for fetching
| [http://whoglqjdkgt2an4tdepberwqz3hk7tjo4kqgdnuj77rt7nshw2xqhqad.onion](http://whoglqjdkgt2an4tdepberwqz3hk7tjo4kqgdnuj77rt7nshw2xqhqad.onion) | 🇺🇸 US | Multi-choice
| [http://nuifgsnbb2mcyza74o7illtqmuaqbwu4flam3cdmsrnudwcmkqur37qd.onion](http://nuifgsnbb2mcyza74o7illtqmuaqbwu4flam3cdmsrnudwcmkqur37qd.onion) | 🇩🇪 DE | English
| [http://whoogle.vernccvbvyi5qhfzyqengccj7lkove6bjot2xhh5kajhwvidqafczrad.onion](http://whoogle.vernccvbvyi5qhfzyqengccj7lkove6bjot2xhh5kajhwvidqafczrad.onion/) | 🇺🇸 US | English |
| [http://whoogle.g4c3eya4clenolymqbpgwz3q3tawoxw56yhzk4vugqrl6dtu3ejvhjid.onion](http://whoogle.g4c3eya4clenolymqbpgwz3q3tawoxw56yhzk4vugqrl6dtu3ejvhjid.onion/) | 🇳🇱 NL | English |
| [http://whoogle.g4c3eya4clenolymqbpgwz3q3tawoxw56yhzk4vugqrl6dtu3ejvhjid.onion](http://whoogle.g4c3eya4clenolymqbpgwz3q3tawoxw56yhzk4vugqrl6dtu3ejvhjid.onion/) | 🇫🇷 FR | English |
| [http://whoogle.daturab6drmkhyeia4ch5gvfc2f3wgo6bhjrv3pz6n7kxmvoznlkq4yd.onion](http://whoogle.daturab6drmkhyeia4ch5gvfc2f3wgo6bhjrv3pz6n7kxmvoznlkq4yd.onion/) | 🇩🇪 DE | Multi-choice | |