Commit Graph

453 Commits

Author SHA1 Message Date
Ben Busby
cdbe550737
Add env vars for hiding favicons and removing daily update check
- WHOOGLE_SHOW_FAVICONS: Default on, can be set to 0 to hide favicons
  and skip the request for fetching them
- WHOOGLE_UPDATE_CHECK: Default on, can be set to 0 to disable the
  daily check for new versions released on github

Closes #1098
Closes #1059
2023-12-20 11:28:00 -07:00
Vivek
70dc750c7a
Add arg for configuring unix socket perms (#1103)
The default unix socket permissions of 600 is too restrictive for many use
cases.

Added a new argument --unix-socket-perms which is passed to waitress to allow
for user configurable socket permissions
2023-12-20 11:05:38 -07:00
Ben Busby
b97f3dd4c0
Bump version to 0.8.4 2023-11-01 14:45:42 -06:00
Ben Busby
9f68c843d6
Specify links that should trigger div removal from results
There are certain links (such as the age verification link mentioned in
issue #1083) that should trigger removal of the entire container div on
the results page, rather than just hiding the link itself.

This introduces a new `unsupported_g_divs` list that holds links that
will trigger a removal of the result div on the result page.

Fixes #1083
2023-11-01 14:30:23 -06:00
Gautam Korlam
9cc1004fb8
fix: correctly handle skip_prefix logic for site_alts (#1092)
Fixes #1091
2023-11-01 14:07:45 -06:00
Ben Busby
2950aa869b
Redirect POST search -> enc GET request
This should fix the annoyance with browsers like Firefox not caching
POST request responses. By redirecting a POST search to be a GET request
instead (with an encrypted query string), the page can be cached and
successfully navigated back to after visiting a result.
2023-10-16 16:28:36 -06:00
Ben Busby
7bda165ca3
Fetch fallback site icons from DDG
DDG provides favicons using the url format
icons.duckduckgo.com/ip2/{site}.ico

This can be used to fetch favicons in the event that the default
"/favicon.ico" path does not work.
2023-10-11 17:26:12 -06:00
Ben Busby
81b7fd1876
Encrypt site icon requests
Paths to favicons are now encrypted with the user's Fernet key, the same
as any other external result page element
2023-10-11 17:18:25 -06:00
Ben Busby
a7e937f7c6
Skip scrollers when applying site icons to results
Scroller results (like the "latest from ___" or "top stories" results)
shouldn't have a site icon associated with them. This extracts the class
that those types of results have and skips over the process of inserting
an icon.
2023-10-11 15:58:52 -06:00
Ben Busby
c2873190c9
Display audio controls, refactor site icon placement
Audio controls are now always shown by default (mostly found in searches
that contain word pronunciation guides).

Site icons were moved to the left side of the results.
2023-10-11 15:41:48 -06:00
Ben Busby
330ae964f3
Only sanitize result content on main result page
The other result tabs (images/maps/videos/news) don't have text content
that needs sanitizing.

Fixes #1080
2023-10-11 11:09:09 -06:00
Ben Busby
67b6110087
Display an empty img if a site icon can't be found
This improves the search result icon feature by "hiding" the site's icon
if one was not found. This happens in scenarios where a site doesn't
have a /favicon.ico due to having a unique path or using javascript to
load the icon.
2023-10-11 11:05:53 -06:00
Ben Busby
4292ec7f63
Add icons for each search result
This appends an icon element to each search result, using the result
domain's "/favicon.ico" path.

Note that some sites do not have a standard /favicon.ico, but have a
unique path to a specifically sized favicon instead. Worse still, some
sites use javascript to load their favicon, which would make it even
more difficult for Whoogle to figure out.

For now this approach is fine, but can be expanded upon in the future
if desired.
2023-10-11 11:05:53 -06:00
Ben Busby
c36396e9cb
Sanitize valid html in result text content
This inspects the text content of each individual result div and strips
out valid 'script' or 'iframe' tags from the result.

Closes #1076
2023-10-10 16:38:13 -06:00
Ben Busby
3a2e0b262e
Validate urls in element and window endpoints
Domains were previously not validated before being handled, leading to a
potential scenario where someone could pass something like
"element_url=127.0.0.1:<port>/<resource>" to access other resources on a
machine running Whoogle. This change ensures that the resource used in
both endpoints is a valid domain.

This also includes validation of config names to prevent names from
including path values such as "../../(etc)".
2023-09-13 15:50:04 -06:00
MoistCat
693ca3a9a8
Fix invalid calculator widget path (#1064)
When starting whoogle from another directory, the path to the calculator
widget was previously invalid. It now specifies the path relative to the widget
loader file.
2023-09-13 14:13:21 -06:00
Ben Busby
a35b1dabbc
Use filtered query for map tab
The map tab should only ever pass the raw query (i.e. no "-site:..."
strings), otherwise the maps page will return an error.

Fixes #1048
2023-09-08 16:44:04 -06:00
Ben Busby
a623210244
Match exact words to trigger calculator widget
The calculator was previously triggered for partial matches with words
like "calc", which meant searches containing the word "calcium" would
cause the calculator widget to appear.
2023-09-08 16:19:39 -06:00
Ben Busby
92e8ede24e
Bump version to 0.8.3 2023-08-21 15:06:17 -06:00
Ahmad Alkadri
4a0089686e
Fix: keep_blank_values = True to handle blank q input (#1052) 2023-08-21 14:53:10 -06:00
Fabian Wunsch
a40b98341b
Change the consent cookies (#1054)
* Changed the consent cookies

* Shorter cookie thanks to @Imaskiller
2023-08-21 14:50:38 -06:00
Ben Busby
4962659acb
Serve basic robots.txt to avoid indexing
Closes #1015
2023-06-26 16:16:45 -06:00
Cx
8a2a6f3265
Update translations.json [skip ci] (#1025) 2023-06-26 15:49:19 -06:00
Andiru
29992985bc
Fix incorrect link replacements (#1016)
Fix link/result description getting replaced when alternative is disabled
(set to empty string)

Replace medium.com links with value from constant
2023-06-26 15:47:43 -06:00
Ben Busby
f65529f328
Allow defining custom redirects with WHOOGLE_REDIRECTS
Redirects to alternative frontends can now be defined using the
WHOOGLE_REDIRECTS environment variable. Usage is documented in the
readme, but is basically defined as <parent>:<new>.

Closes #988
2023-05-19 12:15:15 -06:00
Ben Busby
f213a2a64a
Ensure b64 prefs always have min padding
This relates to an issue with an unknown cause (unable to reproduce on
my end) where the preferences string does not contain the correct amount
of padding on a base64 encoded value. This is mediated by appending
padding to the end of the encoded value, since any extra padding is
removed anyways.

Fixes #987
2023-05-19 11:39:51 -06:00
Roman Štefko
2cb4b9e3ca
Allow setting mobile/desktop UAs using env vars (#1003)
Defines separate environment variables for setting mobile vs desktop user
agents

Defines an environment variable for using the client's User-Agent

Co-authored-by: Ben Busby <contact@benbusby.com>
2023-05-19 11:32:05 -06:00
Abhishek M J
349b87ec18
Fix unsupported_g_pages in result list (#996)
Closes #995
2023-05-01 10:23:57 -06:00
João
bf6c2505b1
Update handling of custom css (#965)
"Default" css is no longer required to be part of the request
when updating styles.

Closes #934
2023-04-13 14:19:36 -06:00
Ben Busby
9d9022ed99
Bump version to 0.8.2 2023-04-13 13:03:30 -06:00
Ben Busby
b39ba0533a
Suppress spurious warnings from bs4
More MarkupResemblesLocatorWarning warnings have been appearing. This
seems to be caused by parsing HTML content that contains a URL.

This new change suppresses the warning at the root level of the app
before any content has been parsed, so this error shouldn't appear
again.

Fixes #968
2023-03-22 12:29:05 -06:00
xatier
f970b62f12
Update zh-tw translation (#973)
* Add translation for new strings from 7041b43db9
  Use same terms as Google's zh-tw interface.
* Fix missing period
* Sync string order with en (easier for future updates)
2023-03-20 13:08:18 -06:00
xatier
b1e468ff01
Fix bug in title/url blocking regex (#969)
Fix the exception `AttributeError: 'Filter' object has no attribute 'block_url'`
introduced in this commit [1].

`self.block_title` and `self.block_url` were members of the Filter
object[2], but not anymore after commit [1].

This bug can be reproduced with setting WHOOGLE_CONFIG_BLOCK_URL to a
non-empty string.

[1] 10a15e06e1
[2] 284a8102c8
2023-03-14 11:22:53 -06:00
Ben Busby
8c426ab180
Suppress invalid warning from bs4, add 404 handler
An invalid parsing warning was being thrown by the latest version of the
bs4 library. This suppresses that warning from being shown in the
console.

A 404 handler was added to move logging from the console to the error
template, since a lot of users assumed that 404 errors from the result
page were problems with Whoogle itself.

Fixes #967
2023-03-07 11:28:55 -07:00
Ben Busby
f7c4381ba6
Remove preferences arg from opensearch template
When a browser adds a search engine using the opensearch template, it
does not have the correct context necessary to autofill the
`preferences` arg with the user's session prefs. As a result, queries
made using the browser bar will have the instance's default preferences
filled into the template.

Removing this shouldn't have any side effects, since queries made on the
same machine will have the correct session associated with the user.

Fixes #929
2023-03-06 15:33:28 -07:00
João
baa8bd0eb4
Add auth to cookie (#964)
When authenticated, the cookie set will allow the user to stay connected even
if the browser is restarted.

Fixes #951
2023-03-01 09:58:59 -07:00
Ben Busby
1759c119a8
Replace Python 3.10 match with if/else
Some distributions require manually installing Python 3.10, which makes
it less convenient than just using whatever version of Python3.X the
package manager supports. Since the only 3.10 feature being used was
"match", and it was a very small change, it's been replaced with an
if/else statement to ensure compatibility with older versions of Python
3.
2023-03-01 09:42:30 -07:00
Ben Busby
fb8a2ea325
Include prefs arg in footer navigation
Navigating between pages of results now includes the user's preferences
string, which allows them to retain their config for a particular
instance between result pages.

Fixes #960
2023-02-21 09:57:44 -07:00
Ben Busby
6b56dab4c1
Remove ig->bibliogram redirects
Bibliogram has been discontinued, and the remaining instances aren't
very reliable. As a result, all instagram redirects have been removed.

Fixes #955
2023-02-21 09:42:42 -07:00
elliot
7ca69e752d
Add calculator widget (#956)
This adds a simple calculator widget, somewhat similar to the one presented
when searching calculator on Google.

Also, it adds somewhat of a template for making the addition of new widgets
easier via the app/utils/widgets.py file. My eventual plan is to use this to
create more widgets that appear in Google, such as a color picker, timer, etc.

---------

Co-authored-by: Ben Busby <contact@benbusby.com>
2023-02-21 09:36:38 -07:00
Ben Busby
991fe6d910
Exclude subdomain in Medium->Scribe redirects
Medium redirects needed further cleanup to account for instances where a
link contains a subdomain that would not make sense in a Farside
redirect link.

Fixes #947
2023-02-04 16:36:16 -07:00
test2user-aqil
09db4ff730
Add Azerbaijani translation (#944)
* Add Azerbaijani translation

Co-authored-by: Ben Busby <contact@benbusby.com>
2023-01-30 12:46:10 -07:00
Ahmad Alkadri
16794df68d
Add Indonesian translation (#940) 2023-01-30 12:33:17 -07:00
George T
94b208dd3f
Update greek translations (#943)
Add Hellenic Language

Co-authored-by: Ben Busby <contact@benbusby.com>
2023-01-30 12:32:01 -07:00
Ben Busby
12ce174b9a
Include url prefix for reverse proxied instances
The url prefix was not included when reconstructing the root url using
X-Forwarded-* headers, causing some elements to fail to load properly.

Fixes #937
2023-01-30 12:13:46 -07:00
Ahmad Alkadri
e5a5aad997
Always bold CN/JA/KO search terms (#928)
Add a function to check if target_word contains CJK characters

If a search term contains Chinese, Japanese, or Korean characters,
the term is bolded in search results regardless of whitespace.

CJK characters: Chinese, Japanese (hiragana, katakana, kanji), 
and Korean (hangul syllables, hangul jamo)

Co-authored-by: Ben Busby <contact@benbusby.com>
2023-01-09 12:54:41 -07:00
Ben Busby
fdc63b862e
Autoload whoogle.env if it exists
The whoogle.env file previously needed to be created and enabled using
the WHOOGLE_DOTENV var. This removes the second step and loads the env
file if it's found during app init.

The Dockerfile has also been updated to copy in whoogle.env if it
exists.

Fixes #909
2023-01-04 10:35:42 -07:00
Ben Busby
aa54491ae0
Log rate-limiting errors
Rate limiting is now reported to the console as an error message.

Fixes #914
2023-01-04 10:21:16 -07:00
Charles Zawacki
cec10e81d3
Don't prepend to services that have schemes with '//' (#925) 2023-01-04 10:10:32 -07:00
Charles Zawacki
a760476d1b
Omit 'mobile.' and 'm.' in site alt replacements (#922)
Resolves #921
2023-01-03 10:19:39 -07:00