The header template was using Google's classes for the "Whoogle" logo,
which meant keeping up with their list of colors used in the logo. The
template was updated to only ever use the Whoogle logo color.
Accordingly, the logo specific styling in filter.py was removed, since
it is no longer needed.
Also removes all links to the shopping tab, as it seems that the
majority of the links to items are Google specific links (usually
google.com/aclk links without any discernible param for determining the
true location for the link). The shopping page should be addressed
separately with unique filtering/formatting. Further tracking of this
task will be followed in #136.
Initialization of the app now includes generation of a ddg-bang json
file, which is used for all bang style searches afterwards.
Also added search suggestion handling for bang json lookup. Queries
beginning with "!" now reference the bang json file to pull all keys
that match.
Updated test suite to include basic tests for bang functionality.
Updated gitignore to exclude bang subdir.
The javascript controller has been updated to include a call to focus
the cursor on the search field. This previously had only been seen on
Firefox, and was assumed to be a weird FF-specific bug. Adding in a
timeout to allow elements to finish loading allows the field to be
focused as expected.
Also updated the README to include clarification for IP address
tracking.
Improves clarity of the meaning behind the "Country" filter -- Google
seemingly uses this value to only return results that are hosted in a
particular country, as evidenced in the search differences highlighted
in #123. It now mentions that the results are filtered by website
hosting location.
Also, now that invidio.us is shut down, the fallback URL (invidiou.site)
is now used instead.
* Implemented new dark theme
Now uses a dedicated css file for all dark theme color changes, rather
than replacing color codes directly.
Color theme is from discussion in #60.
* Minor link color update
Reconfigured template to only use method parameter if set to search via
POST request (which is the default).
Apparently Chrome/Chromium based browsers don't like non-GET request
searches, and specifying a method caused Chrome to reject the template
altogether.
Arrow key navigation through search suggestions now populates the input
field with text content from the active selection. Navigating "down"
past the end of the suggestions list returns the active cursor to position 0,
while navigating "up" before the list of suggestions restores the
original search query and removes the active highlight from element 0.
Full implementation of social media alt redirects (twitter/youtube/instagram -> nitter/invidious/bibliogram) depending on configuration.
Verbatim search and option to ignore search autocorrect are now supported as well.
Also cleaned up the javascript side of whoogle config so that it now
uses arrays of available fields for parsing config values instead of manually assigning each
one to a variable.
This doesn't include support for Google Maps -> Open Street Maps, that
seems a bit more involved than the social media redirects were, so it
should likely be a separate effort.
Adding support to choose separately the language of search and the one for the interface (allowing a default givent by google).
Co-authored-by: Joao <ramos.joao@protonmail.com>
This is a proof of concept! The code works, but uses hardcoded operators
and may be placed in the wrong file/class.
The best-case scenario would be the possibility to use the 13.000+ ddg
operators, but I don't know if that's possible without having to
redirect to duckduckgo first.
* Project refactor (#85)
* Major refactor of requests and session management
- Switches from pycurl to requests library
- Allows for less janky decoding, especially with non-latin character
sets
- Adds session level management of user configs
- Allows for each session to set its own config -- users with blocked cookies fall back to the "default" profile (same usage as before)
- Updates key gen/regen to more aggressively swap out keys after each
request
* Added ability to save/load configs by name
- New PUT method for config allows changing config with specified name
- New methods in js controller to handle loading/saving of configs
* Result formatting and removal of unused elements
- Fixed question section formatting from results page (added appropriate
padding and made questions styled as italic)
- Removed user agent display from main config settings
* Minor change to save config button label (now "Save As...")
* Fixed issue with "de-pickling" of flask session
Having a gitignore-everything ("*") file within a flask session folder seems to cause a
weird bug where the state of the app becomes unusable from continuously
trying to prune files listed in the gitignore (and it can't prune '*').
* Switched to pickling saved configs
* Updated ad/sponsored content filter and conf naming
Configs are now named with a .conf extension to allow for easier manual
cleanup/modification of named config files
Sponsored content now removed by basic string matching of span content
* Version bump to 0.2.0
* Fixed request.send return style
* Moved custom conf files to their own directory
* Refactored whoogle session mgmt
Now allows a fallback "default" session to be used if a user's browser
is blocking cookies
* Reworked pytest client fixture to support new session mgmt
* Added better multilingual support, updated filter
Results page now includes method for switching to "All Languages" from
whichever language is specified as the primary in the config (see #74).
Also removes the non-Whoogle links from the page footer, leaving only
the page navigation controls
Added support for the date range filter on the results page, though I'd
still recommend using the ":past <unit>" query instead.
* Removed no-cache enforcement, minor styling/formatting improvements
* Improving ad filtering for non-English languages
* Added footer to results page
Added enter key submit on results page
Added results type carryover for subsequent searches on results page
Removed redundant header on image search results
Basic autocomplete/search suggestion functionality added
* Adds new GET and POST routes for '/autocomplete' that accept a string query and returns an array of suggestions
* Adds new autoscript.js file for handling queries on the main page and results view
* Updated requests class to include autocomplete method
* Updated opensearch template to handle search suggestions
* Added header template to allow for autocomplete on results view
* Updated readme to mention autocomplete feature
* Added country and safe search config options
* Updated handling of parser error in results test
* Improved handling of default country
* Added 1px empty gif fallback as a replacement for images that fail to load
* Putting '! ' at the beginning of the query now redirects to the first search result
Signed-off-by: Paul Rothrock <paul@movetoiceland.com>
* Moved get_first_url outside of filter class
Signed-off-by: Paul Rothrock <paul@movetoiceland.com>
See https://developer.mozilla.org/en-US/docs/Web/HTTP/Redirections
301 redirections do not keep the request method intact, and can occasionally be changed from POST to GET
308 redirections always keep the request method, which is necessary for all POST search requests
* Adding HTTPS enforcement
Command line runs of Whoogle Search through pip/pipx/etc will need the
`--https-only` flag appended to the run command.
Docker runs require the `use_https` build arg applied.
* Update README.md
Moved https-only note to top of docker run command, updated pip runner help output
* Dockerfile: removed HTTPS enforcement, updated PORT setting
Dockerfile no longer enforces an HTTPS connection, but still allows for
setting via a build arg. The Flask server port is now configurable as a
build arg as well, by setting a port number to "whoogle_port"
* Fixed incorrect port assignment
This addresses #18, which brought up the issue of searching with Whoogle
with the search instance set to always use a specific container in
Firefox Container Tabs.
Could also be useful if you want to share your search results or
something, I guess. Though nobody likes when people do that.
* Added language configuration support
Main page now has a dropdown for selecting preferred language of
results.
Refactored config to be its own model with language constants.
* Added more language support
Interface language is now updated using the "hl" arg
Fixed chinese traditional and simplified values
Updated decoding of characters to gb2312
* Updated to use conditional decoding dependent on language
* Updated filter to not rely on valid config to work properly
* Ignore venv when building docker file
* Remove reference to 8888 port
It wasn't really used anywhere, and setting it to 5000 everywhere removes ambiguity, and makes things easier to track and reason about
* Use waitress rather than Flask's built in web server
It's not production grade
* Actually add waitress to requirements
Woops!
Pushing straight to master since this is an extremely simple fix, with
a pretty large performance benefit.
The Phyme library used for generating a User Agent rhyme was consuming
an absolute unit of memory. Now that it's removed, it's using about 10x
less memory, at the cost of User Agents being not as funny anymore.
Config options now allow setting a "root url", which defaults to the
request url root. Saving a new url in this field will allow for proper
redirects and usage of the opensearch element.
Also provides a possible solution for #17, where the default flask redirect method redirects to
http instead of https.
Now implemented as a flask global variable reads from the same json file
as before, but doesn't crash if it does not find an existing file.
Removed user config creation from run script
Added <meta name="referrer" content="no-referrer"> to all whoogle
templates
Refactored search route to use conditionally use either request.args or
request.form, depending on rest call (get vs post respectively)
Switched encoding from utf-8 to unicode-escape in an effort to support multiple
languages besides English.
Updated image results page formatting to fix bad image links (added TODO
for adding full res image link for each image result).
Updated README to include libcurl and libssl install instructions for
manual setup.
The implementation of POST search support comes with a few benefits. The
most apparent is the avoidance of search queries appearing in web server
logs -- instead of the prior GET approach (i.e.
/search?q=my+search+query), using POST requests with the query stored in
the request body creates logs that simply appear as "/search".
Since a lot of relative links are generated in the results page, I came
up with a way to generate a unique key at run time that is used to
encrypt any query strings before sending to the user. This benefits both
regular text queries as well as fetching of image links and means that
web logs will only show an encrypted string where a link or query
string might slip through.
Unfortunately, GET search requests still need to be supported, as it
doesn't seem that Firefox (on iOS) supports loading search engines by
their opensearch.xml file, but instead relies on manual entry of a
search query string. Once this is updated, I'll probably remove GET
request search support.
Images were previously directly fetched from google search results,
which was a potential privacy hazard. All image sources are now modified
to be passed through shoogle's routing first, which will then fetch raw
image data and pass it through to the user.
Filter class was refactored to split the primary clean method into
smaller, more manageable submethods.
The image results page seems to have different formatting from non-image
results pages. Should probably revisit this at some point and try to
style the image results page to be more in line with other result types.
- Updated Dockerfile to include chmod of run script
- Added app.json for Heroku quick deploy
- Removed unused function var in js controller
- Moved requirements back to root of repo
- Added Codebeat report to readme
Curl requests and user agent related functionality was moved to its own
request class.
Routes was refactored to only include strictly routing related
functionality.
Filter class was cleaned up (had routing/request related logic in here,
which didn't make sense)