whoogle-search

mirror of https://github.com/benbusby/whoogle-search synced 2024-11-12 07:10:29 +00:00

Author	SHA1	Message	Date
Ben Busby	d894bd347d	Handle error when parsing image result url	2021-06-16 10:40:18 -04:00
Ben Busby	614dceeb70	Add fallback interface/search lang + cleanup Since the interface language defaults to IP geolocation by google, the default language is now set to english. Still not sure if this is the best solution, but at least temporarily should clear up some confusion for users with instances deployed in countries outside of their own. Also performed some minor cleanup: - Updated name of strip_blocked_sites to clean_query - Added clean_query to list of jinja template functions - Ensured site block list doesn't contain duplicate filters	2021-06-04 11:09:30 -04:00
Ben Busby	cbe32a081e	Hotfix: extract only 'q' element from query string Occasionally the search results will contain links with arguments such as 'dq', which was being erroneously used in attempts to extract the 'q' element from query strings. This enforces that only links with '?q=' or '&q=' (elements with a standalone 'q' arg) will have the element extracted. I also refactored the naming of this element once extracted to be just 'q'. Although this seems counterintuitive, it makes a little more sense since this element is the one we're extracting. It's a vague url arg name, but it is what it is. Bump version to 0.5.2 for hotfix release	2021-05-29 12:22:37 -04:00
Ben Busby	43faaee77f	Hotfix: remove site filter for maps links The new site filter breaks links to Maps results, so filter.py needed to be updated to handle these links as a unique case. A new method was introduced to easily remove any "-site:..." filters from the query, which is now also used to format queries in the header template rather than manually removing the blocked site list within the template itself. Bumps version to 0.5.1 for releasing the bugfix Fixes #329	2021-05-27 12:01:57 -04:00
Joao A. Candido Ramos	448efb8f2a	Add "view image" functionality (#268 ) * add view image option * prevent whoogle links from opening in a new tab. * remove view image template on mobile requests * change loop values to be more robust to the number of images * Update app/templates/imageresults.html * fix "Basically the .cvifge class needs width: 100%; in order to expand the search input to fit the form width." * Update app/templates/imageresults.html * remove hardcoded string from template * Add view image config var to app.json * Add view image config var to whoogle.env Co-authored-by: jacr13 <ramos.joao@protonmail.com> Co-authored-by: Ben Busby <benbusby@protonmail.com>	2021-05-21 11:19:45 -04:00
Ben Busby	1030118d0b	Expand custom css theming support Also adds new default dark theme designed by @gripped.	2021-04-09 11:00:02 -04:00
Ben Busby	0b9600b564	Expand custom css variables and functionality Squashed commit of the following: commit `37e22d2945` Author: Ben Busby <benbusby@protonmail.com> Date: Mon Apr 5 10:27:05 2021 -0400 Pass user config to logo template commit `2406fee05c` Author: Ben Busby <benbusby@protonmail.com> Date: Mon Apr 5 10:24:54 2021 -0400 Fix incorrect contrast text in dark theme commit `91dd677e22` Author: Ben Busby <benbusby@protonmail.com> Date: Fri Apr 2 17:21:38 2021 -0400 Remove inline onclicks, fix svg sizing commit `91bbf9c0fa` Merge: `72637df` `b1227bd` Author: Ben Busby <benbusby@protonmail.com> Date: Fri Apr 2 15:35:37 2021 -0400 Merge remote-tracking branch 'origin/develop' into custom-css-tweaks commit `72637df213` Author: Ben Busby <benbusby@protonmail.com> Date: Fri Apr 2 11:38:38 2021 -0400 Use svg logo w/ custom styling on results pages commit `666a7ceac4` Author: Ben Busby <benbusby@protonmail.com> Date: Fri Apr 2 11:10:37 2021 -0400 Split whoogle-accent into whoogle-element-bg and whoogle-logo See discussion on #247	2021-04-05 11:00:56 -04:00
Ben Busby	df0b7afa50	Switch to single Fernet key per session This moves away from the previous (messy) approach of using two separate keys for decrypting text and element URLs separately and regenerating them for new searches. The current implementation of sessions is not very reliable, which lead to keys being regenerated too soon, which would break page navigation. Until that can be addressed, the single key per session approach should work a lot better. Fixes #250 Fixes #90	2021-04-05 11:00:56 -04:00
Ben Busby	8ad8e66d37	Improve static typing throughout repo Eventually this should be part of a separate mypy ci build, but right now it's just a general guideline. Future commits and PRs should be validated for static typing wherever possible. For reference, the testing commands used for this commit were: mypy --ignore-missing-imports --pretty --disallow-untyped-calls app/ mypy --ignore-missing-imports --pretty --disallow-untyped-calls test/	2021-04-05 11:00:56 -04:00
Ben Busby	f8dfc78539	Improve naming of _utils files, update fn/class doc The app/utils/_utils weren't named very well, and all have been updated to have more accurate names. Function and class documention for the utils have been updated as well, as part of the effort to improve overall documentation for the project.	2021-04-05 11:00:56 -04:00
Ben Busby	64567a63ea	Ensure G logo doesn't appear in mobile img results Adds a separate check to remove all images sourced from www.gstatic.com, which is where the mobile logo in particular is coming from.	2021-04-05 11:00:56 -04:00
Ben Busby	440c4e9c50	Remove lxml dependency The lxml dependency in the project was fairly unnecessary, and made the initial build time for the project considerably slower. This replaces all instances of lxml with either the default html parser (for bs4 constructors) or the built in xml.etree package (for search suggestion parsing).	2020-12-29 18:43:42 -05:00
Ben Busby	6e7ec9918a	Move language/country settings to app config Moves the language and country dicts from the config model to json files that are loaded during app init and stored in the app config dict. This substantially improves the readability of the config model and allows for much more sensible loading of the language/country options.	2020-12-17 16:42:05 -05:00
Ben Busby	375f4ee9fd	PEP-8: Fix formatting issues, add CI workflow (#161 ) Enforces PEP-8 formatting for all python code Adds a github action build for checking pep8 formatting using pycodestyle	2020-12-17 16:06:47 -05:00
Ben Busby	b695179c79	Add ability to collapse "people also ask" This adds a step in the filter process to wrap the "people also ask" section in a <details> element, which automatically collapses the contents of the section. Clicking/tapping the details element expands the view as normal. See #113	2020-12-15 11:09:48 -05:00
Ben Busby	e6db3112f7	Fix pagination bug for pages > 3 The pagination footer on the results page after page 2 has three actions (beginning, next, previous). The footer filter was updated to remove items with more than three actions to fix this. See #131	2020-12-07 20:38:57 -05:00
Ben Busby	72cbc342af	Add ability to set temp config in search query Dark mode, country, interface language, and search language configs can now be set in the search query by appending each option as a url parameter. Supported args are: 'dark', 'lang_search', 'lang_interface', and 'ctry' Ex: /search?q=%s&dark=1&lang_search=lang_en... These config settings persist across page navigation and switching result type, but will be reset if the main search bar is used. See #144	2020-11-11 00:40:49 -05:00
bugbounce	1148a7fb8d	Use relative links instead of absolute (#139 ) * Use relative links instead of absolute This allows for hosting under a subpath. For example if you want to host whoogle at example.com/whoogle, it should work better with a reverse proxy. * Use relative link for opensearch.xml	2020-10-29 11:09:31 -04:00
Ben Busby	f3bb1e22b4	Fix improper header styling, remove shopping tab links The header template was using Google's classes for the "Whoogle" logo, which meant keeping up with their list of colors used in the logo. The template was updated to only ever use the Whoogle logo color. Accordingly, the logo specific styling in filter.py was removed, since it is no longer needed. Also removes all links to the shopping tab, as it seems that the majority of the links to items are Google specific links (usually google.com/aclk links without any discernible param for determining the true location for the link). The shopping page should be addressed separately with unique filtering/formatting. Further tracking of this task will be followed in #136.	2020-10-25 13:52:30 -04:00
Ben Busby	9afe5f81bd	Updated dark theme (#121 ) * Implemented new dark theme Now uses a dedicated css file for all dark theme color changes, rather than replacing color codes directly. Color theme is from discussion in #60. * Minor link color update	2020-09-14 15:29:58 -04:00
Ben Busby	975ece8cd0	Privacy respecting alternatives in results view (#106 ) Full implementation of social media alt redirects (twitter/youtube/instagram -> nitter/invidious/bibliogram) depending on configuration. Verbatim search and option to ignore search autocorrect are now supported as well. Also cleaned up the javascript side of whoogle config so that it now uses arrays of available fields for parsing config values instead of manually assigning each one to a variable. This doesn't include support for Google Maps -> Open Street Maps, that seems a bit more involved than the social media redirects were, so it should likely be a separate effort.	2020-07-26 11:53:59 -06:00
Ben Busby	f7380ae15d	Improving ad filtering for non-English languages	2020-06-11 13:21:40 -06:00
Ben Busby	4324fcd8f8	Added better multilingual support, updated filter Results page now includes method for switching to "All Languages" from whichever language is specified as the primary in the config (see #74). Also removes the non-Whoogle links from the page footer, leaving only the page navigation controls Added support for the date range filter on the results page, though I'd still recommend using the ":past <unit>" query instead.	2020-06-07 14:06:49 -06:00
Ben Busby	b6fb4723f9	Project refactor (#85 ) * Major refactor of requests and session management - Switches from pycurl to requests library - Allows for less janky decoding, especially with non-latin character sets - Adds session level management of user configs - Allows for each session to set its own config (people are probably going to complain about this, though not sure if it'll be the same number of people who are upset that their friends/family have to share their config) - Updates key gen/regen to more aggressively swap out keys after each request * Added ability to save/load configs by name - New PUT method for config allows changing config with specified name - New methods in js controller to handle loading/saving of configs * Result formatting and removal of unused elements - Fixed question section formatting from results page (added appropriate padding and made questions styled as italic) - Removed user agent display from main config settings * Minor change to button label * Fixed issue with "de-pickling" of flask session Having a gitignore-everything ("") file within a flask session folder seems to cause a weird bug where the state of the app becomes unusable from continuously trying to prune files listed in the gitignore (and it can't prune ''). * Switched to pickling saved configs * Updated ad/sponsored content filter and conf naming Configs are now named with a .conf extension to allow for easier manual cleanup/modification of named config files Sponsored content now removed by basic string matching of span content * Version bump to 0.2.0 * Fixed request.send return style	2020-06-02 12:54:47 -06:00
Ben Busby	71ba00785f	Quick improvement to ad removal	2020-05-29 13:21:53 -06:00
Ben Busby	78939e7fb4	Reworked google url routing	2020-05-26 10:47:40 -06:00
Ben Busby	98d639883c	Fixing styling/url/safe mode inconsistencies	2020-05-26 10:39:19 -06:00
Ben Busby	21012f5265	Feature: autocomplete/search suggestions (#72 ) Basic autocomplete/search suggestion functionality added * Adds new GET and POST routes for '/autocomplete' that accept a string query and returns an array of suggestions * Adds new autoscript.js file for handling queries on the main page and results view * Updated requests class to include autocomplete method * Updated opensearch template to handle search suggestions * Added header template to allow for autocomplete on results view * Updated readme to mention autocomplete feature	2020-05-24 14:03:11 -06:00
Ben Busby	3dbe51e9e7	Removing google's filter card from results	2020-05-24 12:53:21 -06:00
Ben Busby	c51f186419	Added version footer, minor PEP 8 refactoring	2020-05-20 11:02:30 -06:00
Paul Rothrock	0e39b8f97b	Added "I'm feeling lucky" function (#46 ) * Putting '! ' at the beginning of the query now redirects to the first search result Signed-off-by: Paul Rothrock <paul@movetoiceland.com> * Moved get_first_url outside of filter class Signed-off-by: Paul Rothrock <paul@movetoiceland.com>	2020-05-18 10:28:23 -06:00
Ben Busby	3123789584	Added config option for opening links in new tab (#49 )	2020-05-15 16:10:31 -06:00
Ben Busby	afd5b9aa83	Minor fix to dark mode on img results	2020-05-15 14:17:16 -06:00
Ben Busby	a11ceb0a57	Feature: language config (#27 ) * Added language configuration support Main page now has a dropdown for selecting preferred language of results. Refactored config to be its own model with language constants. * Added more language support Interface language is now updated using the "hl" arg Fixed chinese traditional and simplified values Updated decoding of characters to gb2312 * Updated to use conditional decoding dependent on language * Updated filter to not rely on valid config to work properly	2020-05-12 17:15:53 -06:00
Ben Busby	708769f682	Minor styling refactor, updated app name	2020-05-04 18:00:43 -06:00
Ben Busby	0300eab6df	Updated formatting and setup instructions Switched encoding from utf-8 to unicode-escape in an effort to support multiple languages besides English. Updated image results page formatting to fix bad image links (added TODO for adding full res image link for each image result). Updated README to include libcurl and libssl install instructions for manual setup.	2020-05-03 19:32:47 -06:00
Ben Busby	39c475af21	Using urlencode "doseq" option for url args	2020-04-29 20:31:03 -06:00
Ben Busby	c30f21f950	Minor conditional fix in filter	2020-04-29 14:46:00 -06:00
Ben Busby	b83f14be26	Fixed image href filter Needed to be checking against img attrs, not just the img object itself	2020-04-29 11:18:07 -06:00
Ben Busby	dcd93d4869	Fixed filter params, updated search button text	2020-04-29 10:03:34 -06:00
Ben Busby	5fe308956b	Cleaned up filter class, updated js config tool	2020-04-29 09:46:18 -06:00
Ben Busby	1cbe394e6f	Updated tests, fixed a few bugs Added opensearch routes test and individual tests for searching via GET and POST separately. Fixed incorrect assignment in gen_query.	2020-04-28 18:59:33 -06:00
Ben Busby	0c0ebb8917	Added POST search, encrypted query strings, refactoring The implementation of POST search support comes with a few benefits. The most apparent is the avoidance of search queries appearing in web server logs -- instead of the prior GET approach (i.e. /search?q=my+search+query), using POST requests with the query stored in the request body creates logs that simply appear as "/search". Since a lot of relative links are generated in the results page, I came up with a way to generate a unique key at run time that is used to encrypt any query strings before sending to the user. This benefits both regular text queries as well as fetching of image links and means that web logs will only show an encrypted string where a link or query string might slip through. Unfortunately, GET search requests still need to be supported, as it doesn't seem that Firefox (on iOS) supports loading search engines by their opensearch.xml file, but instead relies on manual entry of a search query string. Once this is updated, I'll probably remove GET request search support.	2020-04-28 18:19:34 -06:00
Ben Busby	4180aedd87	Added image proxying, refactored filter class Images were previously directly fetched from google search results, which was a potential privacy hazard. All image sources are now modified to be passed through shoogle's routing first, which will then fetch raw image data and pass it through to the user. Filter class was refactored to split the primary clean method into smaller, more manageable submethods.	2020-04-27 20:21:36 -06:00
Ben Busby	b0e6167733	Improved bad url arg filtering	2020-04-26 18:48:40 -06:00
Ben Busby	3bc58b64be	Small update to filter class The image results page seems to have different formatting from non-image results pages. Should probably revisit this at some point and try to style the image results page to be more in line with other result types.	2020-04-25 11:32:43 -06:00
Ben Busby	1f6bfa092e	Complete refactoring of opensearch Refactored opensearch.xml to only exist as a template that is served by a flask route, which is then populated with the necessary url root.	2020-04-24 18:45:57 -06:00
Ben Busby	a7005c012e	Refactoring of user requests and routing Curl requests and user agent related functionality was moved to its own request class. Routes was refactored to only include strictly routing related functionality. Filter class was cleaned up (had routing/request related logic in here, which didn't make sense)	2020-04-23 20:59:43 -06:00
Ben Busby	6a150092a2	Fixed config bug in filter, updated run script to work on mac os	2020-04-16 18:50:31 -06:00
Ben Busby	e72ccc4988	Small change to mobile styling	2020-04-16 10:10:18 -06:00

1 2

53 Commits