Commit Graph

1289 Commits (288e3086d2a67aaee45576f13b032f31ab782b45)

Author SHA1 Message Date
Émilien Devos 7d3e8118b0
Update the XPath for fetching the Google results 3 years ago
Markus Heiser 906a0a99cd [fix] openstreatmap: load thumbnail from uploads.wikimedia.org
Openstreatmap images are now loaded from uploads.wikimedia.org instead of
commons.wikimedia.org to prevent redirects.

With `image_proxy` enabled images from commons.wikimedia.org cant be loaded
since they are redirected.  We already discussed this issue [875] and
@tiekoetter fixed this issue in PR [878].

Related-to:
- [875] https://github.com/searxng/searxng/issues/875
- [878] https://github.com/searxng/searxng/pull/878
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser a967e59590 [pylint] searx/engines/wikidata.py (no functional change)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Léon Tiekötter 1c151ae92b
[fix] wikidata: URL decoding and file extension handling
Add '.png' to the second img_src_name if it has the extension '.svg'.
Use urllib.parse.unquote for URL decoding.
3 years ago
Markus Heiser a13c5d70c7 [fix] wikidata engine: select image with higher (not lower) priority
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Léon Tiekötter a50f32bcfc
wikidata: load thumbnail instead of full image 3 years ago
Léon Tiekötter 560a14e77b
[fix] wikidata info box images
Wikidata info box images are now loaded from uploads.wikimedia.org instead of commons.wikimedia.org to prevent redirects

Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser b35ef9789b [pylint] engines/invidious.py
Fix remarks from pylint and remove usless comments

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser e2ec6b4211 [fix] invidious engine: store random base_url in param
Two different threads ( = two different user queries) can call the request
function in a row and then the response function.  The namespace will be same
since this is the same engine.

To keep exactly the same value ``base_url`` must be stored in params and then
retrieve using ``resp.search_params["base_url"]``.

Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#discussion_r799324861
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser ddc2102a07 [fix] solidtorrents engine: store random bas_url in param
Two different threads ( = two different user queries) can call the request
function in a row and then the response function.  The namespace will be same
since this is the same engine.

To keep exactly the same value ``base_url`` must be stored in params and then
retrieve using ``resp.search_params["base_url"]``.

Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#discussion_r799324861
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser d6061b7c8a [mod] solidtorrents engine: add metadata & torrentfile
BTW: define min_len in eval_xpath_list of 'stats' list

Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#pullrequestreview-872910744
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser f9c4868142 [fix] solidtorrents engine: use get_torrent_size from searx.utils
Suggested-by: @dalf https://github.com/searxng/searxng/pull/862#pullrequestreview-872858489
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser d92b3d96fd [fix] solidtorrents engine: JSON API no longer exists
The API endpoint, we where using does not exist anymore.  This patch is a
rewrite that parses the HTML page.

Related: https://github.com/paulgoio/searxng/issues/17
Closes: https://github.com/searxng/searxng/issues/858

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser 50a56532c4 [pylint] engines/currency_convert.py
Fix remarks from pylint

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser 15320b5eec [fix] engines description - currency_convert.py
Currency engine has DuckDuckGo metadata

In the engine selector of the preferences window, the currency search engine has
the same metadata and wikidata url as duckduckgo, I'd assume there should be a
difference of some sort there clarifying what source the currency uses or, if
it's a duckduckgo service, at least clarifying that it's a currency service by
duck duck go.

Closes: https://github.com/searxng/searxng/issues/787
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser 60e7fee47a
Merge pull request #475 from return42/tineye
[enh] engine - add Tineye reverse image search
3 years ago
Alexandre Flament ebd3013a1a [mod] tineye engine: minor changes
* remove "disable: false" in settings.yml
* use the json() method from httpx.Response (faster character encoding detection)
3 years ago
Léon Tiekötter a6673a1a94 [fix] 1x engine
1x changed the XML result layout.
3 years ago
Markus Heiser a6b879f19c [mod] tineye engine: set engine_type to 'online_url_search'
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Alexandre Flament 116802852d [fix] ina engine
based on a45408e8e2
3 years ago
Markus Heiser b7f74fbe42 [mod] tineye - add some documentation
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Allen 880555e263 [enh] engine - add Tineye reverse image search
Other optional parameter ..

`&sort=crawl_date`
    can be appended to search_string to sort results by date.

`&domain=example.org`
    can be implemented to search_string to get results from just one domain.

Public instances could get relatively fast timed-out for 3600s.

--

Merged from @allendema's commit [1] and slightly modfied / see [2].

Related-to: [1] 455b2b4460
Related-to: [2] https://github.com/searx/searx/pull/3040
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Léon Tiekötter 0cbf73a1f4
Allow 'using_tor_proxy' to be set for each engine individually
Check 'using_tor_proxy' for each engine individually instead of checking globally

[fix] searx.network: update _rdns test to the last httpx version

Co-authored-by: Alexandre Flament <alex@al-f.net>
3 years ago
Markus Heiser 1a0760c10a [fix] googel engine - "some results are invalids: invalid content"
Fix google issues listet in the `/stats?engine=google` and message::

    some results are invalids: invalid content

The log is::

    DEBUG   searx                         : result: invalid content: {'url': 'https://de.wikipedia.org/wiki/Foo', 'title': 'Foo - Wikipedia', 'content': None, 'engine': 'google'}
    WARNING searx.engines.google          : ErrorContext('searx/search/processors/abstract.py', 111, 'result_container.extend(self.engine_name, search_results)', None, 'some results are invalids: invalid content', ()) True

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser f0102a95c9 [fix] google engine: remove adds and fix mobile_ui selector
1. Fix issue reported in comment [1]
2. Fix XPath selector for the response of google's mobile UI, reported in
   comment [2]

[1] https://github.com/searxng/searxng/pull/777#issuecomment-1015121322
[2] https://github.com/searxng/searxng/pull/777#issuecomment-1015236238

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Émilien Devos 6670063e0d
Update XPath for Google engine 3 years ago
Alexandre Flament e07417848f
Merge pull request #695 from return42/fix-sp
[fix] startpage engine / modified API
3 years ago
Alexandre Flament f9271d595f [fix] startpage: workaround to use the startpage network
workaround for the issue #762
3 years ago
Markus Heiser bf593af423 [mod] engine mysql_server: make port configurable
Cherry piked from https://github.com/searx/searx/commit/82ac634070

Suggested-by: https://github.com/searx/searx/issues/3117
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser df238e944c [mod] starpage engine: add comment about Startpage's FFox add-on
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser 21e884f369 [fix] startpage engine: fetch CAPTCHA & issues related to PR-695
In case of CAPTCHA raise a SearxEngineCaptchaException and suspend for 7 days.
When get_sc_code() fails raise a SearxEngineResponseException and suspend for 7
days.

[1] https://github.com/searxng/searxng/pull/695

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser 2f4e567e90 [fix] Get an actual `sc` argument from startpage's home page.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser 1cbcddb3f7 [pylint] Startpage engine
Fix remarks from pylint

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser f1f5e69c42 [fix] startpage engine - avoid captcha
Startpage has introduced new anti-scraping measures that make SearXNG instances
run into captchas:

1. some arguments has been removed and a new `sc` has been added.
2. search path changed from `do/search` to `sp/search`
3. POST request is no longer needed

Closes: https://github.com/searxng/searxng/issues/692
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Martin Fischer 576e19dad1 [fix] add default for "about" engine property
Fixes #732.
3 years ago
Markus Heiser 4fc5e5299c [fix] ccengine engine - avoid unwanted redirects
api.openverse.engineering is a little picky and wants to have a trailing slash
in the path:

    /v1/images? -->/ v1/images/?

otherwise it redirects, here is the debug log:

    DEBUG   searx.network.openverse       : HTTP Request: GET https://api.openverse.engineering/v1/images?&page=1&page_size=20&format=json&q=foo "HTTP/2 301 Moved Permanently" (text/html; charset=utf-8)
    DEBUG   searx.network.openverse       : HTTP Request: GET https://api.openverse.engineering/v1/images/?&page=1&page_size=20&format=json&q=foo "HTTP/2 200 OK" (application/json)
    WARNING searx.engines.openverse       : ErrorContext('searx/search/processors/online.py', 105, 'count_error(', None, '1 redirects, maximum: 0', ('200', 'OK', 'api.openverse.engineering')) True

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Léon Tiekötter 37baf46ece [fix] Rename ccengine engine to openverse
The CC engine was merged with WordPress and renamed to Openverse

Source: https://wordpress.org/news/2021/05/welcome-to-openverse/
3 years ago
Léon Tiekötter 4be6deb0a1 [fix] ccengine engine
Change domain to api.openverse.engineering
3 years ago
Markus Heiser ced656606f
Merge pull request #709 from return42/drop-etools
[fix] drop etools engine module
3 years ago
Markus Heiser 5dd3442f83 [fix] drop etools engine module
The implementation of the etools engine is poor.  No date-range support, no
language support and it is broken by a CAPTCHA.

etools is a metasearch engine, the major search engines it supports (google,
bing, wikipedia, Yahoo) are already available in SeaarXNG.

While etools does support several engines we currently don't support directly,
support for them should be added directly to SearXNG if there is demand.

In practice: in SearXNG the worse etools results will be mixed with good results
from other engines we have (as long as there is no captcha).

At best case, what we win with etools is in e.g. results from de.ask.com in a
query from a german request .. in all other cases worse results are bubble up in
SearXNG's result list.

[1] https://github.com/searxng/searxng/issues/696#issuecomment-1005855499

Closes: https://github.com/searxng/searxng/issues/696
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Martin Fischer e12525a1fa
Merge pull request #708 from not-my-profile/pref-refactor
Refactor `preferences`
3 years ago
Léon Tiekötter 3ab826de22 Drop microsoft academic engine
Microsoft academic was discontinued on 2021-12-31.

Source: https://www.microsoft.com/en-us/research/project/academic/articles/microsoft-academic-to-expand-horizons-with-community-driven-approach/
3 years ago
Martin Fischer bb06758a7b [refactor] add type hints & remove Setting._post_init
Previously the Setting classes used a horrible _post_init
hack that prevented proper type checking.
3 years ago
Alexandre Flament aedd6279b3
Merge pull request #634 from not-my-profile/powered-by
Introduce `categories_as_tabs` & group engines in tabs
3 years ago
Alexandre Flament d3ecadd3f8
Merge pull request #679 from dalf/brand-searxng
searxng.org: update setup.py & settings.yml
3 years ago
Martin Fischer d01e8aa8cc [mod] introduce searx.engines.Engine for type hinting 3 years ago
Martin Fischer 1e195f5b95 [mod] move group_engines_in_tab to searx.webutils 3 years ago
Martin Fischer 5d74bf3820 [enh] move dictionaries, Erowid & IMDb out of general category
The general category is the category that is searched by default.
From a privacy standpoint it doesn't make sense to send all general
queries to specialized search engines that cannot deal with those
queries anyway.
3 years ago
Martin Fischer ab90e2ac49 [enh] show categories not in any tab category in "Other" preferences tab
Previously we didn't have a good place to put search engines that don't
fit into any of the tab categories. This commit automatically puts
search engines that don't belong to any tab category in an "other"
category, that is only displayed in the user preferences (and not above
search results).
3 years ago
Martin Fischer b02f762687 [enh] add more categories 3 years ago