Commit Graph

1068 Commits (d2faea423a33a82867e097505cd88b84c24a7931)

Author SHA1 Message Date
Alexandre Flament a08df82574 [fix] scanr_structure engine: fix import 4 years ago
Alexandre Flament 95bd6033fa [mod] wikidata engine: use one SPARQL request instead of 2 HTTP requests. 4 years ago
Alexandre Flament ca593728af [mod] duckduckgo_definitions: display only user friendly attributes / URL
various bug fixes
4 years ago
a01200356 c3daa08537 [enh] Add onions category with Ahmia, Not Evil and Torch
Xpath engine and results template changed to account for the fact that
archive.org doesn't cache .onions, though some onion engines migth have
their own cache.

Disabled by default. Can be enabled by setting the SOCKS proxies to
wherever Tor is listening and setting using_tor_proxy as True.

Requires Tor and updating packages.

To avoid manually adding the timeout on each engine, you can set
extra_proxy_timeout to account for Tor's (or whatever proxy used) extra
time.
4 years ago
Nicholas Kegler 8e15d3e4c1 Open Semantic Search Engine 4 years ago
Noémi Ványi e158eeee4b Propagate error messages from YouTube API 4 years ago
Adam Tauber 835d16cbb1
Merge pull request #2255 from kvch/yacy-improvements
Add yacy improvements: HTTP digest auth, category checking
4 years ago
Alexandre Flament cfd21bc475 [fix] fix duckduckgo engine
- remove paging support: a "vqd" parameter is required between each request. This parameter is uniq for each request
- update the URL (no redirect), use the POST method
- language support: works if there is no more than request per minute, otherwise it is ignored !
4 years ago
Noémi Ványi 72c7fd25fe Add yacy improvements: HTTP digest auth, category checking 4 years ago
Noémi Ványi f0278d41fc add ebay enginte to shopping category 4 years ago
Alexandre Flament a9dc54bebc [mod] Add searx.data module
Instead of loading the data/*.json in different location,
load these files in the new searx.data module.
4 years ago
Alexandre Flament 8659212f5a [fix] drop Python 2: use collections.abc.Iterable instead of collections.Iterable 4 years ago
Alexandre Flament b728cb610b
Merge pull request #2241 from dalf/move-extract-text-and-url
Move the extract_text  and extract_url functions to searx.utils
4 years ago
Finn 53c8d945b4
[enh] Add SepiaSearch engine (#2227)
supported_languages values: see https://framagit.org/framasoft/peertube/search-index/-/blob/master/client/src/views/Search.vue#L618-641
4 years ago
Alexandre Flament 2006eb4680 [mod] move extract_text, extract_url to searx.utils 4 years ago
Markus Heiser 8162d7aff4 [fix] google engine - div classes has been renamed in HTML reult
Since 1. October 2020 google has changed the 'class' attribute of the HTML
result page.

Fix the xpath expressions and ignore <div class="g" ../> sections which do not
match to title's xpath expression.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
4 years ago
Alexandre Flament f204e4903d [fix] migration from github.com/asciimoo/searx to github.com/searx/searx : fix URLs 4 years ago
Marc Abonce Seguin ecf5899153 fetch google's search langs rather than ui langs 4 years ago
Marc Abonce Seguin 41800835f9 fetch supported languages for startpage engine 4 years ago
Marc Abonce Seguin ea9d979cc3 add language names in qwant's fetch languages function 4 years ago
Dalf c225db45c8 Drop Python 2 (4/n): SearchQuery.query is a str instead of bytes 4 years ago
Dalf 1022228d95 Drop Python 2 (1/n): remove unicode string and url_utils 4 years ago
Marc Abonce Seguin ab20ca182c use Wikipedia's REST v1 API 4 years ago
Noémi Ványi f0ca1c3483
[enh] Add command line engines: git grep, find, etc. (#2128)
A new "base" engine called command is introduced. It is the foundation for all command line engines for now.
You can use this engine to create your own command line engine.

Add some engines (commented out to make sure no one enables anything accidentally):
* git grep: This engine lets you grep in the searx repo.
* locate: If locate is installed and initialized, you can search on the FS.
* find: You can find files with a specific name from where you started searx.
* pattern search in files: This engine utilizes the command fgrep.
* regex search in files: This engine runs `grep` to find a file based on its contents.
4 years ago
Alexandre Flament 3397382754
[enh] stop searx when an engine raise an SyntaxError exception (#2177)
and some other exceptions:
* KeyboardInterrupt
* SystemExit
* RuntimeError
* SystemError
* ImportError: an engine with an unmet dependency will stop everything.
4 years ago
Alexandre Flament b329058c1a Revert "[enh] test: load each engine to check for syntax errors"
This reverts commit 4fb3ed2c63.
4 years ago
Adam Tauber 6f9aa0e258
Merge pull request #2160 from dalf/test_load_engine
[enh] test: load each engine to check for syntax errors
4 years ago
Adam Tauber 6ded6e7a9a [fix] skip uncomplete image results - closes #1496 4 years ago
Dalf 4fb3ed2c63 [enh] test: load each engine to check for syntax errors 4 years ago
Marc Abonce Seguin 0d8970c8f2
only return one url per "type" in Wikidata (#2151)
i.e. only one official website, one Twitter, etc.
4 years ago
Émilien Devos 27d74826f1
[enh] add yggtorrent engine (#2135) 4 years ago
Emilien Devos c15a91a534 [fix] piratebay engine date and pep8 indentation 4 years ago
Emilien Devos 52d78d8418 [fix] piratebay engine 4 years ago
Adam Tauber 77103c7874
Merge pull request #2116 from mikeri/invidiousres
Include author and video length in Invidious results
4 years ago
Vlad f678388dbc
Fix google images 'get image' button bug from issue #2103 (#2115)
Closes #2103
4 years ago
Michael Ilsaas a1ce141c99
add peertube engine (#2109) 4 years ago
Michael Ilsaas 2ed8ad7691 include length in invidious results 4 years ago
Michael Ilsaas 0305fe0dd5 include author in invidious results 4 years ago
Marc Abonce Seguin 77b9faa8df fix Wikipedia's paragraph extraction 4 years ago
Michael Ilsaas 98cb6b6701 Update torrentz2 URL from .eu to .is 4 years ago
xywei 1d4657b714
Fix relative urls that do not start with '/' 4 years ago
Gaspard d'Hautefeuille 4e346e741a
fix python 3 support 4 years ago
Adam Tauber 52eba0c721 [fix] pep8 4 years ago
Markus Heiser 16f8ec894a [fix] revise google images engine
this commit is picked from #1985
4 years ago
Markus Heiser 410c2f903d [fix] revise google engine
this commit is picked from #1985
4 years ago
Markus Heiser 8d318ee142
Merge branch 'master' into gigablast 4 years ago
Sophie Tauchert 71db7b1238
Fix YaCy text results returned as images 4 years ago
Noémi Ványi 93cbd85b8a
Merge branch 'master' into duckduckgo_correction 4 years ago
Markus Heiser 5fac6cffa2
Merge branch 'master' into gigablast 4 years ago
Markus Heiser 5293e58032 [fix] yahoo engine - changed content_xpath
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
4 years ago
Markus Heiser 223430ff30
Merge branch 'master' into gigablast 4 years ago
Adam Tauber 32f7877235 [fix] resolve flickr_noapi encoding issues 4 years ago
Gordon Quad 385e9b5c9e add correction support for duckduckgo 4 years ago
Markus Heiser ee0da61cbb
Merge branch 'master' into gigablast 4 years ago
Adam Tauber aa7c043ff4 [fix] resolve pep8 errors 4 years ago
Adam Tauber 29960aa1d9 [enh] add official site link to the top of the infobox - closes #1644 4 years ago
Adam Tauber 6c06286251 [enh] add length and author details to youtube videos
closes #775
4 years ago
Adam Tauber 2c6531b233 [enh] add routing directions to osm search - closes #254 4 years ago
Markus Heiser 74135007eb
Merge branch 'master' into gigablast 4 years ago
Noémi Ványi e3282748d0 add display_error_messages option to engine settings
A new option is added to engines to hide error messages from users. It
is called `display_error_messages` and by default it is set to `True`.
If it is set to `False` error messages do not show up on the UI.

Keep in mind that engines are still suspended if needed regardless of
this setting.

Closes #1828
4 years ago
Markus Heiser ee5d2b319b [fix] gigablast requires a random extra parameter
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
4 years ago
Markus Heiser a18760b322 [fix] revise of the gigablast engine (WIP)
The gigablast API has changed and seems to have some quirks, this is the first
revise.  More work (hacks) are needed.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
4 years ago
Markus Heiser 57c7b90edd [fix] gigablast does no longer support *supported_languages_url*
Since there are zero results, we can remove it:

    $ make engines.languages
    fetch languages ..
    ...
    fetched 0 languages from engine gigablast

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
4 years ago
Markus Heiser de179ecc5b [fix] remove debug print from commit e5305f8
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
4 years ago
Markus Heiser 9302d1fc17
Merge branch 'master' into master 5 years ago
Noémi Ványi fcb44c6542
Merge branch 'master' into fix_startpage_ValueError_on_spanish_datetime 5 years ago
HLFH 3a26093c46
Remove discontinued faroo engine 5 years ago
Spühler Stefan 4f90fb6a92 [Fix] Startpage ValueError on Spanish date format
datetime.parser.parse() does not know the Spanish date format which
leads to a ValueError. Fixes #1870

Traceback (most recent call last):
  File "/usr/local/searx/searx/search.py", line 160, in search_one_http_request_safe
    search_results = search_one_http_request(engine, query, request_params)
  File "/usr/local/searx/searx/search.py", line 97, in search_one_http_request
    return engine.response(response)
  File "/usr/local/searx/searx/engines/startpage.py", line 102, in response
    published_date = parser.parse(date_string, dayfirst=True)
  File "/usr/local/searx/searx-ve/lib/python3.6/site-packages/dateutil/parser/_parser.py", line 1358, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "/usr/local/searx/searx-ve/lib/python3.6/site-packages/dateutil/parser/_parser.py", line 649, in parse
    raise ValueError("Unknown string format:", timestr)
ValueError: ('Unknown string format:', '24 Ene 2013')
5 years ago
Markus Heiser ad7a6e6e10 bugfix(!biv) : bing-video do not like "older" User-Agents
When selecting other languages than 'en', bing-video did not handle the language
correct and gave very bad results.  Since User-Agent is normaly rotated in
searx, the behavior of a !biv search was unpredictable and paging was broken.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
5 years ago
Markus Heiser 1c853f9573 bing_news: parital rollback of c89c05bc
The bing_news bug (discussed in #1838) was caused by wrong language tags, which
was fixed e0c99d9d / no need to change the bing_news search string.

closes: https://github.com/asciimoo/searx/issues/1838

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
5 years ago
Markus Heiser e0c99d9dcb bugfix: fetch_supported_languages bing, -news, -videos, -images
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
5 years ago
Markus Heiser c89c05bceb bugfix: google-news and bing-news has changed the language parameter
closes: https://github.com/asciimoo/searx/issues/1838

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
5 years ago
piplongrun f0684a5bb5
Add eTools engine 5 years ago
Noémi Ványi 99435381a8 [enh] introduce private engines
This PR adds a new setting to engines named `tokens`.
It expects a list of tokens which lets searx validate
if the request should be accepted or not.
5 years ago
frankdelange db9d7d47bd Fix double-encode error (fixes #1799) 5 years ago
Adam Tauber 17b6faa4c3 [fix] pep8 5 years ago
Adam Tauber ad5bb994b1 [fix] add py3 compatibility 5 years ago
Adam Tauber 1e6253ce16 [fix] handle empty response 5 years ago
Adam Tauber 86a378bd01 [fix] handle missing thumbnail 5 years ago
Adam Tauber 2dc2e1e8f9 [fix] skip invalid encoded attributes 5 years ago
Adam Tauber 2292e6e130 [fix] handle missing result size 5 years ago
Markus Heiser 36e72a4619
Merge branch 'master' into fix-engine-spotify 5 years ago
Marc Abonce Seguin 5706c12fba remove empty parenthesis in wikipedia's summary
They're usually IPA pronunciations which are removed
by the API.
5 years ago
Marc Abonce Seguin c18048e045 exclude disambiguation pages from wikipedia infobox 5 years ago
Adam Tauber 34ad3d6b34 [enh] display error message if gigablast extra param expired 5 years ago
Adam Tauber fc457569f7 [fix] pep8 5 years ago
Adam Tauber 00512e36c1 [fix] handle empty response from wikipedia engine - closes #1114 5 years ago
Adam Tauber f8713512be [fix] convert byte query to string in osm engine - fixes #1220 5 years ago
Adam Tauber e5305f886c [fix] fetch extra search param of gigablast - fixes #1293 5 years ago
Adam Tauber 8850036ded [fix] add explicit useragent header to requests - closes #1459 5 years ago
Marc Abonce Seguin ccaf6ca02c [fix] update xpaths for new google results page 5 years ago
Adam Tauber 731e34299d
Merge pull request #1744 from dalf/optimizations
[mod] speed optimization
5 years ago
Adam Tauber 574cb25a16
Merge pull request #1758 from return42/ddd-fix
[fix] duckduckgo_definitions
5 years ago
Markus Heiser 30ad0c666d duckduckgo_definitions: remove the debug message
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
5 years ago
Adam Tauber 20da8f2cbf
Merge pull request #1754 from MarcAbonce/seedpeer
Add Seedpeer again
5 years ago
Markus Heiser b6d9f5aa71 [fix] duckduckgo_definition issues reported by 'manage.sh test'
Fix this error while travis build::

  /home/travis/build/asciimoo/searx/searx/engines/duckduckgo_definitions.py:21:44: E225 missing whitespace around operator

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
5 years ago
Markus Heiser 4998e9ec85 [fix] duckduckgo_definitions - where 'AnswerType' is 'calc'
Do not try to get text when 'AnswerType' is 'calc'.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
5 years ago
Markus Heiser 2aa95c16e3 [fix] soundcloud: URLs of JS sources has been moved
The client_id is found under (new) URL:

  https://a-v2.sndcdn.com/assets/49-a0c01933-3.js

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
5 years ago
Adam Tauber 789d71350d
Merge pull request #1745 from lorddavidiii/python3.8-fix
Fix python 3.8 compatibility
5 years ago
Adam Tauber 05033ea8d8
Merge pull request #1689 from MarcAbonce/images_fixes
[fix] Google Images
5 years ago
Marc Abonce Seguin 9299355570 add seedpeer again 5 years ago
Emilien Devos 8f51430f5c [fix] Force Google old UI with a new user agent 5 years ago
lorddavidiii 5e5ff0cbf8 webapp.py: use html.escape if cgi.escape is not available
- cgi.escape was removed in python 3.8
- also use html.escape in framalibre.py
5 years ago
Dalf 85b3723345 [mod] speed optimization
compile XPath only once
avoid redundant call to urlparse
get_locale(webapp.py): avoid useless call to request.accept_languages.best_match
5 years ago
Noémi Ványi 5796dc60c9 fix pep 8 check 5 years ago
Noémi Ványi a6f20caf32 add initial support for offline engines && command engine 5 years ago
Adam Tauber 7d8fd4b95e [fix] pep8 5 years ago
Adam Tauber bbe4442a86 [fix] update gigablast engine 5 years ago
Adam Tauber 1057e42cfd [fix] update digg engine 5 years ago
Adam Tauber 7177c9e12f [fix] update devianart engine 5 years ago
Adam Tauber 6ca1622378 [fix] update 1x engine 5 years ago
Adam Tauber c98a2df36d [fix] enable paging support for arxiv engine 5 years ago
Adam Tauber ed1c1bdb04 [fix] pep8 5 years ago
Adam Tauber 77a70fe541 [fix] update startpage engine - closes #1601 5 years ago
Adam Tauber 94ea9d6622 [fix] duckduckgo paging - closes #1677 5 years ago
Marc Abonce Seguin bb4d223770 [fix] google images 5 years ago
Léo Bourrel 88261e111c Fix bing engine results count (#1387)
This PR fixes the result count from bing which was throwing an (hidden) error and add a validation to avoid reading more results than avalaible.

For example :
If there is 100 results from some search and we try to get results from 120 to 130, Bing will send back the results from 0 to 10 and no error. If we compare results count with the first parameter of the request we can avoid this "invalid" results.
5 years ago
Dalf 1cee2c1796 [fix] bing engine
before this commit, sometimes there are no results
use a generic user-agent instead of one with the OS "Windows NT 6.3; WOW64"
5 years ago
Dalf fcc9587ee9 [fix] fdroid engine 5 years ago
Dalf fbf6b689dd [fix] dictzone engine 5 years ago
Dalf 9ff5001816 [fix] arxiv engine 5 years ago
Alexandre Flament 2179079a91
[fix] fix flickr_noapi decoding (#1655)
Characters that were not ASCII were incorrectly decoded.
Add an helper function: searx.utils.ecma_unescape (Python implementation of unescape Javascript function).
5 years ago
cy8aer 4dc792e1e2 [enh] add invidious engine. (#1657)
closes #1372
5 years ago
0xhtml b2e1ee8d35 Fix some more errors with none/wrong credentials 5 years ago
0xhtml 275b37cc7c Fix error if the user hasn't set api credentials 5 years ago
0xhtml c329ea135e Fix spotify engine 5 years ago
Dalf 0c032c8429 [fix] youtube_noapi engine: fix the title 5 years ago
Dalf 8b7ac56669 [fix] google_videos engine: some results don't a thumbnail 5 years ago
Dalf d44677e226 [fix] dailymotion engine: remove HTML tags from the description 5 years ago
Dalf 6e0285b2db [fix] wikidata engine: faster processing, remove one HTTP redirection.
* Search URL is https://www.wikidata.org/w/index.php?{query}&ns0=1 (with ns0=1 at the end to avoid an HTTP redirection)
* url_detail: remove the disabletidy=1 deprecated parameter
* Add eval_xpath function: compile once for all xpath.
* Add get_id_cache: retrieve all HTML with an id, avoid the slow to procress dynamic xpath '//div[@id="{propertyid}"]'.replace('{propertyid}')
* Create an etree.HTMLParser() instead of using the global one (see #1575)
5 years ago
Frank de Lange cbc5e13275 [enh] flickr_noapi: use complete JSON data block, add 'content', 'img_format', 'source', etc. (#1571)
Fetch complete JSON data block, use legend to extract images. 
Unquote urlencoded strings.
Add image description as 'content'. 
Add 'img_format' and 'source' data (needs PR #1567 to enable this data to be displayed). 
Show images which lack ownerid instead of discarding them.
5 years ago
Frank de Lange 204a2cbbf0 [fix] bing_videos (#1579)
use JSON where possible, compose 'content' using all available data, use correct 'url' (direct to source instead of redirect through bing)
5 years ago
Dalf 23611897ec [fix] make sure then engine name is lower case
Minor fix: "%s engine initialized" display the right engine name
5 years ago
Frank de Lange 11fc9913e9 [enh] bing_images: use data from embedded JSON to improve results (e.g. real page title) (#1568)
use data from embedded JSON to improve results (e.g. real page title), add image format and source info (see PR #1567), improve paging logic (it now works)
5 years ago
Alexandre Flament f34b5cedb1
[fix] fixes google play engines (#1651)
update commit 87baa74a86
5 years ago
volth eb182df132 [mod] restore btdigg engine as btdig.com (#1515) 5 years ago
rachmadani haryono 3b1122c5fa [fix] fix duden engine (#1594) 5 years ago
Venca24 87baa74a86 [fix] fixes google play engines and adds thumbnails to their results (#1612)
fix google play apps, google play apps, google play music engines

xpath engine: thumbnail_xpath can define an optional thumbnail
5 years ago
Dalf da0ce5880f [fix] fix soundcloud engine, speed up searx start time 5 years ago
Dalf 45702b77ca embedded iframe (youtube, dailymotion, vimeo): use https 5 years ago
Emilien Devos cbd1ebdce8 [fix] Force Google old UI (#1597) 5 years ago
Frank de Lange 4b7332286a Use string formatter to create source and img_format labels (#1566)
google_images :  use JSON embedded in HTML (engine expected pure JSON)
5 years ago
Dalf ffe0972f91 Remove some engines : subtitleseeker, seedpeer, swisscows
http://www.subtitleseeker.com and http://www.seedpeer.eu don't exist anymore.
https://swisscows.ch/ has change : the engine needs to be updated
5 years ago
Alexandre Flament df2b9a76f7
Merge branch 'master' into ne/fix-google-image-search 6 years ago
Nick Espig 1c6ab79b9f
Fix google image search
- Because there is not full image url in the dom, we replace "image_url" with the same url as the "url" (url of source).
  See example HTML https://gist.github.com/Nachtalb/2dea8a4d2c723c49226ad9645838121f
- Remove unused import
- Fix google image search title
- Keep google image safe value up to date
6 years ago
Marc Abonce Seguin 3e1c2153f7 [fix] duckduckgo images requests 6 years ago
Marc Abonce Seguin f2d49a6971 [fix] get youtube results from js object
Results are not appearing in the html document anymore,
instead they are found inside an object embedded in a script.
6 years ago
Jonas Zohren f7bdd827c4 [enh] adds apkmirror search engine 6 years ago
Léo Bourrel bf4a38ad66 Remove asksteem 6 years ago
d-tux f1814079f0
Merge branch 'master' into engines/unsplash 6 years ago