Commit Graph

12 Commits (ba8959ad7c18ce4165d29b7a472d845bd96f4735)

Author SHA1 Message Date
Alexandre FLAMENT e36f85b836 Science category: update the engines
* use the paper.html template
* fetch more data from the engines
* add crossref.py
2 years ago
Markus Heiser 8df1f0c47e [mod] add 'Accept-Language' HTTP header to online processores
Most engines that support languages (and regions) use the Accept-Language from
the WEB browser to build a response that fits to the language (and region).

- add new engine option: send_accept_language_header

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2 years ago
Emilien Devos 5fb2071cb2 [fix] google & youtube - set EU consent cookie
This change the previous bypass method for Google consent using
``ucbcb=1`` (6face215b8) to accept the consent using ``CONSENT=YES+``.

The youtube_noapi and google have a similar API, at least for the consent[1].

Get CONSENT cookie from google reguest::

    curl -i "https://www.google.com/search?q=time&tbm=isch" \
         -A "Mozilla/5.0 (X11; Linux i686; rv:102.0) Gecko/20100101 Firefox/102.0" \
         | grep -i consent
    ...
    location: https://consent.google.com/m?continue=https://www.google.com/search?q%3Dtime%26tbm%3Disch&gl=DE&m=0&pc=irp&uxe=eomtm&hl=en-US&src=1
    set-cookie: CONSENT=PENDING+936; expires=Wed, 24-Jul-2024 11:26:20 GMT; path=/; domain=.google.com; Secure
    ...

PENDING & YES [2]:

  Google change the way for consent about YouTube cookies agreement in EU
  countries. Instead of showing a popup in the website, YouTube redirects the
  user to a new webpage at consent.youtube.com domain ...  Fix for this is to
  put a cookie CONSENT with YES+ value for every YouTube request

[1] https://github.com/iv-org/invidious/pull/2207
[2] https://github.com/TeamNewPipe/NewPipeExtractor/issues/592

Closes: https://github.com/searxng/searxng/issues/1432
2 years ago
Emilien Devos 6face215b8 bypass google consent with ucbcb=1 2 years ago
Markus Heiser 3d96a9839a [format.python] initial formatting of the python code
This patch was generated by black [1]::

    make format.python

[1] https://github.com/psf/black

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2 years ago
Markus Heiser f0059b80ed [pylint] engines: drop no longer needed 'missing-function-docstring'
Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914168470
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser cd033b5416 [fix] drop useless pylint: disable=undefined-variable
Since 7b235a1 (see line 591) it is no longer needed to disable
'undefined-variable' for names defined in::

   PYLINT_ADDITIONAL_BUILTINS_FOR_ENGINES

Suggested-by: @dalf https://github.com/searxng/searxng/issues/102#issuecomment-914068609
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser aecfb2300d [mod] one logger per engine - drop obsolete logger.getChild
Remove the no longer needed `logger = logger.getChild(...)` from engines.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser 2ac3e5b20b [fix] log messages from: google- images, news, scholar, videos
- HTTP header Accept-Language --> lang_info['headers']['Accept-Language']
- remove obsolete query_url log messages which is already logged by
  httpx._client:HTTP request

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Alexandre Flament 1c67b6aece [enh] google engine: supports "default language"
Same behaviour behaviour than Whoogle [1].  Only the google engine with the
"Default language" choice "(all)"" is changed by this patch.

When searching for a locate place, the result are in the expect language,
without missing results [2]:

  > When a language is not specified, the language interpretation is left up to
  > Google to decide how the search results should be delivered.

The query parameters are copied from Whoogle.  With the ``all`` language:

- add parameter ``source=lnt``
- don't use parameter ``lr``
- don't add a ``Accept-Language`` HTTP header.

The new signature of function ``get_lang_info()`` is:

    lang_info = get_lang_info(params, lang_list, custom_aliases, supported_any_language)

Argument ``supported_any_language`` is True for google.py and False for the other
google engines.  With this patch the function now returns:

- query parameters: ``lang_info['params']``
- HTTP headers: ``lang_info['headers']``
- and as before this patch:
  - ``lang_info['subdomain']``
  - ``lang_info['country']``
  - ``lang_info['language']``

[1] https://github.com/benbusby/whoogle-search
[2] https://github.com/benbusby/whoogle-search/releases/tag/v0.5.4
3 years ago
Markus Heiser dc29f1d826 [pylint] tag PYLINT_FILES by comment `# lint: pylint`
These py files are linted by `test.pylint`, all other files are linted by
`test.pep8`.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago
Markus Heiser d48e2e7b0b [enh] google scholar - python implementation of the engine
The old xpath configuration for google scholar did not work and is replaced by a
python implementation.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
3 years ago