[docs] revision of the article "Offline engines"

This patch is a a complete revision of the article "Offline engines", which also
merges the content from the searx-wiki [1] into this article.

[1] https://github.com/searx/searx/wiki/Offline-engines

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
pull/97/head
Markus Heiser 3 years ago
parent 1c8cf1d3a8
commit 274288ea99

@ -1,77 +1,78 @@
===============================
Preparation for offline engines
===============================
.. _offline engines:
Offline engines
===============
Offline Engines
===============
.. sidebar:: offline engines
- :ref:`demo offline engine`
- :ref:`sql engines`
- :ref:`command line engines`
- :origin:`Redis <searx/engines/redis_server.py>`
To extend the functionality of searx, offline engines are going to be
To extend the functionality of SearxNG, offline engines are going to be
introduced. An offline engine is an engine which does not need Internet
connection to perform a search and does not use HTTP to communicate.
Offline engines can be configured as online engines, by adding those to the
`engines` list of :origin:`settings.yml <searx/settings.yml>`. Thus, searx
finds the engine file and imports it.
Offline engines can be configured, by adding those to the `engines` list of
:origin:`settings.yml <searx/settings.yml>`. An example skeleton for offline
engines can be found in :ref:`demo offline engine` (:origin:`demo_offline.py
<searx/engines/demo_offline.py>`).
Example skeleton for the new engines:
.. code:: python
Programming Interface
=====================
from subprocess import PIPE, Popen
:py:func:`init(engine_settings=None) <searx.engines.demo_offline.init>`
All offline engines can have their own init function to setup the engine before
accepting requests. The function gets the settings from settings.yml as a
parameter. This function can be omitted, if there is no need to setup anything
in advance.
categories = ['general']
offline = True
:py:func:`search(query, params) <searx.engines.demo_offline.searc>`
def init(settings):
pass
Each offline engine has a function named ``search``. This function is
responsible to perform a search and return the results in a presentable
format. (Where *presentable* means presentable by the selected result
template.)
def search(query, params):
process = Popen(['ls', query], stdout=PIPE)
return_code = process.wait()
if return_code != 0:
raise RuntimeError('non-zero return code', return_code)
The return value is a list of results retrieved by the engine.
results = []
line = process.stdout.readline()
while line:
result = parse_line(line)
results.append(results)
Engine representation in ``/config``
If an engine is offline, the attribute ``offline`` is set to ``True``.
line = process.stdout.readline()
.. _offline requirements:
return results
Extra Dependencies
==================
If an offline engine depends on an external tool, SearxNG does not install it by
default. When an administrator configures such engine and starts the instance,
the process returns an error with the list of missing dependencies. Also,
required dependencies will be added to the comment/description of the engine, so
admins can install packages in advance.
Development progress
====================
If there is a need to install additional packages in *Python's Virtual
Environment* of your SearxNG instance you need to switch into the environment
(:ref:`searx-src`) first, for this you can use :ref:`searx.sh`::
First, a proposal has been created as a Github issue. Then it was moved to the
wiki as a design document. You can read it here: :wiki:`Offline-engines`.
$ sudo utils/searx.sh shell
(searx-pyenv)$ pip install ...
In this development step, searx core was prepared to accept and perform offline
searches. Offline search requests are scheduled together with regular offline
requests.
As offline searches can return arbitrary results depending on the engine, the
current result templates were insufficient to present such results. Thus, a new
template is introduced which is caplable of presenting arbitrary key value pairs
as a table. You can check out the pull request for more details see
:pull-searx:`1700`.
Private engines (Security)
==========================
Next steps
==========
To limit the access to offline engines, if an instance is available publicly,
administrators can set token(s) for each of the :ref:`private engines`. If a
query contains a valid token, then SearxNG performs the requested private
search. If not, requests from an offline engines return errors.
Today, it is possible to create/run an offline engine. However, it is going to be publicly available for everyone who knows the searx instance. So the next step is to introduce token based access for engines. This way administrators are able to limit the access to private engines.
Acknowledgement
===============
This development was sponsored by `Search and Discovery Fund`_ of `NLnet Foundation`_ .
.. _Search and Discovery Fund: https://nlnet.nl/discovery
.. _NLnet Foundation: https://nlnet.nl/
| Happy hacking.
| kvch // 2019.10.21 17:03
This development was sponsored by `Search and Discovery Fund
<https://nlnet.nl/discovery>`_ of `NLnet Foundation <https://nlnet.nl/>`_ .

Loading…
Cancel
Save