|
|
|
|
|
<!DOCTYPE html>
|
|
|
|
|
|
<html lang="en">
|
|
|
<head>
|
|
|
<meta charset="utf-8" />
|
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
|
|
<meta name="viewport" content="width=device-width, initial-scale=1">
|
|
|
<title>Search language plugin — SearXNG Documentation (2023.1.23+522ba9a1)</title>
|
|
|
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
|
|
|
<link rel="stylesheet" type="text/css" href="../_static/searxng.css" />
|
|
|
<link rel="stylesheet" type="text/css" href="../_static/tabs.css" />
|
|
|
<script data-url_root="../" id="documentation_options" src="../_static/documentation_options.js"></script>
|
|
|
<script src="../_static/jquery.js"></script>
|
|
|
<script src="../_static/underscore.js"></script>
|
|
|
<script src="../_static/_sphinx_javascript_frameworks_compat.js"></script>
|
|
|
<script src="../_static/doctools.js"></script>
|
|
|
<script src="../_static/sphinx_highlight.js"></script>
|
|
|
<link rel="index" title="Index" href="../genindex.html" />
|
|
|
<link rel="search" title="Search" href="../search.html" />
|
|
|
<link rel="next" title="Limiter Plugin" href="searx.plugins.limiter.html" />
|
|
|
<link rel="prev" title="Locales" href="searx.locales.html" />
|
|
|
</head><body>
|
|
|
<div class="related" role="navigation" aria-label="related navigation">
|
|
|
<h3>Navigation</h3>
|
|
|
<ul>
|
|
|
<li class="right" style="margin-right: 10px">
|
|
|
<a href="../genindex.html" title="General Index"
|
|
|
accesskey="I">index</a></li>
|
|
|
<li class="right" >
|
|
|
<a href="../py-modindex.html" title="Python Module Index"
|
|
|
>modules</a> |</li>
|
|
|
<li class="right" >
|
|
|
<a href="searx.plugins.limiter.html" title="Limiter Plugin"
|
|
|
accesskey="N">next</a> |</li>
|
|
|
<li class="right" >
|
|
|
<a href="searx.locales.html" title="Locales"
|
|
|
accesskey="P">previous</a> |</li>
|
|
|
<li class="nav-item nav-item-0"><a href="../index.html">SearXNG Documentation (2023.1.23+522ba9a1)</a> »</li>
|
|
|
<li class="nav-item nav-item-1"><a href="index.html" accesskey="U">Source-Code</a> »</li>
|
|
|
<li class="nav-item nav-item-this"><a href="">Search language plugin</a></li>
|
|
|
</ul>
|
|
|
</div>
|
|
|
|
|
|
<div class="document">
|
|
|
<div class="documentwrapper">
|
|
|
<div class="bodywrapper">
|
|
|
<div class="body" role="main">
|
|
|
|
|
|
<section id="module-searx.plugins.autodetect_search_language">
|
|
|
<span id="search-language-plugin"></span><span id="autodetect-search-language"></span><h1>Search language plugin<a class="headerlink" href="#module-searx.plugins.autodetect_search_language" title="Permalink to this heading">¶</a></h1>
|
|
|
<p>Plugin to detect the search language from the search query.</p>
|
|
|
<p>The language detection is done by using the <a class="reference external" href="https://fasttext.cc/">fastText</a> library (<a class="reference external" href="https://pypi.org/project/fasttext/">python
|
|
|
fasttext</a>). <a class="reference external" href="https://fasttext.cc/">fastText</a> distributes the <a class="reference external" href="https://fasttext.cc/docs/en/language-identification.html">language identification model</a>, for
|
|
|
reference:</p>
|
|
|
<ul class="simple">
|
|
|
<li><p><a class="reference external" href="https://arxiv.org/abs/1612.03651">FastText.zip: Compressing text classification models</a></p></li>
|
|
|
<li><p><a class="reference external" href="https://arxiv.org/abs/1607.01759">Bag of Tricks for Efficient Text Classification</a></p></li>
|
|
|
</ul>
|
|
|
<p>The <a class="reference external" href="https://fasttext.cc/docs/en/language-identification.html">language identification model</a> support the language codes (ISO-639-3):</p>
|
|
|
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">af</span> <span class="n">als</span> <span class="n">am</span> <span class="n">an</span> <span class="n">ar</span> <span class="n">arz</span> <span class="k">as</span> <span class="n">ast</span> <span class="n">av</span> <span class="n">az</span> <span class="n">azb</span> <span class="n">ba</span> <span class="n">bar</span> <span class="n">bcl</span> <span class="n">be</span> <span class="n">bg</span> <span class="n">bh</span> <span class="n">bn</span> <span class="n">bo</span> <span class="n">bpy</span> <span class="n">br</span> <span class="n">bs</span> <span class="n">bxr</span>
|
|
|
<span class="n">ca</span> <span class="n">cbk</span> <span class="n">ce</span> <span class="n">ceb</span> <span class="n">ckb</span> <span class="n">co</span> <span class="n">cs</span> <span class="n">cv</span> <span class="n">cy</span> <span class="n">da</span> <span class="n">de</span> <span class="n">diq</span> <span class="n">dsb</span> <span class="n">dty</span> <span class="n">dv</span> <span class="n">el</span> <span class="n">eml</span> <span class="n">en</span> <span class="n">eo</span> <span class="n">es</span> <span class="n">et</span> <span class="n">eu</span> <span class="n">fa</span>
|
|
|
<span class="n">fi</span> <span class="n">fr</span> <span class="n">frr</span> <span class="n">fy</span> <span class="n">ga</span> <span class="n">gd</span> <span class="n">gl</span> <span class="n">gn</span> <span class="n">gom</span> <span class="n">gu</span> <span class="n">gv</span> <span class="n">he</span> <span class="n">hi</span> <span class="n">hif</span> <span class="n">hr</span> <span class="n">hsb</span> <span class="n">ht</span> <span class="n">hu</span> <span class="n">hy</span> <span class="n">ia</span> <span class="nb">id</span> <span class="n">ie</span> <span class="n">ilo</span> <span class="n">io</span>
|
|
|
<span class="ow">is</span> <span class="n">it</span> <span class="n">ja</span> <span class="n">jbo</span> <span class="n">jv</span> <span class="n">ka</span> <span class="n">kk</span> <span class="n">km</span> <span class="n">kn</span> <span class="n">ko</span> <span class="n">krc</span> <span class="n">ku</span> <span class="n">kv</span> <span class="n">kw</span> <span class="n">ky</span> <span class="n">la</span> <span class="n">lb</span> <span class="n">lez</span> <span class="n">li</span> <span class="n">lmo</span> <span class="n">lo</span> <span class="n">lrc</span> <span class="n">lt</span> <span class="n">lv</span>
|
|
|
<span class="n">mai</span> <span class="n">mg</span> <span class="n">mhr</span> <span class="nb">min</span> <span class="n">mk</span> <span class="n">ml</span> <span class="n">mn</span> <span class="n">mr</span> <span class="n">mrj</span> <span class="n">ms</span> <span class="n">mt</span> <span class="n">mwl</span> <span class="n">my</span> <span class="n">myv</span> <span class="n">mzn</span> <span class="n">nah</span> <span class="n">nap</span> <span class="n">nds</span> <span class="n">ne</span> <span class="n">new</span> <span class="n">nl</span> <span class="n">nn</span>
|
|
|
<span class="n">no</span> <span class="n">oc</span> <span class="ow">or</span> <span class="n">os</span> <span class="n">pa</span> <span class="n">pam</span> <span class="n">pfl</span> <span class="n">pl</span> <span class="n">pms</span> <span class="n">pnb</span> <span class="n">ps</span> <span class="n">pt</span> <span class="n">qu</span> <span class="n">rm</span> <span class="n">ro</span> <span class="n">ru</span> <span class="n">rue</span> <span class="n">sa</span> <span class="n">sah</span> <span class="n">sc</span> <span class="n">scn</span> <span class="n">sco</span> <span class="n">sd</span>
|
|
|
<span class="n">sh</span> <span class="n">si</span> <span class="n">sk</span> <span class="n">sl</span> <span class="n">so</span> <span class="n">sq</span> <span class="n">sr</span> <span class="n">su</span> <span class="n">sv</span> <span class="n">sw</span> <span class="n">ta</span> <span class="n">te</span> <span class="n">tg</span> <span class="n">th</span> <span class="n">tk</span> <span class="n">tl</span> <span class="n">tr</span> <span class="n">tt</span> <span class="n">tyv</span> <span class="n">ug</span> <span class="n">uk</span> <span class="n">ur</span> <span class="n">uz</span> <span class="n">vec</span> <span class="n">vep</span>
|
|
|
<span class="n">vi</span> <span class="n">vls</span> <span class="n">vo</span> <span class="n">wa</span> <span class="n">war</span> <span class="n">wuu</span> <span class="n">xal</span> <span class="n">xmf</span> <span class="n">yi</span> <span class="n">yo</span> <span class="n">yue</span> <span class="n">zh</span>
|
|
|
</pre></div>
|
|
|
</div>
|
|
|
<p>The <a class="reference external" href="https://fasttext.cc/docs/en/language-identification.html">language identification model</a> is harmonized with the SearXNG’s language
|
|
|
(locale) model. General conditions of SearXNG’s locale model are:</p>
|
|
|
<ol class="loweralpha simple">
|
|
|
<li><p>SearXNG’s locale of a query is passed to the
|
|
|
<a class="reference internal" href="searx.locales.html#searx.locales.get_engine_locale" title="searx.locales.get_engine_locale"><code class="xref py py-obj docutils literal notranslate"><span class="pre">searx.locales.get_engine_locale</span></code></a> to get a language and/or region
|
|
|
code that is used by an engine.</p></li>
|
|
|
<li><p>SearXNG and most of the engines do not support all the languages from
|
|
|
language model and there might be also a discrepancy in the ISO-639-3 and
|
|
|
ISO-639-2 handling (<a class="reference internal" href="searx.locales.html#searx.locales.get_engine_locale" title="searx.locales.get_engine_locale"><code class="xref py py-obj docutils literal notranslate"><span class="pre">searx.locales.get_engine_locale</span></code></a>). Further
|
|
|
more, in SearXNG the locales like <code class="docutils literal notranslate"><span class="pre">zh-TH</span></code> (<code class="docutils literal notranslate"><span class="pre">zh-CN</span></code>) are mapped to
|
|
|
<code class="docutils literal notranslate"><span class="pre">zh_Hant</span></code> (<code class="docutils literal notranslate"><span class="pre">zh_Hans</span></code>).</p></li>
|
|
|
</ol>
|
|
|
<p>Conclusion: This plugin does only auto-detect the languages a user can select in
|
|
|
the language menu (<a class="reference internal" href="#searx.plugins.autodetect_search_language.supported_langs" title="searx.plugins.autodetect_search_language.supported_langs"><code class="xref py py-obj docutils literal notranslate"><span class="pre">supported_langs</span></code></a>).</p>
|
|
|
<p>SearXNG’s locale of a query comes from (<em>highest wins</em>):</p>
|
|
|
<ol class="arabic simple">
|
|
|
<li><p>The <code class="docutils literal notranslate"><span class="pre">Accept-Language</span></code> header from user’s HTTP client.</p></li>
|
|
|
<li><p>The user select a locale in the preferences.</p></li>
|
|
|
<li><p>The user select a locale from the menu in the query form (e.g. <code class="docutils literal notranslate"><span class="pre">:zh-TW</span></code>)</p></li>
|
|
|
<li><p>This plugin is activated in the preferences and the locale (only the language
|
|
|
code / none region code) comes from the fastText’s language detection.</p></li>
|
|
|
</ol>
|
|
|
<p>Conclusion: There is a conflict between the language selected by the user and
|
|
|
the language from language detection of this plugin. For example, the user
|
|
|
explicitly selects the German locale via the search syntax to search for a term
|
|
|
that is identified as an English term (try <code class="docutils literal notranslate"><span class="pre">:de-DE</span> <span class="pre">thermomix</span></code>, for example).</p>
|
|
|
<div class="admonition hint">
|
|
|
<p class="admonition-title">Hint</p>
|
|
|
<p>To SearXNG maintainers; please take into account: under some circumstances
|
|
|
the auto-detection of the language of this plugin could be detrimental to
|
|
|
users expectations. Its not recommended to activate this plugin by
|
|
|
default. It should always be the user’s decision whether to activate this
|
|
|
plugin or not.</p>
|
|
|
</div>
|
|
|
<dl class="py data">
|
|
|
<dt class="sig sig-object py" id="searx.plugins.autodetect_search_language.supported_langs">
|
|
|
<span class="sig-prename descclassname"><span class="pre">searx.plugins.autodetect_search_language.</span></span><span class="sig-name descname"><span class="pre">supported_langs</span></span><em class="property"><span class="w"> </span><span class="p"><span class="pre">=</span></span><span class="w"> </span><span class="pre">{'af',</span> <span class="pre">'ar',</span> <span class="pre">'be',</span> <span class="pre">'bg',</span> <span class="pre">'ca',</span> <span class="pre">'cs',</span> <span class="pre">'da',</span> <span class="pre">'de',</span> <span class="pre">'el',</span> <span class="pre">'en',</span> <span class="pre">'es',</span> <span class="pre">'et',</span> <span class="pre">'fa',</span> <span class="pre">'fi',</span> <span class="pre">'fil',</span> <span class="pre">'fr',</span> <span class="pre">'he',</span> <span class="pre">'hi',</span> <span class="pre">'hr',</span> <span class="pre">'hu',</span> <span class="pre">'id',</span> <span class="pre">'is',</span> <span class="pre">'it',</span> <span class="pre">'ja',</span> <span class="pre">'ko',</span> <span class="pre">'lt',</span> <span class="pre">'lv',</span> <span class="pre">'nl',</span> <span class="pre">'no',</span> <span class="pre">'pl',</span> <span class="pre">'pt',</span> <span class="pre">'ro',</span> <span class="pre">'ru',</span> <span class="pre">'sk',</span> <span class="pre">'sl',</span> <span class="pre">'sr',</span> <span class="pre">'sv',</span> <span class="pre">'sw',</span> <span class="pre">'th',</span> <span class="pre">'tr',</span> <span class="pre">'uk',</span> <span class="pre">'vi',</span> <span class="pre">'zh'}</span></em><a class="headerlink" href="#searx.plugins.autodetect_search_language.supported_langs" title="Permalink to this definition">¶</a></dt>
|
|
|
<dd><p>Languages supported by most searxng engines (<code class="xref py py-obj docutils literal notranslate"><span class="pre">searx.languages.language_codes</span></code>).</p>
|
|
|
</dd></dl>
|
|
|
|
|
|
</section>
|
|
|
|
|
|
|
|
|
<div class="clearer"></div>
|
|
|
</div>
|
|
|
</div>
|
|
|
</div>
|
|
|
<span id="sidebar-top"></span>
|
|
|
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
|
|
|
<div class="sphinxsidebarwrapper">
|
|
|
|
|
|
|
|
|
<p class="logo"><a href="../index.html">
|
|
|
<img class="logo" src="../_static/searxng-wordmark.svg" alt="Logo"/>
|
|
|
</a></p>
|
|
|
|
|
|
|
|
|
<h3><a href="../index.html">Table of Contents</a></h3>
|
|
|
<p class="caption" role="heading"><span class="caption-text">Contents</span></p>
|
|
|
<ul class="current">
|
|
|
<li class="toctree-l1"><a class="reference internal" href="../user/index.html">User information</a></li>
|
|
|
<li class="toctree-l1"><a class="reference internal" href="../own-instance.html">Why use a private instance?</a></li>
|
|
|
<li class="toctree-l1"><a class="reference internal" href="../admin/index.html">Administrator documentation</a></li>
|
|
|
<li class="toctree-l1"><a class="reference internal" href="../dev/index.html">Developer documentation</a></li>
|
|
|
<li class="toctree-l1"><a class="reference internal" href="../utils/index.html">DevOps tooling box</a></li>
|
|
|
<li class="toctree-l1 current"><a class="reference internal" href="index.html">Source-Code</a><ul class="current">
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.babel_extract.html">Custom message extractor (i18n)</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.engines.html">Load Engines</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.engines.demo_offline.html">Demo Offline Engine</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.engines.demo_online.html">Demo Online Engine</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.engines.google.html">Google Engines</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.engines.tineye.html">Tineye</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.engines.yahoo.html">Yahoo Engine</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.infopage.html">Online <code class="docutils literal notranslate"><span class="pre">/info</span></code></a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.locales.html">Locales</a></li>
|
|
|
<li class="toctree-l2 current"><a class="current reference internal" href="#">Search language plugin</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.plugins.limiter.html">Limiter Plugin</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.plugins.tor_check.html">Tor check plugin</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.redisdb.html">Redis DB</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.redislib.html">Redis Library</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.search.html">Search</a></li>
|
|
|
<li class="toctree-l2"><a class="reference internal" href="searx.utils.html">Utility functions for the engines</a></li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
<li class="toctree-l1"><a class="reference internal" href="../donate.html">Donate to searxng.org</a></li>
|
|
|
</ul>
|
|
|
|
|
|
<h3>Project Links</h3>
|
|
|
<ul>
|
|
|
<li><a href="https://github.com/searxng/searxng/tree/master">Source</a>
|
|
|
|
|
|
<li><a href="https://github.com/searxng/searxng/wiki">Wiki</a>
|
|
|
|
|
|
<li><a href="https://searx.space">Public instances</a>
|
|
|
|
|
|
<li><a href="https://github.com/searxng/searxng/issues">Issue Tracker</a>
|
|
|
</ul><h3>Navigation</h3>
|
|
|
<ul>
|
|
|
<li><a href="../index.html">Overview</a>
|
|
|
<ul>
|
|
|
<li><a href="index.html">Source-Code</a>
|
|
|
<ul>
|
|
|
<li>Previous: <a href="searx.locales.html" title="previous chapter">Locales</a>
|
|
|
<li>Next: <a href="searx.plugins.limiter.html" title="next chapter">Limiter Plugin</a></ul>
|
|
|
</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
</ul>
|
|
|
<div id="searchbox" style="display: none" role="search">
|
|
|
<h3 id="searchlabel">Quick search</h3>
|
|
|
<div class="searchformwrapper">
|
|
|
<form class="search" action="../search.html" method="get">
|
|
|
<input type="text" name="q" aria-labelledby="searchlabel" autocomplete="off" autocorrect="off" autocapitalize="off" spellcheck="false"/>
|
|
|
<input type="submit" value="Go" />
|
|
|
</form>
|
|
|
</div>
|
|
|
</div>
|
|
|
<script>document.getElementById('searchbox').style.display = "block"</script>
|
|
|
<div role="note" aria-label="source link">
|
|
|
<h3>This Page</h3>
|
|
|
<ul class="this-page-menu">
|
|
|
<li><a href="../_sources/src/searx.plugins.autodetect_search_language.rst.txt"
|
|
|
rel="nofollow">Show Source</a></li>
|
|
|
</ul>
|
|
|
</div>
|
|
|
</div>
|
|
|
</div>
|
|
|
<div class="clearer"></div>
|
|
|
</div>
|
|
|
|
|
|
<div class="footer" role="contentinfo">
|
|
|
© Copyright 2021 SearXNG team, 2015-2021 Adam Tauber, Noémi Ványi.
|
|
|
Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 5.3.0.
|
|
|
</div>
|
|
|
<script src="../_static/version_warning_offset.js"></script>
|
|
|
|
|
|
</body>
|
|
|
</html> |