Support Redis Sentinel database connections (#5196)

# Support Redis Sentinel database connections

This PR adds the support to connect not only to Redis standalone servers
but High Availability Replication sets too
(https://redis.io/docs/management/sentinel/)
Redis Replica Sets have on Master allowing to write data and 2+ replicas
with read-only access to the data. The additional Redis Sentinel
instances monitor all server and reconfigure the RW-Master on the fly if
it comes unavailable.

Therefore all connections must be made through the Sentinels the query
the current master for a read-write connection. This PR adds basic
support to also allow a redis connection url specifying a Sentinel as
Redis connection.

Redis documentation and Jupyter notebook with Redis examples are updated
to mention how to connect to a redis Replica Set with Sentinels

        - 

Remark - i did not found test cases for Redis server connections to add
new cases here. Therefor i tests the new utility class locally with
different kind of setups to make sure different connection urls are
working as expected. But no test case here as part of this PR.
This commit is contained in:
sseide 2023-07-17 16:18:51 +02:00 committed by GitHub
parent 2e47412073
commit 25e3d3f283
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
6 changed files with 268 additions and 30 deletions

View File

@ -8,6 +8,36 @@ It is broken into two parts: installation and setup, and then references to spec
## Wrappers
All wrappers needing a redis url connection string to connect to the database support either a stand alone Redis server
or a High-Availability setup with Replication and Redis Sentinels.
### Redis Standalone connection url
For standalone Redis server the official redis connection url formats can be used as describe in the python redis modules
"from_url()" method [Redis.from_url](https://redis-py.readthedocs.io/en/stable/connections.html#redis.Redis.from_url)
Example: `redis_url = "redis://:secret-pass@localhost:6379/0"`
### Redis Sentinel connection url
For [Redis sentinel setups](https://redis.io/docs/management/sentinel/) the connection scheme is "redis+sentinel".
This is an un-offical extensions to the official IANA registered protocol schemes as long as there is no connection url
for Sentinels available.
Example: `redis_url = "redis+sentinel://:secret-pass@sentinel-host:26379/mymaster/0"`
The format is `redis+sentinel://[[username]:[password]]@[host-or-ip]:[port]/[service-name]/[db-number]`
with the default values of "service-name = mymaster" and "db-number = 0" if not set explicit.
The service-name is the redis server monitoring group name as configured within the Sentinel.
The current url format limits the connection string to one sentinel host only (no list can be given) and
booth Redis server and sentinel must have the same password set (if used).
### Redis Cluster connection url
Redis cluster is not supported right now for all methods requiring a "redis_url" parameter.
The only way to use a Redis Cluster is with LangChain classes accepting a preconfigured Redis client like `RedisCache`
(example below).
### Cache
The Cache wrapper allows for [Redis](https://redis.io) to be used as a remote, low-latency, in-memory cache for LLM prompts and responses.

View File

@ -8,7 +8,11 @@
"\n",
">[Redis (Remote Dictionary Server)](https://en.wikipedia.org/wiki/Redis) is an in-memory data structure store, used as a distributed, in-memory keyvalue database, cache and message broker, with optional durability.\n",
"\n",
"This notebook shows how to use functionality related to the [Redis vector database](https://redis.com/solutions/use-cases/vector-database/)."
"This notebook shows how to use functionality related to the [Redis vector database](https://redis.com/solutions/use-cases/vector-database/).\n",
"\n",
"As database either Redis standalone server or Redis Sentinel HA setups are supported for connections with the \"redis_url\"\n",
"parameter. More information about the different formats of the redis connection url can be found in the LangChain\n",
"[Redis Readme](../../../../integrations/redis.md) file"
]
},
{
@ -258,6 +262,52 @@
"source": [
"Redis.delete(keys, redis_url=\"redis://localhost:6379\")"
]
},
{
"cell_type": "markdown",
"source": [
"### Redis connection Url examples\n",
"\n",
"Valid Redis Url scheme are:\n",
"1. `redis://` - Connection to Redis standalone, unencrypted\n",
"2. `rediss://` - Connection to Redis standalone, with TLS encryption\n",
"3. `redis+sentinel://` - Connection to Redis server via Redis Sentinel, unencrypted\n",
"4. `rediss+sentinel://` - Connection to Redis server via Redis Sentinel, booth connections with TLS encryption\n",
"\n",
"More information about additional connection parameter can be found in the redis-py documentation at https://redis-py.readthedocs.io/en/stable/connections.html"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"# connection to redis standalone at localhost, db 0, no password\n",
"redis_url=\"redis://localhost:6379\"\n",
"# connection to host \"redis\" port 7379 with db 2 and password \"secret\" (old style authentication scheme without username / pre 6.x)\n",
"redis_url=\"redis://:secret@redis:7379/2\"\n",
"# connection to host redis on default port with user \"joe\", pass \"secret\" using redis version 6+ ACLs\n",
"redis_url=\"redis://joe:secret@redis/0\"\n",
"\n",
"# connection to sentinel at localhost with default group mymaster and db 0, no password\n",
"redis_url=\"redis+sentinel://localhost:26379\"\n",
"# connection to sentinel at host redis with default port 26379 and user \"joe\" with password \"secret\" with default group mymaster and db 0\n",
"redis_url=\"redis+sentinel://joe:secret@redis\"\n",
"# connection to sentinel, no auth with sentinel monitoring group \"zone-1\" and database 2\n",
"redis_url=\"redis+sentinel://redis:26379/zone-1/2\"\n",
"\n",
"# connection to redis standalone at localhost, db 0, no password but with TLS support\n",
"redis_url=\"rediss://localhost:6379\"\n",
"# connection to redis sentinel at localhost and default port, db 0, no password\n",
"# but with TLS support for booth Sentinel and Redis server\n",
"redis_url=\"rediss+sentinel://localhost\"\n"
],
"metadata": {
"collapsed": false
}
}
],
"metadata": {

View File

@ -6,6 +6,7 @@ from langchain.schema import (
BaseChatMessageHistory,
)
from langchain.schema.messages import BaseMessage, _message_to_dict, messages_from_dict
from langchain.utilities.redis import get_client
logger = logging.getLogger(__name__)
@ -29,7 +30,7 @@ class RedisChatMessageHistory(BaseChatMessageHistory):
)
try:
self.redis_client = redis.Redis.from_url(url=url)
self.redis_client = get_client(redis_url=url)
except redis.exceptions.ConnectionError as error:
logger.error(error)

View File

@ -15,6 +15,7 @@ from langchain.memory.utils import get_prompt_input_key
from langchain.schema import BasePromptTemplate
from langchain.schema.language_model import BaseLanguageModel
from langchain.schema.messages import BaseMessage, get_buffer_string
from langchain.utilities.redis import get_client
logger = logging.getLogger(__name__)
@ -99,7 +100,7 @@ class RedisEntityStore(BaseEntityStore):
super().__init__(*args, **kwargs)
try:
self.redis_client = redis.Redis.from_url(url=url, decode_responses=True)
self.redis_client = get_client(redis_url=url, decode_responses=True)
except redis.exceptions.ConnectionError as error:
logger.error(error)

View File

@ -0,0 +1,140 @@
from __future__ import annotations
import logging
from typing import (
TYPE_CHECKING,
Any,
)
from urllib.parse import urlparse
if TYPE_CHECKING:
from redis.client import Redis as RedisType
logger = logging.getLogger(__name__)
def get_client(redis_url: str, **kwargs: Any) -> RedisType:
"""Get a redis client from the connection url given. This helper accepts
urls for Redis server (TCP with/without TLS or UnixSocket) as well as
Redis Sentinel connections.
Redis Cluster is not supported.
Before creating a connection the existence of the database driver is checked
an and ValueError raised otherwise
To use, you should have the ``redis`` python package installed.
Example:
.. code-block:: python
from langchain.utilities.redis import get_client
redis_client = get_client(
redis_url="redis://username:password@localhost:6379"
index_name="my-index",
embedding_function=embeddings.embed_query,
)
To use a redis replication setup with multiple redis server and redis sentinels
set "redis_url" to "redis+sentinel://" scheme. With this url format a path is
needed holding the name of the redis service within the sentinels to get the
correct redis server connection. The default service name is "mymaster". The
optional second part of the path is the redis db number to connect to.
An optional username or password is used for booth connections to the rediserver
and the sentinel, different passwords for server and sentinel are not supported.
And as another constraint only one sentinel instance can be given:
Example:
.. code-block:: python
from langchain.utilities.redis import get_client
redis_client = get_client(
redis_url="redis+sentinel://username:password@sentinelhost:26379/mymaster/0"
index_name="my-index",
embedding_function=embeddings.embed_query,
)
"""
# Initialize with necessary components.
try:
import redis
except ImportError:
raise ValueError(
"Could not import redis python package. "
"Please install it with `pip install redis>=4.1.0`."
)
# check if normal redis:// or redis+sentinel:// url
if redis_url.startswith("redis+sentinel"):
redis_client = _redis_sentinel_client(redis_url, **kwargs)
if redis_url.startswith("rediss+sentinel"): # sentinel with TLS support enables
kwargs["ssl"] = True
if "ssl_cert_reqs" not in kwargs:
kwargs["ssl_cert_reqs"] = "none"
redis_client = _redis_sentinel_client(redis_url, **kwargs)
else:
# connect to redis server from url
redis_client = redis.from_url(redis_url, **kwargs)
return redis_client
def _redis_sentinel_client(redis_url: str, **kwargs: Any) -> RedisType:
"""helper method to parse an (un-official) redis+sentinel url
and create a Sentinel connection to fetch the final redis client
connection to a replica-master for read-write operations.
If username and/or password for authentication is given the
same credentials are used for the Redis Sentinel as well as Redis Server.
With this implementation using a redis url only it is not possible
to use different data for authentication on booth systems.
"""
import redis
parsed_url = urlparse(redis_url)
# sentinel needs list with (host, port) tuple, use default port if none available
sentinel_list = [(parsed_url.hostname or "localhost", parsed_url.port or 26379)]
if parsed_url.path:
# "/mymaster/0" first part is service name, optional second part is db number
path_parts = parsed_url.path.split("/")
service_name = path_parts[1] or "mymaster"
if len(path_parts) > 2:
kwargs["db"] = path_parts[2]
else:
service_name = "mymaster"
sentinel_args = {}
if parsed_url.password:
sentinel_args["password"] = parsed_url.password
kwargs["password"] = parsed_url.password
if parsed_url.username:
sentinel_args["username"] = parsed_url.username
kwargs["username"] = parsed_url.username
# check for all SSL related properties and copy them into sentinel_kwargs too,
# add client_name also
for arg in kwargs:
if arg.startswith("ssl") or arg == "client_name":
sentinel_args[arg] = kwargs[arg]
# sentinel user/pass is part of sentinel_kwargs, user/pass for redis server
# connection as direct parameter in kwargs
sentinel_client = redis.sentinel.Sentinel(
sentinel_list, sentinel_kwargs=sentinel_args, **kwargs
)
# redis server might have password but not sentinel - fetch this error and try
# again without pass, everything else cannot be handled here -> user needed
try:
sentinel_client.execute_command("ping")
except redis.exceptions.AuthenticationError as ae:
if "no password is set" in ae.args[0]:
logger.warning(
"Redis sentinel connection configured with password but Sentinel \
answered NO PASSWORD NEEDED - Please check Sentinel configuration"
)
sentinel_client = redis.sentinel.Sentinel(sentinel_list, **kwargs)
else:
raise ae
return sentinel_client.master_for(service_name)

View File

@ -28,6 +28,7 @@ from langchain.callbacks.manager import (
)
from langchain.docstore.document import Document
from langchain.embeddings.base import Embeddings
from langchain.utilities.redis import get_client
from langchain.utils import get_from_dict_or_env
from langchain.vectorstores.base import VectorStore, VectorStoreRetriever
@ -111,6 +112,24 @@ class Redis(VectorStore):
index_name="my-index",
embedding_function=embeddings.embed_query,
)
To use a redis replication setup with multiple redis server and redis sentinels
set "redis_url" to "redis+sentinel://" scheme. With this url format a path is
needed holding the name of the redis service within the sentinels to get the
correct redis server connection. The default service name is "mymaster".
An optional username or password is used for booth connections to the rediserver
and the sentinel, different passwords for server and sentinel are not supported.
And as another constraint only one sentinel instance can be given:
Example:
.. code-block:: python
vectorstore = Redis(
redis_url="redis+sentinel://username:password@sentinelhost:26379/mymaster/0"
index_name="my-index",
embedding_function=embeddings.embed_query,
)
"""
def __init__(
@ -126,19 +145,10 @@ class Redis(VectorStore):
**kwargs: Any,
):
"""Initialize with necessary components."""
try:
import redis
except ImportError:
raise ValueError(
"Could not import redis python package. "
"Please install it with `pip install redis>=4.1.0`."
)
self.embedding_function = embedding_function
self.index_name = index_name
try:
# connect to redis from url
redis_client = redis.from_url(redis_url, **kwargs)
redis_client = get_client(redis_url=redis_url, **kwargs)
# check if redis has redisearch module installed
_check_redis_module_exist(redis_client, REDIS_REQUIRED_MODULES)
except ValueError as e:
@ -280,13 +290,13 @@ class Redis(VectorStore):
query (str): The query text for which to find similar documents.
k (int): The number of documents to return. Default is 4.
score_threshold (float): The minimum matching score required for a document
to be considered a match. Defaults to 0.2.
Because the similarity calculation algorithm is based on cosine similarity,
the smaller the angle, the higher the similarity.
to be considered a match. Defaults to 0.2.
Because the similarity calculation algorithm is based on cosine
similarity, the smaller the angle, the higher the similarity.
Returns:
List[Document]: A list of documents that are most similar to the query text,
including the match score for each document.
including the match score for each document.
Note:
If there are no documents that satisfy the score_threshold value,
@ -373,13 +383,16 @@ class Redis(VectorStore):
) -> Tuple[Redis, List[str]]:
"""Create a Redis vectorstore from raw documents.
This is a user-friendly interface that:
1. Embeds documents.
2. Creates a new index for the embeddings in Redis.
3. Adds the documents to the newly created Redis index.
4. Returns the keys of the newly created documents.
1. Embeds documents.
2. Creates a new index for the embeddings in Redis.
3. Adds the documents to the newly created Redis index.
4. Returns the keys of the newly created documents.
This is intended to be a quick way to get started.
Example:
.. code-block:: python
from langchain.vectorstores import Redis
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
@ -434,12 +447,15 @@ class Redis(VectorStore):
) -> Redis:
"""Create a Redis vectorstore from raw documents.
This is a user-friendly interface that:
1. Embeds documents.
2. Creates a new index for the embeddings in Redis.
3. Adds the documents to the newly created Redis index.
1. Embeds documents.
2. Creates a new index for the embeddings in Redis.
3. Adds the documents to the newly created Redis index.
This is intended to be a quick way to get started.
Example:
.. code-block:: python
from langchain.vectorstores import Redis
from langchain.embeddings import OpenAIEmbeddings
embeddings = OpenAIEmbeddings()
@ -481,7 +497,7 @@ class Redis(VectorStore):
raise ValueError("'ids' (keys)() were not provided.")
try:
import redis
import redis # noqa: F401
except ImportError:
raise ValueError(
"Could not import redis python package. "
@ -492,7 +508,7 @@ class Redis(VectorStore):
# otherwise passing it to Redis will result in an error.
if "redis_url" in kwargs:
kwargs.pop("redis_url")
client = redis.from_url(url=redis_url, **kwargs)
client = get_client(redis_url=redis_url, **kwargs)
except ValueError as e:
raise ValueError(f"Your redis connected error: {e}")
# Check if index exists
@ -522,7 +538,7 @@ class Redis(VectorStore):
"""
redis_url = get_from_dict_or_env(kwargs, "redis_url", "REDIS_URL")
try:
import redis
import redis # noqa: F401
except ImportError:
raise ValueError(
"Could not import redis python package. "
@ -533,7 +549,7 @@ class Redis(VectorStore):
# otherwise passing it to Redis will result in an error.
if "redis_url" in kwargs:
kwargs.pop("redis_url")
client = redis.from_url(url=redis_url, **kwargs)
client = get_client(redis_url=redis_url, **kwargs)
except ValueError as e:
raise ValueError(f"Your redis connected error: {e}")
# Check if index exists
@ -558,7 +574,7 @@ class Redis(VectorStore):
"""Connect to an existing Redis index."""
redis_url = get_from_dict_or_env(kwargs, "redis_url", "REDIS_URL")
try:
import redis
import redis # noqa: F401
except ImportError:
raise ValueError(
"Could not import redis python package. "
@ -569,7 +585,7 @@ class Redis(VectorStore):
# otherwise passing it to Redis will result in an error.
if "redis_url" in kwargs:
kwargs.pop("redis_url")
client = redis.from_url(url=redis_url, **kwargs)
client = get_client(redis_url=redis_url, **kwargs)
# check if redis has redisearch module installed
_check_redis_module_exist(client, REDIS_REQUIRED_MODULES)
# ensure that the index already exists