langchain/docs/integrations/myscale.md
Leonid Ganeline e2d7677526
docs: compound ecosystem and integrations (#4870)
# Docs: compound ecosystem and integrations

**Problem statement:** We have a big overlap between the
References/Integrations and Ecosystem/LongChain Ecosystem pages. It
confuses users. It creates a situation when new integration is added
only on one of these pages, which creates even more confusion.
- removed References/Integrations page (but move all its information
into the individual integration pages - in the next PR).
- renamed Ecosystem/LongChain Ecosystem into Integrations/Integrations.
I like the Ecosystem term. It is more generic and semantically richer
than the Integration term. But it mentally overloads users. The
`integration` term is more concrete.
UPDATE: after discussion, the Ecosystem is the term.
Ecosystem/Integrations is the page (in place of Ecosystem/LongChain
Ecosystem).

As a result, a user gets a single place to start with the individual
integration.
2023-05-18 09:29:57 -07:00

2.6 KiB

MyScale

This page covers how to use MyScale vector database within LangChain. It is broken into two parts: installation and setup, and then references to specific MyScale wrappers.

With MyScale, you can manage both structured and unstructured (vectorized) data, and perform joint queries and analytics on both types of data using SQL. Plus, MyScale's cloud-native OLAP architecture, built on top of ClickHouse, enables lightning-fast data processing even on massive datasets.

Introduction

Overview to MyScale and High performance vector search

You can now register on our SaaS and start a cluster now!

If you are also interested in how we managed to integrate SQL and vector, please refer to this document for further syntax reference.

We also deliver with live demo on huggingface! Please checkout our huggingface space! They search millions of vector within a blink!

Installation and Setup

  • Install the Python SDK with pip install clickhouse-connect

Setting up envrionments

There are two ways to set up parameters for myscale index.

  1. Environment Variables

    Before you run the app, please set the environment variable with export: export MYSCALE_URL='<your-endpoints-url>' MYSCALE_PORT=<your-endpoints-port> MYSCALE_USERNAME=<your-username> MYSCALE_PASSWORD=<your-password> ...

    You can easily find your account, password and other info on our SaaS. For details please refer to this document Every attributes under MyScaleSettings can be set with prefix MYSCALE_ and is case insensitive.

  2. Create MyScaleSettings object with parameters

    from langchain.vectorstores import MyScale, MyScaleSettings
    config = MyScaleSetting(host="<your-backend-url>", port=8443, ...)
    index = MyScale(embedding_function, config)
    index.add_documents(...)
    

Wrappers

supported functions:

  • add_texts
  • add_documents
  • from_texts
  • from_documents
  • similarity_search
  • asimilarity_search
  • similarity_search_by_vector
  • asimilarity_search_by_vector
  • similarity_search_with_relevance_scores

VectorStore

There exists a wrapper around MyScale database, allowing you to use it as a vectorstore, whether for semantic search or similar example retrieval.

To import this vectorstore:

from langchain.vectorstores import MyScale

For a more detailed walkthrough of the MyScale wrapper, see this notebook