langchain/tests/integration_tests
zhaoshengbo ab44c24333
Add Alibaba Cloud OpenSearch as a new vector store (#6154)
Hello Folks,

Thanks for creating and maintaining this great project. I'm excited to
submit this PR to add Alibaba Cloud OpenSearch as a new vector store.

OpenSearch is a one-stop platform to develop intelligent search
services. OpenSearch was built based on the large-scale distributed
search engine developed by Alibaba. OpenSearch serves more than 500
business cases in Alibaba Group and thousands of Alibaba Cloud
customers. OpenSearch helps develop search services in different search
scenarios, including e-commerce, O2O, multimedia, the content industry,
communities and forums, and big data query in enterprises.

OpenSearch provides the vector search feature. In specific scenarios,
especially test question search and image search scenarios, you can use
the vector search feature together with the multimodal search feature to
improve the accuracy of search results.


This PR includes:

A AlibabaCloudOpenSearch class that can connect to the Alibaba Cloud
OpenSearch instance.
add embedings and metadata into a opensearch datasource.
querying by squared euclidean and metadata.
integration tests.
ipython notebook and docs.

I have read your contributing guidelines. And I have passed the tests
below

- [x]  make format
- [x]  make lint
- [x]  make coverage
- [x]  make test

---------

Co-authored-by: zhaoshengbo <shengbo.zsb@alibaba-inc.com>
2023-06-20 10:07:40 -07:00
..
agent Add Multi-CSV/DF support in CSV and DataFrame Toolkits (#5009) 2023-05-25 14:23:11 -07:00
cache feat: add Momento as a standard cache and chat message history provider (#5221) 2023-05-25 19:13:21 -07:00
callbacks Add support for tags (#5898) 2023-06-13 12:30:59 -07:00
chains fix neo4j schema query (#6381) 2023-06-19 22:48:35 -07:00
chat_models fix anthropic chat model mutating input list (#6457) 2023-06-19 21:30:52 -07:00
document_loaders Harrison/unstructured page number (#6464) 2023-06-19 22:31:43 -07:00
embeddings add dashscope text embedding (#5929) 2023-06-11 21:14:20 -07:00
examples feat: Add UnstructuredXMLLoader for .xml files (#5955) 2023-06-10 16:24:42 -07:00
llms Baseten integration (#5862) 2023-06-08 23:05:57 -07:00
memory feat: add Momento as a standard cache and chat message history provider (#5221) 2023-05-25 19:13:21 -07:00
prompts Cleanup integration test dir (#3308) 2023-04-21 09:44:09 -07:00
retrievers DocArray as a Retriever (#6031) 2023-06-17 09:09:33 -07:00
utilities ArxivAPIWrapper - doc_content_chars_max (#6063) 2023-06-15 22:16:42 -07:00
vectorstores Add Alibaba Cloud OpenSearch as a new vector store (#6154) 2023-06-20 10:07:40 -07:00
__init__.py
.env.example adding MongoDBAtlasVectorSearch (#5338) 2023-05-30 07:59:01 -07:00
conftest.py feat: improve pinecone tests (#2806) 2023-04-13 21:49:31 -07:00
test_document_transformers.py Contextual compression retriever (#2915) 2023-04-20 17:01:14 -07:00
test_nebulagraph.py Harrison/nebula graph (#5865) 2023-06-07 21:56:43 -07:00
test_nlp_text_splitters.py OptimizedPrompt -- k-shot example choice backed by semantic search (#91) 2022-11-09 21:15:42 -08:00
test_pdf_pagesplitter.py cleanup: unify 3 different pdf loaders, rename PagedPDFSplitter (#1615) 2023-03-13 23:06:50 -07:00
test_schema.py Add 'get_token_ids' method (#4784) 2023-05-22 13:17:26 +00:00
test_text_splitter.py chore: spedd up integration test by using smaller model (#6044) 2023-06-12 13:27:10 -07:00