Commit Graph

1169 Commits (searx_updates)

Author SHA1 Message Date
blob42 48e642c353 searx: update doc, rate limiter whitelist feature on searx merged 12 months ago
Lance Martin 370becdfc2
Add self query retriever example with MD header splitting (#6359)
Flesh out the notebook example for `MarkdownHeaderTextSplitter`
12 months ago
Lance Martin 2c97fbabbd
Update MD header text splitter notebook (#6339)
Highlight use case for maintaining header groups when splitting.
12 months ago
Harrison Chase a2bbe3dda4
Harrison/mmr support for opensearch (#6349)
Co-authored-by: Mehmet Öner Yalçın <oneryalcin@gmail.com>
12 months ago
Davis Chase 2eea5d4cb4
Add ignore vercel preview script (#6320)
skip building preview of docs for anything branch that doesn't start
with `__docs__`. will eventually update to look at code diff directories
but patching for now
12 months ago
Harrison Chase 680d6bbbf8 fix titles in documentation 12 months ago
Harrison Chase 8cfb52ddbb fix spelling 12 months ago
lonestriker 6f36f0f930
Add oobabooga/text-generation-webui support as a llm (#5997)
Add oobabooga/text-generation-webui support as an LLM. Currently,
supports using text-generation-webui's non-streaming API interface.
Allows users who already have text-gen running to use the same models
with langchain.

#### Before submitting

Simple usage, similar to existing LLM supported:

```
from langchain.llms import TextGen
llm = TextGen(model_url = "http://localhost:5000")
```
#### Who can review?

 @hwchase17 - project lead

---------

Co-authored-by: Hien Ngo <Hien.Ngo@adia.ae>
12 months ago
Saba Sturua 427551eabf
DocArray as a Retriever (#6031)
## DocArray as a Retriever

[DocArray](https://github.com/docarray/docarray) is an open-source tool
for managing your multi-modal data. It offers flexibility to store and
search through your data using various document index backends. This PR
introduces `DocArrayRetriever` - which works with any available backend
and serves as a retriever for Langchain apps.

Also, I added 2 notebooks:
DocArray Backends - intro to all 5 currently supported backends, how to
initialize, index, and use them as a retriever
DocArray Usage - showcasing what additional search parameters you can
pass to create versatile retrievers

Example:
```python
from docarray.index import InMemoryExactNNIndex
from docarray import BaseDoc, DocList
from docarray.typing import NdArray
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.retrievers import DocArrayRetriever


# define document schema
class MyDoc(BaseDoc):
    description: str
    description_embedding: NdArray[1536]


embeddings = OpenAIEmbeddings()
# create documents
descriptions = ["description 1", "description 2"]
desc_embeddings = embeddings.embed_documents(texts=descriptions)
docs = DocList[MyDoc](
    [
        MyDoc(description=desc, description_embedding=embedding)
        for desc, embedding in zip(descriptions, desc_embeddings)
    ]
)

# initialize document index with data
db = InMemoryExactNNIndex[MyDoc](docs)

# create a retriever
retriever = DocArrayRetriever(
    index=db,
    embeddings=embeddings,
    search_field="description_embedding",
    content_field="description",
)

# find the relevant document
doc = retriever.get_relevant_documents("action movies")
print(doc)
```

#### Who can review?

@dev2049

---------

Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
12 months ago
Masafumi Mori 7bb437146d
fix links to prompt templates and example selectors (#6332)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # 
links to prompt templates and example selectors on the
[Prompts](https://python.langchain.com/docs/modules/model_io/prompts/)
page are invalid.

#### Before submitting
Just a small note that I tried to run `make docs_clean` and other
related commands before PR written
[here](https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md#build-documentation-locally),
it gives me an error:
```bash
langchain % make docs_clean
Traceback (most recent call last):
  File "/Users/masafumi/Downloads/langchain/.venv/bin/make", line 5, in <module>
    from scripts.proto import main
ModuleNotFoundError: No module named 'scripts'
make: *** [docs_clean] Error 1
# Poetry (version 1.5.1)
# Python 3.9.13
```
I couldn't figure out how to fix this, so I didn't run those command.
But links should work.

#### Who can review?

Tag maintainers/contributors who might be interested:
@hwchase17

Similar issue #6323

Co-authored-by: masafumimori <m.masafumimori@outlook.com>
12 months ago
Francisco Ingham 83eea230f3
changed height in the nb example (#6327)
changed height in the example to a more reasonable number (from 9 feet
to 6 feet)
12 months ago
Harrison Chase af18413d97
Harrison/deeplake new features (#6263)
Co-authored-by: adilkhan <adilkhan.sarsen@nu.edu.kz>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
12 months ago
Davis Chase 6640293087
fix eval guide links (#6319) 12 months ago
ljeagle ad324a39ae
Improve the performance of add_texts interface and upgrade the AwaDB from 0.3.2 to 0.3.3 (#6316)
1. Changed the implementation of add_texts interface for the AwaDB
vector store in order to improve the performance
2. Upgrade the AwaDB from 0.3.2 to 0.3.3

---------

Co-authored-by: vincent <awadb.vincent@gmail.com>
12 months ago
Davis Chase 24b2af5218
nit (#6305) 12 months ago
Davis Chase 03b5891cf7
more redirect (#6314) 12 months ago
Davis Chase eaee492dbc
basic redirect (#6309) 12 months ago
Davis Chase 2f47e5c766
update api link (#6303) 12 months ago
Davis Chase d558bcfad8
rm ignore_vercel (#6302) 12 months ago
Davis Chase 87e502c6bc
Doc refactor (#6300)
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
12 months ago
Harrison Chase 6aafb46807
Harrison/openai functions (#6223)
Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>
12 months ago
Alon Roth 0013256e81
Support chat history persistence in AutoGPT (#5716)
**Short Description**
Added a new argument to AutoGPT class which allows to persist the chat
history to a file.

**Changes**
1. Removed the `self.full_message_history: List[BaseMessage] = []`
2. Replaced it with `chat_history_memory` which can take any subclasses
of `BaseChatMessageHistory`

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
12 months ago
Martin Antos 1913320cbe
Feature/add acreom loader (#5780)
adding new loader for [acreom](https://acreom.com) vaults. It's based on
the Obsidian loader with some additional text processing for acreom
specific markdown elements.

 @eyurtsev please take a look!

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
12 months ago
Harrison Chase e82687ddf4
Harrison/use functions agent (#6185)
Co-authored-by: Francisco Ingham <24279597+fpingham@users.noreply.github.com>
12 months ago
Ryo Kanazawa 7d2b946d0b
Fix typo `pandocs` to `pandoc` (#6203)
Fixes https://github.com/hwchase17/langchain/issues/6204

### Context

An typo issue with `pandoc`.

#### Who can review?
@hwchase17
12 months ago
0xJordan c5a46e7435
feat: Add support for the Solidity language (#6054)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

## Add Solidity programming language support for code splitter.

Twitter: @0xjord4n_

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
#### Who can review?

Tag maintainers/contributors who might be interested:
@hwchase17

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
12 months ago
Nuno Campos 17c4ec4812
Add docs for tags (#6155)
<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->

Fixes # (issue)

#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @hwchase17

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
12 months ago
thiswillbeyourgithub 4a649e3b14
typo: 'following following' to 'following' (#6163)
Co-authored-by: thiswillbeyourgithub <github@32mail.33mail.com>
12 months ago
Maciej Bryński 8a44c879c6
Update readthedocs_documentation.ipynb (#6148)
Minor fix in documentation. 
Change URL in wget call to proper one.
12 months ago
Harrison Chase 6ac120f299
bump ver to 200 (#6130) 12 months ago
Harrison Chase e41f0b341c
add functions agent (#6113) 12 months ago
Harrison Chase 1281fdf0f2
Harrison/notebook functions (#6103) 12 months ago
Wenchen Li f9edf76e7c
Implement `max_marginal_relevance_search` in `VectorStore` of Pinecone (#6056)
This adds implementation of MMR search in pinecone; and I have two
semi-related observations about this vector store class:
- Maybe we should also have a
`similarity_search_by_vector_returning_embeddings` like in supabase, but
it's not in the base `VectorStore` class so I didn't implement
- Talking about the base class, there's
`similarity_search_with_relevance_scores`, but in pinecone it is called
`similarity_search_with_score`; maybe we should consider renaming it to
align with other `VectorStore` base and sub classes (or add that as an
alias for backward compatibility)

#### Who can review?

Tag maintainers/contributors who might be interested:
 - VectorStores / Retrievers / Memory - @dev2049
12 months ago
Lance Martin ee3d0513ad
Add tests and update notebook for MarkdownHeaderTextSplitter (#6069)
Add test and update notebook for `MarkdownHeaderTextSplitter`.
12 months ago
Julius Lipp 5b6bbf4ab2
Add embaas document extraction api endpoints (#6048)
# Introduces embaas document extraction api endpoints

In this PR, we add support for embaas document extraction endpoints to
Text Embedding Models (with LLMs, in different PRs coming). We currently
offer the MTEB leaderboard top performers, will continue to add top
embedding models and soon add support for customers to deploy thier own
models. Additional Documentation + Infomation can be found
[here](https://embaas.io).

While developing this integration, I closely followed the patterns
established by other langchain integrations. Nonetheless, if there are
any aspects that require adjustments or if there's a better way to
present a new integration, let me know! :)

Additionally, I fixed some docs in the embeddings integration.

Related PR: #5976 

#### Who can review?
  DataLoaders
  - @eyurtsev
12 months ago
Lance Martin b023f0c0f2
Text splitter for Markdown files by header (#5860)
This creates a new kind of text splitter for markdown files.

The user can supply a set of headers that they want to split the file
on.

We define a new text splitter class, `MarkdownHeaderTextSplitter`, that
does a few things:

(1) For each line, it determines the associated set of user-specified
headers
(2) It groups lines with common headers into splits

See notebook for example usage and test cases.
12 months ago
Harrison Chase 5922742d56 comment out 12 months ago
Harrison Chase 681ba6d520 embaas title 12 months ago
Ben Flast 7a5e36f3f5
Mongo db doc fix (#6042)
I missed a few errors in my initial fix @hwchase1.  Thanks!
12 months ago
Harrison Chase d1561b74eb
Harrison/cognitive search (#6011)
Co-authored-by: Fabrizio Ruocco <ruoccofabrizio@gmail.com>
12 months ago
wenmeng zhou bb7ac9edb5
add dashscope text embedding (#5929)
#### What I do
Adding embedding api for
[DashScope](https://help.aliyun.com/product/610100.html), which is the
DAMO Academy's multilingual text unified vector model based on the LLM
base. It caters to multiple mainstream languages worldwide and offers
high-quality vector services, helping developers quickly transform text
data into high-quality vector data. Currently supported languages
include Chinese, English, Spanish, French, Portuguese, Indonesian, and
more.

#### Who can review?

  Models
  - @hwchase17
  - @agola11

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
12 months ago
Ben Flast 010d0bfeea
Update MongoDB Atlas support docs (#6022)
Updating MongoDB Atlas support docs @hwchase17 let me know if you have
any questions
12 months ago
Harrison Chase e05997c25e
Harrison/hologres (#6012)
Co-authored-by: Changgeng Zhao <changgeng@nyu.edu>
Co-authored-by: Changgeng Zhao <zhaochanggeng.zcg@alibaba-inc.com>
12 months ago
ju-bezdek 18f5c985d9
Langchain decorators (#6017)
Added description of LangChain Decorators  into the integration section

<!--
Thank you for contributing to LangChain! Your PR will appear in our
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.

Finally, we'd love to show appreciation for your contribution - if you'd
like us to shout you out on Twitter, please also include your handle!
-->

<!-- Remove if not applicable -->


#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?

Tag maintainers/contributors who might be interested:

@hwchase17 

<!-- For a quicker response, figure out the right person to tag with @

  @hwchase17 - project lead

  Tracing / Callbacks
  - @agola11

  Async
  - @agola11

  DataLoaders
  - @eyurtsev

  Models
  - @hwchase17
  - @agola11

  Agents / Tools / Toolkits
  - @vowelparrot

  VectorStores / Retrievers / Memory
  - @dev2049

 -->
12 months ago
Harrison Chase a7227ee01b
Harrison/embaas (#6010)
Co-authored-by: Julius Lipp <43986145+juliuslipp@users.noreply.github.com>
12 months ago
Akhil Vempali d7d629911b
feat: Added filtering option to FAISS vectorstore (#5966)
Inspired by the filtering capability available in ChromaDB, added the
same functionality to the FAISS vectorestore as well. Since FAISS does
not have an inbuilt method of filtering used the approach suggested in
this [thread](https://github.com/facebookresearch/faiss/issues/1079)
Langchain Issue inspiration:
https://github.com/hwchase17/langchain/issues/4572

- [x] Added filtering capability to semantic similarly and MMR
- [x] Added test cases for filtering in
`tests/integration_tests/vectorstores/test_faiss.py`

#### Who can review?

Tag maintainers/contributors who might be interested:

  VectorStores / Retrievers / Memory
  - @dev2049
  - @hwchase17
12 months ago
Ikko Eltociear Ashimine c868a3eef3
Update databricks.md (#6006)
HuggingFace -> Hugging Face


#### Before submitting

<!-- If you're adding a new integration, please include:

1. a test for the integration - favor unit tests that does not rely on
network access.
2. an example notebook showing its use


See contribution guidelines for more information on how to write tests,
lint
etc:


https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->

#### Who can review?
12 months ago
Satheesh Valluru d2270a2261
Fix: Grammer fix in documentation (#5925)
Fix for grammatical errors in the documentation of `vectorstore`.  
@vowelparrot
12 months ago
Ofer Mendelevitch f8cf09a230
Update to Vectara integration (#5950)
This PR updates the Vectara integration (@hwchase17 ):
* Adds reuse of requests.session to imrpove efficiency and speed.
* Utilizes Vectara's low-level API (instead of standard API) to better
match user's specific chunking with LangChain
* Now add_texts puts all the texts into a single Vectara document so
indexing is much faster.
* updated variables names from alpha to lambda_val (to be consistent
with Vectara docs) and added n_context_sentence so it's available to use
if needed.
* Updates to documentation and tests

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
12 months ago
qued e4224a396b
feat: Add `UnstructuredXMLLoader` for `.xml` files (#5955)
# Unstructured XML Loader
Adds an `UnstructuredXMLLoader` class for .xml files. Works with
unstructured>=0.6.7. A plain text representation of the text with the
XML tags will be available under the `page_content` attribute in the
doc.

### Testing
```python
from langchain.document_loaders import UnstructuredXMLLoader

loader = UnstructuredXMLLoader(
    "example_data/factbook.xml",
)
docs = loader.load()
```


## Who can review?

@hwchase17 
@eyurtsev
12 months ago