Commit Graph

906 Commits

Author SHA1 Message Date
了空
f7e3d97b19
Remove unnecessary spaces from document object’s page_content of BiliBiliLoader (#4619)
- Remove unnecessary spaces from document object’s page_content of
BiliBiliLoader
- Fix BiliBiliLoader document and test file
2023-05-16 13:13:57 -04:00
Eugene Yurtsev
f47ec5b4b6
Docugami docs: First cell should be a title cell (#4735)
# Make first cell a title in docugami docs

This makes the first cell a title cell in docugami notebook
2023-05-16 13:12:14 -04:00
Harrison Chase
a7af32c274
Cassandra support for chat history (#4378) (#4764)
# Cassandra support for chat history

### Description

- Store chat messages in cassandra

### Dependency

- cassandra-driver - Python Module

## Before submitting

- Added Integration Test

## Who can review?

@hwchase17
@agola11

# Your PR Title (What it does)

<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->

<!-- Remove if not applicable -->

Fixes # (issue)

## Before submitting

<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:

<!-- For a quicker response, figure out the right person to tag with @

        @hwchase17 - project lead

        Tracing / Callbacks
        - @agola11

        Async
        - @agola11

        DataLoaders
        - @eyurtsev

        Models
        - @hwchase17
        - @agola11

        Agents / Tools / Toolkits
        - @vowelparrot
        
        VectorStores / Retrievers / Memory
        - @dev2049
        
 -->

Co-authored-by: Jinto Jose <129657162+jj701@users.noreply.github.com>
2023-05-15 23:43:09 -07:00
Leonid Ganeline
a6f3ec94bc
docs: added additional_resources folder (#4748)
# docs: added `additional_resources` folder

The additional resource files were inside the doc top-level folder,
which polluted the top-level folder.
- added the `additional_resources` folder and moved correspondent files
to this folder;
- fixed a broken link to the "Model comparison" page (model_laboratory
notebook)
- fixed a broken link to one of the YouTube videos (sorry, it is not
directly related to this PR)

## Who can review?

@dev2049
2023-05-15 17:12:47 -07:00
Zander Chase
580861e7f2
Revert "Make serpapi base url configurable via env (#4402)" (#4750)
This reverts commit 5111bec540.

This PR introduced a bug in the async API (the `url` param isn't bound);
it also didn't update the synchronous API correctly, which makes it
error-prone (the behavior of the async and sync endpoints would be
different)
2023-05-15 16:17:16 -07:00
shiyu22
21b9397342
Update the milvus example (#4706)
# Fix issue when running example

- add the query content
- update the `user` parameter with Zilliz

Signed-off-by: shiyu22 <shiyu.chen@zilliz.com>
2023-05-15 16:16:57 -07:00
Davis Chase
36c9fd1af7
Dev2049/docs edit0 (#4699) 2023-05-15 15:20:37 -07:00
Jinto Jose
1e467d9fc4
Jupyter Notebook Example for using Mongodb to store Chat Message History (#4436)
# Jupyter Notebook Example for using Mongodb Chat Message History

@dev2049
2023-05-15 14:33:42 -07:00
Leonid Ganeline
6060505a9d
Add new links to Tutorials and YouTube pages (#4746)
- added an official LangChain YouTube channel :)
- added new tutorials and videos (only videos with enough subscriber or
view numbers)
- added a "New video" icon 

## Who can review?

@dev2049
2023-05-15 14:32:48 -07:00
vinoyang
5111bec540
Make serpapi base url configurable via env (#4402)
Fixes #4328

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-15 14:25:25 -07:00
Roma
cb802edf75
[Feature] Add GraphQL Query Tool (#4409)
# Add GraphQL Query Support

This PR introduces a GraphQL API Wrapper tool that allows LLM agents to
query GraphQL databases. The tool utilizes the httpx and gql Python
packages to interact with GraphQL APIs and provides a simple interface
for running queries with LLM agents.

@vowelparrot

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-15 14:06:12 -07:00
Leonid Ganeline
70fd7cda14
docs: Concepts (#4734)
# glossary.md renamed as concepts.md and moved under the Getting Started

small PR.
`Concepts` looks right to the point. It is moved under Getting Started
(typical place). Previously it was lost in the Additional Resources
section.

## Who can review?

 @hwchase17
2023-05-15 11:09:25 -07:00
Harrison Chase
dd95f0892d
Harrison/add top k (#4707)
Co-authored-by: blc16 <benlc@umich.edu>
2023-05-15 09:09:22 -07:00
Eugene Yurtsev
3c490b5ba3
Docugami DataLoader (#4727)
### Adds a document loader for Docugami

Specifically:

1. Adds a data loader that talks to the [Docugami](http://docugami.com)
API to download processed documents as semantic XML
2. Parses the semantic XML into chunks, with additional metadata
capturing chunk semantics
3. Adds a detailed notebook showing how you can use additional metadata
returned by Docugami for techniques like the [self-querying
retriever](https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/self_query_retriever.html)
4. Adds an integration test, and related documentation

Here is an example of a result that is not possible without the
capabilities added by Docugami (from the notebook):

<img width="1585" alt="image"
src="https://github.com/hwchase17/langchain/assets/749277/bb6c1ce3-13dc-4349-a53b-de16681fdd5b">

---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
Co-authored-by: Taqi Jaffri <tjaffri@gmail.com>
2023-05-15 10:53:00 -04:00
Lester Yang
cd3f9865f3
Feature: pdfplumber PDF loader with BaseBlobParser (#4552)
# Feature: pdfplumber PDF loader with BaseBlobParser

* Adds pdfplumber as a PDF loader
* Adds pdfplumber as a blob parser.
2023-05-15 09:47:02 -04:00
Harrison Chase
b6e3ac17c4
Harrison/sitemap local (#4704)
Co-authored-by: Lukas Bauer <lukas.bauer@mayflower.de>
2023-05-14 22:04:38 -07:00
Harrison Chase
12b4ee1fc7
Harrison/telegram chat loader (#4698)
Co-authored-by: Akinwande Komolafe <47945512+Sensei-akin@users.noreply.github.com>
Co-authored-by: Akinwande Komolafe <akhinoz@gmail.com>
2023-05-14 22:04:27 -07:00
Leonid Ganeline
2b181e5a6c
docs: tutorials are moved on the top-level of docs (#4464)
# Added Tutorials section on the top-level of documentation

**Problem Statement**: the Tutorials section in the documentation is
top-priority. Not every project has resources to make tutorials. We have
such a privilege. Community experts created several tutorials on
YouTube.
But the tutorial links are now hidden on the YouTube page and not easily
discovered by first-time visitors.

**PR**: I've created the `Tutorials` page (from the `Additional
Resources/YouTube` page) and moved it to the top level of documentation
in the `Getting Started` section.

## Who can review?

        @dev2049
 
NOTE:
PR checks are randomly failing

3aefaafcdb

258819eadf

514d81b5b3
2023-05-14 21:22:25 -07:00
Ashish Talati
372a5113ff
Update gallery.rst with chatpdf opensource (#4342) 2023-05-14 19:43:16 -07:00
Samuli Rauatmaa
66828ad231
add the existing OpenWeatherMap tool to the public api (#4292)
[OpenWeatherMapAPIWrapper](f70e18a5b3/docs/modules/agents/tools/examples/openweathermap.ipynb)
works wonderfully, but the _tool_ itself can't be used in master branch.

- added OpenWeatherMap **tool** to the public api, to be loadable with
`load_tools` by using "openweathermap-api" tool name (that name is used
in the existing
[docs](aff33d52c5/docs/modules/agents/tools/getting_started.md),
at the bottom of the page)
- updated OpenWeatherMap tool's **description** to make the input format
match what the API expects (e.g. `London,GB` instead of `'London,GB'`)
- added [ecosystem documentation page for
OpenWeatherMap](f9c41594fe/docs/ecosystem/openweathermap.md)
- added tool usage example to [OpenWeatherMap's
notebook](f9c41594fe/docs/modules/agents/tools/examples/openweathermap.ipynb)

Let me know if there's something I missed or something needs to be
updated! Or feel free to make edits yourself if that makes it easier for
you 🙂
2023-05-14 18:50:45 -07:00
Harrison Chase
6f47ab17a4
Harrison/param notion db (#4689)
Co-authored-by: Edward Park <ed.sh.park@gmail.com>
2023-05-14 18:26:25 -07:00
Harrison Chase
5d63fc65e1
add warning for combined memory (#4688) 2023-05-14 18:26:16 -07:00
Harrison Chase
a48810fb21
dont have openai_api_version by default (#4687)
an alternative to https://github.com/hwchase17/langchain/pull/4234/files
2023-05-14 18:26:08 -07:00
Harrison Chase
c48f1301ee oops remove api key, dont worried i cycled it 2023-05-14 17:40:31 -07:00
Harrison Chase
57b2f3ffe6
add rebuff (#4637) 2023-05-14 17:38:43 -07:00
Zander Chase
d85b04be7f
Add RELLM and JSONFormer experimental LLM decoding (#4185)
[RELLM](https://github.com/r2d4/rellm) is a library that wraps local
HuggingFace pipeline models for structured decoding.

RELLM works by generating tokens one at a time. At each step, it masks
tokens that don't conform to the provided partial regular expression.

[JSONFormer](https://github.com/1rgs/jsonformer) is a bit different, where it sequentially adds the keys then decodes each value directly
2023-05-14 22:40:03 +00:00
Harrison Chase
243886be93
Harrison/virtual time (#4658)
Co-authored-by: ifsheldon <39153080+ifsheldon@users.noreply.github.com>
Co-authored-by: maple.liang <maple.liang@gempoll.com>
2023-05-14 10:29:17 -07:00
Harrison Chase
ef49c659f6
add embedding router (#4644) 2023-05-13 21:47:01 -07:00
Harrison Chase
c09bb00959
Harrison/summary memory history (#4649)
Co-authored-by: engkheng <60956360+outday29@users.noreply.github.com>
2023-05-13 21:46:11 -07:00
Harrison Chase
44ae673388
Harrison/multithreading directory loader (#4650)
Co-authored-by: PawelFaron <42373772+PawelFaron@users.noreply.github.com>
Co-authored-by: Pawel Faron <ext-pawel.faron@vaisala.com>
2023-05-13 21:46:02 -07:00
Harrison Chase
873b0c7eb6
Harrison/structured chat mem (#4652)
Co-authored-by: d 3 n 7 <29033313+d3n7@users.noreply.github.com>
2023-05-13 21:45:42 -07:00
Harrison Chase
279605b4d3
Harrison/metaphor search (#4657)
Co-authored-by: Jeffrey Wang <jeffreyzhiyuanwang@gmail.com>
2023-05-13 21:45:05 -07:00
Harrison Chase
9aa9fe7021
Harrison/spark connect example (#4659)
Co-authored-by: Mike Wang <62768671+skcoirz@users.noreply.github.com>
2023-05-13 21:44:54 -07:00
Leonid Ganeline
3ce78ef6c4
docs: document_loaders classification (#4069)
**Problem statement:** the
[document_loaders](https://python.langchain.com/en/latest/modules/indexes/document_loaders.html#)
section is too long and hard to comprehend.
**Proposal:** group document_loaders by 3 classes: (see `Files changed`
tab)

UPDATE: I've completely reworked the document_loader classification.
Now this PR changes only one file! 

FYI @eyurtsev @hwchase17
2023-05-13 19:17:32 -07:00
Harrison Chase
1e322ffc1c change heading 2023-05-13 09:52:23 -07:00
Davis Chase
9ab7101182
WIP: FLARE-inspired chain (#4612)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-05-13 09:28:28 -07:00
Harrison Chase
daa3e6dedb
Harrison/prompt constructor methods (#4616) 2023-05-13 09:23:51 -07:00
Harrison Chase
6265cbfb11
Harrison/standard llm interface (#4615) 2023-05-13 09:05:31 -07:00
Harrison Chase
7d425cbf38
improve sql prompt (#4611)
Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
Co-authored-by: Taqi Jaffri <tjaffri@gmail.com>
2023-05-12 21:55:03 -07:00
Tim Asp
ed0d557ede
docs: fix pdf docs hierarchy and formatting (#4593)
# Fix pdf loader docs page


![image](https://github.com/hwchase17/langchain/assets/707699/4a11f379-00ed-4f7a-9870-71f74e0cadc6)

Using h1's messes with hierarchy, this fixes that, and moves the
PyPDFium2 loader out of the middle of PDFMiner docs
2023-05-12 15:03:01 -04:00
Zander Chase
d96f6a106b
Add Steamship Image Generation Tool (#4580)
Co-authored-by: Enias Cailliau <enias@steamship.com>
2023-05-12 10:35:01 -07:00
Davis Chase
a4a9d1f403
Improve vespa interface (#4546)
![Screenshot 2023-05-11 at 7 50 31
PM](https://github.com/hwchase17/langchain/assets/130488702/bc8ab4bb-8006-44fc-ba07-df54e84ee2c1)
2023-05-12 10:11:26 -07:00
Neil Ruaro
3a2855945b
added documentation on retrieving a PG vectorstore (#4578)
This PR adds in documentation on querying an existing vectorstore in PG 

Fixes 3191 (issue)
2023-05-12 13:04:06 -04:00
Harrison Chase
5ad151ed44
Add constitutional principles from paper (#4554)
Add constitutional principles from https://arxiv.org/pdf/2212.08073.pdf

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-12 07:34:03 -07:00
Sai Vinay G
cf4c1394a2
feat: Added class to support huggingface text generation inference server (#4447)
[Text Generation
Inference](https://github.com/huggingface/text-generation-inference) is
a Rust, Python and gRPC server for generating text using LLMs.

This pull request add support for self hosted Text Generation Inference
servers.

feature: #4280

---------

Co-authored-by: Your Name <you@example.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-12 07:32:37 -07:00
Leonid Ganeline
e17d0319d5
Add arxiv retriever (#4538) 2023-05-11 22:48:38 -07:00
SimFG
7bcf238a1a
Optimize the initialization method of GPTCache (#4522)
Optimize the initialization method of GPTCache, so that users can use GPTCache more quickly.
2023-05-11 16:15:23 -07:00
kYLe
446b60d803
Fix a typo in langchain/docs/modules/models/llms/integrations/anyscale.ipynb (#4526) 2023-05-11 09:03:04 -07:00
Akshaya Annavajhala
b21d7c138c
Callback Handler for MLflow (#4150)
Rebased Mahmedk's PR with the callback refactor and added the example
requested by hwchase plus a couple minor fixes

---------

Co-authored-by: Ahmed K <77802633+mahmedk@users.noreply.github.com>
Co-authored-by: Ahmed K <mda3k27@gmail.com>
Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com>
Co-authored-by: Corey Zumar <39497902+dbczumar@users.noreply.github.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-11 01:10:40 -07:00
kYLe
0d51a1f12b
Add LLMs support for Anyscale Service (#4350)
Add Anyscale service integration under LLM

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-05-11 00:39:59 -07:00