langchain

mirror of https://github.com/hwchase17/langchain synced 2024-10-31 15:20:26 +00:00

Author	SHA1	Message	Date
glaze	f7ad14acfa	Add etherscan document loader (#7943 ) @rlancemartin The modification includes: * etherscanLoader * test_etherscan * document ipynb I have run the test, lint, format, and spell check. I do encounter a linting error on ipynb, I am not sure how to address that. ``` docs/extras/modules/data_connection/document_loaders/integrations/Etherscan.ipynb:55: error: Name "null" is not defined [name-defined] docs/extras/modules/data_connection/document_loaders/integrations/Etherscan.ipynb:76: error: Name "null" is not defined [name-defined] Found 2 errors in 1 file (checked 1 source file) ``` - Description: The Etherscan loader uses etherscan api to load transaction histories under specific accounts on Ethereum Mainnet. - No dependency is introduced by this PR. - Twitter handle: glazecl --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 17:09:16 -07:00
Bagatur	483f6c2fe3	mv eval docs (#8209 )	2023-07-24 16:31:20 -07:00
Liu Ming	24f889f2bc	Change with_history option to False for ChatGLM by default (#8076 ) ChatGLM LLM integration will by default accumulate conversation history(with_history=True) to ChatGLM backend api, which is not expected in most cases. This PR set with_history=False by default, user should explicitly set llm.with_history=True to turn this feature on. Related PR: #8048 #7774 --------- Co-authored-by: mlot <limpo2000@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 15:46:02 -07:00
Anthony Mahanna	76102971c0	ArangoDB/AQL support for Graph QA Chain (#7880 ) Description: Serves as an introduction to LangChain's support for [ArangoDB](https://github.com/arangodb/arangodb), similar to https://github.com/hwchase17/langchain/pull/7165 and https://github.com/hwchase17/langchain/pull/4881 Issue: No issue has been created for this feature Dependencies: `python-arango` has been added as an optional dependency via the `CONTRIBUTING.md` guidelines Twitter handle: [at]arangodb - Integration test has been added - Notebook has been added: [graph_arangodb_qa.ipynb](https://github.com/amahanna/langchain/blob/master/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb) [![Open In Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/amahanna/langchain/blob/master/docs/extras/modules/chains/additional/graph_arangodb_qa.ipynb) ``` docker run -p 8529:8529 -e ARANGO_ROOT_PASSWORD= arangodb/arangodb ``` ``` pip install git+https://github.com/amahanna/langchain.git ``` ```python from arango import ArangoClient from langchain.chat_models import ChatOpenAI from langchain.graphs import ArangoGraph from langchain.chains import ArangoGraphQAChain db = ArangoClient(hosts="localhost:8529").db(name="_system", username="root", password="", verify=True) graph = ArangoGraph(db) chain = ArangoGraphQAChain.from_llm(ChatOpenAI(temperature=0), graph=graph) chain.run("Is Ned Stark alive?") ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 15:16:52 -07:00
Adilkhan Sarsen	3e7d2a1b64	SelfQuery support for deeplake (#7888 ) Added support SelfQuery for Deeplake	2023-07-24 14:22:33 -07:00
Juan José Torres	1cc7d4c9eb	Update SageMaker Endpoint Embeddings docs to be up to date with current requirements (#8103 ) - Description: Simple change of the Class that ContentHandler inherits from. To create an object of type SagemakerEndpointEmbeddings, the property content_handler must be of type EmbeddingsContentHandler not ContentHandlerBase anymore, - Twitter handle: @Juanjo_Torres11 Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 13:35:06 -07:00
Bagatur	1a7d8667c8	Bagatur/gateway chat (#8198 ) Signed-off-by: dbczumar <corey.zumar@databricks.com> Co-authored-by: dbczumar <corey.zumar@databricks.com>	2023-07-24 12:17:00 -07:00
Ettore Di Giacinto	ae28568e2a	Add embeddings for LocalAI (#8134 ) Description: This PR adds embeddings for LocalAI ( https://github.com/go-skynet/LocalAI ), a self-hosted OpenAI drop-in replacement. As LocalAI can re-use OpenAI clients it is mostly following the lines of the OpenAI embeddings, however when embedding documents, it just uses string instead of sending tokens as sending tokens is best-effort depending on the model being used in LocalAI. Sending tokens is also tricky as token id's can mismatch with the model - so it's safer to just send strings in this case. Partly related to: https://github.com/hwchase17/langchain/issues/5256 Dependencies: No new dependencies Twitter: @mudler_it --------- Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 12:16:49 -07:00
Mike Nitsenko	d983046f90	Extend Cube Semantic Loader functionality (#8186 ) PR Description: This pull request introduces several enhancements and new features to the `CubeSemanticLoader`. The changes include the following: 1. Added imports for the `json` and `time` modules. 2. Added new constructor parameters: `load_dimension_values`, `dimension_values_limit`, `dimension_values_max_retries`, and `dimension_values_retry_delay`. 3. Updated the class documentation with descriptions for the new constructor parameters. 4. Added a new private method `_get_dimension_values()` to retrieve dimension values from Cube's REST API. 5. Modified the `load()` method to load dimension values for string dimensions if `load_dimension_values` is set to `True`. 6. Updated the API endpoint in the `load()` method from the base URL to the metadata endpoint. 7. Refactored the code to retrieve metadata from the response JSON. 8. Added the `column_member_type` field to the metadata dictionary to indicate if a column is a measure or a dimension. 9. Added the `column_values` field to the metadata dictionary to store the dimension values retrieved from Cube's API. 10. Modified the `page_content` construction to include the column title and description instead of the table name, column name, data type, title, and description. These changes improve the functionality and flexibility of the `CubeSemanticLoader` class by allowing the loading of dimension values and providing more detailed metadata for each document. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-24 12:11:58 -07:00
Bagatur	4928f7a9f5	undo bump (#8192 )	2023-07-24 11:32:17 -07:00
Bagatur	14aa27b5f4	redirect (#8189 )	2023-07-24 10:45:12 -07:00
Bagatur	e7d64f8b15	Bagatur/vercel test 3 (#8188 )	2023-07-24 10:11:54 -07:00
Bagatur	026269bfa9	redirects (#8183 )	2023-07-24 08:32:49 -07:00
Harrison Chase	3caccf304c	Harrison/hugginggpt (#8162 ) Co-authored-by: Yongliang Shen <withsyl@163.com>	2023-07-24 07:36:24 -07:00
Bagatur	c8c8635dc9	mv module integrations docs (#8101 )	2023-07-23 23:23:16 -07:00
Adarsh Shirawalmath	8ea840432f	Generalize Comment on Streaming Support for LLM Implementations and add examples (#8115 ) The example provided demonstrates the usage of the HuggingFaceTextGenInference implementation with streaming enabled.	2023-07-23 22:59:59 -07:00
Harrison Chase	33fd6184ba	beef up getting started (#8139 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-23 19:57:43 -07:00
Lawrence Lim	fa8906a9b7	fix typo: Entity Summary Memory documentation (#8145 ) Fixed a small typo I came across in the Memory documentation.	2023-07-23 19:36:50 -07:00
Fielding Johnston	fb62f2be70	nit: small typo in evaluation module docs (#8155 ) Hopefully, this doesn't come across as nitpicky! That isn't the intention. I only noticed it, because I enjoy reading the documentation and when I hit a mental road bump it is usually due to a missing word or something =) @baskaryan	2023-07-23 18:25:14 -07:00
SlapDrone	961a0e200f	Implement AgentExecutorIterator (#6929 ) - Description: Implements a `.iter()` method for the `AgentExecutor` class. This allows hooking into and intercepting intermediate agent steps. - Issue: #6925 - Dependencies: None - Tag maintainer: @vowelparrot @agola11 - Twitter handle: @SlapDron3 @lacicocodes --------- Co-authored-by: Lacico <Lacicocodes@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-23 18:00:22 -07:00
Harrison Chase	e46126eac6	add llamaapi (#8140 )	2023-07-23 09:16:16 -07:00
Harrison Chase	f0eb5db670	Harrison/agent intro (#8138 ) Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-22 22:14:59 -07:00
Harrison Chase	cbf2fc8af8	prompt ergonomics (#7799 )	2023-07-22 14:19:17 -07:00
Samuel Berthe	d81d6e874f	doc(sqldatabasechain): use views when jsonb column description is not available (#8133 ) I think the PR diff is self explaining ;) @baskaryan	2023-07-22 11:30:04 -07:00
Karthik Raja A	8b08687fc4	MultiOn client toolkit (#8110 ) Addition of MultiOn Client Agent Toolkit Dependencies: multion pip package This PR consists of the following: - MultiOn utility,tools and integration with agent - sample jupyter notebook. Request @hwchase17 , @hinthornw --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-22 08:19:01 -07:00
Harrison Chase	aa0e69bc98	Harrison/official pre release (#8106 )	2023-07-21 18:44:32 -07:00
Bagatur	58f65fcf12	use top nav docs (#8090 )	2023-07-21 13:52:03 -07:00
Bagatur	08c658d3f8	fix api ref (#8083 )	2023-07-21 12:37:21 -07:00
Harrison Chase	f35db9f43e	(WIP) set up experimental (#7959 )	2023-07-21 09:20:24 -07:00
Lance Martin	5a084e1b20	Async HTML loader and HTML2Text transformer (#8036 ) New HTML loader that asynchronously loader a list of urls. New transformer using [HTML2Text](https://github.com/Alir3z4/html2text/) for HTML to clean, easy-to-read plain ASCII text (valid Markdown).	2023-07-20 22:30:59 -07:00
Wey Gu	cf60cff1ef	feat: Add with_history option for chatglm (#8048 ) In certain 0-shot scenarios, the existing stateful language model can unintentionally send/accumulate the .history. This commit adds the "with_history" option to chatglm, allowing users to control the behavior of .history and prevent unintended accumulation. Possible reviewers @hwchase17 @baskaryan @mlot Refer to discussion over this thread: https://twitter.com/wey_gu/status/1681996149543276545?s=20	2023-07-20 22:25:37 -07:00
Harrison Chase	1f3b987860	Harrison/GitHub toolkit (#8047 ) Co-authored-by: Trevor Dobbertin <trevordobbertin@gmail.com>	2023-07-20 22:24:55 -07:00
Harrison Chase	f99f497b2c	Harrison/predibase (#8046 ) Co-authored-by: Abhay Malik <32989166+Abhay-765@users.noreply.github.com>	2023-07-20 19:26:50 -07:00
Jacob Lee	56c6ab1715	Fix bad docs sidebar header (#7966 ) Quick fix for: <img width="283" alt="Screenshot 2023-07-19 at 2 49 44 PM" src="https://github.com/hwchase17/langchain/assets/6952323/91e4868c-b75e-413d-9f8f-d34762abf164"> CC @baskaryan	2023-07-20 19:06:57 -07:00
Kacper Łukawski	ed6a5532ac	Implement async support in Qdrant local mode (#8001 ) I've extended the support of async API to local Qdrant mode. It is faked but allows prototyping without spinning a container. The tests are improved to test the in-memory case as well. @baskaryan @rlancemartin @eyurtsev @agola11	2023-07-20 19:04:33 -07:00
Taqi Jaffri	973593c5c7	Added streaming support to Replicate (#8045 ) Streaming support is useful if you are doing long-running completions or need interactivity e.g. for chat... adding it to replicate, using a similar pattern to other LLMs that support streaming. Housekeeping: I ran `make format` and `make lint`, no issues reported in the files I touched. I did update the replicate integration test but ran into some issues, specifically: 1. The original test was failing for me due to the model argument not being specified... perhaps this test is not regularly run? I fixed it by adding a call to the lightweight hello world model which should not be burdensome for replicate infra. 2. I couldn't get the `make integration_tests` command to pass... a lot of failures in other integration tests due to missing dependencies... however I did make sure the particluar test file I updated does pass, by running `poetry run pytest tests/integration_tests/llms/test_replicate.py` Finally, I am @tjaffri https://twitter.com/tjaffri for feature announcement tweets... or if you could please tag @docugami https://twitter.com/docugami we would really appreciate that :-) Tagging model maintainers @hwchase17 @baskaryan Thank for all the awesome work you folks are doing. --------- Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>	2023-07-20 18:59:54 -07:00
Piyush Jain	31b7ddc12c	Neptune graph and openCypher QA Chain (#8035 ) ## Description This PR adds a graph class and an openCypher QA chain to work with the Amazon Neptune database. ## Dependencies `requests` which is included in the LangChain dependencies. ## Maintainers for Review @krlawrence @baskaryan ### Twitter handle pjain7	2023-07-20 18:56:47 -07:00
Emory Petermann	7239d57a53	Update Golden integration documentation (#8030 ) fixes some typos and cleans up onboarding for golden, thank you! @hinthornw	2023-07-20 15:53:44 -07:00
Jonathon Belotti	021bb9be84	Update Modal.com integration docs (#8014 ) Hey, I'm a Modal Labs engineer and I'm making this docs update after getting a user question in [our beta Slack space](https://join.slack.com/t/modalbetatesters/shared_invite/zt-1xl9gbob8-1QDgUY7_PRPg6dQ49hqEeQ) about the Langchain integration docs. 🔗 [Modal beta-testers link to docs discussion thread](https://modalbetatesters.slack.com/archives/C031Z7DBQFL/p1689777700594819?thread_ts=1689775859.855849&cid=C031Z7DBQFL)	2023-07-20 15:53:06 -07:00
Jeffrey Wang	62d0475c29	Add Metaphor new field and reformat docs (#8022 ) This PR reformats our python notebook example and also adds a new field we have. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-07-20 15:50:54 -07:00
vrushankportkey	5f10d2ea1d	Add Portkey LLMOps integration (#7877 ) Integrating Portkey, which adds production features like caching, tracing, tagging, retries, etc. to langchain apps. - Dependencies: None - Twitter handle: https://twitter.com/portkeyai - test_portkey.py added for tests - example notebook added in new utilities folder in modules Also fixed a bug with OpenAIEmbeddings where headers weren't passing. cc @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 09:08:44 -07:00
Dwai Banerjee	d8c40253c3	Adding endpoint_url to embeddings/bedrock.py and updated docs (#7927 ) BedrockEmbeddings does not have endpoint_url so that switching to custom endpoint is not possible. I have access to Bedrock custom endpoint and cannot use BedrockEmbeddings --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 07:25:59 -07:00
Constantin Musca	d593833e4d	Add Golden Query Tool (#7930 ) Description: Golden Query is a wrapper on top of the [Golden Query API](https://docs.golden.com/reference/query-api) which enables programmatic access to query results on entities across Golden's Knowledge Base. For more information about Golden API, please see the [Golden API Getting Started](https://docs.golden.com/reference/getting-started) page. Issue: None Dependencies: requests(already present in project) Tag maintainer: @hinthornw Signed-off-by: Constantin Musca <constantin.musca@gmail.com>	2023-07-20 07:03:20 -07:00
Santiago Delgado	c416dbe8e0	Amadeus Flight and Travel Search Tool (#7890 ) ## Background With the addition on email and calendar tools, LangChain is continuing to complete its functionality to automate business processes. ## Challenge One of the pieces of business functionality that LangChain currently doesn't have is the ability to search for flights and travel in order to book business travel. ## Changes This PR implements an integration with the [Amadeus](https://developers.amadeus.com/) travel search API for LangChain, enabling seamless search for flights with a single authentication process. ## Who can review? @hinthornw ## Appendix @tsolakoua and @minjikarin, I utilized your [amadeus-python](https://github.com/amadeus4dev/amadeus-python) library extensively. Given the rising popularity of LangChain and similar AI frameworks, the convergence of libraries like amadeus-python and tools like this one is likely. So, I wanted to keep you updated on our progress. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 06:59:29 -07:00
Yun Kim	54e02e4392	Add datadog-langchain integration doc (#7955 ) ## Description Added a doc about the [Datadog APM integration for LangChain](https://github.com/DataDog/dd-trace-py/pull/6137). Note that the integration is on `ddtrace`'s end and so no code is introduced/required by this integration into the langchain library. For that reason I've refrained from adding an example notebook (although I've added setup instructions for enabling the integration in the doc) as no code is technically required to enable the integration. Tagging @baskaryan as reviewer on this PR, thank you very much! ## Dependencies Datadog APM users will need to have `ddtrace` installed, but the integration is on `ddtrace` end and so does not introduce any external dependencies to the LangChain project. Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 06:44:58 -07:00
Jithin James	493cbc9410	docs: fix a couple of small indentation errors in the strings (#7951 ) Fixed a few indentations I came across in the docs @baskaryan	2023-07-20 06:34:01 -07:00
Bhashithe Abeysinghe	73901ef132	Added windows specific instructions to Llama.cpp documentation. (#8000 ) - Description: Added windows specific instructions on llama.cpp in the notebook file - Issue: #6356 - Dependencies: None - Tag maintainer: @baskaryan	2023-07-20 06:31:25 -07:00
Jeff Huber	5694e7b8cf	Update chroma notebook (#7978 ) Fix up the Chroma notebook - remove `.persist()` -- this is no longer in Chroma as of `0.4.0` - update output to match `0.4.0` - other cleanup work	2023-07-20 06:25:31 -07:00
Harutaka Kawamura	4a5894db47	Fix incorrect field name in MLflow AI Gateway config example (#7983 )	2023-07-20 06:24:59 -07:00
Kacper Łukawski	19e8472521	Add async Qdrant to async_agent.ipynb (#7993 ) I added Qdrant to the async API docs. This is the only vector store that supports full async API. @baskaryan @rlancemartin, @eyurtsev	2023-07-20 06:23:15 -07:00
Bagatur	5d021c0962	nb fix (#7962 )	2023-07-19 15:27:43 -07:00
Julien Salinas	3adab5e5be	Integrate NLP Cloud embeddings endpoint (#7931 ) Add embeddings for [NLPCloud](https://docs.nlpcloud.com/#embeddings). --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-19 15:27:34 -07:00
Bagatur	854a2be0ca	Add debugging guide (#7956 )	2023-07-19 14:15:11 -07:00
Brendan Collins	9aef79c2e3	Add Geopandas.GeoDataFrame Document Loader (#3817 ) Work in Progress. WIP Not ready... Adds Document Loader support for [Geopandas.GeoDataFrames](https://geopandas.org/) Example: - [x] stub out `GeoDataFrameLoader` class - [x] stub out integration tests - [ ] Experiment with different geometry text representations - [ ] Verify CRS is successfully added in metadata - [ ] Test effectiveness of searches on geometries - [ ] Test with different geometry types (point, line, polygon with multi-variants). - [ ] Add documentation --------- Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>	2023-07-19 12:14:41 -07:00
Lance Martin	dfc533aa74	Add llama-v2 to local document QA (#7952 )	2023-07-19 11:15:47 -07:00
Bagatur	f97535b33e	fix (#7947 )	2023-07-19 10:23:10 -07:00
Harutaka Kawamura	f6839a8682	Add integration for MLflow AI Gateway (#7113 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> - Adds integration for MLflow AI Gateway (this will be shipped in MLflow 2.5 this week). Manual testing: ```sh # Move to mlflow repo cd /path/to/mlflow # install langchain pip install git+https://github.com/harupy/langchain.git@gateway-integration # launch gateway service mlflow gateway start --config-path examples/gateway/openai/config.yaml # Then, run the examples in this PR ```	2023-07-19 07:40:55 -07:00
William FH	9d7e57f5c0	Docs Nit (#7918 )	2023-07-18 21:47:28 -07:00
Jarek Kazmierczak	f2ef3ff54a	Google Cloud Enterprise Search retriever (#7857 ) Added a retriever that encapsulated Google Cloud Enterprise Search. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 18:24:08 -07:00
Zizhong Zhang	bdf0c2267f	docs(custom_chain) fix typo (#7898 ) Fix typo in the document of custom_chain	2023-07-18 18:03:19 -07:00
Jeff Huber	2139d0197e	upgrade chroma to 0.4.0 (#7749 ) This should land Monday the 17th Chroma is upgrading from `0.3.29` to `0.4.0`. `0.4.0` is easier to build, more durable, faster, smaller, and more extensible. This comes with a few changes: 1. A simplified and improved client setup. Instead of having to remember weird settings, users can just do `EphemeralClient`, `PersistentClient` or `HttpClient` (the underlying direct `Client` implementation is also still accessible) 2. We migrated data stores away from `duckdb` and `clickhouse`. This changes the api for the `PersistentClient` that used to reference `chroma_db_impl="duckdb+parquet"`. Now we simply set `is_persistent=true`. `is_persistent` is set for you to `true` if you use `PersistentClient`. 3. Because we migrated away from `duckdb` and `clickhouse` - this also means that users need to migrate their data into the new layout and schema. Chroma is committed to providing extension notification and tooling around any schema and data migrations (for example - this PR!). After upgrading to `0.4.0` - if users try to access their data that was stored in the previous regime, the system will throw an `Exception` and instruct them how to use the migration assistant to migrate their data. The migration assitant is a pip installable CLI: `pip install chroma_migrate`. And is runnable by calling `chroma_migrate` -- TODO ADD here is a short video demonstrating how it works. Please reference the readme at [chroma-core/chroma-migrate](https://github.com/chroma-core/chroma-migrate) to see a full write-up of our philosophy on migrations as well as more details about this particular migration. Please direct any users facing issues upgrading to our Discord channel called [#get-help](https://discord.com/channels/1073293645303795742/1129200523111841883). We have also created a [email listserv](https://airtable.com/shrHaErIs1j9F97BE) to notify developers directly in the future about breaking changes. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 17:20:54 -07:00
Lance Martin	41c841ec85	Add Llama-v2 to Llama.cpp notebook (#7913 )	2023-07-18 15:13:27 -07:00
Bagatur	b9639f6067	fix docs (#7911 )	2023-07-18 14:25:45 -07:00
Jeff Huber	dc8b790214	Improve vector store onboarding exp (#6698 ) This PR - fixes the `similarity_search_by_vector` example, makes the code run and adds the example to mirror `similarity_search` - reverts back to chroma from faiss to remove sharp edges / create a happy path for new developers. (1) real metadata filtering, (2) expected functionality like `update`, `delete`, etc to serve beyond the most trivial use cases @hwchase17 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 13:48:42 -07:00
Lance Martin	862268175e	Add llama-v2 to docs (#7893 )	2023-07-18 12:09:09 -07:00
Filip Michalsky	69b9db2b5e	Notebook update: sales agent with tools (#7753 ) - Description: This is an update to a previously published notebook. Sales Agent now has access to tools, and this notebook shows how to use a Product Knowledge base to reduce hallucinations and act as a better sales person! - Issue: N/A - Dependencies: `chromadb openai tiktoken` - Tag maintainer: @baskaryan @hinthornw - Twitter handle: @FilipMichalsky	2023-07-18 09:53:12 -07:00
Orgil	75d3f1e5e6	remove unused import in voice assistant doc (#7757 ) Description: Removed unused import in voice_assistant doc. Tag maintainer: @baskaryan	2023-07-18 09:51:28 -07:00
maciej-skorupka	c6d1d6d7fc	feat: moving azure OpenAI API version to the latest 2023-05-15 (#7764 ) Moving to the latest non-preview Azure OpenAI API version=2023-05-15. The previous 2023-03-15-preview doesn't have support, SLA etc. For instance, OpenAI SDK has moved to this version https://github.com/openai/openai-python/releases/tag/v0.27.7 @baskaryan	2023-07-18 09:50:15 -07:00
satorioh	259a409998	docs(zilliz): connection_args add token description for serverless cl… (#7810 ) Description: Currently, Zilliz only support dedicated clusters using a pair of username and password for connection. Regarding serverless clusters, they can connect to them by using API keys( [ see official note detail](https://docs.zilliz.com/docs/manage-cluster-credentials)), so I add API key(token) description in Zilliz docs to make it more obvious and convenient for this group of users to better utilize Zilliz. No changes done to code. --------- Co-authored-by: Robin.Wang <3Jg$94sbQ@q1> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 09:31:39 -07:00
maciej-skorupka	5de7815310	docs: added comment from azure llm to azure chat about GPT-4 (#7884 ) Azure GPT-4 models can't be accessed via LLM model. It's easy to miss that and a lot of discussions about that are on the Internet. Therefore I added a comment in Azure LLM docs that mentions that and points to Azure Chat OpenAI docs. @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 08:05:41 -07:00
Bill Zhang	dda11d2a05	WeaviateHybridSearchRetriever option to enable scores. (#7861 ) Description: This PR adds the option to retrieve scores and explanations in the WeaviateHybridSearchRetriever. This feature improves the usability of the retriever by allowing users to understand the scoring logic behind the search results and further refine their search queries. Issue: This PR is a solution to the issue #7855 Dependencies: This PR does not introduce any new dependencies. Tag maintainer: @rlancemartin, @eyurtsev I have included a unit test for the added feature, ensuring that it retrieves scores and explanations correctly. I have also included an example notebook demonstrating its use.	2023-07-18 07:57:17 -07:00
Jonathan Pedoeem	c460c29a64	Adding Docs for `PromptLayerCallbackHandler` (#7860 ) Here I am adding documentation for the `PromptLayerCallbackHandler`. When we created the initial PR for the callback handler the docs were causing issues, so we merged without the docs.	2023-07-18 07:51:16 -07:00
German Martin	f1eaa9b626	Lost in the middle: We have been ordering documents the WRONG way. (for long context) (#7520 ) Motivation, it seems that when dealing with a long context and "big" number of relevant documents we must avoid using out of the box score ordering from vector stores. See: https://arxiv.org/pdf/2306.01150.pdf So, I added an additional parameter that allows you to reorder the retrieved documents so we can work around this performance degradation. The relevance respect the original search score but accommodates the lest relevant document in the middle of the context. Extract from the paper (one image speaks 1000 tokens): ![image](https://github.com/hwchase17/langchain/assets/1821407/fafe4843-6e18-4fa6-9416-50cc1d32e811) This seems to be common to all diff arquitectures. SO I think we need a good generic way to implement this reordering and run some test in our already running retrievers. It could be that my approach is not the best one from the architecture point of view, happy to have a discussion about that. For me this was the best place to introduce the change and start retesting diff implementations. @rlancemartin, @eyurtsev --------- Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-18 07:45:15 -07:00
Bagatur	6a32f93669	add ls link (#7847 )	2023-07-18 07:39:26 -07:00
William FH	c6f2d27789	Docs Nits (#7874 ) Add links to reference docs	2023-07-18 01:50:14 -07:00
William FH	3179ee3a56	Evals docs (#7460 ) Still don't have good "how to's", and the guides / examples section could be further pruned and improved, but this PR adds a couple examples for each of the common evaluator interfaces. - [x] Example docs for each implemented evaluator - [x] "how to make a custom evalutor" notebook for each low level APIs (comparison, string, agent) - [x] Move docs to modules area - [x] Link to reference docs for more information - [X] Still need to finish the evaluation index page - ~[ ] Don't have good data generation section~ - ~[ ] Don't have good how to section for other common scenarios / FAQs like regression testing, testing over similar inputs to measure sensitivity, etc.~	2023-07-18 01:00:01 -07:00
Nicolas	46330da2e7	docs: Mendable: Fixes pretty sources not working (#7863 ) This new version fixes the"Verified Sources" display that got broken. Instead of displaying the full URL, it shows the title of the page the source is from.	2023-07-17 18:23:46 -07:00
Jasper	5b4d53e8ef	Add text_content kwarg to BrowserlessLoader (#7856 ) Added keyword argument to toggle between getting the text content of a site versus its HTML when using the `BrowserlessLoader`	2023-07-17 17:02:19 -07:00
William FH	2aa3cf4e5f	update notebook (#7852 )	2023-07-17 14:46:42 -07:00
Matt Robinson	3c489be773	feat: optional post-processing for Unstructured loaders (#7850 ) ### Summary Adds a post-processing method for Unstructured loaders that allows users to optionally modify or clean extracted elements. ### Testing ```python from langchain.document_loaders import UnstructuredFileLoader from unstructured.cleaners.core import clean_extra_whitespace loader = UnstructuredFileLoader( "./example_data/layout-parser-paper.pdf", mode="elements", post_processors=[clean_extra_whitespace], ) docs = loader.load() docs[:5] ``` ### Reviewrs - @rlancemartin - @eyurtsev - @hwchase17	2023-07-17 12:13:05 -07:00
Bagatur	2a315dbee9	fix nb (#7843 )	2023-07-17 09:39:11 -07:00
Bagatur	98c48f303a	fix (#7838 )	2023-07-17 07:53:11 -07:00
Dayuan Jiang	ee40d37098	add bm25 module (#7779 ) - Description: Add a BM25 Retriever that do not need Elastic search - Dependencies: rank_bm25(if it is not installed it will be install by using pip, just like TFIDFRetriever do) - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: DayuanJian21687 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-17 07:30:17 -07:00
Liu Ming	fa0a9e502a	Add LLM for ChatGLM(2)-6B API (#7774 ) Description: Add LLM for ChatGLM-6B & ChatGLM2-6B API Related Issue: Will the langchain support ChatGLM? #4766 Add support for selfhost models like ChatGLM or transformer models #1780 Dependencies: No extra library install required. It wraps api call to a ChatGLM(2)-6B server(start with api.py), so api endpoint is required to run. Tag maintainer: @mlot Any comments on this PR would be appreciated. --------- Co-authored-by: mlot <limpo2000@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-17 07:27:17 -07:00
sseide	25e3d3f283	Support Redis Sentinel database connections (#5196 ) # Support Redis Sentinel database connections This PR adds the support to connect not only to Redis standalone servers but High Availability Replication sets too (https://redis.io/docs/management/sentinel/) Redis Replica Sets have on Master allowing to write data and 2+ replicas with read-only access to the data. The additional Redis Sentinel instances monitor all server and reconfigure the RW-Master on the fly if it comes unavailable. Therefore all connections must be made through the Sentinels the query the current master for a read-write connection. This PR adds basic support to also allow a redis connection url specifying a Sentinel as Redis connection. Redis documentation and Jupyter notebook with Redis examples are updated to mention how to connect to a redis Replica Set with Sentinels - Remark - i did not found test cases for Redis server connections to add new cases here. Therefor i tests the new utility class locally with different kind of setups to make sure different connection urls are working as expected. But no test case here as part of this PR.	2023-07-17 07:18:51 -07:00
Yifei Song	2e47412073	Add Xorbits agent (#7647 ) - [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source computing framework that makes it easy to scale data science and machine learning workloads in parallel. Xorbits can leverage multi cores or GPUs to accelerate computation on a single machine, or scale out up to thousands of machines to support processing terabytes of data. - This PR added support for the Xorbits agent, which allows langchain to interact with Xorbits Pandas dataframe and Xorbits Numpy array. - Dependencies: This change requires the Xorbits library to be installed in order to be used. `pip install xorbits` - Request for review: @hinthornw - Twitter handle: https://twitter.com/Xorbitsio	2023-07-17 07:09:51 -07:00
Ankush Gola	ff3aada0b2	minor langsmith notebook fixes (#7814 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-16 21:27:03 -07:00
William FH	c58d35765d	Add examples to docstrings (#7796 ) and: - remove dataset name from autogenerated project name - print out project name to view	2023-07-16 12:05:56 -07:00
Ankush Gola	c4ece52dac	update LangSmith notebook (#7767 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md -->	2023-07-15 21:05:09 -07:00
Lance Martin	1d06eee3b5	Fix ntbk link in docs (#7755 ) Minor fix to running to [docs](https://python.langchain.com/docs/use_cases/question_answering/local_retrieval_qa).	2023-07-15 09:11:18 -07:00
Mohammad Mohtashim	b8b8a138df	Simple Import fix in Tools Exception Docs (#7740 ) Issue: #7720 @hinthornw	2023-07-15 10:25:34 -04:00
Nicolas	43f900fd38	docs: Mendable Search Improvements (#7744 ) - New pin-to-side (button). This functionality allows you to search the docs while asking the AI for questions - Fixed the search bar in Firefox that won't detect a mouse click - Fixes and improvements overall in the model's performance	2023-07-15 10:19:21 -04:00
Lance Martin	b015647e31	Add GPT4All embeddings (#7743 ) Support for [GPT4All embeddings](https://docs.gpt4all.io/gpt4all_python_embedding.html) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-15 10:04:29 -04:00
Kacper Łukawski	1ff5b67025	Implement async API for Qdrant vector store (#7704 ) Inspired by #5550, I implemented full async API support in Qdrant. The docs were extended to mention the existence of asynchronous operations in Langchain. I also used that chance to restructure the tests of Qdrant and provided a suite of tests for the async version. Async API requires the GRPC protocol to be enabled. Thus, it doesn't work on local mode yet, but we're considering including the support to be consistent.	2023-07-15 09:33:26 -04:00
Bearnardd	275b926cf7	add missing import (#7730 ) Just a nit documentation fix @baskaryan	2023-07-14 20:03:23 -04:00
Lorenzo	77e6bbe6f0	fix typo in deeplake.ipynb (#7718 ) - Fixing typos in deeplake documentation - @baskaryan	2023-07-14 13:38:31 -04:00
Samuel Berthe	2be3515a66	SQLDatabase: adding security disclamer (#7710 ) It might be obvious to most engineers, but I think everybody should be cautious when using such a chain. ![image](https://github.com/hwchase17/langchain/assets/2951285/a1df6567-9d56-4c12-98ea-767401ae2ac8)	2023-07-14 13:38:16 -04:00
Bagatur	bae93682f6	update docs (#7714 )	2023-07-14 11:49:09 -04:00
Bagatur	b065da6933	Bagatur/docs nit (#7712 )	2023-07-14 11:13:02 -04:00
Bagatur	87d81b6acc	Redirect old text splitter page (#7708 ) related to #7665	2023-07-14 11:12:18 -04:00
Aarav Borthakur	210296a71f	Integrate Rockset as a document loader (#7681 ) <!-- Thank you for contributing to LangChain! Replace this comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. Maintainer responsibilities: - General / Misc / if you don't know who to tag: @baskaryan - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev - Models / Prompts: @hwchase17, @baskaryan - Memory: @hwchase17 - Agents / Tools / Toolkits: @hinthornw - Tracing / Callbacks: @agola11 - Async: @agola11 If no one reviews your PR within a few days, feel free to @-mention the same people again. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> Integrate [Rockset](https://rockset.com/docs/) as a document loader. Issue: None Dependencies: Nothing new (rockset's dependency was already added [here](https://github.com/hwchase17/langchain/pull/6216)) Tag maintainer: @rlancemartin I have added a test for the integration and an example notebook showing its use. I ran `make lint` and everything looks good. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-14 07:58:13 -07:00
Samuel Berthe	7d4843fe84	feat(chains): adding ElasticsearchDatabaseChain for interacting with analytics database (#7686 ) This pull request adds a ElasticsearchDatabaseChain chain for interacting with analytics database, in the manner of the SQLDatabaseChain. Maintainer: @samber Twitter handler: samuelberthe --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-14 10:30:57 -04:00
Daniel	6d88b23ef7	Update pgembedding.ipynb (#7699 ) Update the extension name. It changed from pg_hnsw to pg_embedding. Thank you. I missed this in my previous commit.	2023-07-14 08:39:01 -04:00
Richy Wang	45bb414be2	Add LLM for Alibaba's Damo Academy's Tongyi Qwen API (#7477 ) - Add langchain.llms.Tonyi for text completion, in examples into the Tonyi Text API, - Add system tests. Note async completion for the Text API is not yet supported and will be included in a future PR. Dependencies: dashscope. It will be installed manually cause it is not need by everyone. Happy for feedback on any aspect of this PR @hwchase17 @baskaryan.	2023-07-14 01:58:22 -04:00
Lance Martin	6325a3517c	Make recursive loader yield while crawling (#7568 ) Support actual lazy_load since it can take a while to crawl larger directories.	2023-07-13 21:55:20 -07:00
UmerHA	82f3e32d8d	[Small upgrade] Allow document limit in AzureCognitiveSearchRetriever (#7690 ) Multiple people have asked in #5081 for a way to limit the documents returned from an AzureCognitiveSearchRetriever. This PR adds the `top_n` parameter to allow that. Twitter handle: [@UmerHAdil](twitter.com/umerHAdil)	2023-07-13 23:04:40 -04:00
Daniel	854f3fe9b1	Update pgembedding.ipynb (#7682 ) Correct links to the pg_embedding repository and the Neon documentation.	2023-07-13 19:54:07 -04:00
William FH	051fac1e66	Improve walkthrough links for sphinx (#7672 ) Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>	2023-07-13 16:08:31 -07:00
Bagatur	5db4dba526	add integrations hub link to docs (#7675 )	2023-07-13 18:44:10 -04:00
Jasper	fbc97a77ed	add browserless loader (#7562 ) # Browserless Added support for Browserless' `/content` endpoint as a document loader. ### About Browserless Browserless is a cloud service that provides access to headless Chrome browsers via a REST API. It allows developers to automate Chromium in a serverless fashion without having to configure and maintain their own Chrome infrastructure. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-13 13:18:28 -07:00
frangin2003	c7b687e944	Simplify GraphQL Tool Initialization documentation by Removing 'llm' Argument (#7651 ) This PR is aimed at enhancing the clarity of the documentation in the langchain project. Description: In the graphql.ipynb file, I have removed the unnecessary 'llm' argument from the initialization process of the GraphQL tool (of type _EXTRA_OPTIONAL_TOOLS). The 'llm' argument is not required for this process. Its presence could potentially confuse users. This modification simplifies the understanding of tool initialization and minimizes potential confusion. Issue: Not applicable, as this is a documentation improvement. Dependencies: None. I kindly request a review from the following maintainer: @hinthornw, who is responsible for Agents / Tools / Toolkits. No new integration is being added in this PR, hence no need for a test or an example notebook. Please see the changes for more detail and let me know if any further modification is necessary.	2023-07-13 14:52:07 -04:00
William FH	a673a51efa	[Breaking] Update Evaluation Functionality (#7388 ) - Migrate from deprecated langchainplus_sdk to `langsmith` package - Update the `run_on_dataset()` API to use an eval config - Update a number of evaluators, as well as the loading logic - Update docstrings / reference docs - Update tracer to share single HTTP session	2023-07-13 02:13:06 -07:00
Matt Adams	98e1bbfbbd	Add missing dependencies to apify.ipynb (#6331 ) Fixes errors caused by missing dependencies when running the notebook.	2023-07-13 03:02:23 -04:00
Francisco Ingham	488d2d5da9	Entity extraction improvements (#6342 ) Added fix to avoid irrelevant attributes being returned plus an example of extracting unrelated entities and an exampe of using an 'extra_info' attribute to extract unstructured data for an entity. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 02:16:05 -04:00
Bagatur	7f8ff2a317	add tagger nb (#7637 )	2023-07-13 01:48:23 -04:00
Jacob Lee	cdb93ab5ca	Adds OpenAI functions powered document metadata tagger (#7521 ) Adds a new document transformer that automatically extracts metadata for a document based on an input schema. I also moved `document_transformers.py` to `document_transformers/__init__.py` to group it with this new transformer - it didn't seem to cause issues in the notebook, but let me know if I've done something wrong there. Also had a linter issue I couldn't figure out: ``` MacBook-Pro:langchain jacoblee$ make lint poetry run mypy . docs/dist/conf.py: error: Duplicate module named "conf" (also at "./docs/api_reference/conf.py") docs/dist/conf.py: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#mapping-file-paths-to-modules for more info docs/dist/conf.py: note: Common resolutions include: a) using `--exclude` to avoid checking one of them, b) adding `__init__.py` somewhere, c) using `--explicit-package-bases` or adjusting MYPYPATH Found 1 error in 1 file (errors prevented further checking) make: *** [lint] Error 2 ``` @rlancemartin @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-13 01:12:41 -04:00
Jason Fan	8effd90be0	Add new types of document transformers (#7379 ) - Description: Add two new document transformers that translates documents into different languages and converts documents into q&a format to improve vector search results. Uses OpenAI function calling via the [doctran](https://github.com/psychic-api/doctran/tree/main) library. - Issue: N/A - Dependencies: `doctran = "^0.0.5"` - Tag maintainer: @rlancemartin @eyurtsev @hwchase17 - Twitter handle: @psychicapi or @jfan001 Notes - Adheres to the `DocumentTransformer` abstraction set by @dev2049 in #3182 - refactored `EmbeddingsRedundantFilter` to put it in a file under a new `document_transformers` module - Added basic docs for `DocumentInterrogator`, `DocumentTransformer` as well as the existing `EmbeddingsRedundantFilter` --------- Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-12 23:53:30 -04:00
Jamie Broomall	0e1d7a27c6	WhyLabsCallbackHandler updates (#7621 ) Updates to the WhyLabsCallbackHandler and example notebook - Update dependency to langkit 0.0.6 which defines new helper methods for callback integrations - Update WhyLabsCallbackHandler to use the new `get_callback_instance` so that the callback is mostly defined in langkit - Remove much of the implementation of the WhyLabsCallbackHandler here in favor of the callback instance This does not change the behavior of the whylabs callback handler implementation but is a reorganization that moves some of the implementation externally to our optional dependency package, and should make future updates easier. @agola11	2023-07-12 23:46:56 -04:00
Gaurang Pawar	53722dcfdc	Fixed a typo in pinecone_hybrid_search.ipynb (#7627 ) Fixed a small typo in documentation	2023-07-12 23:46:41 -04:00
Bagatur	ee70d4a0cd	mv tutorials (#7614 )	2023-07-12 17:33:36 -04:00
Yaohui Wang	d85c33a5c3	Fix the markdown rendering issue with a code block inside a markdown code block (#6625 ) ### Description - Fix the markdown rendering issue with a code block inside a markdown, using a different number of backticks for the delimiters. Current doc site: <https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/code_splitter#markdown> After fix: <img width="480" alt="image" src="https://github.com/hwchase17/langchain/assets/3115235/d9921d59-64e6-4a34-9c62-79743667f528"> ### Who can review PTAL @dev2049 Co-authored-by: Yaohui Wang <wangyaohui.01@bytedance.com>	2023-07-12 16:29:25 -04:00
Yaroslav Halchenko	0d92a7f357	codespell: workflow, config + some (quite a few) typos fixed (#6785 ) Probably the most boring PR to review ;) Individual commits might be easier to digest --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2023-07-12 16:20:08 -04:00
Sam	931e68692e	Adds a chain around sympy for symbolic math (#6834 ) - Description: Adds a new chain that acts as a wrapper around Sympy to give LLMs the ability to do some symbolic math. - Dependencies: SymPy --------- Co-authored-by: sreiswig <sreiswig@github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-12 15:17:32 -04:00
Subsegment	6e1000dc8d	docs : Use more meaningful cnosdb examples (#7587 ) This change makes the ecosystem integrations cnosdb documentation more realistic and easy to understand. - change examples of question and table - modify typo and format	2023-07-12 10:31:55 -04:00
ausboss	50316f6477	Adding LLM wrapper for Kobold AI (#7560 ) - Description: add wrapper that lets you use KoboldAI api in langchain - Issue: n/a - Dependencies: none extra, just what exists in lanchain - Tag maintainer: @baskaryan - Twitter handle: @zanzibased --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-12 03:48:12 -04:00
os1ma	2667ddc686	Fix `make docs_build` and related scripts (#7276 ) Description: a description of the change Fixed `make docs_build` and related scripts which caused errors. There are several changes. First, I made the build of the documentation and the API Reference into two separate commands. This is because it takes less time to build. The commands for documents are `make docs_build`, `make docs_clean`, and `make docs_linkcheck`. The commands for API Reference are `make api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`. It looked like `docs/.local_build.sh` could be used to build the documentation, so I used that. Since `.local_build.sh` was also building API Rerefence internally, I removed that process. `.local_build.sh` also added some Bash options to stop in error or so. Futher more added `cd "${SCRIPT_DIR}"` at the beginning so that the script will work no matter which directory it is executed in. `docs/api_reference/api_reference.rst` is removed, because which is generated by `docs/api_reference/create_api_rst.py`, and added it to .gitignore. Finally, the description of CONTRIBUTING.md was modified. Issue: the issue # it fixes (if applicable) https://github.com/hwchase17/langchain/issues/6413 Dependencies: any dependencies required for this change `nbdoc` was missing in group docs so it was added. I installed it with the `poetry add --group docs nbdoc` command. I am concerned if any modifications are needed to poetry.lock. I would greatly appreciate it if you could pay close attention to this file during the review. Tag maintainer - General / Misc / if you don't know who to tag: @baskaryan If this PR needs any additional changes, I'll be happy to make them! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-11 22:05:14 -04:00
schop-rob	e811c5e8c6	Add OpenAI organization ID to docs (#7398 ) Description: I added an example of how to reference the OpenAI API Organization ID, because I couldn't find it before. In the example, it is mentioned how to achieve this using environment variables as well as parameters for the OpenAI()-class Issue: - Dependencies: - Twitter @schop-rob	2023-07-11 20:51:58 -04:00
Kenny	8741e55e7c	Template formats documentation (#7404 ) Simple addition to the documentation, adding the correct import statement & showcasing using Python FStrings.	2023-07-11 18:24:24 -04:00
OwenElliott	9cb2347453	Fix broken link from Marqo Ecosystem (#7510 ) Small fix to a link from the Marqo page in the ecosystem. The link was not updated correctly when the documentation structure changed to html pages instead of links to notebooks.	2023-07-11 17:15:15 -04:00
Kacper Łukawski	1f83b5f47e	Reuse the existing collection if configured properly in Qdrant.from_texts (#7530 ) This PR changes the behavior of `Qdrant.from_texts` so the collection is reused if not requested to recreate it. Previously, calling `Qdrant.from_texts` or `Qdrant.from_documents` resulted in removing the old data which was confusing for many.	2023-07-11 16:24:35 -04:00
Felix Brockmeier	406a9dc11f	Add notebook example for Lemon AI NLP Workflow Automation (#7556 ) - Description: Added notebook to LangChain docs that explains how to use Lemon AI NLP Workflow Automation tool with Langchain - Issue: not applicable - Dependencies: not applicable - Tag maintainer: @agola11 - Twitter handle: felixbrockm	2023-07-11 15:15:11 -04:00
Lance Martin	9e067b8cc9	Add env setup (#7550 ) Include setup	2023-07-11 09:48:40 -07:00
Bagatur	d2137eea9f	fix cpal docs (#7545 )	2023-07-11 11:07:45 -04:00
Boris	9129318466	CPAL (#6255 ) # Causal program-aided language (CPAL) chain ## Motivation This builds on the recent [PAL](https://arxiv.org/abs/2211.10435) to stop LLM hallucination. The problem with the [PAL](https://arxiv.org/abs/2211.10435) approach is that it hallucinates on a math problem with a nested chain of dependence. The innovation here is that this new CPAL approach includes causal structure to fix hallucination. For example, using the below word problem, PAL answers with 5, and CPAL answers with 13. "Tim buys the same number of pets as Cindy and Boris." "Cindy buys the same number of pets as Bill plus Bob." "Boris buys the same number of pets as Ben plus Beth." "Bill buys the same number of pets as Obama." "Bob buys the same number of pets as Obama." "Ben buys the same number of pets as Obama." "Beth buys the same number of pets as Obama." "If Obama buys one pet, how many pets total does everyone buy?" The CPAL chain represents the causal structure of the above narrative as a causal graph or DAG, which it can also plot, as shown below. ![complex-graph](https://github.com/hwchase17/langchain/assets/367522/d938db15-f941-493d-8605-536ad530f576) . The two major sections below are: 1. Technical overview 2. Future application Also see [this jupyter notebook](https://github.com/borisdev/langchain/blob/master/docs/extras/modules/chains/additional/cpal.ipynb) doc. ## 1. Technical overview ### CPAL versus PAL Like [PAL](https://arxiv.org/abs/2211.10435), CPAL intends to reduce large language model (LLM) hallucination. The CPAL chain is different from the PAL chain for a couple of reasons. * CPAL adds a causal structure (or DAG) to link entity actions (or math expressions). * The CPAL math expressions are modeling a chain of cause and effect relations, which can be intervened upon, whereas for the PAL chain math expressions are projected math identities. PAL's generated python code is wrong. It hallucinates when complexity increases. ```python def solution(): """Tim buys the same number of pets as Cindy and Boris.Cindy buys the same number of pets as Bill plus Bob.Boris buys the same number of pets as Ben plus Beth.Bill buys the same number of pets as Obama.Bob buys the same number of pets as Obama.Ben buys the same number of pets as Obama.Beth buys the same number of pets as Obama.If Obama buys one pet, how many pets total does everyone buy?""" obama_pets = 1 tim_pets = obama_pets cindy_pets = obama_pets + obama_pets boris_pets = obama_pets + obama_pets total_pets = tim_pets + cindy_pets + boris_pets result = total_pets return result # math result is 5 ``` CPAL's generated python code is correct. ```python story outcome data name code value depends_on 0 obama pass 1.0 [] 1 bill bill.value = obama.value 1.0 [obama] 2 bob bob.value = obama.value 1.0 [obama] 3 ben ben.value = obama.value 1.0 [obama] 4 beth beth.value = obama.value 1.0 [obama] 5 cindy cindy.value = bill.value + bob.value 2.0 [bill, bob] 6 boris boris.value = ben.value + beth.value 2.0 [ben, beth] 7 tim tim.value = cindy.value + boris.value 4.0 [cindy, boris] query data { "question": "how many pets total does everyone buy?", "expression": "SELECT SUM(value) FROM df", "llm_error_msg": "" } # query result is 13 ``` Based on the comments below, CPAL's intended location in the library is `experimental/chains/cpal` and PAL's location is`chains/pal`. ### CPAL vs Graph QA Both the CPAL chain and the Graph QA chain extract entity-action-entity relations into a DAG. The CPAL chain is different from the Graph QA chain for a few reasons. * Graph QA does not connect entities to math expressions * Graph QA does not associate actions in a sequence of dependence. * Graph QA does not decompose the narrative into these three parts: 1. Story plot or causal model 4. Hypothetical question 5. Hypothetical condition ### Evaluation Preliminary evaluation on simple math word problems shows that this CPAL chain generates less hallucination than the PAL chain on answering questions about a causal narrative. Two examples are in [this jupyter notebook](https://github.com/borisdev/langchain/blob/master/docs/extras/modules/chains/additional/cpal.ipynb) doc. ## 2. Future application ### "Describe as Narrative, Test as Code" The thesis here is that the Describe as Narrative, Test as Code approach allows you to represent a causal mental model both as code and as a narrative, giving you the best of both worlds. #### Why describe a causal mental mode as a narrative? The narrative form is quick. At a consensus building meeting, people use narratives to persuade others of their causal mental model, aka. plan. You can share, version control and index a narrative. #### Why test a causal mental model as a code? Code is testable, complex narratives are not. Though fast, narratives are problematic as their complexity increases. The problem is LLMs and humans are prone to hallucination when predicting the outcomes of a narrative. The cost of building a consensus around the validity of a narrative outcome grows as its narrative complexity increases. Code does not require tribal knowledge or social power to validate. Code is composable, complex narratives are not. The answer of one CPAL chain can be the hypothetical conditions of another CPAL Chain. For stochastic simulations, a composable plan can be integrated with the [DoWhy library](https://github.com/py-why/dowhy). Lastly, for the futuristic folk, a composable plan as code allows ordinary community folk to design a plan that can be integrated with a blockchain for funding. An explanation of a dependency planning application is [here.](https://github.com/borisdev/cpal-llm-chain-demo) --- Twitter handle: @boris_dev --------- Co-authored-by: Boris Dev <borisdev@Boriss-MacBook-Air.local>	2023-07-11 10:11:21 -04:00
Alejandra De Luna	2e4047e5e7	feat: support generate as an early stopping method for `OpenAIFunctionsAgent` (#7229 ) This PR proposes an implementation to support `generate` as an `early_stopping_method` for the new `OpenAIFunctionsAgent` class. The motivation behind is to facilitate the user to set a maximum number of actions the agent can take with `max_iterations` and force a final response with this new agent (as with the `Agent` class). The following changes were made: - The `OpenAIFunctionsAgent.return_stopped_response` method was overwritten to support `generate` as an `early_stopping_method` - A boolean `with_functions` parameter was added to the `OpenAIFunctionsAgent.plan` method This way the `OpenAIFunctionsAgent.return_stopped_response` method can call the `OpenAIFunctionsAgent.plan` method with `with_function=False` when the `early_stopping_method` is set to `generate`, making a call to the LLM with no functions and forcing a final response from the `"assistant"`. - Relevant maintainer: @hinthornw - Twitter handle: @aledelunap --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-11 09:25:02 -04:00
Lance Martin	4a94f56258	Minor edits to QA docs (#7507 ) Small clean-ups	2023-07-10 22:15:05 -07:00
Lance Martin	bd0c6381f5	Minor update to clarify map-reduce custom prompt usage (#7453 ) Update docs for map-reduce custom prompt usage	2023-07-10 16:43:44 -07:00
Lance Martin	28d2b213a4	Update landing page for "question answering over documents" (#7152 ) Improve documentation for a central use-case, qa / chat over documents. This will be merged as an update to `index.mdx` [here](https://python.langchain.com/docs/use_cases/question_answering/). Testing w/ local Docusaurus server: ``` From `docs` directory: mkdir _dist cp -r {docs_skeleton,snippets} _dist cp -r extras/* _dist/docs_skeleton/docs cd _dist/docs_skeleton yarn install yarn start ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-10 14:15:13 -07:00
Adilkhan Sarsen	5debd5043e	Added deeplake use case examples of the new features (#6528 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> 1. Added use cases of the new features 2. Done some code refactoring --------- Co-authored-by: Ivo Stranic <istranic@gmail.com>	2023-07-10 07:04:29 -07:00
Kazuki Maeda	92b4418c8c	Datadog logs loader (#7356 ) ### Description Created a Loader to get a list of specific logs from Datadog Logs. ### Dependencies `datadog_api_client` is required. ### Twitter handle [kzk_maeda](https://twitter.com/kzk_maeda) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-10 04:27:55 -04:00
Yifei Song	7d29bb2c02	Add Xorbits Dataframe as a Document Loader (#7319 ) - [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source computing framework that makes it easy to scale data science and machine learning workloads in parallel. Xorbits can leverage multi cores or GPUs to accelerate computation on a single machine, or scale out up to thousands of machines to support processing terabytes of data. - This PR added support for the Xorbits document loader, which allows langchain to leverage Xorbits to parallelize and distribute the loading of data. - Dependencies: This change requires the Xorbits library to be installed in order to be used. `pip install xorbits` - Request for review: @rlancemartin, @eyurtsev - Twitter handle: https://twitter.com/Xorbitsio Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-10 04:24:47 -04:00
Paul-Emile Brotons	d2cf0d16b3	adding max_marginal_relevance_search method to MongoDBAtlasVectorSearch (#7310 ) Adding a maximal_marginal_relevance method to the MongoDBAtlasVectorSearch vectorstore enhances the user experience by providing more diverse search results Issue: #7304	2023-07-10 04:04:19 -04:00
Matt Robinson	bcab894f4e	feat: Add `UnstructuredTSVLoader` (#7367 ) ### Summary Adds an `UnstructuredTSVLoader` for TSV files. Also updates the doc strings for `UnstructuredCSV` and `UnstructuredExcel` loaders. ### Testing ```python from langchain.document_loaders.tsv import UnstructuredTSVLoader loader = UnstructuredTSVLoader( file_path="example_data/mlb_teams_2012.csv", mode="elements" ) docs = loader.load() ```	2023-07-10 03:07:10 -04:00
nikkie	dfc3f83b0f	docs(vectorstores/integrations/chroma): Fix loading and saving (#7437 ) - Description: Fix loading and saving code about Chroma - Issue: the issue #7436 - Dependencies: - - Twitter handle: https://twitter.com/ftnext	2023-07-10 02:05:15 -04:00
Daniel Chalef	c7f7788d0b	Add ZepMemory; improve ZepChatMessageHistory handling of metadata; Fix bugs (#7444 ) Hey @hwchase17 - This PR adds a `ZepMemory` class, improves handling of Zep's message metadata, and makes it easier for folks building custom chains to persist metadata alongside their chat history. We've had plenty confused users unfamiliar with ChatMessageHistory classes and how to wrap the `ZepChatMessageHistory` in a `ConversationBufferMemory`. So we've created the `ZepMemory` class as a light wrapper for `ZepChatMessageHistory`. Details: - add ZepMemory, modify notebook to demo use of ZepMemory - Modify summary to be SystemMessage - add metadata argument to add_message; add Zep metadata to Message.additional_kwargs - support passing in metadata	2023-07-10 01:53:49 -04:00
Saurabh Chaturvedi	8f8e8d701e	Fix info about YouTube (#7447 ) (Unintentionally mean 😅) nit: YouTube wasn't created by Google, this PR fixes the mention in docs.	2023-07-10 01:52:55 -04:00
Jeroen Van Goey	f5bd88757e	Fix typo (#7416 ) `quesitons` -> `questions`.	2023-07-09 00:54:48 -04:00
Nolan	5da9f9abcb	docs(agents/toolkits): Fix error in document_comparison_toolkit.ipynb (#7417 ) Replace this comment with: - Description: Removes unneeded output warning in documentation at https://python.langchain.com/docs/modules/agents/toolkits/document_comparison_toolkit - Issue: - - Dependencies: - - Tag maintainer: @baskaryan - Twitter handle: @finnless	2023-07-08 19:51:08 -04:00
nikkie	2eb4a2ceea	docs(retrievers/get-started): Fix broken state_of_the_union.txt link (#7399 ) Thank you for this awesome library. - Description: Fix broken link in documentation - Issue: - https://python.langchain.com/docs/modules/data_connection/retrievers/#get-started - the URL: https://github.com/hwchase17/langchain/blob/master/docs/modules/state_of_the_union.txt - I think the right one is https://github.com/hwchase17/langchain/blob/master/docs/extras/modules/state_of_the_union.txt - Dependencies: - - Tag maintainer: @baskaryan - Twitter handle: -	2023-07-08 11:11:05 -04:00
Delgermurun	a1603fccfb	integrate JinaChat (#6927 ) Integration with https://chat.jina.ai/api. It is OpenAI compatible API. - Twitter handle: [https://twitter.com/JinaAI_](https://twitter.com/JinaAI_) --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-08 02:17:04 -04:00

1 2 3 4 5 ...

1602 Commits