- Description: Adds a new chain that acts as a wrapper around Sympy to
give LLMs the ability to do some symbolic math.
- Dependencies: SymPy
---------
Co-authored-by: sreiswig <sreiswig@github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description: a description of the change**
Fixed `make docs_build` and related scripts which caused errors. There
are several changes.
First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.
It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.
`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.
Finally, the description of CONTRIBUTING.md was modified.
**Issue: the issue # it fixes (if applicable)**
https://github.com/hwchase17/langchain/issues/6413
**Dependencies: any dependencies required for this change**
`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.
**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan
If this PR needs any additional changes, I'll be happy to make them!
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
# Causal program-aided language (CPAL) chain
## Motivation
This builds on the recent [PAL](https://arxiv.org/abs/2211.10435) to
stop LLM hallucination. The problem with the
[PAL](https://arxiv.org/abs/2211.10435) approach is that it hallucinates
on a math problem with a nested chain of dependence. The innovation here
is that this new CPAL approach includes causal structure to fix
hallucination.
For example, using the below word problem, PAL answers with 5, and CPAL
answers with 13.
"Tim buys the same number of pets as Cindy and Boris."
"Cindy buys the same number of pets as Bill plus Bob."
"Boris buys the same number of pets as Ben plus Beth."
"Bill buys the same number of pets as Obama."
"Bob buys the same number of pets as Obama."
"Ben buys the same number of pets as Obama."
"Beth buys the same number of pets as Obama."
"If Obama buys one pet, how many pets total does everyone buy?"
The CPAL chain represents the causal structure of the above narrative as
a causal graph or DAG, which it can also plot, as shown below.
![complex-graph](https://github.com/hwchase17/langchain/assets/367522/d938db15-f941-493d-8605-536ad530f576)
.
The two major sections below are:
1. Technical overview
2. Future application
Also see [this jupyter
notebook](https://github.com/borisdev/langchain/blob/master/docs/extras/modules/chains/additional/cpal.ipynb)
doc.
## 1. Technical overview
### CPAL versus PAL
Like [PAL](https://arxiv.org/abs/2211.10435), CPAL intends to reduce
large language model (LLM) hallucination.
The CPAL chain is different from the PAL chain for a couple of reasons.
* CPAL adds a causal structure (or DAG) to link entity actions (or math
expressions).
* The CPAL math expressions are modeling a chain of cause and effect
relations, which can be intervened upon, whereas for the PAL chain math
expressions are projected math identities.
PAL's generated python code is wrong. It hallucinates when complexity
increases.
```python
def solution():
"""Tim buys the same number of pets as Cindy and Boris.Cindy buys the same number of pets as Bill plus Bob.Boris buys the same number of pets as Ben plus Beth.Bill buys the same number of pets as Obama.Bob buys the same number of pets as Obama.Ben buys the same number of pets as Obama.Beth buys the same number of pets as Obama.If Obama buys one pet, how many pets total does everyone buy?"""
obama_pets = 1
tim_pets = obama_pets
cindy_pets = obama_pets + obama_pets
boris_pets = obama_pets + obama_pets
total_pets = tim_pets + cindy_pets + boris_pets
result = total_pets
return result # math result is 5
```
CPAL's generated python code is correct.
```python
story outcome data
name code value depends_on
0 obama pass 1.0 []
1 bill bill.value = obama.value 1.0 [obama]
2 bob bob.value = obama.value 1.0 [obama]
3 ben ben.value = obama.value 1.0 [obama]
4 beth beth.value = obama.value 1.0 [obama]
5 cindy cindy.value = bill.value + bob.value 2.0 [bill, bob]
6 boris boris.value = ben.value + beth.value 2.0 [ben, beth]
7 tim tim.value = cindy.value + boris.value 4.0 [cindy, boris]
query data
{
"question": "how many pets total does everyone buy?",
"expression": "SELECT SUM(value) FROM df",
"llm_error_msg": ""
}
# query result is 13
```
Based on the comments below, CPAL's intended location in the library is
`experimental/chains/cpal` and PAL's location is`chains/pal`.
### CPAL vs Graph QA
Both the CPAL chain and the Graph QA chain extract entity-action-entity
relations into a DAG.
The CPAL chain is different from the Graph QA chain for a few reasons.
* Graph QA does not connect entities to math expressions
* Graph QA does not associate actions in a sequence of dependence.
* Graph QA does not decompose the narrative into these three parts:
1. Story plot or causal model
4. Hypothetical question
5. Hypothetical condition
### Evaluation
Preliminary evaluation on simple math word problems shows that this CPAL
chain generates less hallucination than the PAL chain on answering
questions about a causal narrative. Two examples are in [this jupyter
notebook](https://github.com/borisdev/langchain/blob/master/docs/extras/modules/chains/additional/cpal.ipynb)
doc.
## 2. Future application
### "Describe as Narrative, Test as Code"
The thesis here is that the Describe as Narrative, Test as Code approach
allows you to represent a causal mental model both as code and as a
narrative, giving you the best of both worlds.
#### Why describe a causal mental mode as a narrative?
The narrative form is quick. At a consensus building meeting, people use
narratives to persuade others of their causal mental model, aka. plan.
You can share, version control and index a narrative.
#### Why test a causal mental model as a code?
Code is testable, complex narratives are not. Though fast, narratives
are problematic as their complexity increases. The problem is LLMs and
humans are prone to hallucination when predicting the outcomes of a
narrative. The cost of building a consensus around the validity of a
narrative outcome grows as its narrative complexity increases. Code does
not require tribal knowledge or social power to validate.
Code is composable, complex narratives are not. The answer of one CPAL
chain can be the hypothetical conditions of another CPAL Chain. For
stochastic simulations, a composable plan can be integrated with the
[DoWhy library](https://github.com/py-why/dowhy). Lastly, for the
futuristic folk, a composable plan as code allows ordinary community
folk to design a plan that can be integrated with a blockchain for
funding.
An explanation of a dependency planning application is
[here.](https://github.com/borisdev/cpal-llm-chain-demo)
---
Twitter handle: @boris_dev
---------
Co-authored-by: Boris Dev <borisdev@Boriss-MacBook-Air.local>
Improve documentation for a central use-case, qa / chat over documents.
This will be merged as an update to `index.mdx`
[here](https://python.langchain.com/docs/use_cases/question_answering/).
Testing w/ local Docusaurus server:
```
From `docs` directory:
mkdir _dist
cp -r {docs_skeleton,snippets} _dist
cp -r extras/* _dist/docs_skeleton/docs
cd _dist/docs_skeleton
yarn install
yarn start
```
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
# [SPARQL](https://www.w3.org/TR/rdf-sparql-query/) for
[LangChain](https://github.com/hwchase17/langchain)
## Description
LangChain support for knowledge graphs relying on W3C standards using
RDFlib: SPARQL/ RDF(S)/ OWL with special focus on RDF \
* Works with local files, files from the web, and SPARQL endpoints
* Supports both SELECT and UPDATE queries
* Includes both a Jupyter notebook with an example and integration tests
## Contribution compared to related PRs and discussions
* [Wikibase agent](https://github.com/hwchase17/langchain/pull/2690) -
uses SPARQL, but specifically for wikibase querying
* [Cypher qa](https://github.com/hwchase17/langchain/pull/5078) - graph
DB question answering for Neo4J via Cypher
* [PR 6050](https://github.com/hwchase17/langchain/pull/6050) - tries
something similar, but does not cover UPDATE queries and supports only
RDF
* Discussions on [w3c mailing list](mailto:semantic-web@w3.org) related
to the combination of LLMs (specifically ChatGPT) and knowledge graphs
## Dependencies
* [RDFlib](https://github.com/RDFLib/rdflib)
## Tag maintainer
Graph database related to memory -> @hwchase17
[Apache HugeGraph](https://github.com/apache/incubator-hugegraph) is a
convenient, efficient, and adaptable graph database, compatible with the
Apache TinkerPop3 framework and the Gremlin query language.
In this PR, the HugeGraph and HugeGraphQAChain provide the same
functionality as the existing integration with Neo4j and enables query
generation and question answering over HugeGraph database. The
difference is that the graph query language supported by HugeGraph is
not cypher but another very popular graph query language
[Gremlin](https://tinkerpop.apache.org/gremlin.html).
A notebook example and a simple test case have also been added.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Handle the new retriever events in a way that (I think) is entirely
backwards compatible? Needs more testing for some of the chain changes
and all.
This creates an entire new run type, however. We could also just treat
this as an event within a chain run presumably (same with memory)
Adds a subclass initializer that upgrades old retriever implementations
to the new schema, along with tests to ensure they work.
First commit doesn't upgrade any of our retriever implementations (to
show that we can pass the tests along with additional ones testing the
upgrade logic).
Second commit upgrades the known universe of retrievers in langchain.
- [X] Add callback handling methods for retriever start/end/error (open
to renaming to 'retrieval' if you want that)
- [X] Update BaseRetriever schema to support callbacks
- [X] Tests for upgrading old "v1" retrievers for backwards
compatibility
- [X] Update existing retriever implementations to implement the new
interface
- [X] Update calls within chains to .{a]get_relevant_documents to pass
the child callback manager
- [X] Update the notebooks/docs to reflect the new interface
- [X] Test notebooks thoroughly
Not handled:
- Memory pass throughs: retrieval memory doesn't have a parent callback
manager passed through the method
---------
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
This pull request introduces a new feature to the LangChain QA Retrieval
Chains with Structures. The change involves adding a prompt template as
an optional parameter for the RetrievalQA chains that utilize the
recently implemented OpenAI Functions.
The main purpose of this enhancement is to provide users with the
ability to input a more customizable prompt to the chain. By introducing
a prompt template as an optional parameter, users can tailor the prompt
to their specific needs and context, thereby improving the flexibility
and effectiveness of the RetrievalQA chains.
## Changes Made
- Created a new optional parameter, "prompt", for the RetrievalQA with
structure chains.
- Added an example to the RetrievalQA with sources notebook.
My twitter handle is @El_Rey_Zero
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
This PR adds `KuzuGraph` and `KuzuQAChain` for interacting with [Kùzu
database](https://github.com/kuzudb/kuzu). Kùzu is an in-process
property graph database management system (GDBMS) built for query speed
and scalability. The `KuzuGraph` and `KuzuQAChain` provide the same
functionality as the existing integration with NebulaGraph and Neo4j and
enables query generation and question answering over Kùzu database.
A notebook example and a simple test case have also been added.
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
- return raw and full output (but keep run shortcut method functional)
- change output parser to take in generations (good for working with
messages)
- add output parser to base class, always run (default to same as
current)
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>