`NotebookLoader.load()` loads the `.ipynb` notebook file into a
`Document` object.
**Parameters**:
* `include_outputs` (bool): whether to include cell outputs in the
resulting document (default is False).
* `max_output_length` (int): the maximum number of characters to include
from each cell output (default is 10).
* `remove_newline` (bool): whether to remove newline characters from the
cell sources and outputs (default is False).
* `traceback` (bool): whether to include full traceback (default is
False).
Link for easier navigation (it's not immediately clear where to find
more info on SimpleSequentialChain (3 clicks away)
---------
Co-authored-by: Larry Fisherman <l4rryfisherman@protonmail.com>
Added a GitBook document loader. It lets you both, (1) fetch text from
any single GitBook page, or (2) fetch all relative paths and return
their respective content in Documents.
I've modified the `scrape` method in the `WebBaseLoader` to accept
custom web paths if given, but happy to remove it and move that logic
into the `GitbookLoader` itself.
### Description
This PR adds a wrapper which adds support for the OpenSearch vector
database. Using opensearch-py client we are ingesting the embeddings of
given text into opensearch cluster using Bulk API. We can perform the
`similarity_search` on the index using the 3 popular searching methods
of OpenSearch k-NN plugin:
- `Approximate k-NN Search` use approximate nearest neighbor (ANN)
algorithms from the [nmslib](https://github.com/nmslib/nmslib),
[faiss](https://github.com/facebookresearch/faiss), and
[Lucene](https://lucene.apache.org/) libraries to power k-NN search.
- `Script Scoring` extends OpenSearch’s script scoring functionality to
execute a brute force, exact k-NN search.
- `Painless Scripting` adds the distance functions as painless
extensions that can be used in more complex combinations. Also, supports
brute force, exact k-NN search like Script Scoring.
### Issues Resolved
https://github.com/hwchase17/langchain/issues/1054
---------
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Follow-up of @hinthornw's PR:
- Migrate the Tool abstraction to a separate file (`BaseTool`).
- `Tool` implementation of `BaseTool` takes in function and coroutine to
more easily maintain backwards compatibility
- Add a Toolkit abstraction that can own the generation of tools around
a shared concept or state
---------
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
Co-authored-by: Dhruv Anand <105786647+dhruv-anand-aintech@users.noreply.github.com>
Co-authored-by: cragwolfe <cragcw@gmail.com>
Co-authored-by: Anton Troynikov <atroyn@users.noreply.github.com>
Co-authored-by: Oliver Klingefjord <oliver@klingefjord.com>
Co-authored-by: William Fu-Hinthorn <whinthorn@Williams-MBP-3.attlocal.net>
Co-authored-by: Bruno Bornsztein <bruno.bornsztein@gmail.com>
This is a work in progress PR to track my progres.
## TODO:
- [x] Get results using the specifed searx host
- [x] Prioritize returning an `answer` or results otherwise
- [ ] expose the field `infobox` when available
- [ ] expose `score` of result to help agent's decision
- [ ] expose the `suggestions` field to agents so they could try new
queries if no results are found with the orignial query ?
- [ ] Dynamic tool description for agents ?
- Searx offers many engines and a search syntax that agents can take
advantage of. It would be nice to generate a dynamic Tool description so
that it can be used many times as a tool but for different purposes.
- [x] Limit number of results
- [ ] Implement paging
- [x] Miror the usage of the Google Search tool
- [x] easy selection of search engines
- [x] Documentation
- [ ] update HowTo guide notebook on Search Tools
- [ ] Handle async
- [ ] Tests
### Add examples / documentation on possible uses with
- [ ] getting factual answers with `!wiki` option and `infoboxes`
- [ ] getting `suggestions`
- [ ] getting `corrections`
---------
Co-authored-by: blob42 <spike@w530>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Alternate implementation to PR #960 Again - only FAISS is implemented.
If accepted can add this to other vectorstores or leave as
NotImplemented? Suggestions welcome...
This PR updates `PromptLayerOpenAI` to now support requests using the
[Async
API](https://langchain.readthedocs.io/en/latest/modules/llms/async_llm.html)
It also updates the documentation on Async API to let users know that
PromptLayerOpenAI also supports this.
`PromptLayerOpenAI` now redefines `_agenerate` a similar was to how it
redefines `_generate`
Adds Google Search integration with [Serper](https://serper.dev) a
low-cost alternative to SerpAPI (10x cheaper + generous free tier).
Includes documentation, tests and examples. Hopefully I am not missing
anything.
Developers can sign up for a free account at
[serper.dev](https://serper.dev) and obtain an api key.
## Usage
```python
from langchain.utilities import GoogleSerperAPIWrapper
from langchain.llms.openai import OpenAI
from langchain.agents import initialize_agent, Tool
import os
os.environ["SERPER_API_KEY"] = ""
os.environ['OPENAI_API_KEY'] = ""
llm = OpenAI(temperature=0)
search = GoogleSerperAPIWrapper()
tools = [
Tool(
name="Intermediate Answer",
func=search.run
)
]
self_ask_with_search = initialize_agent(tools, llm, agent="self-ask-with-search", verbose=True)
self_ask_with_search.run("What is the hometown of the reigning men's U.S. Open champion?")
```
### Output
```
Entering new AgentExecutor chain...
Yes.
Follow up: Who is the reigning men's U.S. Open champion?
Intermediate answer: Current champions Carlos Alcaraz, 2022 men's singles champion.
Follow up: Where is Carlos Alcaraz from?
Intermediate answer: El Palmar, Spain
So the final answer is: El Palmar, Spain
> Finished chain.
'El Palmar, Spain'
```
This PR updates the usage instructions for PromptLayerOpenAI in
Langchain's documentation. The updated instructions provide more detail
and conform better to the style of other LLM integration documentation
pages.
No code changes were made in this PR, only improvements to the
documentation. This update will make it easier for users to understand
how to use `PromptLayerOpenAI`
Currently the chain is getting the column names and types on the one
side and the example rows on the other. It is easier for the llm to read
the table information if the column name and examples are shown together
so that it can easily understand to which columns do the examples refer
to. For an instantiation of this, please refer to the changes in the
`sqlite.ipynb` notebook.
Also changed `eval` for `ast.literal_eval` when interpreting the results
from the sample row query since it is a better practice.
---------
Co-authored-by: Francisco Ingham <>
---------
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
The provided example uses the default `max_length` of `20` tokens, which
leads to the example generation getting cut off. 20 tokens is way too
short to show CoT reasoning, so I boosted it to `64`.
Without knowing HF's API well, it can be hard to figure out just where
those `model_kwargs` come from, and `max_length` is a super critical
one.
Co-authored-by: Andrew White <white.d.andrew@gmail.com>
Co-authored-by: Harrison Chase <harrisonchase@Harrisons-MBP.attlocal.net>
Co-authored-by: Peng Qu <82029664+pengqu123@users.noreply.github.com>
Supporting asyncio in langchain primitives allows for users to run them
concurrently and creates more seamless integration with
asyncio-supported frameworks (FastAPI, etc.)
Summary of changes:
**LLM**
* Add `agenerate` and `_agenerate`
* Implement in OpenAI by leveraging `client.Completions.acreate`
**Chain**
* Add `arun`, `acall`, `_acall`
* Implement them in `LLMChain` and `LLMMathChain` for now
**Agent**
* Refactor and leverage async chain and llm methods
* Add ability for `Tools` to contain async coroutine
* Implement async SerpaPI `arun`
Create demo notebook.
Open questions:
* Should all the async stuff go in separate classes? I've seen both
patterns (keeping the same class and having async and sync methods vs.
having class separation)
PR to fix outdated environment details in the docs, see issue #897
I added code comments as pointers to where users go to get API keys, and
where they can find the relevant environment variable.
Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
Signed-off-by: Frank Liu <frank.liu@zilliz.com>
Co-authored-by: Filip Haltmayer <81822489+filip-halt@users.noreply.github.com>
Co-authored-by: Frank Liu <frank@frankzliu.com>
It's generally considered to be a good practice to pin dependencies to
prevent surprise breakages when a new version of a dependency is
released. This commit adds the ability to pin dependencies when loading
from LangChainHub.
Centralizing this logic and using urllib fixes an issue identified by
some windows users highlighted in this video -
https://youtu.be/aJ6IQUh8MLQ?t=537
The agents usually benefit from understanding what the data looks like
to be able to filter effectively. Sending just one row in the table info
allows the agent to understand the data before querying and get better
results.
---------
Co-authored-by: Francisco Ingham <>
---------
Co-authored-by: Francisco Ingham <fpingham@gmail.com>
On the [Getting Started
page](https://langchain.readthedocs.io/en/latest/modules/prompts/getting_started.html)
for prompt templates, I believe the very last example
```python
print(dynamic_prompt.format(adjective=long_string))
```
should actually be
```python
print(dynamic_prompt.format(input=long_string))
```
The existing example produces `KeyError: 'input'` as expected
***
On the [Create a custom prompt
template](https://langchain.readthedocs.io/en/latest/modules/prompts/examples/custom_prompt_template.html#id1)
page, I believe the line
```python
Function Name: {kwargs["function_name"]}
```
should actually be
```python
Function Name: {kwargs["function_name"].__name__}
```
The existing example produces the prompt:
```
Given the function name and source code, generate an English language explanation of the function.
Function Name: <function get_source_code at 0x7f907bc0e0e0>
Source Code:
def get_source_code(function_name):
# Get the source code of the function
return inspect.getsource(function_name)
Explanation:
```
***
On the [Example
Selectors](https://langchain.readthedocs.io/en/latest/modules/prompts/examples/example_selectors.html)
page, the first example does not define `example_prompt`, which is also
subtly different from previous example prompts used. For user
convenience, I suggest including
```python
example_prompt = PromptTemplate(
input_variables=["input", "output"],
template="Input: {input}\nOutput: {output}",
)
```
in the code to be copy-pasted
tl;dr: input -> word, output -> antonym, rename to dynamic_prompt
consistently
The provided code in this example doesn't run, because the keys are
`word` and `antonym`, rather than `input` and `output`.
Also, the `ExampleSelector`-based prompt is named `few_shot_prompt` when
defined and `dynamic_prompt` in the follow-up example. The former name
is less descriptive and collides with an earlier example, so I opted for
the latter.
Thanks for making a really cool library!
For using Azure OpenAI API, we need to set multiple env vars. But as can
be seen in openai package
[here](48b69293a3/openai/__init__.py (L35)),
the env var for setting base url is named `OPENAI_API_BASE` and not
`OPENAI_API_BASE_URL`. This PR fixes that part in the documentation.
add a chain that applies a prompt to all inputs and then returns not
only an answer but scores it
add examples for question answering and question answering with sources
Small quick fix:
Suggest making the order of the menu the same as it is written on the
page (Getting Started -> Key Concepts). Before the menu order was not
the same as it was on the page. Not sure if this is the only place the
menu is affected.
Mismatch is found here:
https://langchain.readthedocs.io/en/latest/modules/llms.html
- Add support for local build and linkchecking of docs
- Add GitHub Action to automatically check links before prior to
publication
- Minor reformat of Contributing readme
- Fix existing broken links
Co-authored-by: Hunter Gerlach <hunter@huntergerlach.com>
Co-authored-by: Hunter Gerlach <HunterGerlach@users.noreply.github.com>
Co-authored-by: Hunter Gerlach <hunter@huntergerlach.com>
Big docs refactor! Motivation is to make it easier for people to find
resources they are looking for. To accomplish this, there are now three
main sections:
- Getting Started: steps for getting started, walking through most core
functionality
- Modules: these are different modules of functionality that langchain
provides. Each part here has a "getting started", "how to", "key
concepts" and "reference" section (except in a few select cases where it
didnt easily fit).
- Use Cases: this is to separate use cases (like summarization, question
answering, evaluation, etc) from the modules, and provide a different
entry point to the code base.
There is also a full reference section, as well as extra resources
(glossary, gallery, etc)
Co-authored-by: Shreya Rajpal <ShreyaR@users.noreply.github.com>