# Fix typo + add wikipedia package installation part in
human_input_llm.ipynb
This PR
1. Fixes typo ("the the human input LLM"),
2. Addes wikipedia package installation part (in accordance with
`WikipediaQueryRun`
[documentation](https://python.langchain.com/en/latest/modules/agents/tools/examples/wikipedia.html))
in `human_input_llm.ipynb`
(`docs/modules/models/llms/examples/human_input_llm.ipynb`)
# Add link to Psychic from document loaders documentation page
In my previous PR I forgot to update `document_loaders.rst` to link to
`psychic.ipynb` to make it discoverable from the main documentation.
# Add AzureCognitiveServicesToolkit to call Azure Cognitive Services
API: achieve some multimodal capabilities
This PR adds a toolkit named AzureCognitiveServicesToolkit which bundles
the following tools:
- AzureCogsImageAnalysisTool: calls Azure Cognitive Services image
analysis API to extract caption, objects, tags, and text from images.
- AzureCogsFormRecognizerTool: calls Azure Cognitive Services form
recognizer API to extract text, tables, and key-value pairs from
documents.
- AzureCogsSpeech2TextTool: calls Azure Cognitive Services speech to
text API to transcribe speech to text.
- AzureCogsText2SpeechTool: calls Azure Cognitive Services text to
speech API to synthesize text to speech.
This toolkit can be used to process image, document, and audio inputs.
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
OpenLM is a zero-dependency OpenAI-compatible LLM provider that can call
different inference endpoints directly via HTTP. It implements the
OpenAI Completion class so that it can be used as a drop-in replacement
for the OpenAI API. This changeset utilizes BaseOpenAI for minimal added
code.
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Add Mastodon toots loader.
Loader works either with public toots, or Mastodon app credentials. Toot
text and user info is loaded.
I've also added integration test for this new loader as it works with
public data, and a notebook with example output run now.
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Improve pinecone hybrid search retriever adding metadata support
I simply remove the hardwiring of metadata to the existing
implementation allowing one to pass `metadatas` attribute to the
constructors and in `get_relevant_documents`. I also add one missing pip
install to the accompanying notebook (I am not adding dependencies, they
were pre-existing).
First contribution, just hoping to help, feel free to critique :)
my twitter username is `@andreliebschner`
While looking at hybrid search I noticed #3043 and #1743. I think the
former can be closed as following the example right now (even prior to
my improvements) works just fine, the latter I think can be also closed
safely, maybe pointing out the relevant classes and example. Should I
reply those issues mentioning someone?
@dev2049, @hwchase17
---------
Co-authored-by: Andreas Liebschner <a.liebschner@shopfully.com>
### Submit Multiple Files to the Unstructured API
Enables batching multiple files into a single Unstructured API requests.
Support for requests with multiple files was added to both
`UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. Note that
if you submit multiple files in "single" mode, the result will be
concatenated into a single document. We recommend using this feature in
"elements" mode.
### Testing
The following should load both documents, using two of the example docs
from the integration tests folder.
```python
from langchain.document_loaders import UnstructuredAPIFileLoader
file_paths = ["examples/layout-parser-paper.pdf", "examples/whatsapp_chat.txt"]
loader = UnstructuredAPIFileLoader(
file_paths=file_paths,
api_key="FAKE_API_KEY",
strategy="fast",
mode="elements",
)
docs = loader.load()
```
# Corrected Misspelling in agents.rst Documentation
<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.
Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.
After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get
-->
In the
[documentation](https://python.langchain.com/en/latest/modules/agents.html)
it says "in fact, it is often best to have an Action Agent be in
**change** of the execution for the Plan and Execute agent."
**Suggested Change:** I propose correcting change to charge.
Fix for issue: #5039
# Add documentation for Databricks integration
This is a follow-up of https://github.com/hwchase17/langchain/pull/4702
It documents the details of how to integrate Databricks using langchain.
It also provides examples in a notebook.
## Who can review?
@dev2049 @hwchase17 since you are aware of the context. We will promote
the integration after this doc is ready. Thanks in advance!
# Fixes an annoying typo in docs
<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.
Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.
After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->
<!-- Remove if not applicable -->
Fixes Annoying typo in docs - "Therefor" -> "Therefore". It's so
annoying to read that I just had to make this PR.
# Streaming only final output of agent (#2483)
As requested in issue #2483, this Callback allows to stream only the
final output of an agent (ie not the intermediate steps).
Fixes#2483
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Add self query translator for weaviate vectorstore
Adds support for the EQ comparator and the AND/OR operators.
Co-authored-by: Dominic Chan <dchan@cppib.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Improve Evernote Document Loader
When exporting from Evernote you may export more than one note.
Currently the Evernote loader concatenates the content of all notes in
the export into a single document and only attaches the name of the
export file as metadata on the document.
This change ensures that each note is loaded as an independent document
and all available metadata on the note e.g. author, title, created,
updated are added as metadata on each document.
It also uses an existing optional dependency of `html2text` instead of
`pypandoc` to remove the need to download the pandoc application via
`download_pandoc()` to be able to use the `pypandoc` python bindings.
Fixes#4493
Co-authored-by: Mike McGarry <mike.mcgarry@finbourne.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Remove autoreload in examples
Remove the `autoreload` in examples since it is not necessary for most
users:
```
%load_ext autoreload,
%autoreload 2
```
# Powerbi API wrapper bug fix + integration tests
- Bug fix by removing `TYPE_CHECKING` in in utilities/powerbi.py
- Added integration test for power bi api in
utilities/test_powerbi_api.py
- Added integration test for power bi agent in
agent/test_powerbi_agent.py
- Edited .env.examples to help set up power bi related environment
variables
- Updated demo notebook with working code in
docs../examples/powerbi.ipynb - AzureOpenAI -> ChatOpenAI
Notes:
Chat models (gpt3.5, gpt4) are much more capable than davinci at writing
DAX queries, so that is important to getting the agent to work properly.
Interestingly, gpt3.5-turbo needed the examples=DEFAULT_FEWSHOT_EXAMPLES
to write consistent DAX queries, so gpt4 seems necessary as the smart
llm.
Fixes#4325
## Before submitting
Azure-core and Azure-identity are necessary dependencies
check integration tests with the following:
`pytest tests/integration_tests/utilities/test_powerbi_api.py`
`pytest tests/integration_tests/agent/test_powerbi_agent.py`
You will need a power bi account with a dataset id + table name in order
to test. See .env.examples for details.
## Who can review?
@hwchase17
@vowelparrot
---------
Co-authored-by: aditya-pethe <adityapethe1@gmail.com>
This PR adds support for Databricks runtime and Databricks SQL by using
[Databricks SQL Connector for
Python](https://docs.databricks.com/dev-tools/python-sql-connector.html).
As a cloud data platform, accessing Databricks requires a URL as follows
`databricks://token:{api_token}@{hostname}?http_path={http_path}&catalog={catalog}&schema={schema}`.
**The URL is **complicated** and it may take users a while to figure it
out**. Since the fields `api_token`/`hostname`/`http_path` fields are
known in the Databricks notebook, I am proposing a new method
`from_databricks` to simplify the connection to Databricks.
## In Databricks Notebook
After changes, Databricks users only need to specify the `catalog` and
`schema` field when using langchain.
<img width="881" alt="image"
src="https://github.com/hwchase17/langchain/assets/1097932/984b4c57-4c2d-489d-b060-5f4918ef2f37">
## In Jupyter Notebook
The method can be used on the local setup as well:
<img width="678" alt="image"
src="https://github.com/hwchase17/langchain/assets/1097932/142e8805-a6ef-4919-b28e-9796ca31ef19">
# Add Spark SQL support
* Add Spark SQL support. It can connect to Spark via building a
local/remote SparkSession.
* Include a notebook example
I tried some complicated queries (window function, table joins), and the
tool works well.
Compared to the [Spark Dataframe
agent](https://python.langchain.com/en/latest/modules/agents/toolkits/examples/spark.html),
this tool is able to generate queries across multiple tables.
---------
# Your PR Title (What it does)
<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.
Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.
After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->
<!-- Remove if not applicable -->
Fixes # (issue)
## Before submitting
<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
<!-- For a quicker response, figure out the right person to tag with @
@hwchase17 - project lead
Tracing / Callbacks
- @agola11
Async
- @agola11
DataLoaders
- @eyurtsev
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @vowelparrot
VectorStores / Retrievers / Memory
- @dev2049
-->
---------
Co-authored-by: Gengliang Wang <gengliang@apache.org>
Co-authored-by: Mike W <62768671+skcoirz@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: UmerHA <40663591+UmerHA@users.noreply.github.com>
Co-authored-by: 张城铭 <z@hyperf.io>
Co-authored-by: assert <zhangchengming@kkguan.com>
Co-authored-by: blob42 <spike@w530>
Co-authored-by: Yuekai Zhang <zhangyuekai@foxmail.com>
Co-authored-by: Richard He <he.yucheng@outlook.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
Co-authored-by: Alexey Nominas <60900649+Chae4ek@users.noreply.github.com>
Co-authored-by: elBarkey <elbarkey@gmail.com>
Co-authored-by: Davis Chase <130488702+dev2049@users.noreply.github.com>
Co-authored-by: Jeffrey D <1289344+verygoodsoftwarenotvirus@users.noreply.github.com>
Co-authored-by: so2liu <yangliu35@outlook.com>
Co-authored-by: Viswanadh Rayavarapu <44315599+vishwa-rn@users.noreply.github.com>
Co-authored-by: Chakib Ben Ziane <contact@blob42.xyz>
Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>
Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
Co-authored-by: Jari Bakken <jari.bakken@gmail.com>
Co-authored-by: escafati <scafatieugenio@gmail.com>
# Zep Retriever - Vector Search Over Chat History with the Zep Long-term
Memory Service
More on Zep: https://github.com/getzep/zep
Note: This PR is related to and relies on
https://github.com/hwchase17/langchain/pull/4834. I did not want to
modify the `pyproject.toml` file to add the `zep-python` dependency a
second time.
Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
# docs: updated `Supabase` notebook
- the title of the notebook was inconsistent (included redundant
"Vectorstore"). Removed this "Vectorstore"
- added `Postgress` to the title. It is important. The `Postgres` name
is much more popular than `Supabase`.
- added description for the `Postrgress`
- added more info to the `Supabase` description
# Update GPT4ALL integration
GPT4ALL have completely changed their bindings. They use a bit odd
implementation that doesn't fit well into base.py and it will probably
be changed again, so it's a temporary solution.
Fixes#3839, #4628
# Docs: compound ecosystem and integrations
**Problem statement:** We have a big overlap between the
References/Integrations and Ecosystem/LongChain Ecosystem pages. It
confuses users. It creates a situation when new integration is added
only on one of these pages, which creates even more confusion.
- removed References/Integrations page (but move all its information
into the individual integration pages - in the next PR).
- renamed Ecosystem/LongChain Ecosystem into Integrations/Integrations.
I like the Ecosystem term. It is more generic and semantically richer
than the Integration term. But it mentally overloads users. The
`integration` term is more concrete.
UPDATE: after discussion, the Ecosystem is the term.
Ecosystem/Integrations is the page (in place of Ecosystem/LongChain
Ecosystem).
As a result, a user gets a single place to start with the individual
integration.
# Fix bilibili api import error
bilibili-api package is depracated and there is no sync module.
<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.
Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.
After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->
<!-- Remove if not applicable -->
Fixes#2673#2724
## Before submitting
<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
@vowelparrot @liaokongVFX
<!-- For a quicker response, figure out the right person to tag with @
@hwchase17 - project lead
Tracing / Callbacks
- @agola11
Async
- @agola11
DataLoaders
- @eyurtsev
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @vowelparrot
VectorStores / Retrievers / Memory
- @dev2049
-->
# TextLoader auto detect encoding and enhanced exception handling
- Add an option to enable encoding detection on `TextLoader`.
- The detection is done using `chardet`
- The loading is done by trying all detected encodings by order of
confidence or raise an exception otherwise.
### New Dependencies:
- `chardet`
Fixes#4479
## Before submitting
<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
- @eyurtsev
---------
Co-authored-by: blob42 <spike@w530>
# Load specific file types from Google Drive (issue #4878)
Add the possibility to define what file types you want to load from
Google Drive.
```
loader = GoogleDriveLoader(
folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5",
file_types=["document", "pdf"]
recursive=False
)
```
Fixes ##4878
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
DataLoaders
- @eyurtsev
Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) | Discord:
RicChilligerDude#7589
---------
Co-authored-by: UmerHA <40663591+UmerHA@users.noreply.github.com>
#docs: text splitters improvements
Changes are only in the Jupyter notebooks.
- added links to the source packages and a short description of these
packages
- removed " Text Splitters" suffixes from the TOC elements (they made
the list of the text splitters messy)
- moved text splitters, based on the length function into a separate
list. They can be mixed with any classes from the "Text Splitters", so
it is a different classification.
## Who can review?
@hwchase17 - project lead
@eyurtsev
@vowelparrot
NOTE: please, check out the results of the `Python code` text splitter
example (text_splitters/examples/python.ipynb). It looks suboptimal.
# Added another helpful way for developers who want to set OpenAI API
Key dynamically
Previous methods like exporting environment variables are good for
project-wide settings.
But many use cases need to assign API keys dynamically, recently.
```python
from langchain.llms import OpenAI
llm = OpenAI(openai_api_key="OPENAI_API_KEY")
```
## Before submitting
```bash
export OPENAI_API_KEY="..."
```
Or,
```python
import os
os.environ["OPENAI_API_KEY"] = "..."
```
<hr>
Thank you.
Cheers,
Bongsang
# Documentation for Azure OpenAI embeddings model
- OPENAI_API_VERSION environment variable is needed for the endpoint
- The constructor does not work with model, it works with deployment.
I fixed it in the notebook.
(This is my first contribution)
## Who can review?
@hwchase17
@agola
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
# Docs: improvements in the `retrievers/examples/` notebooks
Its primary purpose is to make the Jupyter notebook examples
**consistent** and more suitable for first-time viewers.
- add links to the integration source (if applicable) with a short
description of this source;
- removed `_retriever` suffix from the file names (where it existed) for
consistency;
- removed ` retriever` from the notebook title (where it existed) for
consistency;
- added code to install necessary Python package(s);
- added code to set up the necessary API Key.
- very small fixes in notebooks from other folders (for consistency):
- docs/modules/indexes/vectorstores/examples/elasticsearch.ipynb
- docs/modules/indexes/vectorstores/examples/pinecone.ipynb
- docs/modules/models/llms/integrations/cohere.ipynb
- fixed misspelling in langchain/retrievers/time_weighted_retriever.py
comment (sorry, about this change in a .py file )
## Who can review
@dev2049
# Fixed typos (issues #4818 & #4668 & more typos)
- At some places, it said `model = ChatOpenAI(model='gpt-3.5-turbo')`
but should be `model = ChatOpenAI(model_name='gpt-3.5-turbo')`
- Fixes some other typos
Fixes#4818, #4668
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @vowelparrot
# Removed usage of deprecated methods
Replaced `SQLDatabaseChain` deprecated direct initialisation with
`from_llm` method
## Who can review?
@hwchase17
@agola11
---------
Co-authored-by: imeckr <chandanroutray2012@gmail.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
- Installation of non-colab packages
- Get API keys
# Added dependencies to make notebook executable on hosted notebooks
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
@hwchase17
@vowelparrot
- Installation of non-colab packages
- Get API keys
- Get rid of warnings
# Cleanup and added dependencies to make notebook executable on hosted
notebooks
@hwchase17
@vowelparrot
The current example in
https://python.langchain.com/en/latest/modules/agents/plan_and_execute.html
has inconsistent reasoning step (observing 28 years and thinking it's 26
years):
```
Observation: 28 years
Thought:Based on my search, Gigi Hadid's current age is 26 years old.
Action:
{
"action": "Final Answer",
"action_input": "Gigi Hadid's current age is 26 years old."
}
```
Guessing this is model noise. Rerunning seems to give correct answer of
28 years.
# Fix Telegram API loader + add tests.
I was testing this integration and it was broken with next error:
```python
message_threads = loader._get_message_threads(df)
KeyError: False
```
Also, this particular loader didn't have any tests / related group in
poetry, so I added those as well.
@hwchase17 / @eyurtsev please take a look on this fix PR.
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Cassandra support for chat history
### Description
- Store chat messages in cassandra
### Dependency
- cassandra-driver - Python Module
## Before submitting
- Added Integration Test
## Who can review?
@hwchase17
@agola11
# Your PR Title (What it does)
<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.
Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.
After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->
<!-- Remove if not applicable -->
Fixes # (issue)
## Before submitting
<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
<!-- For a quicker response, figure out the right person to tag with @
@hwchase17 - project lead
Tracing / Callbacks
- @agola11
Async
- @agola11
DataLoaders
- @eyurtsev
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @vowelparrot
VectorStores / Retrievers / Memory
- @dev2049
-->
Co-authored-by: Jinto Jose <129657162+jj701@users.noreply.github.com>
# Add GraphQL Query Support
This PR introduces a GraphQL API Wrapper tool that allows LLM agents to
query GraphQL databases. The tool utilizes the httpx and gql Python
packages to interact with GraphQL APIs and provides a simple interface
for running queries with LLM agents.
@vowelparrot
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
### Adds a document loader for Docugami
Specifically:
1. Adds a data loader that talks to the [Docugami](http://docugami.com)
API to download processed documents as semantic XML
2. Parses the semantic XML into chunks, with additional metadata
capturing chunk semantics
3. Adds a detailed notebook showing how you can use additional metadata
returned by Docugami for techniques like the [self-querying
retriever](https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/self_query_retriever.html)
4. Adds an integration test, and related documentation
Here is an example of a result that is not possible without the
capabilities added by Docugami (from the notebook):
<img width="1585" alt="image"
src="https://github.com/hwchase17/langchain/assets/749277/bb6c1ce3-13dc-4349-a53b-de16681fdd5b">
---------
Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
Co-authored-by: Taqi Jaffri <tjaffri@gmail.com>
[OpenWeatherMapAPIWrapper](f70e18a5b3/docs/modules/agents/tools/examples/openweathermap.ipynb)
works wonderfully, but the _tool_ itself can't be used in master branch.
- added OpenWeatherMap **tool** to the public api, to be loadable with
`load_tools` by using "openweathermap-api" tool name (that name is used
in the existing
[docs](aff33d52c5/docs/modules/agents/tools/getting_started.md),
at the bottom of the page)
- updated OpenWeatherMap tool's **description** to make the input format
match what the API expects (e.g. `London,GB` instead of `'London,GB'`)
- added [ecosystem documentation page for
OpenWeatherMap](f9c41594fe/docs/ecosystem/openweathermap.md)
- added tool usage example to [OpenWeatherMap's
notebook](f9c41594fe/docs/modules/agents/tools/examples/openweathermap.ipynb)
Let me know if there's something I missed or something needs to be
updated! Or feel free to make edits yourself if that makes it easier for
you 🙂
[RELLM](https://github.com/r2d4/rellm) is a library that wraps local
HuggingFace pipeline models for structured decoding.
RELLM works by generating tokens one at a time. At each step, it masks
tokens that don't conform to the provided partial regular expression.
[JSONFormer](https://github.com/1rgs/jsonformer) is a bit different, where it sequentially adds the keys then decodes each value directly
**Problem statement:** the
[document_loaders](https://python.langchain.com/en/latest/modules/indexes/document_loaders.html#)
section is too long and hard to comprehend.
**Proposal:** group document_loaders by 3 classes: (see `Files changed`
tab)
UPDATE: I've completely reworked the document_loader classification.
Now this PR changes only one file!
FYI @eyurtsev @hwchase17
[Text Generation
Inference](https://github.com/huggingface/text-generation-inference) is
a Rust, Python and gRPC server for generating text using LLMs.
This pull request add support for self hosted Text Generation Inference
servers.
feature: #4280
---------
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
# Add option to `load_huggingface_tool`
Expose a method to load a huggingface Tool from the HF hub
---------
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
Thanks to @anna-charlotte and @jupyterjazz for the contribution! Made
few small changes to get it across the finish line
---------
Signed-off-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Signed-off-by: jupyterjazz <saba.sturua@jina.ai>
Co-authored-by: anna-charlotte <charlotte.gerhaher@jina.ai>
Co-authored-by: jupyterjazz <saba.sturua@jina.ai>
Co-authored-by: Saba Sturua <45267439+jupyterjazz@users.noreply.github.com>
# ODF File Loader
Adds a data loader for handling Open Office ODT files. Requires
`unstructured>=0.6.3`.
### Testing
The following should work using the `fake.odt` example doc from the
[`unstructured` repo](https://github.com/Unstructured-IO/unstructured).
```python
from langchain.document_loaders import UnstructuredODTLoader
loader = UnstructuredODTLoader(file_path="fake.odt", mode="elements")
loader.load()
loader = UnstructuredODTLoader(file_path="fake.odt", mode="single")
loader.load()
```
- added `Wikipedia` retriever. It is effectively a wrapper for
`WikipediaAPIWrapper`. It wrapps load() into get_relevant_documents()
- sorted `__all__` in the `retrievers/__init__`
- added integration tests for the WikipediaRetriever
- added an example (as Jupyter notebook) for the WikipediaRetriever
# Minor Wording Documentation Change
```python
agent_chain.run("When's my friend Eric's surname?")
# Answer with 'Zhu'
```
is change to
```python
agent_chain.run("What's my friend Eric's surname?")
# Answer with 'Zhu'
```
I think when is a residual of the old query that was "When’s my friends
Eric`s birthday?".
# Fix grammar in Text Splitters docs
Just a small fix of grammar in the documentation:
"That means there two different axes" -> "That means there are two
different axes"
Related: #4028, I opened a new PR because (1) I was unable to unstage
mistakenly committed files (I'm not familiar with git enough to resolve
this issue), (2) I felt closing the original PR and opening a new PR
would be more appropriate if I changed the class name.
This PR creates HumanInputLLM(HumanLLM in #4028), a simple LLM wrapper
class that returns user input as the response. I also added a simple
Jupyter notebook regarding how and why to use this LLM wrapper. In the
notebook, I went over how to use this LLM wrapper and showed example of
testing `WikipediaQueryRun` using HumanInputLLM.
I believe this LLM wrapper will be useful especially for debugging,
educational or testing purpose.
- Added the `Wikipedia` document loader. It is based on the existing
`unilities/WikipediaAPIWrapper`
- Added a respective ut-s and example notebook
- Sorted list of classes in __init__
- made notebooks consistent: titles, service/format descriptions.
- corrected short names to full names, for example, `Word` -> `Microsoft
Word`
- added missed descriptions
- renamed notebook files to make ToC correctly sorted
This implements a loader of text passages in JSON format. The `jq`
syntax is used to define a schema for accessing the relevant contents
from the JSON file. This requires dependency on the `jq` package:
https://pypi.org/project/jq/.
---------
Signed-off-by: Aivin V. Solatorio <avsolatorio@gmail.com>
In the example for creating a Python REPL tool under the Agent module,
the ".run" was omitted in the example. I believe this is required when
defining a Tool.
In the section `Get Message Completions from a Chat Model` of the quick
start guide, the HumanMessage doesn't need to include `Translate this
sentence from English to French.` when there is a system message.
Simplify HumanMessages in these examples can further demonstrate the
power of LLM.
* implemented arun, results, and aresults. Reuses aiosession if
available.
* helper tools GoogleSerperRun and GoogleSerperResults
* support for Google Images, Places and News (examples given) and
filtering based on time (e.g. past hour)
* updated docs
Single edit to: models/text_embedding/examples/openai.ipynb - Line 88:
changed from: "embeddings = OpenAIEmbeddings(model_name=\"ada\")" to
"embeddings = OpenAIEmbeddings()" as model_name is no longer part of the
OpenAIEmbeddings class.
Seems the pyllamacpp package is no longer the supported bindings from
gpt4all. Tested that this works locally.
Given that the older models weren't very performant, I think it's better
to migrate now without trying to include a lot of try / except blocks
---------
Co-authored-by: Nissan Pow <npow@users.noreply.github.com>
Co-authored-by: Nissan Pow <pownissa@amazon.com>
Modified Modern Treasury and Strip slightly so credentials don't have to
be passed in explicitly. Thanks @mattgmarcus for adding Modern Treasury!
---------
Co-authored-by: Matt Marcus <matt.g.marcus@gmail.com>
Haven't gotten to all of them, but this:
- Updates some of the tools notebooks to actually instantiate a tool
(many just show a 'utility' rather than a tool. More changes to come in
separate PR)
- Move the `Tool` and decorator definitions to `langchain/tools/base.py`
(but still export from `langchain.agents`)
- Add scene explain to the load_tools() function
- Add unit tests for public apis for the langchain.tools and langchain.agents modules