Commit Graph

1969 Commits

Author SHA1 Message Date
Anurag
19e28d8784
feat: Allow users to pass additional arguments to the WebDriver (#4121)
This commit adds support for passing additional arguments to the
`SeleniumURLLoader ` when creating Chrome or Firefox web drivers.
Previously, only a few arguments such as `headless` could be passed in.
With this change, users can pass any additional arguments they need as a
list of strings using the `arguments` parameter.

The `arguments` parameter allows users to configure the driver with any
options that are available for that particular browser. For example,
users can now pass custom `user_agent` strings or `proxy` settings using
this parameter.

This change also includes updated documentation and type hints to
reflect the new `arguments` parameter and its usage.

fixes #4120
2023-05-05 13:24:42 -07:00
hp0404
2a3c5f8353
Update WhatsAppChatLoader regex to handle multiple date-time formats (#4186)
This PR updates the `message_line_regex` used by `WhatsAppChatLoader` to
support different date-time formats used in WhatsApp chat exports;
resolves #4153.

The new regex handles the following input formats:
```terminal
[05.05.23, 15:48:11] James: Hi here
[11/8/21, 9:41:32 AM] User name: Message 123
1/23/23, 3:19 AM - User 2: Bye!
1/23/23, 3:22_AM - User 1: And let me know if anything changes
```

Tests have been added to verify that the loader works correctly with all
formats.
2023-05-05 13:13:05 -07:00
Nicolas
a57259ec83
docs: Mendable Fixes and Improvements (#4184)
Overall fixes and improvements.
2023-05-05 13:04:24 -07:00
Harrison Chase
7dcc698ebf
bump version to 159 (#4183) 2023-05-05 09:31:08 -07:00
Harrison Chase
26534457f5
simplify csv args (#4182) 2023-05-05 09:22:08 -07:00
Eduard van Valkenburg
3095546851
PowerBI fix for table names with spaces (#4170)
small fix to make sure a table name with spaces is passed correctly to
the API for the schema lookup.
2023-05-05 09:15:47 -07:00
obbiondo
b1e2e29222
fix: remove expand parameter from ConfluenceLoader by label (#4181)
expand is not an allowed parameter for the method
confluence.get_all_pages_by_label, since it doesn't return the body of
the text but just metadata of documents

Co-authored-by: Andrea Biondo <a.biondo@reply.it>
2023-05-05 09:15:21 -07:00
Zander Chase
84cfa76e00
Update Cohere Reranker (#4180)
The forward ref annotations don't get updated if we only iimport with
type checking

---------

Co-authored-by: Abhinav Verma <abhinav_win12@yahoo.co.in>
2023-05-05 09:11:37 -07:00
Davis Chase
d84bb02881
Add Chroma self query (#4149)
Add internal query language -> chroma metadata filter translator
2023-05-05 08:43:08 -07:00
Vinoo Ganesh
905a2114d7
Fix: Typo in Docs (#4179)
Fixing small typo in docs
2023-05-05 08:35:49 -07:00
Ankush Gola
8de1b4c4c2
Revert "fix: #4128 missing run_manager parameter" (#4159)
Reverts hwchase17/langchain#4130
2023-05-05 00:52:16 -07:00
878d0c8155
fix: #4128 missing run_manager parameter (#4130)
`run_manager` was not being passed downstream. Not sure if this was a
deliberate choice but it seems like it broke many agent callbacks like
`agent_action` and `agent_finish`. This fix needs a proper review.

Co-authored-by: blob42 <spike@w530>
2023-05-04 23:59:55 -07:00
Zander Chase
6032a051e9
Add Tenant ID to V2 Tracer (#4135)
Update the V2 tracer to
- use UUIDs instead of int's
- load a tenant ID and use that when saving sessions
2023-05-04 21:35:20 -07:00
Zander Chase
fea639c1fc
Vwp/sqlalchemy (#4145)
Bump threshold to 1.4 from 1.3. Change import to be compatible

Resolves #4142 and #4129

---------

Co-authored-by: ndaugreal <ndaugreal@gmail.com>
Co-authored-by: Jeremy Lopez <lopez86@users.noreply.github.com>
2023-05-04 20:46:38 -07:00
Zander Chase
2f087d63af
Fix Python RePL Tool (#4137)
Filter out kwargs from inferred schema when determining if a tool is
single input.

Add a couple unit tests.

Move tool unit tests to the tools dir
2023-05-04 20:31:16 -07:00
Zander Chase
cc068f1b77
Add Issue Templates (#4021)
Add issue templates for
- bug reports
- feature suggestions
- documentation
and a link to the discord for general discussion.

Open to other suggestions here. Could also add another "Other" template
with just a raw text box if we think this is too restrictive


<img width="1464" alt="image"
src="https://user-images.githubusercontent.com/130414180/236115358-e603bcbe-282c-40c7-82eb-905eb93ccec0.png">
2023-05-04 16:33:52 -07:00
Zander Chase
ac0a9d02bd
Visual Studio Code/Github Codespaces Dev Containers (#4035) (#4122)
Having dev containers makes its easier, faster and secure to setup the
dev environment for the repository.

The pull request consists of:

- .devcontainer folder with:
- **devcontainer.json :** (minimal necessary vscode extensions and
settings)
- **docker-compose.yaml :** (could be modified to run necessary services
as per need. Ex vectordbs, databases)
    - **Dockerfile:**(non root with dev tools)
- Changes to README - added the Open in Github Codespaces Badge - added
the Open in dev container Badge

Co-authored-by: Jinto Jose <129657162+jj701@users.noreply.github.com>
2023-05-04 11:37:00 -07:00
Harrison Chase
d86ed15d88
bump version to 158 (#4091) 2023-05-04 09:14:47 -07:00
OlajideOgun
624554a43a
DeepLake: Pass in rest of args to self._search_helper (#4080)
As of right now when trying to use functions like
`max_marginal_relevance_search()` or
`max_marginal_relevance_search_by_vector()` the rest of the kwargs are
not propagated to `self._search_helper()`. For example a user cannot
explicitly state the distance_metric they want to use when calling
`max_marginal_relevance_search`
2023-05-04 02:14:22 -07:00
Eduard van Valkenburg
6d84541ff9
fix base url (#4095)
Noticed a mistake in the base url and group vs non-group urls
2023-05-04 02:08:21 -07:00
Harrison Chase
a9c2450330
Harrison/toml loader (#4090)
Co-authored-by: Mika Ayenson <Mikaayenson@users.noreply.github.com>
2023-05-03 23:14:39 -07:00
Harrison Chase
d4cf1eb60a
Add firestore memory (#3792) (#3941)
If you have any other suggestions or feedback, please let me know.

---------

Co-authored-by: yakigac <10434946+yakigac@users.noreply.github.com>
2023-05-03 22:55:47 -07:00
Harrison Chase
fba6921b50
Harrison/one drive loader (#4081)
Co-authored-by: José Ferraz Neto <netoferraz@gmail.com>
2023-05-03 22:55:34 -07:00
golergka
bd277b5327
feat: prune summary buffer (#4004)
If the library user has to decrease the `max_token_limit`, he would
probably want to prune the summary buffer even though he haven't added
any new messages.

Personally, I need it because I want to serialise memory buffer object
and save to database, and when I load it, I may have re-configured my
code to have a shorter memory to save on tokens.
2023-05-03 22:45:48 -07:00
AndreLCanada
bf726f9d8a
Update python_repl docs (#4012)
In the example for creating a Python REPL tool under the Agent module,
the ".run" was omitted in the example. I believe this is required when
defining a Tool.
2023-05-03 22:45:32 -07:00
Mike Wang
67db495fcf
[agent] Add Spark Agent (#4020)
- added support for spark through pyspark library.
- added jupyter notebook as example.
2023-05-03 22:45:23 -07:00
Gengliang Wang
8af25867cb
Simplify HumanMessages in the quick start guide (#4026)
In the section `Get Message Completions from a Chat Model` of the quick
start guide, the HumanMessage doesn't need to include `Translate this
sentence from English to French.` when there is a system message.

Simplify HumanMessages in these examples can further demonstrate the
power of LLM.
2023-05-03 22:45:03 -07:00
Harrison Chase
087a4bd2b8
improve agent documentation (#4062) 2023-05-03 22:44:01 -07:00
rogerserper
b1446bea5f
google-serper: async + full json results + support for Google Images, Places and News (#4078)
* implemented arun, results, and aresults. Reuses aiosession if
available.
* helper tools GoogleSerperRun and GoogleSerperResults
* support for Google Images, Places and News (examples given) and
filtering based on time (e.g. past hour)
* updated docs
2023-05-03 22:35:48 -07:00
mbchang
cdea47491d
refactor: refactor dialogue examples (DialogueAgent, DialogueSimulator) (#4074)
refactor dialogue examples to have same DialogueAgent and
DialogueSimulator definitions
2023-05-03 22:32:26 -07:00
Jan Philipp Harries
657f5f259f
Added option to reduce verbosity of Deeplake integration (#4038)
The deeplake integration was/is very verbose (see e.g. [the
documentation
example](https://python.langchain.com/en/latest/use_cases/code/code-analysis-deeplake.html)
when loading or creating a deeplake dataset with only limited options to
dial down verbosity.

Additionally, the warning that a "Deep Lake Dataset already exists" was
confusing, as there is as far as I can tell no other way to load a
dataset.

This small PR changes that and introduces an explicit `verbose` argument
which is also passed to the deeplake library.

There should be minimal changes to the default output (the loading line
is printed instead of warned to make it consistent with `ds.summary()`
which also prints.
2023-05-03 22:16:27 -07:00
Davis Chase
7f8727bbcd
Router chains (#4019)
Unpolished router examples to help flesh out abstractions and use cases 
![Screenshot 2023-05-02 at 7 02 58
PM](https://user-images.githubusercontent.com/130488702/235820394-389e5584-db0b-415e-a260-2824b5555167.png)

---------

Co-authored-by: Shreya Rajpal <shreya.rajpal@gmail.com>
2023-05-03 22:02:55 -07:00
Pulkit Mehta
bbbca10704
issue#4082 base_language had wrong code comment that it was using gpt… (#4084)
…3 to tokenize text instead of gpt-2

Co-authored-by: Pulkit <pulkit.mehta@catylex.com>
2023-05-03 21:58:29 -07:00
Leonid Ganeline
6caba8e759
docs: added a link to the Google Scholar articles (#4007)
Google Scholar outputs a nice list of scientific and research articles
that use LangChain.
I added a link to the Google Scholar page to the `gallery` doc page
2023-05-03 21:54:44 -07:00
obbiondo
d18e788ee3
bugfix: return whole document when loading with ConfluenceLoader.load by label (#3980)
Method confluence.get_all_pages_by_label, returns only metadata about
documents with a certain label (such as pageId, titles, ...). To return
all documents with a certain label we need to extract all page ids given
a certain label and get pages content by these ids.

---------

Co-authored-by: Andrea Biondo <a.biondo@reply.it>
2023-05-03 21:52:05 -07:00
Harrison Chase
5f30cc8713
Harrison/knn retriever (#4083)
Co-authored-by: Yuichi Tateno (secon) <hotchpotch@users.noreply.github.com>
2023-05-03 21:21:58 -07:00
Zander Chase
65c3b146c9
Accept str or list[str] for shell (#4060)
Relax the requirements
2023-05-03 21:11:06 -07:00
Harrison Chase
5a269d3175
Harrison/media wiki xml (#4072)
Co-authored-by: Géraud de Drouas <gdedrouas@users.noreply.github.com>
2023-05-03 20:45:33 -07:00
Zeeland
c186f18aab
fix: incorrect data type when construct_path in chain (#4031)
A incorrect data type error happened when executing _construct_path in
`chain.py` as follows:

```python
Error with message replace() argument 2 must be str, not int
```

The path is always a string. But the result of `args.pop(param, "")` is
undefined.
2023-05-03 18:49:47 -07:00
engkheng
349ba88aee
Export FileChatMessageHistory (#4042) 2023-05-03 18:14:47 -07:00
Nikolas Garske
1608f5dcae
Remove pip stdout and fix typo (#4050) 2023-05-03 18:06:39 -07:00
Ivo Stranic
3b556eae44
Update deeplake example (#4055) 2023-05-03 18:03:51 -07:00
Steve Kim
9b830f437c
Deleted importing Document from document_loaders.base because Documen… (#4068)
Hi,

- Modification:
https://python.langchain.com/en/latest/modules/indexes/document_loaders/examples/arxiv.html
- Reason: In this example, the first line is unnecessary because the
Document class does not exist in the base.
- Resolves: Issue #4052

--------
P.S: This pull-request is my first time, so please let me know if I need
to correct or write more explanation.
2023-05-03 17:54:30 -07:00
hp0404
374725a715
Refactor TelegramChatLoader and FacebookChatLoader classes and add tests (#3863)
This PR includes two main changes:

- Refactor the `TelegramChatLoader` and `FacebookChatLoader` classes by
removing the dependency on pandas and simplifying the message filtering
process.

- Add test cases for the `TelegramChatLoader` and `FacebookChatLoader`
classes. This test ensures that the class correctly loads and processes
the example chat data, providing better test coverage for this
functionality.
2023-05-03 15:59:19 -07:00
Jon Saginaw
ea64b1716d
Enhancement: option to Get All Tokens with a single Blockchain Document Loader call (#3797)
The Blockchain Document Loader's default behavior is to return 100
tokens at a time which is the Alchemy API limit. The Document Loader
exposes a startToken that can be used for pagination against the API.

This enhancement includes an optional get_all_tokens param (default:
False) which will:

- Iterate over the Alchemy API until it receives all the tokens, and
return the tokens in a single call to the loader.
- Manage all/most tokenId formats (this can be int, hex16 with zero or
all the leading zeros). There aren't constraints as to how smart
contracts can represent this value, but these three are most common.

Note that a contract with 10,000 tokens will issue 100 calls to the
Alchemy API, and could take about a minute, which is why this param will
default to False. But I've been using the doc loader with these
utilities on the side, so figured it might make sense to build them in
for others to use.
2023-05-03 15:46:44 -07:00
Akash Sharma
525db1b6cb
Fixed typo leading to broken link (#4034) 2023-05-03 14:45:54 -07:00
Zander Chase
afa9d1292b
Re-Permit Partials in Tool (#4058)
Resolved issue #4053

Now that StructuredTool is a separate class, this constraint is no
longer needed.

Added/updated a unit test
2023-05-03 13:16:41 -07:00
Zander Chase
7e967aa4d5
Update Notebooks (#4051) 2023-05-03 09:31:02 -07:00
Nuno Campos
f3ec6d2449
Replace remaining usage of basellm with baselangmodel (#3981) 2023-05-02 21:52:29 -07:00
mbchang
f291fd7eed
docs: remove stdout from pip install (for gymnasium) (#3993) 2023-05-02 21:51:40 -07:00