Commit Graph

3246 Commits

Author SHA1 Message Date
Cristóbal Carnero Liñán
e494b0a09f
feat (documents): add a source code loader based on AST manipulation (#6486)
#### Summary

A new approach to loading source code is implemented:

Each top-level function and class in the code is loaded into separate
documents. Then, an additional document is created with the top-level
code, but without the already loaded functions and classes.

This could improve the accuracy of QA chains over source code.

For instance, having this script:

```
class MyClass:
    def __init__(self, name):
        self.name = name

    def greet(self):
        print(f"Hello, {self.name}!")

def main():
    name = input("Enter your name: ")
    obj = MyClass(name)
    obj.greet()

if __name__ == '__main__':
    main()
```

The loader will create three documents with this content:

First document:
```
class MyClass:
    def __init__(self, name):
        self.name = name

    def greet(self):
        print(f"Hello, {self.name}!")
```

Second document:
```
def main():
    name = input("Enter your name: ")
    obj = MyClass(name)
    obj.greet()
```

Third document:
```
# Code for: class MyClass:

# Code for: def main():

if __name__ == '__main__':
    main()
```

A threshold parameter is added to control whether small scripts are
split in this way or not.

At this moment, only Python and JavaScript are supported. The
appropriate parser is determined by examining the file extension.

#### Tests

This PR adds:

- Unit tests
- Integration tests

#### Dependencies

Only one dependency was added as optional (needed for the JavaScript
parser).

#### Documentation

A notebook is added showing how the loader can be used.

#### Who can review?

@eyurtsev @hwchase17

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-06-27 15:58:47 -07:00
Robert Lewis
da462d9dd4
Zapier update oauth support (#6780)
Description: Update documentation to

1) point to updated documentation links at Zapier.com (we've revamped
our help docs and paths), and
2) To provide clarity how to use the wrapper with an access token for
OAuth support

Demo:

Initializing the Zapier Wrapper with an OAuth Access Token

`ZapierNLAWrapper(zapier_nla_oauth_access_token="<redacted>")`

Using LangChain to resolve the current weather in Vancouver BC
leveraging Zapier NLA to lookup weather by coords.

```
> Entering new  chain...
 I need to use a tool to get the current weather.
Action: The Weather: Get Current Weather
Action Input: Get the current weather for Vancouver BC
Observation: {"coord__lon": -123.1207, "coord__lat": 49.2827, "weather": [{"id": 802, "main": "Clouds", "description": "scattered clouds", "icon": "03d", "icon_url": "http://openweathermap.org/img/wn/03d@2x.png"}], "weather[]icon_url": ["http://openweathermap.org/img/wn/03d@2x.png"], "weather[]icon": ["03d"], "weather[]id": [802], "weather[]description": ["scattered clouds"], "weather[]main": ["Clouds"], "base": "stations", "main__temp": 71.69, "main__feels_like": 71.56, "main__temp_min": 67.64, "main__temp_max": 76.39, "main__pressure": 1015, "main__humidity": 64, "visibility": 10000, "wind__speed": 3, "wind__deg": 155, "wind__gust": 11.01, "clouds__all": 41, "dt": 1687806607, "sys__type": 2, "sys__id": 2011597, "sys__country": "CA", "sys__sunrise": 1687781297, "sys__sunset": 1687839730, "timezone": -25200, "id": 6173331, "name": "Vancouver", "cod": 200, "summary": "scattered clouds", "_zap_search_was_found_status": true}
Thought: I now know the current weather in Vancouver BC.
Final Answer: The current weather in Vancouver BC is scattered clouds with a temperature of 71.69 and wind speed of 3
```
2023-06-27 11:46:32 -07:00
Joshua Carroll
24e4ae95ba
Initial Streamlit callback integration doc (md) (#6788)
**Description:** Add a documentation page for the Streamlit Callback
Handler integration (#6315)

Notes:
- Implemented as a markdown file instead of a notebook since example
code runs in a Streamlit app (happy to discuss / consider alternatives
now or later)
- Contains an embedded Streamlit app ->
https://mrkl-minimal.streamlit.app/ Currently this app is hosted out of
a Streamlit repo but we're working to migrate the code to a LangChain
owned repo


![streamlit_docs](https://github.com/hwchase17/langchain/assets/116604821/0b7a6239-361f-470c-8539-f22c40098d1a)

cc @dev2049 @tconkling
2023-06-27 11:43:49 -07:00
Harrison Chase
8392ca602c
bump version to 217 (#6831) 2023-06-27 09:39:56 -07:00
Ismail Pelaseyed
fcb3a64799
Add support for passing headers and search params to openai openapi chain (#6782)
- Description: add support for passing headers and search params to
OpenAI OpenAPI chains.
  - Issue: n/a
  - Dependencies: n/a
  - Tag maintainer: @hwchase17
  - Twitter handle: @pelaseyed

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-27 09:09:03 -07:00
Zander Chase
e1fdb67440
Update description in Evals notebook (#6808) 2023-06-27 00:26:49 -07:00
Zander Chase
ad028bbb80
Permit Constitutional Principles (#6807)
In the criteria evaluator.
2023-06-27 00:23:54 -07:00
Zander Chase
6ca383ecf6
Update to RunOnDataset helper functions to accept evaluator callbacks (#6629)
Also improve docstrings and update the tracing datasets notebook to
focus on "debug, evaluate, monitor"
2023-06-26 23:58:13 -07:00
WaseemH
7ac9b22886
RecusiveUrlLoader to RecursiveUrlLoader (#6787) 2023-06-26 23:12:14 -07:00
Mshoven
4535b0b41e
🎯Bug: format the url and path_params (#6755)
- Description: format the url and path_params correctly, 
  - Issue: #6753,
  - Dependencies: None,
  - Tag maintainer: @vowelparrot,
  - Twitter handle: @0xbluesecurity
2023-06-26 23:03:57 -07:00
Zander Chase
07d802d088
Don't raise error if parent not found (#6538)
Done so that you can pass in a run from the low level api
2023-06-26 22:57:52 -07:00
Leonid Ganeline
49c864fa18
docs: vectorstore upgrades 2 (#6796)
updated vectorstores/ notebooks; added new integrations into
ecosystem/integrations/
@dev2049
@rlancemartin, @eyurtsev
2023-06-26 22:55:04 -07:00
Zander Chase
d7dbf4aefe
Clean up agent trajectory interface (#6799)
- Enable reference
- Enable not specifying tools at the start
- Add methods with keywords
2023-06-26 22:54:04 -07:00
Zander Chase
cc60fed3be
Add a Pairwise Comparison Chain (#6703)
Notebook shows preference scoring between two chains and reports wilson
score interval + p value

I think I'll add the option to insert ground truth labels but doesn't
have to be in this PR
2023-06-26 20:47:41 -07:00
Hakan Tekgul
2928b080f6
Update arize_callback.py - bug fix (#6784)
- Description: Bug Fix - Added a step variable to keep track of prompts
- Issue: Bug from internal Arize testing - The prompts and responses
that are ingested were not mapped correctly
  - Dependencies: N/A
2023-06-26 16:49:46 -07:00
Zander Chase
c460b04c64
Update String Evaluator (#6615)
- Add protocol for `evaluate_strings` 
- Move the criteria evaluator out so it's not restricted to being
applied on traced runs
2023-06-26 14:16:14 -07:00
AaaCabbage
b3f8324de9
feat: fix the Chinese characters in the solution content will be conv… (#6734)
fix the Chinese characters in the solution content will be converted to
ascii encoding, resulting in an abnormally long number of tokens


Co-authored-by: qixin <qixin@fintec.ai>
2023-06-26 13:14:48 -07:00
Chris Pappalardo
70f7c2bb2e
align chroma vectorstore get with chromadb to enable where filtering (#6686)
allows for where filtering on collection via get

- Description: aligns langchain chroma vectorstore get with underlying
[chromadb collection
get](https://github.com/chroma-core/chroma/blob/main/chromadb/api/models/Collection.py#L103)
allowing for where filtering, etc.
  - Issue: NA
  - Dependencies: none
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: @pappanaka
2023-06-26 10:51:20 -07:00
Zander Chase
9ca3b4645e
Add support for tags in chain group context manager (#6668)
Lets you specify local and inheritable tags in the group manager.

Also, add more verbose docstrings for our reference docs.
2023-06-26 10:37:33 -07:00
Harrison Chase
d1bcc58beb
bump version to 216 (#6770) 2023-06-26 09:46:19 -07:00
Zander Chase
6d30acffcb
Fix breaking tags (#6765)
Fix tags change that broke old way of initializing agent

Closes #6756
2023-06-26 09:28:11 -07:00
James Croft
ba622764cb
Improve performance when retrieving Notion DB pages (#6710) 2023-06-26 05:46:09 -07:00
Richy Wang
ec8247ec59
Fixed bug in AnalyticDB Vector Store caused by upgrade SQLAlchemy version (#6736) 2023-06-26 05:35:25 -07:00
Santiago Delgado
d84a3bcf7a
Office365 Tool (#6306)
#### Background
With the development of [structured
tools](https://blog.langchain.dev/structured-tools/), the LangChain team
expanded the platform's functionality to meet the needs of new
applications. The GMail tool, empowered by structured tools, now
supports multiple arguments and powerful search capabilities,
demonstrating LangChain's ability to interact with dynamic data sources
like email servers.

#### Challenge
The current GMail tool only supports GMail, while users often utilize
other email services like Outlook in Office365. Additionally, the
proposed calendar tool in PR
https://github.com/hwchase17/langchain/pull/652 only works with Google
Calendar, not Outlook.

#### Changes
This PR implements an Office365 integration for LangChain, enabling
seamless email and calendar functionality with a single authentication
process.

#### Future Work
With the core Office365 integration complete, future work could include
integrating other Office365 tools such as Tasks and Address Book.

#### Who can review?
@hwchase17 or @vowelparrot can review this PR

#### Appendix
@janscas, I utilized your [O365](https://github.com/O365/python-o365)
library extensively. Given the rising popularity of LangChain and
similar AI frameworks, the convergence of libraries like O365 and tools
like this one is likely. So, I wanted to keep you updated on our
progress.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-26 02:59:09 -07:00
Xiaochao Dong
a15afc102c
Relax the action input check for actions that require no input (#6357)
When the tool requires no input, the LLM often gives something like
this:
```json
{
    "action": "just_do_it"
}
```
I have attempted to enhance the prompt, but it doesn't appear to be
functioning effectively. Therefore, I believe we should consider easing
the check a little bit.



Signed-off-by: Xiaochao Dong (@damnever) <the.xcdong@gmail.com>
2023-06-26 02:30:17 -07:00
Ethan Bowen
cc33bde74f
Confluence added (#6432)
Adding Confluence to Jira tool. Can create a page in Confluence with
this PR. If accepted, will extend functionality to Bitbucket and
additional Confluence features.



---------

Co-authored-by: Ethan Bowen <ethan.bowen@slalom.com>
2023-06-26 02:28:04 -07:00
Surya Nudurupati
2aeb8e7dbc
Improved Documentation: Eliminating Redundancy in the Introduction.mdx (#6360)
When the documentation was originally written there was a redundant
typing of the word "using the"
2023-06-26 02:27:36 -07:00
rajib
0f6ef048d2
The openai_info.py does not have gpt-35-turbo which is the underlying Azure Open AI model name (#6321)
Since this model name is not there in the list MODEL_COST_PER_1K_TOKENS,
when we use get_openai_callback(), for gpt 3.5 model in Azure AI, we do
not get the cost of the tokens. This will fix this issue


#### Who can review?
 @hwchase17
 @agola11

Co-authored-by: rajib76 <rajib76@yahoo.com>
Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-26 02:16:39 -07:00
ArchimedesFTW
fe941cb54a
Change tags(str) to tags(dict) in mlflow_callback.py docs (#6473)
Fixes #6472

#### Who can review?

@agola11
2023-06-26 02:12:23 -07:00
0xcrusher
9187d2f3a9
Fixed caching bug for Multiple Caching types by correctly checking types (#6746)
- Fixed an issue where some caching types check the wrong types, hence
not allowing caching to work


Maintainer responsibilities:
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
2023-06-26 01:14:32 -07:00
Harrison Chase
e9877ea8b1
Tiktoken override (#6697) 2023-06-26 00:49:32 -07:00
Gabriel Altay
f9771700e4
prevent DuckDuckGoSearchAPIWrapper from consuming top result (#6727)
remove the `next` call that checks for None on the results generator
2023-06-25 19:54:15 -07:00
Pau Ramon Revilla
87802c86d9
Added a MHTML document loader (#6311)
MHTML is a very interesting format since it's used both for emails but
also for archived webpages. Some scraping projects want to store pages
in disk to process them later, mhtml is perfect for that use case.

This is heavily inspired from the beautifulsoup html loader, but
extracting the html part from the mhtml file.

---------

Co-authored-by: rlm <pexpresss31@gmail.com>
2023-06-25 13:12:08 -07:00
Janos Tolgyesi
05eec99269
beautifulsoup get_text kwargs in WebBaseLoader (#6591)
# beautifulsoup get_text kwargs in WebBaseLoader

- Description: this PR introduces an optional `bs_get_text_kwargs`
parameter to `WebBaseLoader` constructor. It can be used to pass kwargs
to the downstream BeautifulSoup.get_text call. The most common usage
might be to pass a custom text separator, as seen also in
`BSHTMLLoader`.
  - Tag maintainer: @rlancemartin, @eyurtsev
  - Twitter handle: jtolgyesi
2023-06-25 12:42:27 -07:00
Matt Robinson
be68f6f8ce
feat: Add UnstructuredRSTLoader (#6594)
### Summary

Adds an `UnstructuredRSTLoader` for loading
[reStructuredText](https://en.wikipedia.org/wiki/ReStructuredText) file.

### Testing

```python
from langchain.document_loaders import UnstructuredRSTLoader

loader = UnstructuredRSTLoader(
    file_path="example_data/README.rst", mode="elements"
)
docs = loader.load()
print(docs[0])
```

### Reviewers

- @hwchase17 
- @rlancemartin 
- @eyurtsev
2023-06-25 12:41:57 -07:00
Chip Davis
b32cc01c9f
feat: added tqdm progress bar to UnstructuredURLLoader (#6600)
- Description: Adds a simple progress bar with tqdm when using
UnstructuredURLLoader. Exposes new paramater `show_progress_bar`. Very
simple PR.
- Issue: N/A
- Dependencies: N/A
- Tag maintainer: @rlancemartin @eyurtsev

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-25 12:41:25 -07:00
Augustine Theodore
afc292e58d
Fix WhatsAppChatLoader : Enable parsing additional formats (#6663)
- Description: Updated regex to support a new format that was observed
when whatsapp chat was exported.
  - Issue: #6654
  - Dependencies: No new dependencies
  - Tag maintainer: @rlancemartin, @eyurtsev
2023-06-25 12:08:43 -07:00
Sumanth Donthula
3e30a5d967
updated sql_database.py for returning sorted table names. (#6692)
Added code to get the tables info in sorted order in methods
get_usable_table_names and get_table_info.

Linked to Issue: #6640
2023-06-25 12:04:24 -07:00
刘 方瑞
9d1b3bab76
Fix Typo in LangChain MyScale Integration Doc (#6705)
<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @dev2049
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @dev2049
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @vowelparrot
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

- Description: Fix Typo in LangChain MyScale Integration  Doc

@hwchase17
2023-06-25 11:54:00 -07:00
sudolong
408c8d0178
fix chroma _similarity_search_with_relevance_scores missing kwargs … (#6708)
Issue: https://github.com/hwchase17/langchain/issues/6707
2023-06-25 11:53:42 -07:00
Zander Chase
d89e10d361
Fix Multi Functions Agent Tracing (#6702)
Confirmed it works now:
https://dev.langchain.plus/public/0dc32ce0-55af-432e-b09e-5a1a220842f5/r
2023-06-25 10:39:04 -07:00
Harrison Chase
1742db0c30
bump version to 215 (#6719) 2023-06-25 08:52:51 -07:00
Ankush Gola
e1b801be36
split up batch llm calls into separate runs (#5804) 2023-06-24 21:03:31 -07:00
Davis Chase
1da99ce013
bump v214 (#6694) 2023-06-24 14:23:11 -07:00
Lance Martin
dd36adc0f4
Make bs4 a local import in recursive_url_loader.py (#6693)
Resolve https://github.com/hwchase17/langchain/issues/6679
2023-06-24 13:54:10 -07:00
Harrison Chase
ef4c7b54ef
bump to version 213 (#6688) 2023-06-24 11:56:37 -07:00
UmerHA
068142fce2
Add caching to BaseChatModel (issue #1644) (#5089)
#  Add caching to BaseChatModel
Fixes #1644

(Sidenote: While testing, I noticed we have multiple implementations of
Fake LLMs, used for testing. I consolidated them.)

## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
Models
- @hwchase17
- @agola11

Twitter: [@UmerHAdil](https://twitter.com/@UmerHAdil) | Discord:
RicChilligerDude#7589

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-24 11:45:09 -07:00
Harrison Chase
c289cc891a
Harrison/optional ids opensearch (#6684)
Co-authored-by: taekimsmar <66041442+taekimsmar@users.noreply.github.com>
2023-06-24 09:19:57 -07:00
Hrag Balian
2518e6c95b
Session deletion method in motorhead memory (#6609)
Motorhead Memory module didn't support deletion of a session. Added a
method to enable deletion.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-23 21:27:42 -07:00
Baichuan Sun
9fbe346860
Amazon API Gateway hosted LLM (#6673)
This PR adds a new LLM class for the Amazon API Gateway hosted LLM. The
PR also includes example notebooks for using the LLM class in an Agent
chain.

---------

Co-authored-by: Dev 2049 <dev.dev2049@gmail.com>
2023-06-23 21:27:25 -07:00