Commit Graph

5750 Commits (506f81563f07f28e27f05d27cc4cf40411c0c01f)
 

Author SHA1 Message Date
Juan Bustos 67b6f4dc71
Update google_vertex_ai_palm.ipynb (#12715)
Fixed a typo

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** Fixed a typo on the code
  - **Issue:** the issue # it fixes (if applicable),


Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
11 months ago
Eugene Yurtsev b1caae62fd
APIChain add restrictions to domains (CVE-2023-32786) (#12747)
* Restrict the chain to specific domains by default
* This is a breaking change, but it will fail loudly upon object
instantiation -- so there should be no silent errors for users
* Resolves CVE-2023-32786
11 months ago
Erick Friis 4421ba46d7
Demo Server, Fix Timescale (#12746)
- improve demo server
- missing deps
11 months ago
Eugene Yurtsev 0e1aedb9f4
Use jinja2 sandboxing by default (#12733)
* This is an opt-in feature, so users should be aware of risks if using
jinja2.
* Regardless we'll add sandboxing by default to jinja2 templates -- this
  sandboxing is a best effort basis.
* Best strategy is still to make sure that jinja2 templates are only
loaded from trusted sources.
11 months ago
Erick Friis ab5309f6f2
template updates (#12736)
- langchain license
- add timescale vector dep to that template
11 months ago
Lance Martin 6406c53089
Update template index w/ Timescale (#12729) 11 months ago
Erick Friis 14340ee7cd
use http.client instead of urllib3 (#12660)
dep problems with requests

cloudflare debugging not worth it with urllib
11 months ago
Bagatur eee5181b7a
bump 328, exp 37 (#12722) 11 months ago
Erick Friis 3405dbbc64
dash not underscore (#12716)
template names are auto-populating with the wrong convention (with
underscores)
11 months ago
123-fake-st 8bd3ce59cd
PyPDFLoader use url in metadata source if file is a web path (#12092)
**Description:** Update `langchain.document_loaders.pdf.PyPDFLoader` to
store url in metadata (instead of a temporary file path) if user
provides a web path to a pdf

- **Issue:** Related to #7034; the reporter on that issue submitted a PR
updating `PyMuPDFParser` for this behavior, but it has unresolved merge
issues as of 20 Oct 2023 #7077
- In addition to `PyPDFLoader` and `PyMuPDFParser`, these other classes
in `langchain.document_loaders.pdf` exhibit similar behavior and could
benefit from an update: `PyPDFium2Loader`, `PDFMinerLoader`,
`PDFMinerPDFasHTMLLoader`, `PDFPlumberLoader` (I'm happy to contribute
to some/all of that, including assisting with `PyMuPDFParser`, if my
work is agreeable)
- The root cause is that the underlying pdf parser classes, e.g.
`langchain.document_loaders.parsers.pdf.PyPDFParser`, never receive
information about the url; the parsers receive a
`langchain.document_loaders.blob_loaders.blob`, which contains the pdf
contents and local file path, but not the url
- This update passes the web path directly to the parser since it's
minimally invasive and doesn't require further changes to maintain
existing behavior for local files... bigger picture, I'd consider
extending `blob` so that extra information like this can be
communicated, but that has much bigger implications on the codebase
which I think warrants maintainer input

  - **Dependencies:** None

```python
# old behavior
>>> from langchain.document_loaders import PyPDFLoader
>>> loader = PyPDFLoader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': '/var/folders/w2/zx77z1cs01s1thx5dhshkd58h3jtrv/T/tmpfgrorsi5/tmp.pdf', 'page': 0}

# new behavior
>>> from langchain.document_loaders import PyPDFLoader
>>> loader = PyPDFLoader('https://arxiv.org/pdf/1706.03762.pdf')
>>> docs = loader.load()
>>> docs[0].metadata
{'source': 'https://arxiv.org/pdf/1706.03762.pdf', 'page': 0}
```
11 months ago
Dave Kwon b1954aab13
feat: Add page metadata on PDFMinerLoader (#12277)
- **Description:** #12273 's suggestion PR
Like other PDFLoader, loading pdf per each page and giving page
metadata.
  - **Issue:** #12273 
  - **Twitter handle:** @blue0_0hope

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
11 months ago
Duda Nogueira 7148f3e1fe
Weaviate - Fix schema existence check (#12711)
This will allow you create the schema beforehand. The check was failing
and preventing importing into existing classes.

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
11 months ago
Sayandip 8dbbcf0b6c
Adding a template for Solo Performance Prompting Agent (#12627)
**Description:** This template creates an agent that transforms a single
LLM into a cognitive synergist by engaging in multi-turn
self-collaboration with multiple personas.
**Tag maintainer:** @hwchase17

---------

Co-authored-by: Sayandip Sarkar <sayandip.sarkar@skypointcloud.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
11 months ago
Aidos Kanapyanov ae63c186af
Mask API key for Anyscale LLM (#12406)
Description: Add masking of API Key for Anyscale LLM when printed.
Issue: #12165 
Dependencies: None
Tag maintainer: @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
11 months ago
Predrag Gruevski 5ae51a8a85
Fix typo highlighted by `ruff` autoformatter. (#12691)
H/t @MichaReiser for spotting it:
https://github.com/langchain-ai/langchain/pull/12585/files#r1378253045
11 months ago
Predrag Gruevski 724b92231d
Remove `black` caching config from CI lint workflow. (#12594)
To merge after #12585 is merged.
11 months ago
Predrag Gruevski 0ea837404a
Only publish to test PyPI from the `_test_release.yml` workflow. (#12668)
PyPI trusted publishing wants to know which workflow is expected to do
the publish. We always want to publish from the same workflow, so we're
making `_test_release.yml` the only workflow that publishes to Test
PyPI.
11 months ago
Predrag Gruevski 321cd44f13
Use separate jobs for building and publishing test releases. (#12671)
This follows the principle of least privilege. Our `poetry build` step
doesn't need, and shouldn't get, access to our GitHub OIDC capability.

This is the same structure as I used in the already-merged PR for
refactoring the regular PyPI release workflow: #12578.
11 months ago
Erick Friis 44c8b159b9
properly increment version in cli (#12685)
Went from 0.0.9 -> 0.0.11 without releasing. Back to 10, then release.
11 months ago
Erick Friis b825dddf95
fix elastic rag template in playground (#12682)
- a few instructions in the readme (load_documents -> ingest.py)
- added docker run command for local elastic
- adds input type definition to render playground properly
11 months ago
Lance Martin f0eba1ac63
Add RAG input types (#12684)
Co-authored-by: Erick Friis <erick@langchain.dev>
11 months ago
Erick Friis 392cfbee24
link to templates (#12680) 11 months ago
Leonid Ganeline ddcec005bc
fix for `YahooFinanceNewsTool` (#12665)
Added YahooFinanceNewsTool to the __init__.py 
It was missed here.
11 months ago
Predrag Gruevski 09711ad5a1
Both lint and format `templates` with ruff v0.1.3. (#12676)
- Both lint and format code in `templates`.
- Upgrade to ruff v0.1.3.
11 months ago
Predrag Gruevski 01a3c9b94e
Use an in-project virtualenv in the CLI package. (#12678)
Keeping it in sync with how our other packages are configured.
11 months ago
Predrag Gruevski f7f35a9102
Use black to lint notebooks and docs for now. (#12679)
Due to #12677 having lots of errors for the time being.
11 months ago
Jacob Lee bd668fcea1
Adds version CLI command (#12619)
Will be automatically bumped with `poetry version patch`.

@efriis @hwchase17

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
11 months ago
Frank bf5805bb32
Add quip loader (#12259)
- **Description:** implement [quip](https://quip.com) loader
  - **Issue:** https://github.com/langchain-ai/langchain/issues/10352
  - **Dependencies:** No
  -  pass make format, make lint, make test

---------

Co-authored-by: Hao Fan <h_fan@apple.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
11 months ago
Roman Vasilyev c9a6940d58
PGVector fix (#12592)
latest release broken, this fixes it

---------

Co-authored-by: Roman Vasilyev <rvasilyev@mozilla.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
11 months ago
Lance Martin 9e17d1a225
Update Vertex template (#12644)
Co-authored-by: Erick Friis <erick@langchain.dev>
11 months ago
Predrag Gruevski aa3f4a9bc8
Remove the CLI package's pydantic compatibility tests. (#12675)
They aren't necessary, since the CLI package doesn't have a direct
dependency on pydantic.
11 months ago
Predrag Gruevski e8b99364b3
Use `ruff` for both linting and formatting in `langchain-cli`. (#12672)
Prior to this PR, `ruff` was used only for linting and not for
formatting, despite the names of the commands. This PR makes it be used
for both linting code and autoformatting it.
11 months ago
Harrison Chase 9a10b2b047
fix plate chain (#12673) 11 months ago
Margaret Qian acfc485808
Update MosaicML Embedding Input Key (#12657)
This input key was missed in the last update PR:
https://github.com/langchain-ai/langchain/pull/7391

The input/output formats are intended to be like this:

```
{"inputs": [<prompt>]} 

{"outputs": [<output_text>]}
```
11 months ago
Erika Cardenas d26ac5f999
Update README for Hybrid Search Weaviate (#12661)
- **Description:** Updated the README for Hybrid Search Weaviate
11 months ago
Predrag Gruevski c871cc5055
Remove `print()` statements which seemed leftover from debugging. (#12648)
Added in #12159 presumably during debugging. Right now they cause a bit of visual noise.
11 months ago
Erick Friis 2a7e0a27cb
update lc version (#12655)
also updated py version in `csv-agent` and `rag-codellama-fireworks`
because they have stricter python requirements
11 months ago
Predrag Gruevski 360cff81a3
Overwrite existing distributions when uploading to test PyPI. (#12658) 11 months ago
Lance Martin da94c750c5
Add RAG template for Timescale Vector (#12651)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Matvey Arye <mat@timescale.com>
11 months ago
Noam Gat 14e8c74736
LM Format Enforcer Integration + Sample Notebook (#12625)
## Description

This PR adds support for
[lm-format-enforcer](https://github.com/noamgat/lm-format-enforcer) to
LangChain.

![image](https://raw.githubusercontent.com/noamgat/lm-format-enforcer/main/docs/Intro.webp)

The library is similar to jsonformer / RELLM which are supported in
Langchain, but has several advantages such as
- Batching and Beam search support
- More complete JSON Schema support
- LLM has control over whitespace, improving quality
- Better runtime performance due to only calling the LLM's generate()
function once per generate() call.

The integration is loosely based on the jsonformer integration in terms
of project structure.

## Dependencies

No compile-time dependency was added, but if `lm-format-enforcer` is not
installed, a runtime error will occur if it is trying to be used.

## Tests

Due to the integration modifying the internal parameters of the
underlying huggingface transformer LLM, it is not possible to test
without building a real LM, which requires internet access. So, similar
to the jsonformer and RELLM integrations, the testing is via the
notebook.

## Twitter Handle

[@noamgat](https://twitter.com/noamgat)


Looking forward to hearing feedback!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
11 months ago
Stefano Lottini a4e4b5a86f
Relax python version and remove need for explicit setup step (#12637)
This PR addresses what seems like a unnecessary Python version
restriction in the pyroject.toml specs within both Cassandra (/Astra DB)
templates. With "^3.11" I got some version incompatibilities with the
latest "langchain add [...]" commands, so these are now relaxed in line
with the other templates I could inspect.

Incidentally, in the "entomology" template, the need for an explicit
"setup" step for the user to carry on has been removed, replaced by a
check-and-execute-if-necessary instruction on app startup.

Thank you for your attention!
11 months ago
Predrag Gruevski 5308b836c7
Upgrade to `actions/checkout@v4` in the docs lint job. (#12581) 11 months ago
Predrag Gruevski 94f018f1ba
Support release-testing packages with dashes in their names. (#12654) 11 months ago
Erick Friis 912ace18e9
fix template py verisons (#12650) 11 months ago
Brian McBrayer b74468f399
Fix small typo on Founcational -> Router notebook (#12634)
- **Description:** Fix small typo on Founcational -> Router notebook
11 months ago
Predrag Gruevski 72fa5a463d
Show ruff output inline in GitHub PRs. (#12647) 11 months ago
William FH 17c2e3b87e
Rename Template (#12649)
To chatbot feedback. Update import
11 months ago
Erick Friis 7f6e751a3d
template updates (#12646) 11 months ago
Leonid Kuligin a53cac4508
added template to use Vertex Vector Search for q&a (#12622)
added template to use Vertex Vector Search for q&a
11 months ago
Lance Martin 944cb552bb
Minor updates to READMEs (#12642) 11 months ago