Commit Graph

276 Commits (c24c871d8819b0814da0bb0fde93f38685218752)

Author SHA1 Message Date
Bagatur b10cefb160
lint fix: rm init (#12374) 8 months ago
Kishan Kumar Rai cae6f611d3
Fix Typo in CONTRIBUTING.md (#12320)
I have corrected the typos, grammar, and formatting issues.
8 months ago
Bagatur ab3c124ffb
Add dev guide to docs(#12291)
copy CONTRIBUTING.md to docs
8 months ago
Bagatur 2008a6438c
add experimental test release gha (#12229) 8 months ago
Bagatur 8ba97cb408
separate compile integration tests (#12171)
Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>
8 months ago
Priyanshu Prajapati 283a3ecc9c
Create CODE_OF_CONDUCT.md (#12105)
code of conduct.md file is missing it is generally present in good repos
which have large community

Replace this entire comment with:
- **Description:** Added a `code_of_conduct.md` file to the repository
to establish community standards and guidelines for contributors.
- **Issue:** N/A
- **Dependencies:** N/A
- **Tag maintainer:** N/A

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
8 months ago
Deepanshu ff79a99825
Fix Typo in CONTRIBUTING.md file (#12145)
Fix Type & add suitable pronoun in CONTRIBUTING.md file


Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
8 months ago
Bagatur 963ff93476
bump 321 (#12161) 8 months ago
Predrag Gruevski 95a1b598fe
Update to `actions/checkout@v4`. (#11951)
We don't use any of the new functionality at the moment. Just making
sure we don't fall back on versions and fail to benefit from new
patches. This is an easy upgrade and it's always harder to upgrade
across multiple major versions at once.
8 months ago
Bagatur 85302a9ec1
Add CI check that integration tests compile (#12090) 8 months ago
Leonid Ganeline ea0982eede
update CONTRIBUTING.md (#11872)
Adding description of the `View deployment` button on the PR page. This
nice feature was not documented.

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
8 months ago
Bagatur bd74eba152
add azure openai sched tests (#11723) 9 months ago
Bagatur eedfddac2d
Restructure docs (#11620) 9 months ago
Nuno Campos 2c11302598
Update langchain_release.yml (#11444) 9 months ago
Predrag Gruevski 88c5349196
Revert "Rm additional file check for scheduled tests (#11192)" (#11297)
This reverts commit ff90bb59bf.

Requires #11296 to merge first.
9 months ago
Predrag Gruevski 37f2f71156
Trigger Docker release workflow after new langchain release is made. (#11290)
We want to publish a new Docker image after a new langchain Python
package version is published.
9 months ago
Predrag Gruevski d21dd72d64
Upgrade CI workflows to poetry 1.6.1. (#11344) 9 months ago
Eugene Yurtsev 2343302fc6
Remove langserve from langchain repo (#11288)
LangServe has been moved to a separate repo
9 months ago
CG80499 943e4f30d8
Add scoring chain (#11123)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
9 months ago
Bagatur 38d5b63a10
Bedrock scheduled tests (#11194) 9 months ago
Nuno Campos 32a8b311eb
Add base docker image and ci script for building and pushing (#10927) 9 months ago
Kazuki Maeda a363ab5292
rename repo namespace to langchain-ai (#11259)
### Description
renamed several repository links from `hwchase17` to `langchain-ai`.

### Why
I discovered that the README file in the devcontainer contains an old
repository name, so I took the opportunity to rename the old repository
name in all files within the repository, excluding those that do not
require changes.

### Dependencies
none

### Tag maintainer
@baskaryan

### Twitter handle
[kzk_maeda](https://twitter.com/kzk_maeda)
9 months ago
Eugene Yurtsev aebdb1ad01
Ignore aadd (#11235) 9 months ago
Bagatur ff90bb59bf
Rm additional file check for scheduled tests (#11192)
cc @obi1kenobi Causing issues with GHA creds
https://github.com/langchain-ai/langchain/actions/runs/6342674950/job/17228926776
9 months ago
Bagatur 3508e582f1
add anthropic scheduled tests and unit tests (#11188) 9 months ago
Eugene Yurtsev 176d71dd85
LangServe: Add release workflow (#11178)
Add release workflow to langserve
9 months ago
Eugene Yurtsev b05bb9e136
LangServe (#11046)
Adds LangServe package

* Integrate Runnables with Fast API creating Server and a RemoteRunnable
client
* Support multiple runnables for a given server
* Support sync/async/batch/abatch/stream/astream/astream_log on the
client side (using async implementations on server)
* Adds validation using annotations (relying on pydantic under the hood)
-- this still has some rough edges -- e.g., open api docs do NOT
generate correctly at the moment
* Uses pydantic v1 namespace

Known issues: type translation code doesn't handle a lot of types (e.g.,
TypedDicts)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
9 months ago
Bagatur 040d436b3f
Add vertex scheduled test (#10958) 9 months ago
C.J. Jameson b4d2663beb
CONTRIBUTING.md Quick Start: focus on langchain core; clarify docs and experimental are separate (#10906)
follow up to https://github.com/langchain-ai/langchain/pull/7959 ,
explaining better to focus just on langchain core

no dependencies

twitter @cjcjameson
9 months ago
Bagatur dccc20b402
add model feat table (#10921) 9 months ago
Nino Risteski d0070040da
Update CONTRIBUTING.md (#10700)
fiixed few typos
9 months ago
Harrison Chase 2c957de2fc
add checks on basic base modules (#10693) 9 months ago
Harrison Chase 5442d2b1fa
Harrison/stop importing from init (#10690) 9 months ago
Leonid Ganeline db3369272a
fixed PR template (#10515)
@hwchase17
10 months ago
Predrag Gruevski ccb9e3ee2d
Install dev, lint, test, typing extra deps for linting steps. (#10249)
`mypy` cannot type-check code that relies on dependencies that aren't
installed.

Eventually we'll probably want to install as many optional dependencies
as possible. However, the full "extended deps" setup for langchain
creates a 3GB cache file and takes a while to unpack and install. We'll
probably want something a bit more targeted.

This is a first step toward something better.
10 months ago
Predrag Gruevski 82d5d4d0ae
Deny creating files as a result of test runs. (#10253)
A test file was accidentally dropping a `results.json` file in the
current working directory as a result of running `make test`.

This is undesirable, since we don't want to risk accidentally adding
stray files into the repo if we run tests locally and then do `git add
.` without inspecting the file list very closely.
10 months ago
Predrag Gruevski 7fe8bf03a0
Final poetry action fix: manually recreate softlinks broken by caching. (#10250)
It seems the caching action was not always correctly recreating
softlinks. At first glance, the softlinks it created seemed fine, but
they didn't always work. Possibly hitting some kind of underlying bug,
but not particularly worth debugging in depth -- we can manually create
the soft links we need.
10 months ago
Predrag Gruevski 619516260d
Re-enable poetry binary caching with fix and more logging. (#10244)
- Revert "Temporarily disable step that seems to be transiently failing.
(#10234)"
- Refresh shell hashtable and show poetry/python location and version.
10 months ago
Predrag Gruevski 803be5b986
Run CI when CI infra itself has changed. (#10239)
Make sure that changes to CI infrastructure get tested on CI before
being merged.

Without this PR, changes to the poetry setup action don't trigger a CI
run and in principle could break `master` when merged.
10 months ago
Predrag Gruevski e34ad6fefd
Temporarily disable step that seems to be transiently failing. (#10234) 10 months ago
Harrison Chase c0518be1f1
fix syntax (#10155) 10 months ago
maks-operlejn-ds a8f804a618
Add data anonymizer (#9863)
### Description

The feature for anonymizing data has been implemented. In order to
protect private data, such as when querying external APIs (OpenAI), it
is worth pseudonymizing sensitive data to maintain full privacy.

Anonynization consists of two steps:

1. **Identification:** Identify all data fields that contain personally
identifiable information (PII).
2. **Replacement**: Replace all PIIs with pseudo values or codes that do
not reveal any personal information about the individual but can be used
for reference. We're not using regular encryption, because the language
model won't be able to understand the meaning or context of the
encrypted data.

We use *Microsoft Presidio* together with *Faker* framework for
anonymization purposes because of the wide range of functionalities they
provide. The full implementation is available in `PresidioAnonymizer`.

### Future works

- **deanonymization** - add the ability to reverse anonymization. For
example, the workflow could look like this: `anonymize -> LLMChain ->
deanonymize`. By doing this, we will retain anonymity in requests to,
for example, OpenAI, and then be able restore the original data.
- **instance anonymization** - at this point, each occurrence of PII is
treated as a separate entity and separately anonymized. Therefore, two
occurrences of the name John Doe in the text will be changed to two
different names. It is therefore worth introducing support for full
instance detection, so that repeated occurrences are treated as a single
object.

### Twitter handle
@deepsense_ai / @MaksOpp

---------

Co-authored-by: MaksOpp <maks.operlejn@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
10 months ago
Predrag Gruevski 9aaa0fdce0 Use unified Python setup steps for release workflow. 10 months ago
XUEYANZ f97d3a76e7
Update CONTRIBUTING.md (#9817)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
 -->

Hi LangChain :) Thank you for such a great project! 
I was going through the CONTRIBUTING.md and found a few minor issues.
10 months ago
Predrag Gruevski c06f34fa35
Use new Python setup approach for scheduled tests. (#9626)
Using the same new unified Python setup as the regular tests and the
lint job, as set up in #9625.
10 months ago
Predrag Gruevski 83986ea98a
Cache poetry install + unify Python/Poetry setup for lint and test jobs. (#9625)
With this PR:
- All lint and test jobs use the exact same Python + Poetry installation
approach, instead of lints doing it one way and tests doing it another
way.
- The Poetry installation itself is cached, which saves ~15s per run.
- We no longer pass shell commands as workflow arguments to a workflow
that just runs them in a shell. This makes our actions more resilient to
shell code injection.

If y'all like this approach, I can modify the scheduled tests workflow
and the release workflow to use this too.
10 months ago
seamusp f3ba9ce7f4
Remove -E all from installation instructions (#9573)
Update installation instructions to only install test dependencies rather than all dependencies.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
10 months ago
Predrag Gruevski 35812d0096
Set up concurrency groups and workflow cancelation in CI. (#9564)
If another push to the same PR or branch happens while its CI is still
running, cancel the earlier run in favor of the next run.

There's no point in testing an outdated version of the code. GitHub only
allows a limited number of job runners to be active at the same time, so
it's better to cancel pointless jobs early so that more useful jobs can
run sooner.
10 months ago
Predrag Gruevski 3c7cc4d440
Test experimental package with `langchain` on `master` branch. (#9621)
It's possible that langchain-experimental works fine with the latest
*published* langchain, but is broken with the langchain on `master`.
Unfortunately, you can see this is currently the case — this is why this
PR also includes a minor fix for the `langchain` package itself.

We want to catch situations like that *before* releasing a new
langchain, hence this test.
10 months ago
Predrag Gruevski acb54d8b9d
Reduce cache timeouts to ensure faster builds on timeout. (#9619)
The current timeouts are too long, and mean that if the GitHub cache
decides to act up, jobs get bogged down for 15min at a time. This has
happened 2-3 times already this week -- a tiny fraction of our total
workflows but really annoying when it happens to you. We can do better.

Installing deps on cache miss takes about ~4min, so it's not worth
waiting more than 4min for the deps cache. The black and mypy caches
save 1 and 2min, respectively, so wait only up to that long to download
them.
10 months ago
Predrag Gruevski a1e89aa8d5
Explicitly add the `contents: write` permission for publishing releases. (#9617) 10 months ago
Predrag Gruevski c75e1aa5ed
Eliminate special-casing from test CI workflows. (#9562)
The previous approach was relying on `_test.yml` taking an input
parameter, and then doing almost completely orthogonal things for each
parameter value. I've separated out each of those test situations as its
own job or workflow file, which eliminated all the special-casing and,
in my opinion, improved maintainability by making it much more obvious
what code runs when.
10 months ago
Predrag Gruevski 6c308aabae
Use the GitHub-suggested safer pattern for shell interpolation. (#9567)
Using `${{ }}` to construct shell commands is risky, since the `${{ }}`
interpolation runs first and ignores shell quoting rules. This means
that shell commands that look safely quoted, like `echo "${{
github.event.issue.title }}"`, are actually vulnerable to shell
injection.

More details here:
https://github.blog/2023-08-09-four-tips-to-keep-your-github-actions-workflows-secure/
10 months ago
Predrag Gruevski 2a3758a98e
Reminder to not report security issues as "bug" type issues. (#9554)
Updated the issue template that pops up when users open a new issue.
10 months ago
Predrag Gruevski 9f08d29bc8
Use PyPI Trusted Publishing to publish langchain packages. (#9467)
Trusted Publishing is the current best practice for publishing Python
packages. Rather than long-lived secret keys, it uses OpenID Connect
(OIDC) to allow our GitHub runner to directly authenticate itself to
PyPI and get a short-lived publishing token. This locks down publishing
quite a bit:
- There's no long-lived publish key to steal anymore.
- Publishing is *only* allowed via the *specifically designated* GitHub
workflow in the designated repo.

It also is operationally easier: no keys means there's nothing that
needs to be periodically rotated, nothing to worry about leaking, and
nobody can accidentally publish a release from their laptop because they
happened to have PyPI keys set up.

After this gets merged, we'll need to configure PyPI to start expecting
trusted publishing. It's only a few clicks and should only take a
minute; instructions are here:
https://docs.pypi.org/trusted-publishers/adding-a-publisher/

More info:
- https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/
- https://github.com/pypa/gh-action-pypi-publish
10 months ago
Predrag Gruevski 249752e8ee
Require manually triggering release workflows. (#9552) 10 months ago
Predrag Gruevski 875ea4b4c6
Fix conditional that erroneously always runs. (#9543)
The input it means to test for is `"libs/langchain"` and not
`"langchain"`.
10 months ago
Predrag Gruevski a7eba8b006
Release on push to `master` instead of on closed PRs targeting it. (#9544)
This is safer than the prior approach, since it's safe by default: the
release workflows never get triggered for non-merged PRs, so there's no
possibility of a buggy conditional accidentally letting a workflow
proceed when it shouldn't have.

The only loss is that publishing no longer requires a `release` label on
the merged PR that bumps the version. We can add a separate CI step that
enforces that part as a condition for merging into `master`, if
desirable.
10 months ago
Predrag Gruevski a03003f5fd
Upgrade CI poetry version to 1.5.1. (#9479)
Poetry v1.5.1 was released on May 29, almost 3 months ago. Probably a
safe upgrade.
10 months ago
Yuki Miyake 85a1c6d0b7
🐛 fix unexpected run of release workflow (#9494)
I have discovered a bug located within `.github/workflows/_release.yml`
which is the primary cause of continuous integration (CI) errors. The
problem can be solved; therefore, I have constructed a PR to address the
issue.

## The Issue

Access the following link to view the exact errors: [Langhain Release
Workflow](https://github.com/langchain-ai/langchain/actions/workflows/langchain_release.yml)

The instances of these errors take place for **each PR** that updates
`pyproject.toml`, excluding those specifically associated with bumping
PRs.

See below for the specific error message:

```
Error: Error 422: Validation Failed: {"resource":"Release","code":"already_exists","field":"tag_name"}
```

An image of the error can be viewed here:

![Image](https://github.com/langchain-ai/langchain/assets/13769670/13125f73-9b53-49b7-a83e-653bb01a1da1)

The `_release.yml` document contains the following if-condition:

```yaml
    if: |
        ${{ github.event.pull_request.merged == true }}
        && ${{ contains(github.event.pull_request.labels.*.name, 'release') }}
```

## The Root Cause

The above job constantly runs as the `if-condition` is always identified
as `true`.

## The Logic

The `if-condition` can be defined as `if: ${{ b1 }} && ${{ b2 }}`, where
`b1` and `b2` are boolean values. However, in terms of condition
evaluation with GitHub Actions, `${{ false }}` is identified as a string
value, thereby rendering it as truthy as per the [official
documentation](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idif).

I have run some tests regarding this behavior within my forked
repository. You can consult my [debug
PR](https://github.com/zawakin/langchain/pull/1) for reference.

Here is the result of the tests:

|If-Condition|Outcome|
|:--:|:--:|
|`if: true && ${{ false }}`|Execution|
|`if: ${{ false }}` |Skipped|
|`if: true && false` |Skipped|
|`if: false`|Skipped|
|`if: ${{ true && false }}` |Skipped|

In view of the first and second results, we can infer that `${{ false
}}` can only be interpreted as `true` for conditions composed of some
expressions.
It is consistent that the condition of `if: ${{ inputs.working-directory
== 'libs/langchain' }}` works.

It is surprised to be skipped for the second case but it seems the spec
of GitHub Actions 😓

Anyway, the PR would fix these errors, I believe 👍 

Could you review this? @hwchase17 or @shoelsch , who is the author of
[PR](https://github.com/langchain-ai/langchain/pull/360).
10 months ago
Predrag Gruevski ade683c589
Rely on `WORKDIR` env var to avoid ugly ternary operators in workflows. (#9456)
Ternary operators in GitHub Actions syntax are pretty ugly and hard to
read: `inputs.working-directory == '' && '.' ||
inputs.working-directory` means "if the condition is true, use `'.'` and
otherwise use the expression after the `||`".

This PR performs the ternary as few times as possible, assigning its
outcome to an env var we can then reuse as needed.
10 months ago
Predrag Gruevski 8976483f3a
Lint only on the min and max supported Python versions. (#9450)
Only lint on the min and max supported Python versions.

It's extremely unlikely that there's a lint issue on any version in
between that doesn't show up on the min or max versions.

GitHub rate-limits how many jobs can be running at any one time.
Starting new jobs is also relatively slow, so linting on fewer versions
makes CI faster.
10 months ago
Predrag Gruevski 463019ac3e
Cache black formatting information across CI runs. (#9413)
Save and persist `black`'s formatted files cache across CI runs.

Around a ~20s win, 21s -> 2s. Most cases should be close to this best
case scenario, since most PRs don't modify most files — and this PR
makes sure we don't re-check files that haven't changed.

Before:

![image](https://github.com/langchain-ai/langchain/assets/2348618/6c5670c5-be70-4a18-aa2a-ece5e4425d1e)

After:

![image](https://github.com/langchain-ai/langchain/assets/2348618/37810d27-c611-4f76-b9bd-e827cefbaa0a)
10 months ago
Predrag Gruevski 0dd2c21089
Do not bust `poetry install` cache when manually installing pydantic v2. (#9407)
Using `poetry add` to install `pydantic@2.1` was also causing poetry to
change its lockfile. This prevented dependency caching from working:
- When attempting to restore a cache, it would hash the lockfile in git
and use it as part of the cache key. Say this is a cache miss.
- Then, it would attempt to save the cache -- but the lockfile will have
changed, so the cache key would be *different* than the key in the
lookup. So the cache save would succeed, but to a key that cannot be
looked up in the next run -- meaning we never get a cache hit.

In addition to busting the cache, the lockfile update itself is also
non-trivially long, over 30s:

![image](https://github.com/langchain-ai/langchain/assets/2348618/d84d3b56-484d-45eb-818d-54126a094a40)

This PR fixes the problems by using `pip` to perform the installation,
avoiding the lockfile change.
10 months ago
Predrag Gruevski 8f2d321dd0
Cache .mypy_cache across lint runs. (#9405)
Preserve the `.mypy_cache` directory across lint runs, to avoid having
to re-parse all dependencies and their type information.

Approximately a 1min perf win for CI.

Before:

![image](https://github.com/langchain-ai/langchain/assets/2348618/6524f2a9-efc0-4588-a94c-69914b98b382)

After:

![image](https://github.com/langchain-ai/langchain/assets/2348618/dd0af954-4dc9-43d3-8544-25846616d41d)
10 months ago
Predrag Gruevski 7e63270e04
Ensure the in-project venv gets cached in CI tests. (#9336)
The previous caching configuration was attempting to cache poetry venvs
created in the default shared virtualenvs directory. However, all
langchain packages use `in-project = true` for their poetry virtualenv
setup, which moves the venv inside the package itself instead. This
meant that poetry venvs were not being cached at all.

This PR ensures that the venv gets cached by adding the in-project venv
directory to the cached directories list.

It also makes sure that the cache key *only* includes the lockfile being
installed, as opposed to *all lockfiles* (unnecessary cache misses) or
just the *top-level lockfile* (cache hits when it shouldn't).
10 months ago
Predrag Gruevski f2560188ec
Cache linting venv on CI. (#9342)
Ensure that we cache the linting virtualenv as well as the pip cache for
the `pip install -e langchain` step.

This is a win of about 60-90s overall.

Before:

![image](https://github.com/langchain-ai/langchain/assets/2348618/f55f8398-2c3a-4112-bad3-2c646d186183)

After:

![image](https://github.com/langchain-ai/langchain/assets/2348618/984a9529-2431-41b4-97e5-7f5dd7742651)
10 months ago
Bagatur 995ef8a7fc
unpin pydantic (#9356) 10 months ago
Bagatur 3eccd72382
pin pydantic (#9274)
don't want default to be v2 yet
11 months ago
Eugene Yurtsev a091b4bf4c
Update testing workflow to test with both pydantic versions (#9206)
* PR updates test.yml to test with both pydantic versions
* Code should be refactored to make it easier to do testing in matrix
format w/ packages
* Added steps to assert that pydantic version in the environment is as
expected
11 months ago
Bagatur 641cb80c9d
update pr temp (#9062) 11 months ago
Bagatur 206f809366
fix sched ci (more) (#9056) 11 months ago
Bagatur e5db8a16c0
Bagatur/fix sched (#9054) 11 months ago
Bagatur e162fd418a
fix sched ci (#9053) 11 months ago
Bagatur 269f85b7b7
scheduled gha fix (#8977) 11 months ago
Bagatur 95cf7de112
scheduled tests GHA (#8879)
Adding scheduled daily GHA that runs marked integration tests. To start
just marking some tests in test_openai
11 months ago
Harrison Chase 2448043b84
bump and fix (#8441) 11 months ago
Harrison Chase cddd8ae83d
update release yml (#8364)
only do the step that tags and adds release notes if its langchain
11 months ago
William FH 01a9b06400
Add api cross ref linking (#8275)
Example of how it would show up in our python docs:


![image](https://github.com/langchain-ai/langchain/assets/13333726/0f0a88cc-ba4a-4778-bc47-118c66807f15)


Examples added to the reference docs:

https://api.python.langchain.com/en/wfh-api_crosslink/vectorstores/langchain.vectorstores.chroma.Chroma.html#langchain.vectorstores.chroma.Chroma


![image](https://github.com/langchain-ai/langchain/assets/13333726/dcd150de-cb56-4d42-b49a-a76a002a5a52)
11 months ago
Harrison Chase 8dcabd9205
bump releases rc0 (#8097) 11 months ago
Harrison Chase 0faba034b1
add experimental release action (#8096) 11 months ago
Harrison Chase 344cbd9c90
update contributor guide (#8088) 11 months ago
Harrison Chase da04760de1
Harrison/move experimental (#8084) 11 months ago
Harrison Chase f35db9f43e
(WIP) set up experimental (#7959) 11 months ago
Bagatur 25a2bdfb70
add pr template instructions (#7904) 11 months ago
Yaroslav Halchenko 0d92a7f357
codespell: workflow, config + some (quite a few) typos fixed (#6785)
Probably the most  boring PR to review ;)

Individual commits might be easier to digest

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
12 months ago
os1ma 2667ddc686
Fix `make docs_build` and related scripts (#7276)
**Description: a description of the change**

Fixed `make docs_build` and related scripts which caused errors. There
are several changes.

First, I made the build of the documentation and the API Reference into
two separate commands. This is because it takes less time to build. The
commands for documents are `make docs_build`, `make docs_clean`, and
`make docs_linkcheck`. The commands for API Reference are `make
api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`.

It looked like `docs/.local_build.sh` could be used to build the
documentation, so I used that. Since `.local_build.sh` was also building
API Rerefence internally, I removed that process. `.local_build.sh` also
added some Bash options to stop in error or so. Futher more added `cd
"${SCRIPT_DIR}"` at the beginning so that the script will work no matter
which directory it is executed in.

`docs/api_reference/api_reference.rst` is removed, because which is
generated by `docs/api_reference/create_api_rst.py`, and added it to
.gitignore.

Finally, the description of CONTRIBUTING.md was modified.

**Issue: the issue # it fixes (if applicable)**

https://github.com/hwchase17/langchain/issues/6413

**Dependencies: any dependencies required for this change**

`nbdoc` was missing in group docs so it was added. I installed it with
the `poetry add --group docs nbdoc` command. I am concerned if any
modifications are needed to poetry.lock. I would greatly appreciate it
if you could pay close attention to this file during the review.

**Tag maintainer**
- General / Misc / if you don't know who to tag: @baskaryan

If this PR needs any additional changes, I'll be happy to make them!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
12 months ago
Kazuki Maeda 5c3fe8b0d1
Enhance Makefile with 'format_diff' Option and Improved Readability (#7394)
### Description:

This PR introduces a new option format_diff to the existing Makefile.
This option allows us to apply the formatting tools (Black and isort)
only to the changed Python and ipynb files since the last commit. This
will make our development process more efficient as we only format the
codes that we modify. Along with this change, comments were added to
make the Makefile more understandable and maintainable.

### Issue:

N/A

### Dependencies:

Add dependency to black.

### Tag maintainer:

@baskaryan

### Twitter handle:

[kzk_maeda](https://twitter.com/kzk_maeda)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
12 months ago
Bagatur fad2c7e5e0
update pr tmpl (#7095) 12 months ago
Davis Chase 00a7403236
update pr tmpl (#6552) 1 year ago
Brigit Murtaugh ccd916babe
Update dev container (#6189)
Fixes https://github.com/hwchase17/langchain/issues/6172

As described in https://github.com/hwchase17/langchain/issues/6172, I'd
love to help update the dev container in this project.

**Summary of changes:**
- Dev container now builds (the current container in this repo won't
build for me)
- Dockerfile updates
- Update image to our [currently-maintained Python
image](https://github.com/devcontainers/images/tree/main/src/python/.devcontainer)
(`mcr.microsoft.com/devcontainers/python`) rather than the deprecated
image from vscode-dev-containers
- Move Dockerfile to root of repo - in order for `COPY` to work
properly, it needs the files (in this case, `pyproject.toml` and
`poetry.toml`) in the same directory
- devcontainer.json updates
- Removed `customizations` and `remoteUser` since they should be covered
by the updated image in the Dockerfile
     - Update comments
- Update docker-compose.yaml to properly point to updated Dockerfile
- Add a .gitattributes to avoid line ending conversions, which can
result in hundreds of pending changes
([info](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files))
- Add a README in the .devcontainer folder and info on the dev container
in the contributing.md

**Outstanding questions:**
- Is it expected for `poetry install` to take some time? It takes about
30 minutes for this dev container to finish building in a Codespace, but
a user should only have to experience this once. Through some online
investigation, this doesn't seem unusual
- Versions of poetry newer than 1.3.2 failed every time - based on some
of the guidance in contributing.md and other online resources, it seemed
changing poetry versions might be a good solution. 1.3.2 is from Jan
2023

---------

Co-authored-by: bamurtaugh <brmurtau@microsoft.com>
Co-authored-by: Samruddhi Khandale <samruddhikhandale@github.com>
1 year ago
Davis Chase 87e502c6bc
Doc refactor (#6300)
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Zander Chase 2c9619bc1d
Remove from PR template (#6018) 1 year ago
Zander Chase 6655f43282
Rm Template Title (#5616)
Remove the redundant title from the PR template

#### Before submitting
1 year ago
Jacob Lee f77f27163d
Update PR template with Twitter handle request (#5382)
# Updates PR template to request Twitter handle for shoutouts!

Makes it easier for maintainers to show their appreciation 😄
1 year ago
Eugene Yurtsev a669abf16b
Update CONTRIBUTION guidelines and PR Template (#5140)
# Update contribution guidelines and PR template

This PR updates the contribution guidelines to include more information
on how to handle optional dependencies. 

The PR template is updated to include a link to the contribution guidelines document.
1 year ago
Davis Chase 56cb77a828
Make test gha workflow manually runnable (#4998)
if https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#workflow_dispatch
is to be believed this should make it possible to manually kick of test
workflow, but i don't know much about these things
1 year ago
Eugene Yurtsev 3ecd7c9641
Add check to verify poetry.toml (#4794)
# Add poetry check to github action

Check poetry toml file during tests for errors
1 year ago
Eugene Yurtsev 14bedf1cc5
Github Action: Fix poetry lock file checking (#4789)
Fix how poetry lock file is checked to avoid skipping caches silently.
1 year ago
Eugene Yurtsev 49ce5ce1ca
Only run linkcheck against docs dir on PR (#4741)
# Only run linkchecker on direct changes to docs

This is a stop-gap that will speed up PRs.

Some broken links can slip through if they're embedded in doc-strings
inside the codebase.

But we'll still be running the linkchecker on master.
1 year ago
Eugene Yurtsev 99cfe71cd0
Check poetry lock file (#4740)
# Check poetry lock file on CI

This PR checks that the lock file is up to date using poetry lock
--check.

As part of this PR, a new lock file was generated.
1 year ago
Eugene Yurtsev 08ed927c32
Turn on extended tests (#4588)
# Turn on strict extended tests

This PR turns on strict testing for extended tests.
1 year ago
Davis Chase cd01de49cf
Update contribution guidelines (#4431)
provide more guidance on pr's
1 year ago
Eugene Yurtsev 146616aa5d
Test workflow, fix minor typos (#4495)
# Fix 2 minor typos in test workflow.

This PR does not result in any functional changes.
1 year ago
Eugene Yurtsev f373883c1a
Refactor test workflow (#4457)
# Refactor the test workflow

This PR refactors the tests to run using a single test workflow. This
makes it easier to relaunch failing tests and see in the UI which test
failed since the jobs are grouped together.

## Before submitting

## Who can review?
1 year ago
Eugene Yurtsev 80558b5b27
Add workflow for testing with all deps (#4410)
# Add action to test with all dependencies installed

PR adds a custom action for setting up poetry that allows specifying a
cache key:
https://github.com/actions/setup-python/issues/505#issuecomment-1273013236

This makes it possible to run 2 types of unit tests: 

(1) unit tests with only core dependencies
(2) unit tests with extended dependencies (e.g., those that rely on an
optional pdf parsing library)


As part of this PR, we're moving some pdf parsing tests into the
unit-tests section and making sure that these unit tests get executed
when running with extended dependencies.
1 year ago
Zander Chase 0870a45a69
Add Pull Request Template (#4247) 1 year ago
Zander Chase cc068f1b77
Add Issue Templates (#4021)
Add issue templates for
- bug reports
- feature suggestions
- documentation
and a link to the discord for general discussion.

Open to other suggestions here. Could also add another "Other" template
with just a raw text box if we think this is too restrictive


<img width="1464" alt="image"
src="https://user-images.githubusercontent.com/130414180/236115358-e603bcbe-282c-40c7-82eb-905eb93ccec0.png">
1 year ago
Ehsan M. Kermani 5d0674fb46
Use a consistent poetry version everywhere (#3250)
Fixes the discrepancy of poetry version in Dockerfile and the GAs
1 year ago
Noah Gundotra 577ec92f16
Include testing instructions for getting setup in CONTRIBUTING.md (#3020)
Running tests is good sanity check for new users to ensure their
development environment is setup correctly.
1 year ago
sergerdn 90973c10b1
fix: tests with Dockerfile (#2382)
Update the Dockerfile to use the `$POETRY_HOME` argument to set the
Poetry home directory instead of adding Poetry to the PATH environment
variable.

Add instructions to the `CONTRIBUTING.md` file on how to run tests with
Docker.

Closes https://github.com/hwchase17/langchain/issues/2324
1 year ago
andrewmelis 7ed8d00bba
Remove extra word in CONTRIBUTING.md (#2370)
"via by a developer" -> "by a developer"

---

Thank you for all your hard work!
1 year ago
Matt Tucker fa2e546b76
Add workaround for debugpy install issue to contrib docs. (#1835)
When following the Quick Start instructions in the contributing docs, I
was getting a "WheelFileValidationError" on installation of debugpy
which was blocking the installation of a number of other deps. Google
turned up this [GitHub
issue](https://github.com/microsoft/debugpy/issues/1246) indicating a
regression in Poetry 1.4.1 and workarounds.

This PR updates the contrib docs noting the issue and the workarounds.
1 year ago
Harrison Chase b053f831cd
Harrison/contributing (#1542)
Co-authored-by: Saurav Maheshkar <sauravvmaheshkar@gmail.com>
1 year ago
Harrison Chase 012a6dfb16
Harrison/makefile (#1033)
Co-authored-by: blob42 <contact@blob42.xyz>
Co-authored-by: blob42 <spike@w530>
1 year ago
Steven Hoelscher a5999351cf
chore: add release workflow (#360)
Adds release workflow that (1) creates a GitHub release and (2)
publishes built artifacts to PyPI

**Release Workflow**
1. Checkout `master` locally and cut a new branch
1. Run `poetry version <rule>` to version bump (e.g., `poetry version
patch`)
1. Commit changes and push to remote branch
1. Ensure all quality check workflows pass
1. Explicitly tag PR with `release` label
1. Merge to mainline

At this point, a release workflow should be triggered because:
* The PR is closed, targeting `master`, and merged
* `pyproject.toml` has been detected as modified
* The PR had a `release` label

The workflow will then proceed to build the artifacts, create a GitHub
release with release notes and uploaded artifacts, and publish to PyPI.

Example Workflow run:
https://github.com/shoelsch/langchain/actions/runs/3711037455/jobs/6291076898
Example Releases: https://github.com/shoelsch/langchain/releases

--

Note, this workflow is looking for the `PYPI_API_TOKEN` secret, so that
will need to be uploaded to the repository secrets. I tested uploading
as far as hitting a permissions issue due to project ownership in Test
PyPI.
1 year ago
Harrison Chase 9753bccc71
Feature: linkcheck-action (#534) (#542)
- Add support for local build and linkchecking of docs
- Add GitHub Action to automatically check links before prior to
publication
- Minor reformat of Contributing readme
- Fix existing broken links

Co-authored-by: Hunter Gerlach <hunter@huntergerlach.com>

Co-authored-by: Hunter Gerlach <HunterGerlach@users.noreply.github.com>
Co-authored-by: Hunter Gerlach <hunter@huntergerlach.com>
1 year ago
Hunter Gerlach 482611f426
unit test / code coverage improvements (#322)
This PR has two contributions:

1. Add test for when stop token is found in middle of text

2. Add code coverage tooling and instructions
- Add pytest-cov via poetry
- Add necessary config files
- Add new make instruction for `coverage`
- Update README with coverage guidance
- Update minor README formatting/spelling

Co-authored-by: Hunter Gerlach <hunter@huntergerlach.com>
2 years ago
Christian Clauss 2fbb152386
Add Python 3.11 to the testing (#324) 2 years ago
Christian Clauss d946be2f3d
Add Python 3.11 to the testing (#323) 2 years ago
Harrison Chase 3c1c7ba672
update branch name in gha (#274) 2 years ago
Steven Hoelscher 98fb19b535
chore: use poetry as dependency manager (#242)
* Adopts [Poetry](https://python-poetry.org/) as a dependency manager
* Introduces dependency version requirements
* Deprecates Python 3.7 support

**TODO**
- [x] Update developer guide
- [x] Add back `playwright`, `manifest-ml`, and `jupyter` to dependency
group

**Not Doing => Fast Follow**
- Investigate single source for version, perhaps relying on GitHub tags
and [tackling this
issue](https://github.com/hwchase17/langchain/issues/26)
2 years ago
Predrag Gruevski 1a95252f00
Use `pull_request` not `pull_request_target` in GitHub Actions. (#139)
`pull_request` runs on the merge commit between the opened PR and the
target branch where the PR is to be merged — `master` in this case. This
is desirable because that way the new changes get linted and tested.

The existing `pull_request_target` specifier causes lint and test to run
_on the target branch itself_ (i.e. `master` in this case). That way the
new code in the PR doesn't get linted and tested at all. This can also
lead to security vulnerabilities, as described in the GitHub docs:

![image](https://user-images.githubusercontent.com/2348618/201735153-c5dd0c03-2490-45e9-b7f9-f0d47eb0109f.png)

Screenshot from here:
https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_target
Link from the screenshot:
https://securitylab.github.com/research/github-actions-preventing-pwn-requests/
2 years ago
Harrison Chase 6cff2837bb
Harrison/fix lint (#80) 2 years ago
Harrison Chase 9679bdc34c
run workflows on forks (#78)
per
https://stackoverflow.com/questions/58221321/is-github-actions-available-on-forked-repositories
2 years ago
Harrison Chase 18aeb72012 initial commit 2 years ago