langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-04 06:00:26 +00:00

Author	SHA1	Message	Date
Predrag Gruevski	88c5349196	Revert "Rm additional file check for scheduled tests (#11192 )" (#11297 ) This reverts commit `ff90bb59bf`. Requires #11296 to merge first.	2023-10-04 11:35:55 -04:00
Predrag Gruevski	37f2f71156	Trigger Docker release workflow after new langchain release is made. (#11290 ) We want to publish a new Docker image after a new langchain Python package version is published.	2023-10-04 10:27:08 -04:00
Predrag Gruevski	d21dd72d64	Upgrade CI workflows to poetry 1.6.1. (#11344 )	2023-10-03 19:23:54 -04:00
Eugene Yurtsev	2343302fc6	Remove langserve from langchain repo (#11288 ) LangServe has been moved to a separate repo	2023-10-03 10:48:35 -04:00
CG80499	943e4f30d8	Add scoring chain (#11123 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-02 15:15:31 -07:00
Bagatur	38d5b63a10	Bedrock scheduled tests (#11194 )	2023-10-02 15:21:54 -04:00
Nuno Campos	32a8b311eb	Add base docker image and ci script for building and pushing (#10927 )	2023-10-02 15:07:57 +01:00
Kazuki Maeda	a363ab5292	rename repo namespace to langchain-ai (#11259 ) ### Description renamed several repository links from `hwchase17` to `langchain-ai`. ### Why I discovered that the README file in the devcontainer contains an old repository name, so I took the opportunity to rename the old repository name in all files within the repository, excluding those that do not require changes. ### Dependencies none ### Tag maintainer @baskaryan ### Twitter handle [kzk_maeda](https://twitter.com/kzk_maeda)	2023-10-01 15:30:58 -04:00
Eugene Yurtsev	aebdb1ad01	Ignore aadd (#11235 )	2023-09-29 21:10:53 +01:00
Bagatur	ff90bb59bf	Rm additional file check for scheduled tests (#11192 ) cc @obi1kenobi Causing issues with GHA creds https://github.com/langchain-ai/langchain/actions/runs/6342674950/job/17228926776	2023-09-28 11:49:26 -07:00
Bagatur	3508e582f1	add anthropic scheduled tests and unit tests (#11188 )	2023-09-28 11:47:29 -07:00
Eugene Yurtsev	176d71dd85	LangServe: Add release workflow (#11178 ) Add release workflow to langserve	2023-09-28 13:47:55 -04:00
Eugene Yurtsev	b05bb9e136	LangServe (#11046 ) Adds LangServe package * Integrate Runnables with Fast API creating Server and a RemoteRunnable client * Support multiple runnables for a given server * Support sync/async/batch/abatch/stream/astream/astream_log on the client side (using async implementations on server) * Adds validation using annotations (relying on pydantic under the hood) -- this still has some rough edges -- e.g., open api docs do NOT generate correctly at the moment * Uses pydantic v1 namespace Known issues: type translation code doesn't handle a lot of types (e.g., TypedDicts) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2023-09-28 10:52:44 +01:00
Bagatur	040d436b3f	Add vertex scheduled test (#10958 )	2023-09-23 15:51:59 -07:00
C.J. Jameson	b4d2663beb	CONTRIBUTING.md Quick Start: focus on langchain core; clarify docs and experimental are separate (#10906 ) follow up to https://github.com/langchain-ai/langchain/pull/7959 , explaining better to focus just on langchain core no dependencies twitter @cjcjameson	2023-09-22 10:17:08 -07:00
Bagatur	dccc20b402	add model feat table (#10921 )	2023-09-22 01:10:27 -07:00
Nino Risteski	d0070040da	Update CONTRIBUTING.md (#10700 ) fiixed few typos	2023-09-17 16:35:18 -07:00
Harrison Chase	2c957de2fc	add checks on basic base modules (#10693 )	2023-09-16 22:08:11 -07:00
Harrison Chase	5442d2b1fa	Harrison/stop importing from init (#10690 )	2023-09-16 17:22:48 -07:00
Leonid Ganeline	db3369272a	fixed PR template (#10515 ) @hwchase17	2023-09-13 09:35:48 -07:00
Predrag Gruevski	ccb9e3ee2d	Install dev, lint, test, typing extra deps for linting steps. (#10249 ) `mypy` cannot type-check code that relies on dependencies that aren't installed. Eventually we'll probably want to install as many optional dependencies as possible. However, the full "extended deps" setup for langchain creates a 3GB cache file and takes a while to unpack and install. We'll probably want something a bit more targeted. This is a first step toward something better.	2023-09-06 11:15:28 -04:00
Predrag Gruevski	82d5d4d0ae	Deny creating files as a result of test runs. (#10253 ) A test file was accidentally dropping a `results.json` file in the current working directory as a result of running `make test`. This is undesirable, since we don't want to risk accidentally adding stray files into the repo if we run tests locally and then do `git add .` without inspecting the file list very closely.	2023-09-06 11:15:16 -04:00
Predrag Gruevski	7fe8bf03a0	Final poetry action fix: manually recreate softlinks broken by caching. (#10250 ) It seems the caching action was not always correctly recreating softlinks. At first glance, the softlinks it created seemed fine, but they didn't always work. Possibly hitting some kind of underlying bug, but not particularly worth debugging in depth -- we can manually create the soft links we need.	2023-09-05 15:47:58 -04:00
Predrag Gruevski	619516260d	Re-enable poetry binary caching with fix and more logging. (#10244 ) - Revert "Temporarily disable step that seems to be transiently failing. (#10234)" - Refresh shell hashtable and show poetry/python location and version.	2023-09-05 14:03:03 -04:00
Predrag Gruevski	803be5b986	Run CI when CI infra itself has changed. (#10239 ) Make sure that changes to CI infrastructure get tested on CI before being merged. Without this PR, changes to the poetry setup action don't trigger a CI run and in principle could break `master` when merged.	2023-09-05 13:08:19 -04:00
Predrag Gruevski	e34ad6fefd	Temporarily disable step that seems to be transiently failing. (#10234 )	2023-09-05 10:55:47 -04:00
Harrison Chase	c0518be1f1	fix syntax (#10155 )	2023-09-03 16:08:43 -07:00
maks-operlejn-ds	a8f804a618	Add data anonymizer (#9863 ) ### Description The feature for anonymizing data has been implemented. In order to protect private data, such as when querying external APIs (OpenAI), it is worth pseudonymizing sensitive data to maintain full privacy. Anonynization consists of two steps: 1. Identification: Identify all data fields that contain personally identifiable information (PII). 2. Replacement: Replace all PIIs with pseudo values or codes that do not reveal any personal information about the individual but can be used for reference. We're not using regular encryption, because the language model won't be able to understand the meaning or context of the encrypted data. We use Microsoft Presidio together with Faker framework for anonymization purposes because of the wide range of functionalities they provide. The full implementation is available in `PresidioAnonymizer`. ### Future works - deanonymization - add the ability to reverse anonymization. For example, the workflow could look like this: `anonymize -> LLMChain -> deanonymize`. By doing this, we will retain anonymity in requests to, for example, OpenAI, and then be able restore the original data. - instance anonymization - at this point, each occurrence of PII is treated as a separate entity and separately anonymized. Therefore, two occurrences of the name John Doe in the text will be changed to two different names. It is therefore worth introducing support for full instance detection, so that repeated occurrences are treated as a single object. ### Twitter handle @deepsense_ai / @MaksOpp --------- Co-authored-by: MaksOpp <maks.operlejn@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-30 10:39:44 -07:00
Predrag Gruevski	9aaa0fdce0	Use unified Python setup steps for release workflow.	2023-08-28 14:20:48 +00:00
XUEYANZ	f97d3a76e7	Update CONTRIBUTING.md (#9817 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. These live is docs/extras directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17, @rlancemartin. --> Hi LangChain :) Thank you for such a great project! I was going through the CONTRIBUTING.md and found a few minor issues.	2023-08-28 09:38:34 -04:00
Predrag Gruevski	c06f34fa35	Use new Python setup approach for scheduled tests. (#9626 ) Using the same new unified Python setup as the regular tests and the lint job, as set up in #9625.	2023-08-22 16:07:53 -04:00
Predrag Gruevski	83986ea98a	Cache poetry install + unify Python/Poetry setup for lint and test jobs. (#9625 ) With this PR: - All lint and test jobs use the exact same Python + Poetry installation approach, instead of lints doing it one way and tests doing it another way. - The Poetry installation itself is cached, which saves ~15s per run. - We no longer pass shell commands as workflow arguments to a workflow that just runs them in a shell. This makes our actions more resilient to shell code injection. If y'all like this approach, I can modify the scheduled tests workflow and the release workflow to use this too.	2023-08-22 15:59:22 -04:00
seamusp	f3ba9ce7f4	Remove -E all from installation instructions (#9573 ) Update installation instructions to only install test dependencies rather than all dependencies. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-08-22 14:57:58 -04:00
Predrag Gruevski	35812d0096	Set up concurrency groups and workflow cancelation in CI. (#9564 ) If another push to the same PR or branch happens while its CI is still running, cancel the earlier run in favor of the next run. There's no point in testing an outdated version of the code. GitHub only allows a limited number of job runners to be active at the same time, so it's better to cancel pointless jobs early so that more useful jobs can run sooner.	2023-08-22 14:21:26 -04:00
Predrag Gruevski	3c7cc4d440	Test experimental package with `langchain` on `master` branch. (#9621 ) It's possible that langchain-experimental works fine with the latest published langchain, but is broken with the langchain on `master`. Unfortunately, you can see this is currently the case — this is why this PR also includes a minor fix for the `langchain` package itself. We want to catch situations like that before releasing a new langchain, hence this test.	2023-08-22 13:35:21 -04:00
Predrag Gruevski	acb54d8b9d	Reduce cache timeouts to ensure faster builds on timeout. (#9619 ) The current timeouts are too long, and mean that if the GitHub cache decides to act up, jobs get bogged down for 15min at a time. This has happened 2-3 times already this week -- a tiny fraction of our total workflows but really annoying when it happens to you. We can do better. Installing deps on cache miss takes about ~4min, so it's not worth waiting more than 4min for the deps cache. The black and mypy caches save 1 and 2min, respectively, so wait only up to that long to download them.	2023-08-22 12:11:38 -04:00
Predrag Gruevski	a1e89aa8d5	Explicitly add the `contents: write` permission for publishing releases. (#9617 )	2023-08-22 08:38:18 -07:00
Predrag Gruevski	c75e1aa5ed	Eliminate special-casing from test CI workflows. (#9562 ) The previous approach was relying on `_test.yml` taking an input parameter, and then doing almost completely orthogonal things for each parameter value. I've separated out each of those test situations as its own job or workflow file, which eliminated all the special-casing and, in my opinion, improved maintainability by making it much more obvious what code runs when.	2023-08-22 11:36:52 -04:00
Predrag Gruevski	6c308aabae	Use the GitHub-suggested safer pattern for shell interpolation. (#9567 ) Using `${{ }}` to construct shell commands is risky, since the `${{ }}` interpolation runs first and ignores shell quoting rules. This means that shell commands that look safely quoted, like `echo "${{ github.event.issue.title }}"`, are actually vulnerable to shell injection. More details here: https://github.blog/2023-08-09-four-tips-to-keep-your-github-actions-workflows-secure/	2023-08-21 17:59:10 -04:00
Predrag Gruevski	2a3758a98e	Reminder to not report security issues as "bug" type issues. (#9554 ) Updated the issue template that pops up when users open a new issue.	2023-08-21 15:48:33 -04:00
Predrag Gruevski	9f08d29bc8	Use PyPI Trusted Publishing to publish langchain packages. (#9467 ) Trusted Publishing is the current best practice for publishing Python packages. Rather than long-lived secret keys, it uses OpenID Connect (OIDC) to allow our GitHub runner to directly authenticate itself to PyPI and get a short-lived publishing token. This locks down publishing quite a bit: - There's no long-lived publish key to steal anymore. - Publishing is only allowed via the specifically designated GitHub workflow in the designated repo. It also is operationally easier: no keys means there's nothing that needs to be periodically rotated, nothing to worry about leaking, and nobody can accidentally publish a release from their laptop because they happened to have PyPI keys set up. After this gets merged, we'll need to configure PyPI to start expecting trusted publishing. It's only a few clicks and should only take a minute; instructions are here: https://docs.pypi.org/trusted-publishers/adding-a-publisher/ More info: - https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/ - https://github.com/pypa/gh-action-pypi-publish	2023-08-21 14:44:29 -04:00
Predrag Gruevski	249752e8ee	Require manually triggering release workflows. (#9552 )	2023-08-21 13:54:44 -04:00
Predrag Gruevski	875ea4b4c6	Fix conditional that erroneously always runs. (#9543 ) The input it means to test for is `"libs/langchain"` and not `"langchain"`.	2023-08-21 13:24:33 -04:00
Predrag Gruevski	a7eba8b006	Release on push to `master` instead of on closed PRs targeting it. (#9544 ) This is safer than the prior approach, since it's safe by default: the release workflows never get triggered for non-merged PRs, so there's no possibility of a buggy conditional accidentally letting a workflow proceed when it shouldn't have. The only loss is that publishing no longer requires a `release` label on the merged PR that bumps the version. We can add a separate CI step that enforces that part as a condition for merging into `master`, if desirable.	2023-08-21 12:57:40 -04:00
Predrag Gruevski	a03003f5fd	Upgrade CI poetry version to 1.5.1. (#9479 ) Poetry v1.5.1 was released on May 29, almost 3 months ago. Probably a safe upgrade.	2023-08-21 10:35:56 -04:00
Yuki Miyake	85a1c6d0b7	🐛 fix unexpected run of release workflow (#9494 ) I have discovered a bug located within `.github/workflows/_release.yml` which is the primary cause of continuous integration (CI) errors. The problem can be solved; therefore, I have constructed a PR to address the issue. ## The Issue Access the following link to view the exact errors: [Langhain Release Workflow](https://github.com/langchain-ai/langchain/actions/workflows/langchain_release.yml) The instances of these errors take place for each PR that updates `pyproject.toml`, excluding those specifically associated with bumping PRs. See below for the specific error message: ``` Error: Error 422: Validation Failed: {"resource":"Release","code":"already_exists","field":"tag_name"} ``` An image of the error can be viewed here: ![Image](https://github.com/langchain-ai/langchain/assets/13769670/13125f73-9b53-49b7-a83e-653bb01a1da1) The `_release.yml` document contains the following if-condition: ```yaml if: \| ${{ github.event.pull_request.merged == true }} && ${{ contains(github.event.pull_request.labels.*.name, 'release') }} ``` ## The Root Cause The above job constantly runs as the `if-condition` is always identified as `true`. ## The Logic The `if-condition` can be defined as `if: ${{ b1 }} && ${{ b2 }}`, where `b1` and `b2` are boolean values. However, in terms of condition evaluation with GitHub Actions, `${{ false }}` is identified as a string value, thereby rendering it as truthy as per the [official documentation](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idif). I have run some tests regarding this behavior within my forked repository. You can consult my [debug PR](https://github.com/zawakin/langchain/pull/1) for reference. Here is the result of the tests: \|If-Condition\|Outcome\| \|:--:\|:--:\| \|`if: true && ${{ false }}`\|Execution\| \|`if: ${{ false }}` \|Skipped\| \|`if: true && false` \|Skipped\| \|`if: false`\|Skipped\| \|`if: ${{ true && false }}` \|Skipped\| In view of the first and second results, we can infer that `${{ false }}` can only be interpreted as `true` for conditions composed of some expressions. It is consistent that the condition of `if: ${{ inputs.working-directory == 'libs/langchain' }}` works. It is surprised to be skipped for the second case but it seems the spec of GitHub Actions 😓 Anyway, the PR would fix these errors, I believe 👍 Could you review this? @hwchase17 or @shoelsch , who is the author of [PR](https://github.com/langchain-ai/langchain/pull/360).	2023-08-21 10:34:03 -04:00
Predrag Gruevski	ade683c589	Rely on `WORKDIR` env var to avoid ugly ternary operators in workflows. (#9456 ) Ternary operators in GitHub Actions syntax are pretty ugly and hard to read: `inputs.working-directory == '' && '.' \|\| inputs.working-directory` means "if the condition is true, use `'.'` and otherwise use the expression after the `\|\|`". This PR performs the ternary as few times as possible, assigning its outcome to an env var we can then reuse as needed.	2023-08-18 12:55:33 -04:00
Predrag Gruevski	8976483f3a	Lint only on the min and max supported Python versions. (#9450 ) Only lint on the min and max supported Python versions. It's extremely unlikely that there's a lint issue on any version in between that doesn't show up on the min or max versions. GitHub rate-limits how many jobs can be running at any one time. Starting new jobs is also relatively slow, so linting on fewer versions makes CI faster.	2023-08-18 10:26:38 -04:00
Predrag Gruevski	463019ac3e	Cache black formatting information across CI runs. (#9413 ) Save and persist `black`'s formatted files cache across CI runs. Around a ~20s win, 21s -> 2s. Most cases should be close to this best case scenario, since most PRs don't modify most files — and this PR makes sure we don't re-check files that haven't changed. Before: ![image](https://github.com/langchain-ai/langchain/assets/2348618/6c5670c5-be70-4a18-aa2a-ece5e4425d1e) After: ![image](https://github.com/langchain-ai/langchain/assets/2348618/37810d27-c611-4f76-b9bd-e827cefbaa0a)	2023-08-18 09:49:50 -04:00
Predrag Gruevski	0dd2c21089	Do not bust `poetry install` cache when manually installing pydantic v2. (#9407 ) Using `poetry add` to install `pydantic@2.1` was also causing poetry to change its lockfile. This prevented dependency caching from working: - When attempting to restore a cache, it would hash the lockfile in git and use it as part of the cache key. Say this is a cache miss. - Then, it would attempt to save the cache -- but the lockfile will have changed, so the cache key would be different than the key in the lookup. So the cache save would succeed, but to a key that cannot be looked up in the next run -- meaning we never get a cache hit. In addition to busting the cache, the lockfile update itself is also non-trivially long, over 30s: ![image](https://github.com/langchain-ai/langchain/assets/2348618/d84d3b56-484d-45eb-818d-54126a094a40) This PR fixes the problems by using `pip` to perform the installation, avoiding the lockfile change.	2023-08-17 18:23:00 -04:00
Predrag Gruevski	8f2d321dd0	Cache .mypy_cache across lint runs. (#9405 ) Preserve the `.mypy_cache` directory across lint runs, to avoid having to re-parse all dependencies and their type information. Approximately a 1min perf win for CI. Before: ![image](https://github.com/langchain-ai/langchain/assets/2348618/6524f2a9-efc0-4588-a94c-69914b98b382) After: ![image](https://github.com/langchain-ai/langchain/assets/2348618/dd0af954-4dc9-43d3-8544-25846616d41d)	2023-08-17 13:53:59 -04:00
Predrag Gruevski	7e63270e04	Ensure the in-project venv gets cached in CI tests. (#9336 ) The previous caching configuration was attempting to cache poetry venvs created in the default shared virtualenvs directory. However, all langchain packages use `in-project = true` for their poetry virtualenv setup, which moves the venv inside the package itself instead. This meant that poetry venvs were not being cached at all. This PR ensures that the venv gets cached by adding the in-project venv directory to the cached directories list. It also makes sure that the cache key only includes the lockfile being installed, as opposed to all lockfiles (unnecessary cache misses) or just the top-level lockfile (cache hits when it shouldn't).	2023-08-17 11:47:22 -04:00
Predrag Gruevski	f2560188ec	Cache linting venv on CI. (#9342 ) Ensure that we cache the linting virtualenv as well as the pip cache for the `pip install -e langchain` step. This is a win of about 60-90s overall. Before: ![image](https://github.com/langchain-ai/langchain/assets/2348618/f55f8398-2c3a-4112-bad3-2c646d186183) After: ![image](https://github.com/langchain-ai/langchain/assets/2348618/984a9529-2431-41b4-97e5-7f5dd7742651)	2023-08-17 11:46:58 -04:00
Bagatur	995ef8a7fc	unpin pydantic (#9356 )	2023-08-17 01:55:46 -07:00
Bagatur	3eccd72382	pin pydantic (#9274 ) don't want default to be v2 yet	2023-08-15 15:02:28 -07:00
Eugene Yurtsev	a091b4bf4c	Update testing workflow to test with both pydantic versions (#9206 ) * PR updates test.yml to test with both pydantic versions * Code should be refactored to make it easier to do testing in matrix format w/ packages * Added steps to assert that pydantic version in the environment is as expected	2023-08-15 13:21:11 -04:00
Bagatur	641cb80c9d	update pr temp (#9062 )	2023-08-10 11:10:06 -07:00
Bagatur	206f809366	fix sched ci (more) (#9056 )	2023-08-10 10:39:29 -07:00
Bagatur	e5db8a16c0	Bagatur/fix sched (#9054 )	2023-08-10 09:34:44 -07:00
Bagatur	e162fd418a	fix sched ci (#9053 )	2023-08-10 09:29:46 -07:00
Bagatur	269f85b7b7	scheduled gha fix (#8977 )	2023-08-09 09:44:25 -07:00
Bagatur	95cf7de112	scheduled tests GHA (#8879 ) Adding scheduled daily GHA that runs marked integration tests. To start just marking some tests in test_openai	2023-08-08 14:55:25 -07:00
Harrison Chase	2448043b84	bump and fix (#8441 )	2023-07-28 17:16:51 -07:00
Harrison Chase	cddd8ae83d	update release yml (#8364 ) only do the step that tags and adds release notes if its langchain	2023-07-27 16:49:04 -07:00
William FH	01a9b06400	Add api cross ref linking (#8275 ) Example of how it would show up in our python docs: ![image](https://github.com/langchain-ai/langchain/assets/13333726/0f0a88cc-ba4a-4778-bc47-118c66807f15) Examples added to the reference docs: https://api.python.langchain.com/en/wfh-api_crosslink/vectorstores/langchain.vectorstores.chroma.Chroma.html#langchain.vectorstores.chroma.Chroma ![image](https://github.com/langchain-ai/langchain/assets/13333726/dcd150de-cb56-4d42-b49a-a76a002a5a52)	2023-07-26 12:38:58 -07:00
Harrison Chase	8dcabd9205	bump releases rc0 (#8097 )	2023-07-21 13:54:57 -07:00
Harrison Chase	0faba034b1	add experimental release action (#8096 )	2023-07-21 13:38:35 -07:00
Harrison Chase	344cbd9c90	update contributor guide (#8088 )	2023-07-21 12:01:05 -07:00
Harrison Chase	da04760de1	Harrison/move experimental (#8084 )	2023-07-21 10:36:28 -07:00
Harrison Chase	f35db9f43e	(WIP) set up experimental (#7959 )	2023-07-21 09:20:24 -07:00
Bagatur	25a2bdfb70	add pr template instructions (#7904 )	2023-07-18 13:22:28 -07:00
Yaroslav Halchenko	0d92a7f357	codespell: workflow, config + some (quite a few) typos fixed (#6785 ) Probably the most boring PR to review ;) Individual commits might be easier to digest --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2023-07-12 16:20:08 -04:00
os1ma	2667ddc686	Fix `make docs_build` and related scripts (#7276 ) Description: a description of the change Fixed `make docs_build` and related scripts which caused errors. There are several changes. First, I made the build of the documentation and the API Reference into two separate commands. This is because it takes less time to build. The commands for documents are `make docs_build`, `make docs_clean`, and `make docs_linkcheck`. The commands for API Reference are `make api_docs_build`, `api_docs_clean`, and `api_docs_linkcheck`. It looked like `docs/.local_build.sh` could be used to build the documentation, so I used that. Since `.local_build.sh` was also building API Rerefence internally, I removed that process. `.local_build.sh` also added some Bash options to stop in error or so. Futher more added `cd "${SCRIPT_DIR}"` at the beginning so that the script will work no matter which directory it is executed in. `docs/api_reference/api_reference.rst` is removed, because which is generated by `docs/api_reference/create_api_rst.py`, and added it to .gitignore. Finally, the description of CONTRIBUTING.md was modified. Issue: the issue # it fixes (if applicable) https://github.com/hwchase17/langchain/issues/6413 Dependencies: any dependencies required for this change `nbdoc` was missing in group docs so it was added. I installed it with the `poetry add --group docs nbdoc` command. I am concerned if any modifications are needed to poetry.lock. I would greatly appreciate it if you could pay close attention to this file during the review. Tag maintainer - General / Misc / if you don't know who to tag: @baskaryan If this PR needs any additional changes, I'll be happy to make them! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-11 22:05:14 -04:00
Kazuki Maeda	5c3fe8b0d1	Enhance Makefile with 'format_diff' Option and Improved Readability (#7394 ) ### Description: This PR introduces a new option format_diff to the existing Makefile. This option allows us to apply the formatting tools (Black and isort) only to the changed Python and ipynb files since the last commit. This will make our development process more efficient as we only format the codes that we modify. Along with this change, comments were added to make the Makefile more understandable and maintainable. ### Issue: N/A ### Dependencies: Add dependency to black. ### Tag maintainer: @baskaryan ### Twitter handle: [kzk_maeda](https://twitter.com/kzk_maeda) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-11 21:03:17 -04:00
Bagatur	fad2c7e5e0	update pr tmpl (#7095 )	2023-07-03 13:34:03 -06:00
Davis Chase	00a7403236	update pr tmpl (#6552 )	2023-06-21 10:03:52 -07:00
Brigit Murtaugh	ccd916babe	Update dev container (#6189 ) Fixes https://github.com/hwchase17/langchain/issues/6172 As described in https://github.com/hwchase17/langchain/issues/6172, I'd love to help update the dev container in this project. Summary of changes: - Dev container now builds (the current container in this repo won't build for me) - Dockerfile updates - Update image to our [currently-maintained Python image](https://github.com/devcontainers/images/tree/main/src/python/.devcontainer) (`mcr.microsoft.com/devcontainers/python`) rather than the deprecated image from vscode-dev-containers - Move Dockerfile to root of repo - in order for `COPY` to work properly, it needs the files (in this case, `pyproject.toml` and `poetry.toml`) in the same directory - devcontainer.json updates - Removed `customizations` and `remoteUser` since they should be covered by the updated image in the Dockerfile - Update comments - Update docker-compose.yaml to properly point to updated Dockerfile - Add a .gitattributes to avoid line ending conversions, which can result in hundreds of pending changes ([info](https://code.visualstudio.com/docs/devcontainers/tips-and-tricks#_resolving-git-line-ending-issues-in-containers-resulting-in-many-modified-files)) - Add a README in the .devcontainer folder and info on the dev container in the contributing.md Outstanding questions: - Is it expected for `poetry install` to take some time? It takes about 30 minutes for this dev container to finish building in a Codespace, but a user should only have to experience this once. Through some online investigation, this doesn't seem unusual - Versions of poetry newer than 1.3.2 failed every time - based on some of the guidance in contributing.md and other online resources, it seemed changing poetry versions might be a good solution. 1.3.2 is from Jan 2023 --------- Co-authored-by: bamurtaugh <brmurtau@microsoft.com> Co-authored-by: Samruddhi Khandale <samruddhikhandale@github.com>	2023-06-16 15:42:14 -07:00
Davis Chase	87e502c6bc	Doc refactor (#6300 ) Co-authored-by: jacoblee93 <jacoblee93@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-16 11:52:56 -07:00
Zander Chase	2c9619bc1d	Remove from PR template (#6018 )	2023-06-11 19:34:26 -07:00
Zander Chase	6655f43282	Rm Template Title (#5616 ) Remove the redundant title from the PR template #### Before submitting	2023-06-02 06:54:55 -07:00
Jacob Lee	f77f27163d	Update PR template with Twitter handle request (#5382 ) # Updates PR template to request Twitter handle for shoutouts! Makes it easier for maintainers to show their appreciation 😄	2023-05-29 06:23:17 -07:00
Eugene Yurtsev	a669abf16b	Update CONTRIBUTION guidelines and PR Template (#5140 ) # Update contribution guidelines and PR template This PR updates the contribution guidelines to include more information on how to handle optional dependencies. The PR template is updated to include a link to the contribution guidelines document.	2023-05-26 10:18:11 -04:00
Davis Chase	56cb77a828	Make test gha workflow manually runnable (#4998 ) if https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#workflow_dispatch is to be believed this should make it possible to manually kick of test workflow, but i don't know much about these things	2023-05-19 13:46:33 -07:00
Eugene Yurtsev	3ecd7c9641	Add check to verify poetry.toml (#4794 ) # Add poetry check to github action Check poetry toml file during tests for errors	2023-05-16 11:53:06 -04:00
Eugene Yurtsev	14bedf1cc5	Github Action: Fix poetry lock file checking (#4789 ) Fix how poetry lock file is checked to avoid skipping caches silently.	2023-05-16 11:40:28 -04:00
Eugene Yurtsev	49ce5ce1ca	Only run linkcheck against docs dir on PR (#4741 ) # Only run linkchecker on direct changes to docs This is a stop-gap that will speed up PRs. Some broken links can slip through if they're embedded in doc-strings inside the codebase. But we'll still be running the linkchecker on master.	2023-05-15 14:40:43 -04:00
Eugene Yurtsev	99cfe71cd0	Check poetry lock file (#4740 ) # Check poetry lock file on CI This PR checks that the lock file is up to date using poetry lock --check. As part of this PR, a new lock file was generated.	2023-05-15 14:38:01 -04:00
Eugene Yurtsev	08ed927c32	Turn on extended tests (#4588 ) # Turn on strict extended tests This PR turns on strict testing for extended tests.	2023-05-12 14:50:08 -04:00
Davis Chase	cd01de49cf	Update contribution guidelines (#4431 ) provide more guidance on pr's	2023-05-11 00:05:25 -07:00
Eugene Yurtsev	146616aa5d	Test workflow, fix minor typos (#4495 ) # Fix 2 minor typos in test workflow. This PR does not result in any functional changes.	2023-05-10 22:36:50 -04:00
Eugene Yurtsev	f373883c1a	Refactor test workflow (#4457 ) # Refactor the test workflow This PR refactors the tests to run using a single test workflow. This makes it easier to relaunch failing tests and see in the UI which test failed since the jobs are grouped together. ## Before submitting ## Who can review?	2023-05-10 21:57:39 -04:00
Eugene Yurtsev	80558b5b27	Add workflow for testing with all deps (#4410 ) # Add action to test with all dependencies installed PR adds a custom action for setting up poetry that allows specifying a cache key: https://github.com/actions/setup-python/issues/505#issuecomment-1273013236 This makes it possible to run 2 types of unit tests: (1) unit tests with only core dependencies (2) unit tests with extended dependencies (e.g., those that rely on an optional pdf parsing library) As part of this PR, we're moving some pdf parsing tests into the unit-tests section and making sure that these unit tests get executed when running with extended dependencies.	2023-05-10 09:35:07 -04:00
Zander Chase	0870a45a69	Add Pull Request Template (#4247 )	2023-05-08 08:34:37 -07:00
Zander Chase	cc068f1b77	Add Issue Templates (#4021 ) Add issue templates for - bug reports - feature suggestions - documentation and a link to the discord for general discussion. Open to other suggestions here. Could also add another "Other" template with just a raw text box if we think this is too restrictive <img width="1464" alt="image" src="https://user-images.githubusercontent.com/130414180/236115358-e603bcbe-282c-40c7-82eb-905eb93ccec0.png">	2023-05-04 16:33:52 -07:00
Ehsan M. Kermani	5d0674fb46	Use a consistent poetry version everywhere (#3250 ) Fixes the discrepancy of poetry version in Dockerfile and the GAs	2023-04-24 18:19:51 -07:00
Noah Gundotra	577ec92f16	Include testing instructions for getting setup in CONTRIBUTING.md (#3020 ) Running tests is good sanity check for new users to ensure their development environment is setup correctly.	2023-04-17 08:34:07 -07:00
sergerdn	90973c10b1	fix: tests with Dockerfile (#2382 ) Update the Dockerfile to use the `$POETRY_HOME` argument to set the Poetry home directory instead of adding Poetry to the PATH environment variable. Add instructions to the `CONTRIBUTING.md` file on how to run tests with Docker. Closes https://github.com/hwchase17/langchain/issues/2324	2023-04-04 06:47:19 -07:00
andrewmelis	7ed8d00bba	Remove extra word in CONTRIBUTING.md (#2370 ) "via by a developer" -> "by a developer" --- Thank you for all your hard work!	2023-04-03 21:48:58 -07:00
Matt Tucker	fa2e546b76	Add workaround for debugpy install issue to contrib docs. (#1835 ) When following the Quick Start instructions in the contributing docs, I was getting a "WheelFileValidationError" on installation of debugpy which was blocking the installation of a number of other deps. Google turned up this [GitHub issue](https://github.com/microsoft/debugpy/issues/1246) indicating a regression in Poetry 1.4.1 and workarounds. This PR updates the contrib docs noting the issue and the workarounds.	2023-03-20 22:03:19 -07:00
Harrison Chase	b053f831cd	Harrison/contributing (#1542 ) Co-authored-by: Saurav Maheshkar <sauravvmaheshkar@gmail.com>	2023-03-08 20:53:16 -08:00
Harrison Chase	012a6dfb16	Harrison/makefile (#1033 ) Co-authored-by: blob42 <contact@blob42.xyz> Co-authored-by: blob42 <spike@w530>	2023-02-13 21:08:47 -08:00
Steven Hoelscher	a5999351cf	chore: add release workflow (#360 ) Adds release workflow that (1) creates a GitHub release and (2) publishes built artifacts to PyPI Release Workflow 1. Checkout `master` locally and cut a new branch 1. Run `poetry version <rule>` to version bump (e.g., `poetry version patch`) 1. Commit changes and push to remote branch 1. Ensure all quality check workflows pass 1. Explicitly tag PR with `release` label 1. Merge to mainline At this point, a release workflow should be triggered because: * The PR is closed, targeting `master`, and merged * `pyproject.toml` has been detected as modified * The PR had a `release` label The workflow will then proceed to build the artifacts, create a GitHub release with release notes and uploaded artifacts, and publish to PyPI. Example Workflow run: https://github.com/shoelsch/langchain/actions/runs/3711037455/jobs/6291076898 Example Releases: https://github.com/shoelsch/langchain/releases -- Note, this workflow is looking for the `PYPI_API_TOKEN` secret, so that will need to be uploaded to the repository secrets. I tested uploading as far as hitting a permissions issue due to project ownership in Test PyPI.	2023-01-15 18:35:21 -08:00
Harrison Chase	9753bccc71	Feature: linkcheck-action (#534 ) (#542 ) - Add support for local build and linkchecking of docs - Add GitHub Action to automatically check links before prior to publication - Minor reformat of Contributing readme - Fix existing broken links Co-authored-by: Hunter Gerlach <hunter@huntergerlach.com> Co-authored-by: Hunter Gerlach <HunterGerlach@users.noreply.github.com> Co-authored-by: Hunter Gerlach <hunter@huntergerlach.com>	2023-01-04 21:39:50 -08:00
Hunter Gerlach	482611f426	unit test / code coverage improvements (#322 ) This PR has two contributions: 1. Add test for when stop token is found in middle of text 2. Add code coverage tooling and instructions - Add pytest-cov via poetry - Add necessary config files - Add new make instruction for `coverage` - Update README with coverage guidance - Update minor README formatting/spelling Co-authored-by: Hunter Gerlach <hunter@huntergerlach.com>	2022-12-13 05:48:53 -08:00
Christian Clauss	2fbb152386	Add Python 3.11 to the testing (#324 )	2022-12-12 07:19:52 -08:00
Christian Clauss	d946be2f3d	Add Python 3.11 to the testing (#323 )	2022-12-12 06:09:08 -08:00
Harrison Chase	3c1c7ba672	update branch name in gha (#274 )	2022-12-06 22:28:50 -08:00
Steven Hoelscher	98fb19b535	chore: use poetry as dependency manager (#242 ) * Adopts [Poetry](https://python-poetry.org/) as a dependency manager * Introduces dependency version requirements * Deprecates Python 3.7 support TODO - [x] Update developer guide - [x] Add back `playwright`, `manifest-ml`, and `jupyter` to dependency group Not Doing => Fast Follow - Investigate single source for version, perhaps relying on GitHub tags and [tackling this issue](https://github.com/hwchase17/langchain/issues/26)	2022-12-03 16:42:59 -08:00
Predrag Gruevski	1a95252f00	Use `pull_request` not `pull_request_target` in GitHub Actions. (#139 ) `pull_request` runs on the merge commit between the opened PR and the target branch where the PR is to be merged — `master` in this case. This is desirable because that way the new changes get linted and tested. The existing `pull_request_target` specifier causes lint and test to run _on the target branch itself_ (i.e. `master` in this case). That way the new code in the PR doesn't get linted and tested at all. This can also lead to security vulnerabilities, as described in the GitHub docs: ![image](https://user-images.githubusercontent.com/2348618/201735153-c5dd0c03-2490-45e9-b7f9-f0d47eb0109f.png) Screenshot from here: https://docs.github.com/en/actions/using-workflows/events-that-trigger-workflows#pull_request_target Link from the screenshot: https://securitylab.github.com/research/github-actions-preventing-pwn-requests/	2022-11-14 11:34:08 -08:00
Harrison Chase	6cff2837bb	Harrison/fix lint (#80 )	2022-11-07 15:22:37 -08:00
Harrison Chase	9679bdc34c	run workflows on forks (#78 ) per https://stackoverflow.com/questions/58221321/is-github-actions-available-on-forked-repositories	2022-11-07 05:53:17 -08:00
Harrison Chase	18aeb72012	initial commit	2022-10-24 14:51:15 -07:00

1 2 3 4 5

212 Commits