langchain

mirror of https://github.com/hwchase17/langchain synced 2024-11-08 07:10:35 +00:00

Author	SHA1	Message	Date
Erick Friis	e5878c467a	infra: scheduled testing env (#16239 )	2024-01-18 14:28:01 -08:00
Erick Friis	2f348c695a	infra: add nvidia api secret to integration testing (#15972 )	2024-01-18 14:20:02 -08:00
Erick Friis	50959abf0c	infra: google cse id integration test (#16238 )	2024-01-18 14:12:00 -08:00
Erick Friis	92bc80483a	infra: google search api key (#16237 )	2024-01-18 14:06:38 -08:00
purificant	3606c5d5e9	infra: update poetry 1.6.1 -> 1.7.1 (#15027 )	2024-01-17 08:51:20 -08:00
Erick Friis	6a2889a4ec	infra: retry release if not found on test pypi (#15913 ) <!-- Thank you for contributing to LangChain! Please title your PR "<package>: <description>", where <package> is whichever of langchain, community, core, experimental, etc. is being modified. Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes if applicable, - Dependencies: any dependencies required for this change, - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` from the root of the package you've modified to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://python.langchain.com/docs/contributing/ If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2024-01-12 09:36:52 -08:00
Erick Friis	98be1e5ed0	infra: title release action runs (#15612 ) https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#run-name	2024-01-05 15:24:57 -08:00
Erick Friis	1a42ad353a	infra: vertex integration test creds (#15609 )	2024-01-05 15:03:39 -08:00
Erick Friis	ebc75c5ca7	openai[minor]: implement langchain-openai package (#15503 ) Todo - [x] copy over integration tests - [x] update docs with new instructions in #15513 - [x] add linear ticket to bump core -> community, community->langchain, and core->openai deps - [ ] (optional): add `pip install langchain-openai` command to each notebook using it - [x] Update docstrings to not need `openai` install - [x] Add serialization - [x] deprecate old models Contributor steps: - [x] Add secret names to manual integrations workflow in .github/workflows/_integration_test.yml - [x] Add secrets to release workflow (for pre-release testing) in .github/workflows/_release.yml Maintainer steps (Contributors should not do these): - [x] set up pypi and test pypi projects - [x] add credential secrets to Github Actions - [ ] add package to conda-forge Functional changes to existing classes: - now relies on openai client v1 (1.6.1) via concrete dep in langchain-openai package Codebase organization - some function calling stuff moved to `langchain_core.utils.function_calling` in order to be used in both community and langchain-openai	2024-01-05 15:03:28 -08:00
Erick Friis	1437872df9	infra: fail check_diffs if too many files changed (#15423 ) Jobs like https://github.com/langchain-ai/langchain/actions/runs/7389187843/job/20101494206 only receive the first 300 changed files. Because of the opportunity to miss packages, better to auto-fail and manually run. Checking that it does what I expect in #15424	2024-01-03 13:30:16 -08:00
Bagatur	63e0cae2b1	infra: fix min deps test (#15486 )	2024-01-03 11:34:46 -05:00
Bagatur	54b58c03db	infra: add minimum deps pre release check (#15485 )	2024-01-03 11:28:35 -05:00
Erick Friis	a8f6f33cd9	infra: remove path filter on check_diffs (#15418 ) CI should run on https://github.com/langchain-ai/langchain/pull/15412 But github only checks first 300 files: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#git-diff-comparisons > Diffs are limited to 300 files. If there are files changed that aren't matched in the first 300 files returned by the filter, the workflow will not run. You may need to create more specific filters so that the workflow will run automatically.	2024-01-02 13:10:48 -05:00
purificant	619cd3ce54	ci: upgrade actions (#15114 ) This PR upgrades CI actions [actions/setup-python](https://github.com/actions/setup-python/releases/tag/v5.0.0) and [google-github-actions/auth](https://github.com/google-github-actions/auth/releases/tag/v2.0.0)	2024-01-01 14:02:43 -08:00
Erick Friis	8a3360edf6	anthropic: beta messages integration (#14928 )	2023-12-19 18:55:19 -08:00
Erick Friis	795cf2ddda	together: package and embedding model (#14936 )	2023-12-19 18:48:32 -08:00
Bagatur	a5be9f9475	mistralai: Add langchain-mistralai partner package (#14783 ) Co-authored-by: Chad Phillips <chad@apartmentlines.com>	2023-12-19 10:34:19 -05:00
Erick Friis	1acc7ffa3f	infra: cut down on integration steps (#14785 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. --> --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-17 12:55:59 -08:00
William FH	c5296fd42c	[Documentation] Updates to NVIDIA Playground/Foundation Model naming.… (#14770 ) … (#14723) - Description: Minor updates per marketing requests. Namely, name decisions (AI Foundation Models / AI Playground) - Tag maintainer: @hinthornw Do want to pass around the PR for a bit and ask a few more marketing questions before merge, but just want to make sure I'm not working in a vacuum. No major changes to code functionality intended; the PR should be for documentation and only minor tweaks. Note: QA model is a bit borked across staging/prod right now. Relevant teams have been informed and are looking into it, and I'm placeholdered the response to that of a working version in the notebook. Co-authored-by: Vadim Kudlay <32310964+VKudlay@users.noreply.github.com>	2023-12-15 12:21:59 -08:00
Bagatur	c7b5dbe8ec	infra: fix pre-release integration test and add unit test (#14742 )	2023-12-14 16:57:41 -08:00
Bagatur	b9975fac89	infra: add action checkout to pre-release-checks (#14732 )	2023-12-14 13:28:13 -08:00
Bagatur	ba897fc04c	infra: Pre-release integration tests for partner pkgs (#14687 )	2023-12-14 13:11:19 -08:00
Bagatur	74211aa02e	infra: add integration test workflow (#14688 )	2023-12-14 12:46:45 -08:00
William FH	bc3ec78a38	[Workflows] Add nvidia-aiplay to _release.yml (#14722 ) As the title says. In the future will want to have a script to automate this	2023-12-14 09:16:40 -08:00
William FH	405d111da6	[Partner] Add langchain-google-genai package (gemini) (#14621 ) Add a new ChatGoogleGenerativeAI class in a `langchain-google-genai` package. Still todo: add a deprecation warning in PALM --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-12-13 11:57:59 -08:00
Erick Friis	231891706b	infra: skip extended testing for partner packages (#14630 ) Tested by merging into #14627	2023-12-13 09:58:48 -08:00
Bagatur	48b7a0584d	infra: Turn release branch check back on (#14563 )	2023-12-11 14:40:24 -08:00
Bagatur	ed58eeb9c5	community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463 ) Moved the following modules to new package langchain-community in a backwards compatible fashion: ``` mv langchain/langchain/adapters community/langchain_community mv langchain/langchain/callbacks community/langchain_community/callbacks mv langchain/langchain/chat_loaders community/langchain_community mv langchain/langchain/chat_models community/langchain_community mv langchain/langchain/document_loaders community/langchain_community mv langchain/langchain/docstore community/langchain_community mv langchain/langchain/document_transformers community/langchain_community mv langchain/langchain/embeddings community/langchain_community mv langchain/langchain/graphs community/langchain_community mv langchain/langchain/llms community/langchain_community mv langchain/langchain/memory/chat_message_histories community/langchain_community mv langchain/langchain/retrievers community/langchain_community mv langchain/langchain/storage community/langchain_community mv langchain/langchain/tools community/langchain_community mv langchain/langchain/utilities community/langchain_community mv langchain/langchain/vectorstores community/langchain_community mv langchain/langchain/agents/agent_toolkits community/langchain_community mv langchain/langchain/cache.py community/langchain_community mv langchain/langchain/adapters community/langchain_community mv langchain/langchain/callbacks community/langchain_community/callbacks mv langchain/langchain/chat_loaders community/langchain_community mv langchain/langchain/chat_models community/langchain_community mv langchain/langchain/document_loaders community/langchain_community mv langchain/langchain/docstore community/langchain_community mv langchain/langchain/document_transformers community/langchain_community mv langchain/langchain/embeddings community/langchain_community mv langchain/langchain/graphs community/langchain_community mv langchain/langchain/llms community/langchain_community mv langchain/langchain/memory/chat_message_histories community/langchain_community mv langchain/langchain/retrievers community/langchain_community mv langchain/langchain/storage community/langchain_community mv langchain/langchain/tools community/langchain_community mv langchain/langchain/utilities community/langchain_community mv langchain/langchain/vectorstores community/langchain_community mv langchain/langchain/agents/agent_toolkits community/langchain_community mv langchain/langchain/cache.py community/langchain_community ``` Moved the following to core ``` mv langchain/langchain/utils/json_schema.py core/langchain_core/utils mv langchain/langchain/utils/html.py core/langchain_core/utils mv langchain/langchain/utils/strings.py core/langchain_core/utils cat langchain/langchain/utils/env.py >> core/langchain_core/utils/env.py rm langchain/langchain/utils/env.py ``` See .scripts/community_split/script_integrations.sh for all changes	2023-12-11 13:53:30 -08:00
Bagatur	300305e5e5	infra: add langchain-community release workflow (#14469 )	2023-12-08 13:31:15 -08:00
Erick Friis	b3f226e8f8	core[patch], langchain[patch], experimental[patch]: import CI (#14414 )	2023-12-08 11:28:55 -08:00
Erick Friis	477b274a62	langchain[patch]: fix scheduled testing ci dep install (#14460 )	2023-12-08 10:37:44 -08:00
Erick Friis	ff0d5514c1	langchain[patch]: fix scheduled testing ci variables (#14459 )	2023-12-08 10:27:21 -08:00
Erick Friis	1d725327eb	langchain[patch]: Fix scheduled testing (#14428 ) - integration tests in pyproject - integration test fixes	2023-12-08 10:23:02 -08:00
Bagatur	b2280fd874	core[patch], langchain[patch]: fix required deps (#14373 )	2023-12-07 14:24:58 -08:00
Bagatur	ce4d81f88b	infra: ci matrix (#14306 )	2023-12-06 11:43:03 -08:00
Bagatur	48fbc5513d	infra[patch], langchain[patch]: fix test deps and upper bound langchain dep on core(#13984 )	2023-11-28 13:26:15 -08:00
Nuno Campos	e0bcc98436	infra[patch]: Use langchain core in-tree as a dev dependency (#13957 ) Using the published version means master is broken for contributors whenever we make changes in one lib that depend on the other.	2023-11-28 09:23:43 -08:00
Bagatur	bcf83988ec	Revert "INFRA: temp rm master condition (#13753 )" (#13759 )	2023-11-22 17:22:07 -08:00
Bagatur	df471b0c0b	INFRA: temp rm master condition (#13753 )	2023-11-22 16:59:50 -08:00
Bagatur	b6b7654f7f	INFRA: run LC ci after core changes (#13742 )	2023-11-22 13:38:48 -08:00
Bagatur	c61e30632e	BUG: more core fixes (#13665 ) Fix some circular deps: - move PromptValue into top level module bc both PromptTemplates and OutputParsers import - move tracer context vars to `tracers.context` and import them in functions in `callbacks.manager` - add core import tests	2023-11-21 15:15:48 -08:00
Harrison Chase	d82cbf5e76	Separate out langchain_core package (#13577 ) Co-authored-by: Nuno Campos <nuno@boringbits.io> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2023-11-20 13:09:30 -08:00
Erick Friis	9545f0666d	fix cli release (#13373 ) My thought is that the ==version would prevent pip from finding the package on regular [pypi.org](http://pypi.org/), so it would look at [test.pypi.org](http://test.pypi.org/) for that. Otherwise it'll pull package from [pypi.org](http://pypi.org/) (e.g. sub deps) Right now, the cli release is failing because it's going to test.pypi.org by default, so it finds this incorrect FASTAPI package instead of the real one: https://test.pypi.org/project/FASTAPI/	2023-11-14 15:08:35 -08:00
Predrag Gruevski	2ebd167dba	Lint Python notebooks with ruff. (#12677 ) The new ruff version fixed the blocking bugs, and I was able to fairly easily us to a passing state: ruff fixed some issues on its own, I fixed a handful by hand, and I added a list of narrowly-targeted exclusions for files that are currently failing ruff rules that we probably should look into eventually. I went pretty lenient on the docs / cookbooks rules, allowing dead code and such things. Perhaps in the future we may want to tighten the rules further, but this is already a good set of checks that found real issues and will prevent them going forward.	2023-11-14 15:58:22 -05:00
Predrag Gruevski	724b92231d	Remove `black` caching config from CI lint workflow. (#12594 ) To merge after #12585 is merged.	2023-10-31 21:39:05 -04:00
Predrag Gruevski	0ea837404a	Only publish to test PyPI from the `_test_release.yml` workflow. (#12668 ) PyPI trusted publishing wants to know which workflow is expected to do the publish. We always want to publish from the same workflow, so we're making `_test_release.yml` the only workflow that publishes to Test PyPI.	2023-10-31 21:36:38 -04:00
Predrag Gruevski	321cd44f13	Use separate jobs for building and publishing test releases. (#12671 ) This follows the principle of least privilege. Our `poetry build` step doesn't need, and shouldn't get, access to our GitHub OIDC capability. This is the same structure as I used in the already-merged PR for refactoring the regular PyPI release workflow: #12578.	2023-10-31 21:36:26 -04:00
Predrag Gruevski	aa3f4a9bc8	Remove the CLI package's pydantic compatibility tests. (#12675 ) They aren't necessary, since the CLI package doesn't have a direct dependency on pydantic.	2023-10-31 16:57:38 -04:00
Predrag Gruevski	360cff81a3	Overwrite existing distributions when uploading to test PyPI. (#12658 )	2023-10-31 10:02:50 -07:00
Predrag Gruevski	5308b836c7	Upgrade to `actions/checkout@v4` in the docs lint job. (#12581 )	2023-10-31 12:41:18 -04:00
Predrag Gruevski	94f018f1ba	Support release-testing packages with dashes in their names. (#12654 )	2023-10-31 12:40:34 -04:00
Predrag Gruevski	72fa5a463d	Show ruff output inline in GitHub PRs. (#12647 )	2023-10-31 12:16:01 -04:00
Erick Friis	e933212a3d	run poetry build in working dir (#12610 ) Was failing because was trying to build from root: https://github.com/langchain-ai/langchain/actions/runs/6700033981/job/18205251365	2023-10-30 16:58:34 -07:00
Predrag Gruevski	3c5c384f1a	Test-publish to test PyPI and separate jobs to limit permissions. (#12578 ) Before making a new `langchain` release, we want to test that everything works as expected. This PR lets us publish `langchain` to test PyPI, then install it from there and run checks to ensure everything works normally before publishing it "for real". It also takes the opportunity to refactor the build process, splitting up the build, release-creation, and PyPI upload steps into separate jobs that do not share their elevated permissions with each other.	2023-10-30 17:10:14 -04:00
Bagatur	0b4b9e61fc	Bagatur/fix doc ci (#12529 )	2023-10-29 16:15:18 -07:00
Bagatur	2424fff3f1	notebook fmt (#12498 )	2023-10-29 15:50:09 -07:00
Harrison Chase	0660c06cf1	add gha for cli (#12492 )	2023-10-28 21:49:28 -07:00
Erick Friis	afcc12d99e	Templates CI (#12313 ) Adds a `langchain-location` param to lint, so we can properly locate it. Regular langchain and experimental lint steps are passing, so default value seems to be working.	2023-10-26 20:29:36 -07:00
Erick Friis	4db8d82c55	CLI CI 2 (#12387 ) Will run all CI because of _test change, but future PRs against CLI will only trigger the new CLI one Has a bunch of file changes related to formatting/linting. No mypy yet - coming soon	2023-10-26 17:01:31 -07:00
Bagatur	7cadf00570	better lint triggering (#12376 )	2023-10-26 15:31:20 -07:00
Bagatur	76230d2c08	fireworks scheduled integration tests (#12373 )	2023-10-26 14:24:42 -07:00
Bagatur	b10cefb160	lint fix: rm init (#12374 )	2023-10-26 14:16:25 -07:00
Bagatur	2008a6438c	add experimental test release gha (#12229 )	2023-10-24 13:49:16 -07:00
Bagatur	8ba97cb408	separate compile integration tests (#12171 ) Co-authored-by: Predrag Gruevski <2348618+obi1kenobi@users.noreply.github.com>	2023-10-24 08:55:19 -07:00
Bagatur	963ff93476	bump 321 (#12161 )	2023-10-23 12:49:38 -04:00
Predrag Gruevski	95a1b598fe	Update to `actions/checkout@v4`. (#11951 ) We don't use any of the new functionality at the moment. Just making sure we don't fall back on versions and fail to benefit from new patches. This is an easy upgrade and it's always harder to upgrade across multiple major versions at once.	2023-10-23 10:01:33 -04:00
Bagatur	85302a9ec1	Add CI check that integration tests compile (#12090 )	2023-10-21 10:52:18 -04:00
Bagatur	bd74eba152	add azure openai sched tests (#11723 )	2023-10-12 10:48:45 -07:00
Bagatur	eedfddac2d	Restructure docs (#11620 )	2023-10-10 12:55:19 -07:00
Nuno Campos	2c11302598	Update langchain_release.yml (#11444 )	2023-10-05 14:23:27 -04:00
Predrag Gruevski	88c5349196	Revert "Rm additional file check for scheduled tests (#11192 )" (#11297 ) This reverts commit `ff90bb59bf`. Requires #11296 to merge first.	2023-10-04 11:35:55 -04:00
Predrag Gruevski	37f2f71156	Trigger Docker release workflow after new langchain release is made. (#11290 ) We want to publish a new Docker image after a new langchain Python package version is published.	2023-10-04 10:27:08 -04:00
Predrag Gruevski	d21dd72d64	Upgrade CI workflows to poetry 1.6.1. (#11344 )	2023-10-03 19:23:54 -04:00
Eugene Yurtsev	2343302fc6	Remove langserve from langchain repo (#11288 ) LangServe has been moved to a separate repo	2023-10-03 10:48:35 -04:00
CG80499	943e4f30d8	Add scoring chain (#11123 ) <!-- Thank you for contributing to LangChain! Replace this entire comment with: - Description: a description of the change, - Issue: the issue # it fixes (if applicable), - Dependencies: any dependencies required for this change, - Tag maintainer: for a quicker response, tag the relevant maintainer (see below), - Twitter handle: we announce bigger features on Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out! Please make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md If you're adding a new integration, please include: 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/extras` directory. If no one reviews your PR within a few days, please @-mention one of @baskaryan, @eyurtsev, @hwchase17. -->	2023-10-02 15:15:31 -07:00
Bagatur	38d5b63a10	Bedrock scheduled tests (#11194 )	2023-10-02 15:21:54 -04:00
Nuno Campos	32a8b311eb	Add base docker image and ci script for building and pushing (#10927 )	2023-10-02 15:07:57 +01:00
Eugene Yurtsev	aebdb1ad01	Ignore aadd (#11235 )	2023-09-29 21:10:53 +01:00
Bagatur	ff90bb59bf	Rm additional file check for scheduled tests (#11192 ) cc @obi1kenobi Causing issues with GHA creds https://github.com/langchain-ai/langchain/actions/runs/6342674950/job/17228926776	2023-09-28 11:49:26 -07:00
Bagatur	3508e582f1	add anthropic scheduled tests and unit tests (#11188 )	2023-09-28 11:47:29 -07:00
Eugene Yurtsev	176d71dd85	LangServe: Add release workflow (#11178 ) Add release workflow to langserve	2023-09-28 13:47:55 -04:00
Eugene Yurtsev	b05bb9e136	LangServe (#11046 ) Adds LangServe package * Integrate Runnables with Fast API creating Server and a RemoteRunnable client * Support multiple runnables for a given server * Support sync/async/batch/abatch/stream/astream/astream_log on the client side (using async implementations on server) * Adds validation using annotations (relying on pydantic under the hood) -- this still has some rough edges -- e.g., open api docs do NOT generate correctly at the moment * Uses pydantic v1 namespace Known issues: type translation code doesn't handle a lot of types (e.g., TypedDicts) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2023-09-28 10:52:44 +01:00
Bagatur	040d436b3f	Add vertex scheduled test (#10958 )	2023-09-23 15:51:59 -07:00
Bagatur	dccc20b402	add model feat table (#10921 )	2023-09-22 01:10:27 -07:00
Harrison Chase	2c957de2fc	add checks on basic base modules (#10693 )	2023-09-16 22:08:11 -07:00
Harrison Chase	5442d2b1fa	Harrison/stop importing from init (#10690 )	2023-09-16 17:22:48 -07:00
Predrag Gruevski	ccb9e3ee2d	Install dev, lint, test, typing extra deps for linting steps. (#10249 ) `mypy` cannot type-check code that relies on dependencies that aren't installed. Eventually we'll probably want to install as many optional dependencies as possible. However, the full "extended deps" setup for langchain creates a 3GB cache file and takes a while to unpack and install. We'll probably want something a bit more targeted. This is a first step toward something better.	2023-09-06 11:15:28 -04:00
Predrag Gruevski	82d5d4d0ae	Deny creating files as a result of test runs. (#10253 ) A test file was accidentally dropping a `results.json` file in the current working directory as a result of running `make test`. This is undesirable, since we don't want to risk accidentally adding stray files into the repo if we run tests locally and then do `git add .` without inspecting the file list very closely.	2023-09-06 11:15:16 -04:00
Predrag Gruevski	803be5b986	Run CI when CI infra itself has changed. (#10239 ) Make sure that changes to CI infrastructure get tested on CI before being merged. Without this PR, changes to the poetry setup action don't trigger a CI run and in principle could break `master` when merged.	2023-09-05 13:08:19 -04:00
maks-operlejn-ds	a8f804a618	Add data anonymizer (#9863 ) ### Description The feature for anonymizing data has been implemented. In order to protect private data, such as when querying external APIs (OpenAI), it is worth pseudonymizing sensitive data to maintain full privacy. Anonynization consists of two steps: 1. Identification: Identify all data fields that contain personally identifiable information (PII). 2. Replacement: Replace all PIIs with pseudo values or codes that do not reveal any personal information about the individual but can be used for reference. We're not using regular encryption, because the language model won't be able to understand the meaning or context of the encrypted data. We use Microsoft Presidio together with Faker framework for anonymization purposes because of the wide range of functionalities they provide. The full implementation is available in `PresidioAnonymizer`. ### Future works - deanonymization - add the ability to reverse anonymization. For example, the workflow could look like this: `anonymize -> LLMChain -> deanonymize`. By doing this, we will retain anonymity in requests to, for example, OpenAI, and then be able restore the original data. - instance anonymization - at this point, each occurrence of PII is treated as a separate entity and separately anonymized. Therefore, two occurrences of the name John Doe in the text will be changed to two different names. It is therefore worth introducing support for full instance detection, so that repeated occurrences are treated as a single object. ### Twitter handle @deepsense_ai / @MaksOpp --------- Co-authored-by: MaksOpp <maks.operlejn@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-08-30 10:39:44 -07:00
Predrag Gruevski	9aaa0fdce0	Use unified Python setup steps for release workflow.	2023-08-28 14:20:48 +00:00
Predrag Gruevski	c06f34fa35	Use new Python setup approach for scheduled tests. (#9626 ) Using the same new unified Python setup as the regular tests and the lint job, as set up in #9625.	2023-08-22 16:07:53 -04:00
Predrag Gruevski	83986ea98a	Cache poetry install + unify Python/Poetry setup for lint and test jobs. (#9625 ) With this PR: - All lint and test jobs use the exact same Python + Poetry installation approach, instead of lints doing it one way and tests doing it another way. - The Poetry installation itself is cached, which saves ~15s per run. - We no longer pass shell commands as workflow arguments to a workflow that just runs them in a shell. This makes our actions more resilient to shell code injection. If y'all like this approach, I can modify the scheduled tests workflow and the release workflow to use this too.	2023-08-22 15:59:22 -04:00
Predrag Gruevski	35812d0096	Set up concurrency groups and workflow cancelation in CI. (#9564 ) If another push to the same PR or branch happens while its CI is still running, cancel the earlier run in favor of the next run. There's no point in testing an outdated version of the code. GitHub only allows a limited number of job runners to be active at the same time, so it's better to cancel pointless jobs early so that more useful jobs can run sooner.	2023-08-22 14:21:26 -04:00
Predrag Gruevski	3c7cc4d440	Test experimental package with `langchain` on `master` branch. (#9621 ) It's possible that langchain-experimental works fine with the latest published langchain, but is broken with the langchain on `master`. Unfortunately, you can see this is currently the case — this is why this PR also includes a minor fix for the `langchain` package itself. We want to catch situations like that before releasing a new langchain, hence this test.	2023-08-22 13:35:21 -04:00
Predrag Gruevski	acb54d8b9d	Reduce cache timeouts to ensure faster builds on timeout. (#9619 ) The current timeouts are too long, and mean that if the GitHub cache decides to act up, jobs get bogged down for 15min at a time. This has happened 2-3 times already this week -- a tiny fraction of our total workflows but really annoying when it happens to you. We can do better. Installing deps on cache miss takes about ~4min, so it's not worth waiting more than 4min for the deps cache. The black and mypy caches save 1 and 2min, respectively, so wait only up to that long to download them.	2023-08-22 12:11:38 -04:00
Predrag Gruevski	a1e89aa8d5	Explicitly add the `contents: write` permission for publishing releases. (#9617 )	2023-08-22 08:38:18 -07:00
Predrag Gruevski	c75e1aa5ed	Eliminate special-casing from test CI workflows. (#9562 ) The previous approach was relying on `_test.yml` taking an input parameter, and then doing almost completely orthogonal things for each parameter value. I've separated out each of those test situations as its own job or workflow file, which eliminated all the special-casing and, in my opinion, improved maintainability by making it much more obvious what code runs when.	2023-08-22 11:36:52 -04:00
Predrag Gruevski	9f08d29bc8	Use PyPI Trusted Publishing to publish langchain packages. (#9467 ) Trusted Publishing is the current best practice for publishing Python packages. Rather than long-lived secret keys, it uses OpenID Connect (OIDC) to allow our GitHub runner to directly authenticate itself to PyPI and get a short-lived publishing token. This locks down publishing quite a bit: - There's no long-lived publish key to steal anymore. - Publishing is only allowed via the specifically designated GitHub workflow in the designated repo. It also is operationally easier: no keys means there's nothing that needs to be periodically rotated, nothing to worry about leaking, and nobody can accidentally publish a release from their laptop because they happened to have PyPI keys set up. After this gets merged, we'll need to configure PyPI to start expecting trusted publishing. It's only a few clicks and should only take a minute; instructions are here: https://docs.pypi.org/trusted-publishers/adding-a-publisher/ More info: - https://blog.pypi.org/posts/2023-04-20-introducing-trusted-publishers/ - https://github.com/pypa/gh-action-pypi-publish	2023-08-21 14:44:29 -04:00
Predrag Gruevski	249752e8ee	Require manually triggering release workflows. (#9552 )	2023-08-21 13:54:44 -04:00

1 2 3 4

198 Commits