mirror of https://github.com/hwchase17/langchain synced 2024-11-11 19:11:02 +00:00

History

matt haigh a4896da2a0 Experimental: Add other threshold types to SemanticChunker (#16807 ) Description Adding different threshold types to the semantic chunker. I’ve had much better and predictable performance when using standard deviations instead of percentiles. ![image](https://github.com/langchain-ai/langchain/assets/44395485/066e84a8-460e-4da5-9fa1-4ff79a1941c5) For all the documents I’ve tried, the distribution of distances look similar to the above: positively skewed normal distribution. All skews I’ve seen are less than 1 so that explains why standard deviations perform well, but I’ve included IQR if anyone wants something more robust. Also, using the percentile method backwards, you can declare the number of clusters and use semantic chunking to get an ‘optimal’ splitting. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>		2024-02-26 13:50:48 -08:00
..
langchain_experimental	Experimental: Add other threshold types to SemanticChunker (#16807 )	2024-02-26 13:50:48 -08:00
scripts	infra: add print rule to ruff (#16221 )	2024-02-09 16:13:30 -08:00
tests	docs, templates: update schema imports to core (#17885 )	2024-02-22 15:58:44 -08:00
LICENSE	Library Licenses (#13300 )	2023-11-28 17:34:27 -08:00
Makefile	create mypy cache dir if it doesn't exist (#14579 )	2023-12-12 15:34:50 -08:00
poetry.lock	experimental[patch]: Release 0.0.52 (#17763 )	2024-02-19 13:12:22 -08:00
poetry.toml
pyproject.toml	experimental[patch]: Release 0.0.52 (#17763 )	2024-02-19 13:12:22 -08:00
README.md

README.md

🦜️🧪 LangChain Experimental

This package holds experimental LangChain code, intended for research and experimental uses.

Warning

Portions of the code in this package may be dangerous if not properly deployed in a sandboxed environment. Please be wary of deploying experimental code to production unless you've taken appropriate precautions and have already discussed it with your security team.

Some of the code here may be marked with security notices. However, given the exploratory and experimental nature of the code in this package, the lack of a security notice on a piece of code does not mean that the code in question does not require additional security considerations in order to be safe to use.

README.md Unescape Escape

🦜️🧪 LangChain Experimental

README.md