You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/libs/experimental/langchain_experimental
Raviraj 858ce264ef
SemanticChunker : Feature Addition ("Semantic Splitting with gradient") (#22895)
```SemanticChunker``` currently provide three methods to split the texts semantically:
- percentile
- standard_deviation
- interquartile

I propose new method ```gradient```. In this method, the gradient of distance is used to split chunks along with the percentile method (technically) . This method is useful when chunks are highly correlated with each other or specific to a domain e.g. legal or medical. The idea is to apply anomaly detection on gradient array so that the distribution become wider and easy to identify boundaries in highly semantic data.
I have tested this merge on a set of 10 domain specific documents (mostly legal).

Details : 
    - **Issue:** Improvement
    - **Dependencies:** NA
    - **Twitter handle:** [x.com/prajapat_ravi](https://x.com/prajapat_ravi)


@hwchase17

---------

Co-authored-by: Raviraj Prajapat <raviraj.prajapat@sirionlabs.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
3 months ago
..
agents experimental[patch]/docs[patch]: Update links to security docs (#22864) 3 months ago
autonomous_agents experimental[patch]: return from HuggingGPT task executor task.run() exception (#20219) 5 months ago
chat_models infra: rm unused # noqa violations (#22049) 4 months ago
comprehend_moderation langchain: `callbacks` imports fix (#20348) 5 months ago
cpal Fix: lint errors and update Field alias in models.py and AutoSelectionScorer initialization (#22846) 3 months ago
data_anonymizer experimental[patch]: update module doc strings (#19539) 6 months ago
fallacy_removal experimental[patch]: `prompts` import fix (#20534) 5 months ago
generative_agents patch: deprecate (a)get_relevant_documents (#20477) 5 months ago
graph_transformers Improve llm graph transformer docstring (#22939) 3 months ago
llm_bash infra: rm unused # noqa violations (#22049) 4 months ago
llm_symbolic_math experimental[patch]: `prompts` import fix (#20534) 5 months ago
llms [experimental][llms][OllamaFunctions] tool calling related fixes (#22339) 3 months ago
open_clip experimental[patch]: update module doc strings (#19539) 6 months ago
openai_assistant Move OAI assistants to langchain and add callbacks (#13236) 10 months ago
pal_chain community[major], experimental[patch]: Remove Python REPL from community (#22904) 3 months ago
plan_and_execute experimental[patch]: `prompts` import fix (#20534) 5 months ago
prompt_injection_identifier experimental[minor]: upgrade the prompt injection model (#20783) 5 months ago
prompts experimental[patch]: `prompts` import fix (#20534) 5 months ago
pydantic_v1 `poetry lock` the experimental package. (#9478) 1 year ago
recommenders infra: rm unused # noqa violations (#22049) 4 months ago
retrievers langchain: `callbacks` imports fix (#20348) 5 months ago
rl_chain infra: rm unused # noqa violations (#22049) 4 months ago
smart_llm experimental[patch]: `prompts` import fix (#20534) 5 months ago
sql experimental[patch], docs: refine notebook for MyScale `SelfQueryRetriever` (#22016) 4 months ago
synthetic_data experimental[patch]: `prompts` import fix (#20534) 5 months ago
tabular_synthetic_data experimental[patch]: `prompts` import fix (#20534) 5 months ago
tools langchain: `callbacks` imports fix (#20348) 5 months ago
tot experimental[patch]: `prompts` import fix (#20534) 5 months ago
utilities experimental: clean python repl input(experimental:Added code for PythonREPL) (#20930) 5 months ago
video_captioning langchain: `callbacks` imports fix (#20348) 5 months ago
__init__.py Add version to langchain_experimental (#11613) 12 months ago
py.typed Add `py.typed` file to `langchain-experimental`. (#9557) 1 year ago
text_splitter.py SemanticChunker : Feature Addition ("Semantic Splitting with gradient") (#22895) 3 months ago