Adds a `langchain-location` param to lint, so we can properly locate it.
Regular langchain and experimental lint steps are passing, so default
value seems to be working.
**Description:**
Revise `libs/langchain/langchain/document_loaders/async_html.py` to
store the HTML Title and Page Language in the `metadata` of
`AsyncHtmlLoader`.
Compare predicted json to reference. First canonicalize (sort keys, rm
whitespace separators), then return normalized string edit distance.
Not a silver bullet but maybe an easy way to capture structure
differences in a less flakey way
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Will run all CI because of _test change, but future PRs against CLI will
only trigger the new CLI one
Has a bunch of file changes related to formatting/linting.
No mypy yet - coming soon
**Description**
This small change will make chunk_size a configurable parameter for
loading documents into a Supabase database.
**Issue**
https://github.com/langchain-ai/langchain/issues/11422
**Dependencies**
No chanages
**Twitter**
@ j1philli
**Reminder**
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
---------
Co-authored-by: Greg Richardson <greg.nmr@gmail.com>
Description
* Add _generate and _agenerate to support Fireworks batching.
* Add stop words test cases
* Opt out retry mechanism
Issue - Not applicable
Dependencies - None
Tag maintainer - @baskaryan
- **Description:** refactors the redis vector field schema to properly
handle default values, includes a new unit test suite.
- **Issue:** N/A
- **Dependencies:** nothing new.
- **Tag maintainer:** @baskaryan @Spartee
- **Twitter handle:** this is a tiny fix/improvement :)
This issue was causing some clients/cuatomers issues when building a
vector index on Redis on smaller db instances (due to fault default
values in index configuration). It would raise an error like:
```redis.exceptions.ResponseError: Vector index initial capacity 20000 exceeded server limit (852 with the given parameters)```
This PR will address this moving forward.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
This PR replaces the previous `Intent` check with the new `Prompt
Safety` check. The logic and steps to enable chain moderation via the
Amazon Comprehend service, allowing you to detect and redact PII, Toxic,
and Prompt Safety information in the LLM prompt or answer remains
unchanged.
This implementation updates the code and configuration types with
respect to `Prompt Safety`.
### Usage sample
```python
from langchain_experimental.comprehend_moderation import (BaseModerationConfig,
ModerationPromptSafetyConfig,
ModerationPiiConfig,
ModerationToxicityConfig
)
pii_config = ModerationPiiConfig(
labels=["SSN"],
redact=True,
mask_character="X"
)
toxicity_config = ModerationToxicityConfig(
threshold=0.5
)
prompt_safety_config = ModerationPromptSafetyConfig(
threshold=0.5
)
moderation_config = BaseModerationConfig(
filters=[pii_config, toxicity_config, prompt_safety_config]
)
comp_moderation_with_config = AmazonComprehendModerationChain(
moderation_config=moderation_config, #specify the configuration
client=comprehend_client, #optionally pass the Boto3 Client
verbose=True
)
template = """Question: {question}
Answer:"""
prompt = PromptTemplate(template=template, input_variables=["question"])
responses = [
"Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like 323-22-9980. John Doe's phone number is (999)253-9876.",
"Final Answer: This is a really shitty way of constructing a birdhouse. This is fucking insane to think that any birds would actually create their motherfucking nests here."
]
llm = FakeListLLM(responses=responses)
llm_chain = LLMChain(prompt=prompt, llm=llm)
chain = (
prompt
| comp_moderation_with_config
| {llm_chain.input_keys[0]: lambda x: x['output'] }
| llm_chain
| { "input": lambda x: x['text'] }
| comp_moderation_with_config
)
try:
response = chain.invoke({"question": "A sample SSN number looks like this 123-456-7890. Can you give me some more samples?"})
except Exception as e:
print(str(e))
else:
print(response['output'])
```
### Output
```python
> Entering new AmazonComprehendModerationChain chain...
Running AmazonComprehendModerationChain...
Running pii Validation...
Running toxicity Validation...
Running prompt safety Validation...
> Finished chain.
> Entering new AmazonComprehendModerationChain chain...
Running AmazonComprehendModerationChain...
Running pii Validation...
Running toxicity Validation...
Running prompt safety Validation...
> Finished chain.
Final Answer: A credit card number looks like 1289-2321-1123-2387. A fake SSN number looks like XXXXXXXXXXXX John Doe's phone number is (999)253-9876.
```
---------
Co-authored-by: Jha <nikjha@amazon.com>
Co-authored-by: Anjan Biswas <anjanavb@amazon.com>
Co-authored-by: Anjan Biswas <84933469+anjanvb@users.noreply.github.com>
**Description:**
This PR adds support for the [Pro version of Titan Takeoff
Server](https://docs.titanml.co/docs/category/pro-features). Users of
the Pro version will have to import the TitanTakeoffPro model, which is
different from TitanTakeoff.
**Issue:**
Also minor fixes to docs for Titan Takeoff (Community version)
**Dependencies:**
No additional dependencies
**Twitter handle:** @becoming_blake
@baskaryan @hwchase17
- **Description:** Super simple fix for colab link on
code_understanding.ipynb,
- **Issue:** not applicable
- **Dependencies:** none,
- **Tag maintainer:** ,
- **Twitter handle:** @kengoodridge
- **Description:**
This PR adds `allowd_operators` property to `QdrantTranslator` to fix
the `TypeError: can only join an iterable` bug. This property is
required in `get_query_constructor_prompt` in
`query_constructor\base.py`:
```
allowed_operators=" | ".join(allowed_operators),
```
- **Issue:**
#12061
---------
Co-authored-by: XIE Qihui <qihui.xie@bopufund.com>
Problem statement:
In the `integrations/llms` and `integrations/chat` pages, we have a
sidebar with ToC, and we also have a ToC at the end of the page.
The ToC at the end of the page is not necessary, and it is confusing
when we mix the index page styles; moreover, it requires manual work.
So, I removed ToC at the end of the page (it was discussed with and
approved by @baskaryan)