Migrate pydantic extra to literals
Upgrade to using a literal for specifying the extra which is the
recommended approach in pydantic 2.
This works correctly also in pydantic v1.
```python
from pydantic.v1 import BaseModel
class Foo(BaseModel, extra="forbid"):
x: int
Foo(x=5, y=1)
```
And
```python
from pydantic.v1 import BaseModel
class Foo(BaseModel):
x: int
class Config:
extra = "forbid"
Foo(x=5, y=1)
```
## Enum -> literal using grit pattern:
```
engine marzano(0.1)
language python
or {
`extra=Extra.allow` => `extra="allow"`,
`extra=Extra.forbid` => `extra="forbid"`,
`extra=Extra.ignore` => `extra="ignore"`
}
```
Resorted attributes in config and removed doc-string in case we will
need to deal with going back and forth between pydantic v1 and v2 during
the 0.3 release. (This will reduce merge conflicts.)
## Sort attributes in Config:
```
engine marzano(0.1)
language python
function sort($values) js {
return $values.text.split(',').sort().join("\n");
}
class_definition($name, $body) as $C where {
$name <: `Config`,
$body <: block($statements),
$values = [],
$statements <: some bubble($values) assignment() as $A where {
$values += $A
},
$body => sort($values),
}
```
Replaced `from langchain.prompts` with `from langchain_core.prompts`
where it is appropriate.
Most of the changes go to `langchain_experimental`
Similar to #20348
Replaced all `from langchain.callbacks` into `from
langchain_core.callbacks` .
Changes in the `langchain` and `langchain_experimental`
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description:**
When using the SQLDatabaseChain with Llama2-70b LLM and, SQLite
database. I was getting `Warning: You can only execute one statement at
a time.`.
```
from langchain.sql_database import SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain
sql_database_path = '/dccstor/mmdataretrieval/mm_dataset/swimming_record/rag_data/swimmingdataset.db'
sql_db = get_database(sql_database_path)
db_chain = SQLDatabaseChain.from_llm(mistral, sql_db, verbose=True, callbacks = [callback_obj])
db_chain.invoke({
"query": "What is the best time of Lance Larson in men's 100 meter butterfly competition?"
})
```
Error:
```
Warning Traceback (most recent call last)
Cell In[31], line 3
1 import langchain
2 langchain.debug=False
----> 3 db_chain.invoke({
4 "query": "What is the best time of Lance Larson in men's 100 meter butterfly competition?"
5 })
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/chains/base.py:162, in Chain.invoke(self, input, config, **kwargs)
160 except BaseException as e:
161 run_manager.on_chain_error(e)
--> 162 raise e
163 run_manager.on_chain_end(outputs)
164 final_outputs: Dict[str, Any] = self.prep_outputs(
165 inputs, outputs, return_only_outputs
166 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/chains/base.py:156, in Chain.invoke(self, input, config, **kwargs)
149 run_manager = callback_manager.on_chain_start(
150 dumpd(self),
151 inputs,
152 name=run_name,
153 )
154 try:
155 outputs = (
--> 156 self._call(inputs, run_manager=run_manager)
157 if new_arg_supported
158 else self._call(inputs)
159 )
160 except BaseException as e:
161 run_manager.on_chain_error(e)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_experimental/sql/base.py:198, in SQLDatabaseChain._call(self, inputs, run_manager)
194 except Exception as exc:
195 # Append intermediate steps to exception, to aid in logging and later
196 # improvement of few shot prompt seeds
197 exc.intermediate_steps = intermediate_steps # type: ignore
--> 198 raise exc
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_experimental/sql/base.py:143, in SQLDatabaseChain._call(self, inputs, run_manager)
139 intermediate_steps.append(
140 sql_cmd
141 ) # output: sql generation (no checker)
142 intermediate_steps.append({"sql_cmd": sql_cmd}) # input: sql exec
--> 143 result = self.database.run(sql_cmd)
144 intermediate_steps.append(str(result)) # output: sql exec
145 else:
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_community/utilities/sql_database.py:436, in SQLDatabase.run(self, command, fetch, include_columns)
425 def run(
426 self,
427 command: str,
428 fetch: Literal["all", "one"] = "all",
429 include_columns: bool = False,
430 ) -> str:
431 """Execute a SQL command and return a string representing the results.
432
433 If the statement returns rows, a string of the results is returned.
434 If the statement returns no rows, an empty string is returned.
435 """
--> 436 result = self._execute(command, fetch)
438 res = [
439 {
440 column: truncate_word(value, length=self._max_string_length)
(...)
443 for r in result
444 ]
446 if not include_columns:
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_community/utilities/sql_database.py:413, in SQLDatabase._execute(self, command, fetch)
410 elif self.dialect == "postgresql": # postgresql
411 connection.exec_driver_sql("SET search_path TO %s", (self._schema,))
--> 413 cursor = connection.execute(text(command))
414 if cursor.returns_rows:
415 if fetch == "all":
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1416, in Connection.execute(self, statement, parameters, execution_options)
1414 raise exc.ObjectNotExecutableError(statement) from err
1415 else:
-> 1416 return meth(
1417 self,
1418 distilled_parameters,
1419 execution_options or NO_OPTIONS,
1420 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/sql/elements.py:516, in ClauseElement._execute_on_connection(self, connection, distilled_params, execution_options)
514 if TYPE_CHECKING:
515 assert isinstance(self, Executable)
--> 516 return connection._execute_clauseelement(
517 self, distilled_params, execution_options
518 )
519 else:
520 raise exc.ObjectNotExecutableError(self)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1639, in Connection._execute_clauseelement(self, elem, distilled_parameters, execution_options)
1627 compiled_cache: Optional[CompiledCacheType] = execution_options.get(
1628 "compiled_cache", self.engine._compiled_cache
1629 )
1631 compiled_sql, extracted_params, cache_hit = elem._compile_w_cache(
1632 dialect=dialect,
1633 compiled_cache=compiled_cache,
(...)
1637 linting=self.dialect.compiler_linting | compiler.WARN_LINTING,
1638 )
-> 1639 ret = self._execute_context(
1640 dialect,
1641 dialect.execution_ctx_cls._init_compiled,
1642 compiled_sql,
1643 distilled_parameters,
1644 execution_options,
1645 compiled_sql,
1646 distilled_parameters,
1647 elem,
1648 extracted_params,
1649 cache_hit=cache_hit,
1650 )
1651 if has_events:
1652 self.dispatch.after_execute(
1653 self,
1654 elem,
(...)
1658 ret,
1659 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1848, in Connection._execute_context(self, dialect, constructor, statement, parameters, execution_options, *args, **kw)
1843 return self._exec_insertmany_context(
1844 dialect,
1845 context,
1846 )
1847 else:
-> 1848 return self._exec_single_context(
1849 dialect, context, statement, parameters
1850 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1988, in Connection._exec_single_context(self, dialect, context, statement, parameters)
1985 result = context._setup_result_proxy()
1987 except BaseException as e:
-> 1988 self._handle_dbapi_exception(
1989 e, str_statement, effective_parameters, cursor, context
1990 )
1992 return result
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:2346, in Connection._handle_dbapi_exception(self, e, statement, parameters, cursor, context, is_sub_exec)
2344 else:
2345 assert exc_info[1] is not None
-> 2346 raise exc_info[1].with_traceback(exc_info[2])
2347 finally:
2348 del self._reentrant_error
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1969, in Connection._exec_single_context(self, dialect, context, statement, parameters)
1967 break
1968 if not evt_handled:
-> 1969 self.dialect.do_execute(
1970 cursor, str_statement, effective_parameters, context
1971 )
1973 if self._has_events or self.engine._has_events:
1974 self.dispatch.after_cursor_execute(
1975 self,
1976 cursor,
(...)
1980 context.executemany,
1981 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/default.py:922, in DefaultDialect.do_execute(self, cursor, statement, parameters, context)
921 def do_execute(self, cursor, statement, parameters, context=None):
--> 922 cursor.execute(statement, parameters)
Warning: You can only execute one statement at a time.
```
**Issue:**
The Error occurs because when generating the SQLQuery, the llm_input
includes the stop character of "\nSQLResult:", so for this user query
the LLM generated response is **SELECT Time FROM men_butterfly_100m
WHERE Swimmer = 'Lance Larson';\nSQLResult:** it is required to remove
the SQLResult suffix on the llm response before executing it on the
database.
```
llm_inputs = {
"input": input_text,
"top_k": str(self.top_k),
"dialect": self.database.dialect,
"table_info": table_info,
"stop": ["\nSQLResult:"],
}
sql_cmd = self.llm_chain.predict(
callbacks=_run_manager.get_child(),
**llm_inputs,
).strip()
if SQL_RESULT in sql_cmd:
sql_cmd = sql_cmd.split(SQL_RESULT)[0].strip()
result = self.database.run(sql_cmd)
```
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
…tch]: import models from community
ran
```bash
git grep -l 'from langchain\.chat_models' | xargs -L 1 sed -i '' "s/from\ langchain\.chat_models/from\ langchain_community.chat_models/g"
git grep -l 'from langchain\.llms' | xargs -L 1 sed -i '' "s/from\ langchain\.llms/from\ langchain_community.llms/g"
git grep -l 'from langchain\.embeddings' | xargs -L 1 sed -i '' "s/from\ langchain\.embeddings/from\ langchain_community.embeddings/g"
git checkout master libs/langchain/tests/unit_tests/llms
git checkout master libs/langchain/tests/unit_tests/chat_models
git checkout master libs/langchain/tests/unit_tests/embeddings/test_imports.py
make format
cd libs/langchain; make format
cd ../experimental; make format
cd ../core; make format
```
Use `.copy()` to fix the bug that the first `llm_inputs` element is
overwritten by the second `llm_inputs` element in `intermediate_steps`.
***Problem description:***
In [line 127](
c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L127C17-L127C17)),
the `llm_inputs` of the sql generation step is appended as the first
element of `intermediate_steps`:
```
intermediate_steps.append(llm_inputs) # input: sql generation
```
However, `llm_inputs` is a mutable dict, it is updated in [line
179](https://github.com/langchain-ai/langchain/blob/master/libs/experimental/langchain_experimental/sql/base.py#L179)
for the final answer step:
```
llm_inputs["input"] = input_text
```
Then, the updated `llm_inputs` is appended as another element of
`intermediate_steps` in [line
180](c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L180)):
```
intermediate_steps.append(llm_inputs) # input: final answer
```
As a result, the final `intermediate_steps` returned in [line
189](c732d8fffd/libs/experimental/langchain_experimental/sql/base.py (L189C43-L189C43))
actually contains two same `llm_inputs` elements, i.e., the `llm_inputs`
for the sql generation step overwritten by the one for final answer step
by mistake. Users are not able to get the actual `llm_inputs` for the
sql generation step from `intermediate_steps`
Simply calling `.copy()` when appending `llm_inputs` to
`intermediate_steps` can solve this problem.
continuation of PR #8550
@hwchase17 please see and merge. And also close the PR #8550.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
**Description:** Renamed argument `database` in
`SQLDatabaseSequentialChain.from_llm()` to `db`,
I realize it's tiny and a bit of a nitpick but for consistency with
SQLDatabaseChain (and all the others actually) I thought it should be
renamed. Also got me while working and using it today.
✔️ Please make sure your PR is passing linting and
testing before submitting. Run `make format`, `make lint` and `make
test` to check this locally.
Squashed from #7454 with updated features
We have separated the `SQLDatabseChain` from `VectorSQLDatabseChain` and
put everything into `experimental/`.
Below is the original PR message from #7454.
-------
We have been working on features to fill up the gap among SQL, vector
search and LLM applications. Some inspiring works like self-query
retrievers for VectorStores (for example
[Weaviate](https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/weaviate_self_query.html)
and
[others](https://python.langchain.com/en/latest/modules/indexes/retrievers/examples/self_query.html))
really turn those vector search databases into a powerful knowledge
base! 🚀🚀
We are thinking if we can merge all in one, like SQL and vector search
and LLMChains, making this SQL vector database memory as the only source
of your data. Here are some benefits we can think of for now, maybe you
have more 👀:
With ALL data you have: since you store all your pasta in the database,
you don't need to worry about the foreign keys or links between names
from other data source.
Flexible data structure: Even if you have changed your schema, for
example added a table, the LLM will know how to JOIN those tables and
use those as filters.
SQL compatibility: We found that vector databases that supports SQL in
the marketplace have similar interfaces, which means you can change your
backend with no pain, just change the name of the distance function in
your DB solution and you are ready to go!
### Issue resolved:
- [Feature Proposal: VectorSearch enabled
SQLChain?](https://github.com/hwchase17/langchain/issues/5122)
### Change made in this PR:
- An improved schema handling that ignore `types.NullType` columns
- A SQL output Parser interface in `SQLDatabaseChain` to enable Vector
SQL capability and further more
- A Retriever based on `SQLDatabaseChain` to retrieve data from the
database for RetrievalQAChains and many others
- Allow `SQLDatabaseChain` to retrieve data in python native format
- Includes PR #6737
- Vector SQL Output Parser for `SQLDatabaseChain` and
`SQLDatabaseChainRetriever`
- Prompts that can implement text to VectorSQL
- Corresponding unit-tests and notebook
### Twitter handle:
- @MyScaleDB
### Tag Maintainer:
Prompts / General: @hwchase17, @baskaryan
DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
### Dependencies:
No dependency added
Add SQLDatabaseSequentialChain Class to __init__.py so it can be
accessed and used
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- Description: SQLDatabaseSequentialChain is not found when importing
Langchain_experimental package, when I open __init__.py
Langchain_expermental.sql, I found that SQLDatabaseSequentialChain is
imported and add to __all__ list
- Issue: SQLDatabaseSequentialChain is not found in
Langchain_experimental package
- Dependencies: None,
- Tag maintainer: None,
- Twitter handle: None,
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
-->
The most reliable way to not have a chain run an undesirable SQL command
is to not give it database permissions to run that command. That way the
database itself performs the rule enforcement, so it's much easier to
configure and use properly than anything we could add in ourselves.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
-->