Commit Graph

697 Commits (c3b407fcf54b8bf601501a660383ffe1e9f7dc39)
 

Author SHA1 Message Date
Samantha Whitmore 32b11101d3
Get elements of ActionInput on newlines (#889)
The re.DOTALL flag in Python's re (regular expression) module makes the
. (dot) metacharacter match newline characters as well as any other
character.

Without re.DOTALL, the . metacharacter only matches any character except
for a newline character. With re.DOTALL, the . metacharacter matches any
character, including newline characters.
1 year ago
Harrison Chase 1614c5f5fd
fix flaky tests (#892) 1 year ago
Harrison Chase a2b699dcd2
prompt template from string (#884) 1 year ago
Alex 7cc44b3bdb
Add to gallery (#882) 1 year ago
Harrison Chase 0b9f086d36
Harrison/docs splitter (#879) 1 year ago
Harrison Chase bcfbc7a818
version 0077 (#878) 1 year ago
Ryan Walker 1dd0733515
Fix small typo in getting started docs (#876)
Just noticed this little typo while reading the docs, thought I'd open a
PR!
1 year ago
Zach Schillaci 4c79100b15
Correct prompt typo + update example for SQLDatabaseChain (#868)
See https://github.com/hwchase17/langchain/issues/821
1 year ago
Harrison Chase 777aaff841
fix routing to tiktoken encoder (#866) 1 year ago
Harrison Chase e9ef08862d
validate template (#865) 1 year ago
Harrison Chase 364b771743
sql return direct (#864) 1 year ago
Harrison Chase 483441d305
pass kwargs through to loading (#863) 1 year ago
Harrison Chase 8df6b68093
fix length based example selector (#862) 1 year ago
Harrison Chase 3f48eed5bd
Harrison/milvus (#856)
Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
Signed-off-by: Frank Liu <frank.liu@zilliz.com>
Co-authored-by: Filip Haltmayer <81822489+filip-halt@users.noreply.github.com>
Co-authored-by: Frank Liu <frank@frankzliu.com>
1 year ago
Ankush Gola 933441cc52
Add retry to OpenAI llm (#849)
add ability to retry when certain exceptions are raised by
`openai.Completions.create`

Test plan: ran all OpenAI integration tests.
1 year ago
kahkeng 4a8f5cdf4b
Add alternative token-based text splitter (#816)
This does not involve a separator, and will naively chunk input text at
the appropriate boundaries in token space.

This is helpful if we have strict token length limits that we need to
strictly follow the specified chunk size, and we can't use aggressive
separators like spaces to guarantee the absence of long strings.

CharacterTextSplitter will let these strings through without splitting
them, which could cause overflow errors downstream.

Splitting at arbitrary token boundaries is not ideal but is hopefully
mitigated by having a decent overlap quantity. Also this results in
chunks which has exact number of tokens desired, instead of sometimes
overcounting if we concatenate shorter strings.

Potentially also helps with #528.
1 year ago
Harrison Chase 523ad2e6bd
vercel deployments (#850) 1 year ago
Harrison Chase fc0cfd7d1f
docs (#848) 1 year ago
Harrison Chase 4d32441b86
bump version to 0076 (#847) 1 year ago
Harrison Chase 23d5f64bda
Harrison/ngram example (#846)
Co-authored-by: Sean Spriggens <ssprigge@syr.edu>
1 year ago
Harrison Chase 0de55048b7
return code for pal (#844) 1 year ago
Harrison Chase d564308e0f
rfc: instruct embeddings (#811)
Co-authored-by: seanaedmiston <seane999@gmail.com>
1 year ago
Nick Furlotte 576609e665
Update PAL to allow passing local and global context to PythonREPL (#774)
Passing additional variables to the python environment can be useful for
example if you want to generate code to analyze a dataset.

I also added a tracker for the executed code - `code_history`.
1 year ago
Harrison Chase 3f952eb597
add from string method (#820) 1 year ago
Ikko Eltociear Ashimine ba26a879e0
Fix typo in crawler.py (#842)
seperator -> separator
1 year ago
Eli Mernit bfabd1d5c0
Added new deployment template (#835)
This PR introduces a new template for deploying LangChain apps as web
endpoints. It includes template code, and links to a detailed
code-walkthrough.
1 year ago
Jonas Ehrenstein f3508228df
Minor fix for google search util: it's uncertain if "snippet" in results exists (#830)
The results from Google search may not always contain a "snippet". 

Example:
`{'kind': 'customsearch#result', 'title': 'FEMA Flood Map', 'htmlTitle':
'FEMA Flood Map', 'link': 'https://msc.fema.gov/portal/home',
'displayLink': 'msc.fema.gov', 'formattedUrl':
'https://msc.fema.gov/portal/home', 'htmlFormattedUrl':
'https://<b>msc</b>.fema.gov/portal/home'}`

This will cause a KeyError at line 99
`snippets.append(result["snippet"])`.
1 year ago
Zach Schillaci b4eb043b81
Minor fix to SQLDatabaseChain doc (#826) 1 year ago
Istora Mandiri 06438794e1
Fix typo in textsplitter docs (#825) 1 year ago
Raza Habib 9f8e05ffd4
Update __init__.py (#827)
Remove duplicate APIChain
1 year ago
Harrison Chase b0d560be56
add to gallery (#824) 1 year ago
Johanna Appel ebea40ce86
Add 'truncate' parameter for CohereEmbeddings (#798)
Currently, the 'truncate' parameter of the cohere API is not supported.

This means that by default, if trying to generate and embedding that is
too big, the call will just fail with an error (which is frustrating if
using this embedding source e.g. with GPT-Index, because it's hard to
handle it properly when generating a lot of embeddings).
With the parameter, one can decide to either truncate the START or END
of the text to fit the max token length and still generate an embedding
without throwing the error.

In this PR, I added this parameter to the class.

_Arguably, there should be a better way to handle this error, e.g. by
optionally calling a function or so that gets triggered when the token
limit is reached and can split the document or some such. Especially in
the use case with GPT-Index, its often hard to estimate the token counts
for each document and I'd rather sort out the troublemakers or simply
split them than interrupting the whole execution.
Thoughts?_

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
1 year ago
Harrison Chase b9045f7e0d
bump version to 0075 (#819) 1 year ago
Harrison Chase 7b4882a2f4
Harrison/tf embeddings (#817)
Co-authored-by: Ryohei Kuroki <10434946+yakigac@users.noreply.github.com>
1 year ago
Harrison Chase 5d4b6e4d4e
conversational agent fix (#818) 1 year ago
Harrison Chase 94ae126747
return sql intermediate steps (#792) 1 year ago
bair82 ae5695ad32
Update cohere.py (#795)
When stop tokens are set in Cohere LLM constructor, they are currently
not stripped from the response, and they should be stripped
1 year ago
Johanna Appel cacf4091c0
Fix documentation for 'model' parameter in CohereEmbeddings (#797)
Currently, the class parameter 'model_name' of the CohereEmbeddings
class is not supported, but 'model' is. The class documentation is
inconsistent with this, though, so I propose to either fix the
documentation (this PR right now) or fix the parameter.

It will create the following error:
```
ValidationError: 1 validation error for CohereEmbeddings
model_name
  extra fields not permitted (type=value_error.extra)
```
1 year ago
Jason Liu 54f9e4287f
Pass kwargs from initialize_agent into agent classmethod (#799)
# Problem
I noticed that in order to change the prefix of the prompt in the
`zero-shot-react-description` agent
we had to dig around to subset strings deep into the agent's attributes.
It requires the user to inspect a long chain of attributes and classes.

`initialize_agent -> AgentExecutor -> Agent -> LLMChain -> Prompt from
Agent.create_prompt`

``` python
agent = initialize_agent(
    tools=tools,
    llm=fake_llm,
    agent="zero-shot-react-description"
)
prompt_str = agent.agent.llm_chain.prompt.template
new_prompt_str = change_prefix(prompt_str)
agent.agent.llm_chain.prompt.template = new_prompt_str
```

# Implemented Solution

`initialize_agent` accepts `**kwargs` but passes it to `AgentExecutor`
but not `ZeroShotAgent`, by simply giving the kwargs to the agent class
methods we can support changing the prefix and suffix for one agent
while allowing future agents to take advantage of `initialize_agent`.


```
agent = initialize_agent(
    tools=tools,
    llm=fake_llm,
    agent="zero-shot-react-description",
    agent_kwargs={"prefix": prefix, "suffix": suffix}
)
```

To be fair, this was before finding docs around custom agents here:
https://langchain.readthedocs.io/en/latest/modules/agents/examples/custom_agent.html?highlight=custom%20#custom-llmchain
but i find that my use case just needed to change the prefix a little.


# Changes

* Pass kwargs to Agent class method
* Added a test to check suffix and prefix

---------

Co-authored-by: Jason Liu <jason@jxnl.coA>
1 year ago
Roger Zurawicki c331009440
docs: Update langchain link to PyPI (#800)
Simple one-line fix

CONTRIBUTING used a link that pointed to the `ruff` project.
1 year ago
Roy Williams 6086292252
Centralize logic for loading from LangChainHub, add ability to pin dependencies (#805)
It's generally considered to be a good practice to pin dependencies to
prevent surprise breakages when a new version of a dependency is
released. This commit adds the ability to pin dependencies when loading
from LangChainHub.

Centralizing this logic and using urllib fixes an issue identified by
some windows users highlighted in this video -
https://youtu.be/aJ6IQUh8MLQ?t=537
1 year ago
Harrison Chase b3916f74a7
enable mmr search (#807) 1 year ago
Harrison Chase f46f1d28af
expose memory key name (#808) 1 year ago
Harrison Chase 7728a848d0
Harrison/tracing docs (#806)
Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
1 year ago
Harrison Chase f3da4dc6ba
Harrison/tracing docs (#804)
Co-authored-by: Ankush Gola <9536492+agola11@users.noreply.github.com>
1 year ago
Harrison Chase ae1b589f60
Harrison/add link for support (#794) 1 year ago
Harrison Chase 6a20f07f0d
add link for support (#793) 1 year ago
Harrison Chase fb2d7afe71
bump version to 0074 (#791) 1 year ago
Harrison Chase 1ad7973cc6
Harrison/tool decorator (#790)
Co-authored-by: Jason Liu <jxnl@users.noreply.github.com>
Co-authored-by: Jason Liu <jason@jxnl.coA>
1 year ago
Harrison Chase 5f73d06502
Harrison/fix caching bug (#788)
Co-authored-by: thepok <richterthepok@yahoo.de>
1 year ago