I have added a reddit document loader which fetches the text from the
Posts of Subreddits or Reddit users, using the `praw` Python package. I
have also added an example notebook reddit.ipynb in order to guide users
to use this dataloader.
This code was made in format similar to twiiter document loader. I have
run code formating, linting and also checked the code myself for
different scenarios.
This is my first contribution to an open source project and I am really
excited about this. If you want to suggest some improvements in my code,
I will be happy to do it. :)
Co-authored-by: Taaha Bajwa <taaha.s.bajwa@gmail.com>
This PR includes some minor alignment updates, including:
- metadata object extended to support contractAddress, blockchainType,
and tokenId
- notebook doc better aligned to standard langchain format
- startToken changed from int to str to support multiple hex value types
on the Alchemy API
The updated metadata will look like the below. It's possible for a
single contractAddress to exist across multiple blockchains (e.g.
Ethereum, Polygon, etc.) so it's important to include the
blockchainType.
```
metadata = {"source": self.contract_address,
"blockchain": self.blockchainType,
"tokenId": tokenId}
```
- Added links to the vectorstore providers
- Added installation code (it is not clear that we have to go to the
`LangChan Ecosystem` page to get installation instructions.)
Add other File Utilities, include
- List Directory
- Search for file
- Move
- Copy
- Remove file
Bundle as toolkit
Add a notebook that connects to the Chat Agent, which somewhat supports
multi-arg input tools
Update original read/write files to return the original dir paths and
better handle unsupported file paths.
Add unit tests
Adds a PlayWright web browser toolkit with the following tools:
- NavigateTool (navigate_browser) - navigate to a URL
- NavigateBackTool (previous_page) - wait for an element to appear
- ClickTool (click_element) - click on an element (specified by
selector)
- ExtractTextTool (extract_text) - use beautiful soup to extract text
from the current web page
- ExtractHyperlinksTool (extract_hyperlinks) - use beautiful soup to
extract hyperlinks from the current web page
- GetElementsTool (get_elements) - select elements by CSS selector
- CurrentPageTool (current_page) - get the current page URL
Alternate implementation of #3452 that relies on a generic query
constructor chain and language and then has vector store-specific
translation layer. Still refactoring and updating examples but general
structure is there and seems to work s well as #3452 on exampels
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
This PR
* Adds `clear` method for `BaseCache` and implements it for various
caches
* Adds the default `init_func=None` and fixes gptcache integtest
* Since right now integtest is not running in CI, I've verified the
changes by running `docs/modules/models/llms/examples/llm_caching.ipynb`
(until proper e2e integtest is done in CI)
It makes sense to use `arxiv` as another source of the documents for
downloading.
- Added the `arxiv` document_loader, based on the
`utilities/arxiv.py:ArxivAPIWrapper`
- added tests
- added an example notebook
- sorted `__all__` in `__init__.py` (otherwise it is hard to find a
class in the very long list)
Tools for Bing, DDG and Google weren't consistent even though the
underlying implementations were.
All three services now have the same tools and implementations to easily
switch and experiment when building chains.
One of our users noticed a bug when calling streaming models. This is
because those models return an iterator. So, I've updated the Replicate
`_call` code to join together the output. The other advantage of this
fix is that if you requested multiple outputs you would get them all –
previously I was just returning output[0].
I also adjusted the demo docs to use dolly, because we're featuring that
model right now and it's always hot, so people won't have to wait for
the model to boot up.
The error that this fixes:
```
> llm = Replicate(model=“replicate/flan-t5-xl:eec2f71c986dfa3b7a5d842d22e1130550f015720966bec48beaae059b19ef4c”)
> llm(“hello”)
> Traceback (most recent call last):
File "/Users/charlieholtz/workspace/dev/python/main.py", line 15, in <module>
print(llm(prompt))
File "/opt/homebrew/lib/python3.10/site-packages/langchain/llms/base.py", line 246, in __call__
return self.generate([prompt], stop=stop).generations[0][0].text
File "/opt/homebrew/lib/python3.10/site-packages/langchain/llms/base.py", line 140, in generate
raise e
File "/opt/homebrew/lib/python3.10/site-packages/langchain/llms/base.py", line 137, in generate
output = self._generate(prompts, stop=stop)
File "/opt/homebrew/lib/python3.10/site-packages/langchain/llms/base.py", line 324, in _generate
text = self._call(prompt, stop=stop)
File "/opt/homebrew/lib/python3.10/site-packages/langchain/llms/replicate.py", line 108, in _call
return outputs[0]
TypeError: 'generator' object is not subscriptable
```
The sentence transformers was a dup of the HF one.
This is a breaking change (model_name vs. model) for anyone using
`SentenceTransformerEmbeddings(model="some/nondefault/model")`, but
since it was landed only this week it seems better to do this now rather
than doing a wrapper.
Update Alchemy Key URL in Blockchain Document Loader. I want to say
thank you for the incredible work the LangChain library creators have
done.
I am amazed at how seamlessly the Loader integrates with Ethereum
Mainnet, Ethereum Testnet, Polygon Mainnet, and Polygon Testnet, and I
am excited to see how this technology can be extended in the future.
@hwchase17 - Please let me know if I can improve or if I have missed any
community guidelines in making the edit? Thank you again for your hard
work and dedication to the open source community.
Improved `arxiv/tool.py` by adding more specific information to the
`description`. It would help with selecting `arxiv` tool between other
tools.
Improved `arxiv.ipynb` with more useful descriptions.
My attempt at improving the `Chain`'s `Getting Started` docs and
`LLMChain` docs. Might need some proof-reading as English is not my
first language.
In LLM examples, I replaced the example use case when a simpler one
(shorter LLM output) to reduce cognitive load.