# Added Tutorials section on the top-level of documentation
**Problem Statement**: the Tutorials section in the documentation is
top-priority. Not every project has resources to make tutorials. We have
such a privilege. Community experts created several tutorials on
YouTube.
But the tutorial links are now hidden on the YouTube page and not easily
discovered by first-time visitors.
**PR**: I've created the `Tutorials` page (from the `Additional
Resources/YouTube` page) and moved it to the top level of documentation
in the `Getting Started` section.
## Who can review?
@dev2049
NOTE:
PR checks are randomly failing
3aefaafcdb258819eadf514d81b5b3
# Respect User-Specified User-Agent in WebBaseLoader
This pull request modifies the `WebBaseLoader` class initializer from
the `langchain.document_loaders.web_base` module to preserve any
User-Agent specified by the user in the `header_template` parameter.
Previously, even if a User-Agent was specified in `header_template`, it
would always be overridden by a random User-Agent generated by the
`fake_useragent` library.
With this change, if a User-Agent is specified in `header_template`, it
will be used. Only in the case where no User-Agent is specified will a
random User-Agent be generated and used. This provides additional
flexibility when using the `WebBaseLoader` class, allowing users to
specify their own User-Agent if they have a specific need or preference,
while still providing a reasonable default for cases where no User-Agent
is specified.
This change has no impact on existing users who do not specify a
User-Agent, as the behavior in this case remains the same. However, for
users who do specify a User-Agent, their choice will now be respected
and used for all subsequent requests made using the `WebBaseLoader`
class.
Fixes#4167
## Before submitting
============================= test session starts
==============================
collecting ... collected 1 item
test_web_base.py::TestWebBaseLoader::test_respect_user_specified_user_agent
============================== 1 passed in 3.64s
===============================
PASSED [100%]
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested: @eyurtsev
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
[OpenWeatherMapAPIWrapper](f70e18a5b3/docs/modules/agents/tools/examples/openweathermap.ipynb)
works wonderfully, but the _tool_ itself can't be used in master branch.
- added OpenWeatherMap **tool** to the public api, to be loadable with
`load_tools` by using "openweathermap-api" tool name (that name is used
in the existing
[docs](aff33d52c5/docs/modules/agents/tools/getting_started.md),
at the bottom of the page)
- updated OpenWeatherMap tool's **description** to make the input format
match what the API expects (e.g. `London,GB` instead of `'London,GB'`)
- added [ecosystem documentation page for
OpenWeatherMap](f9c41594fe/docs/ecosystem/openweathermap.md)
- added tool usage example to [OpenWeatherMap's
notebook](f9c41594fe/docs/modules/agents/tools/examples/openweathermap.ipynb)
Let me know if there's something I missed or something needs to be
updated! Or feel free to make edits yourself if that makes it easier for
you 🙂
[RELLM](https://github.com/r2d4/rellm) is a library that wraps local
HuggingFace pipeline models for structured decoding.
RELLM works by generating tokens one at a time. At each step, it masks
tokens that don't conform to the provided partial regular expression.
[JSONFormer](https://github.com/1rgs/jsonformer) is a bit different, where it sequentially adds the keys then decodes each value directly
Currently, all Zapier tools are built using the pre-written base Zapier
prompt. These small changes (that retain default behavior) will allow a
user to create a Zapier tool using the ZapierNLARunTool while providing
their own base prompt.
Their prompt must contain input fields for zapier_description and
params, checked and enforced in the tool's root validator.
An example of when this may be useful: user has several, say 10, Zapier
tools enabled. Currently, the long generic default Zapier base prompt is
attached to every single tool, using an extreme number of tokens for no
real added benefit (repeated). User prompts LLM on how to use Zapier
tools once, then overrides the base prompt.
Or: user has a few specific Zapier tools and wants to maximize their
success rate. So, user writes prompts/descriptions for those tools
specific to their use case, and provides those to the ZapierNLARunTool.
A consideration - this is the simplest way to implement this I could
think of... though ideally custom prompting would be possible at the
Toolkit level as well. For now, this should be sufficient in solving the
concerns outlined above.
The error in #4087 was happening because of the use of csv.Dialect.*
which is just an empty base class. we need to make a choice on what is
our base dialect. I usually use excel so I put it as excel, if
maintainers have other preferences do let me know.
Open Questions:
1. What should be the default dialect?
2. Should we rework all tests to mock the open function rather than the
csv.DictReader?
3. Should we make a separate input for `dialect` like we have for
`encoding`?
---------
Co-authored-by: = <=>
**Problem statement:** the
[document_loaders](https://python.langchain.com/en/latest/modules/indexes/document_loaders.html#)
section is too long and hard to comprehend.
**Proposal:** group document_loaders by 3 classes: (see `Files changed`
tab)
UPDATE: I've completely reworked the document_loader classification.
Now this PR changes only one file!
FYI @eyurtsev @hwchase17
### Refactor the BaseTracer
- Remove the 'session' abstraction from the BaseTracer
- Rename 'RunV2' object(s) to be called 'Run' objects (Rename previous
Run objects to be RunV1 objects)
- Ditto for sessions: TracerSession*V2 -> TracerSession*
- Remove now deprecated conversion from v1 run objects to v2 run objects
in LangChainTracerV2
- Add conversion from v2 run objects to v1 run objects in V1 tracer
fixes a syntax error mentioned in
#2027 and #3305
another PR to remedy is in #3385, but I believe that is not tacking the
core problem.
Also #2027 mentions a solution that works:
add to the prompt:
'The SQL query should be outputted plainly, do not surround it in quotes
or anything else.'
To me it seems strange to first ask for:
SQLQuery: "SQL Query to run"
and then to tell the LLM not to put the quotes around it. Other
templates (than the sql one) do not use quotes in their steps.
This PR changes that to:
SQLQuery: SQL Query to run
## Change Chain argument in client to accept a chain factory
The `run_over_dataset` functionality seeks to treat each iteration of an
example as an independent trial.
Chains have memory, so it's easier to permit this type of behavior if we
accept a factory method rather than the chain object directly.
There's still corner cases / UX pains people will likely run into, like:
- Caching may cause issues
- if memory is persisted to a shared object (e.g., same redis queue) ,
this could impact what is retrieved
- If we're running the async methods with concurrency using local
models, if someone naively instantiates the chain and loads each time,
it could lead to tons of disk I/O or OOM
# Provide get current date function dialect for other DBs
<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.
Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.
After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->
<!-- Remove if not applicable -->
Fixes # (issue)
## Before submitting
<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->
## Who can review?
Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
@eyurtsev
<!-- For a quicker response, figure out the right person to tag with @
@hwchase17 - project lead
Tracing / Callbacks
- @agola11
Async
- @agola11
DataLoaders
- @eyurtsev
Models
- @hwchase17
- @agola11
Agents / Tools / Toolkits
- @vowelparrot
VectorStores / Retrievers / Memory
- @dev2049
-->
# Cosmetic in errors formatting
Added appropriate spacing to the `ImportError` message in a bunch of
document loaders to enhance trace readability (including Google Drive,
Youtube, Confluence and others). This change ensures that the error
messages are not displayed as a single line block, and that the `pip
install xyz` commands can be copied to clipboard from terminal easily.
## Who can review?
@eyurtsev