langchain/libs
Pasha 4e189cd89a
community[patch]: youtube loader transcript format (#16625)
- **Description**: YoutubeLoader right now returns one document that
contains the entire transcript. I think it would be useful to add an
option to return multiple documents, where each document would contain
one line of transcript with the start time and duration in the metadata.
For example,
[AssemblyAIAudioTranscriptLoader](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/document_loaders/assemblyai.py)
is implemented in a similar way, it allows you to choose between the
format to use for the document loader.
2024-01-26 15:26:09 -08:00
..
cli cli[patch]: add integration tests to default makefile (#16479) 2024-01-23 16:09:16 -07:00
community community[patch]: youtube loader transcript format (#16625) 2024-01-26 15:26:09 -08:00
core core: expand docstring for RunnableParallel (#16600) 2024-01-26 10:03:32 -05:00
experimental core[patch]: simple prompt pretty printing (#15968) 2024-01-12 21:08:51 -05:00
langchain langchain[patch]: inconsistent results with RecursiveCharacterTextSplitter's add_start_index=True (#16583) 2024-01-25 15:50:06 -08:00
partners google-vertexai[patch]: streaming bug (#16603) 2024-01-26 09:45:34 -08:00