You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
langchain/docs/modules
Lance Martin 4092fd21dc
YoutubeAudioLoader and updates to OpenAIWhisperParser (#5772)
This introduces the `YoutubeAudioLoader`, which will load blobs from a
YouTube url and write them. Blobs are then parsed by
`OpenAIWhisperParser()`, as show in this
[PR](https://github.com/hwchase17/langchain/pull/5580), but we extend
the parser to split audio such that each chuck meets the 25MB OpenAI
size limit. As shown in the notebook, this enables a very simple UX:

```
# Transcribe the video to text
loader = GenericLoader(YoutubeAudioLoader([url],save_dir),OpenAIWhisperParser())
docs = loader.load()
``` 

Tested on full set of Karpathy lecture videos:

```
# Karpathy lecture videos
urls = ["https://youtu.be/VMj-3S1tku0"
        "https://youtu.be/PaCmpygFfXo",
        "https://youtu.be/TCH_1BHY58I",
        "https://youtu.be/P6sfmUTpUmc",
        "https://youtu.be/q8SA3rM6ckI",
        "https://youtu.be/t3YJ5hKiMQ0",
        "https://youtu.be/kCc8FmEb1nY"]

# Directory to save audio files 
save_dir = "~/Downloads/YouTube"
 
# Transcribe the videos to text
loader = GenericLoader(YoutubeAudioLoader(urls,save_dir),OpenAIWhisperParser())
docs = loader.load()
```
1 year ago
..
agents Harrison/pubmed integration (#5664) 1 year ago
callbacks FileCallbackHandler (#5589) 1 year ago
chains Fixed multi input prompt for MapReduceChain (#4979) 1 year ago
indexes YoutubeAudioLoader and updates to OpenAIWhisperParser (#5772) 1 year ago
memory fix: correct momento chat history notebook typo and title (#5646) 1 year ago
models Revise DATABRICKS_API_TOKEN as DATABRICKS_TOKEN (#5796) 1 year ago
prompts Harrison/pipeline prompt (#5540) 1 year ago
utils/examples Pass parsed inputs through to tool _run (#4309) 1 year ago
agents.rst docs: `modules` pages simplified (#5116) 1 year ago
chains.rst docs: `modules` pages simplified (#5116) 1 year ago
indexes.rst docs: `modules` pages simplified (#5116) 1 year ago
memory.rst docs: `modules` pages simplified (#5116) 1 year ago
models.rst docs: `modules` pages simplified (#5116) 1 year ago
paul_graham_essay.txt Fix notebook example (#3142) 1 year ago
prompts.rst docs: `modules` pages simplified (#5116) 1 year ago
state_of_the_union.txt Docs refactor (#480) 2 years ago