[Community]: Fix - Open AI Whisper client.audio.transcriptions returning Text Object which raises error (#25271)

- **Description:** The following [line](fd546196ef/libs/community/langchain_community/document_loaders/parsers/audio.py (L117)) in `OpenAIWhisperParser` returns a text object for some odd reason despite the official documentation saying it should return `Transcript` Instance which should have the text attribute. But for the example given in the issue and even when I tried running on my own, I was directly getting the text. The small PR accounts for that. - **Issue:** : #25218 I was able to replicate the error even without the GenericLoader as shown below and the issue was with `OpenAIWhisperParser` ```python parser = OpenAIWhisperParser(api_key="sk-fxxxxxxxxx", response_format="srt", temperature=0) list(parser.lazy_parse(Blob.from_path('path_to_file.m4a'))) ```
2024-11-10 01:10:59 +00:00 · 2024-08-19 18:36:42 +05:00 · 2024-08-19 18:36:42 +05:00 · 75c3c81b8c
commit 75c3c81b8c
parent 0f7b8adddf
1 changed files with 3 additions and 1 deletions
--- a/libs/community/langchain_community/document_loaders/parsers/audio.py
+++ b/libs/community/langchain_community/document_loaders/parsers/audio.py
@ -129,7 +129,9 @@ class OpenAIWhisperParser(BaseBlobParser):
                continue

            yield Document(
-                page_content=transcript.text,
+                page_content=transcript.text
+                if not isinstance(transcript, str)
+                else transcript,
                metadata={"source": blob.source, "chunk": split_number},
            )