community[patch]: Minor Improvement of extract hyperlinks tool output (#25728)

**Description:** Make the hyperlink only appear once in the
extract_hyperlinks tool output. (for some websites output contains
meaningless '#' hyperlinks multiple times which will extend the tokens
of context window without any advantage)
**Issue:** None
**Dependencies:** None
This commit is contained in:
zysoong 2024-08-28 10:02:40 +02:00 committed by GitHub
parent ff0df5ea15
commit 25a6790e1a
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -63,8 +63,9 @@ class ExtractHyperlinksTool(BaseBrowserTool):
links = [urljoin(base_url, anchor.get("href", "")) for anchor in anchors]
else:
links = [anchor.get("href", "") for anchor in anchors]
# Return the list of links as a JSON string
return json.dumps(links)
# Return the list of links as a JSON string. Duplicated link
# only appears once in the list
return json.dumps(list(set(links)))
def _run(
self,