Commit Graph

114 Commits

Author SHA1 Message Date
arkiver
15a0a1a6f5 Version 20230607.06. Ignore discovered /r/FIFA URL if coming from a /r/EASportFC parent URL. 2023-06-07 23:13:42 +02:00
arkiver
fe17191306 Version 20230607.05. Better checking for video. Abort item if no post is found (during blackout for example). 2023-06-07 23:05:44 +02:00
arkiver
7bb5c39419 Version 20230607.04. Abort on video for now. 2023-06-07 22:53:41 +02:00
arkiver
f63c8ab696 Version 20230607.03. Prevent getting URL ending with /". Ignore /message/compose URLs. 2023-06-07 22:39:57 +02:00
arkiver
393407520b Version 20230607.02. Very simple content checks to check if response is complete. Properly prevent writing to WARC in cases and do not abort all items when finding a problematic URL. 2023-06-07 22:35:47 +02:00
arkiver
37ba172c61 Version 20230607.01. Use GNU Wget 1.21.3-at.20230605.01 and arguments around DNS. 2023-06-07 15:46:23 +02:00
arkiver
da85457aae Version 20230531.01. Use --secure-protocol PFS. 2023-05-31 10:16:48 +02:00
arkiver
48b24323c6 Version 20230530.01. Queue discovered outlinks to urls-stash-reddit. 2023-05-30 19:42:55 +02:00
arkiver
a3b5bcecc1 Version 20230529.01. Correctly extract more comment pages from comment pages in the new design. Print debug infrmation for comment pages on old design. 2023-05-29 17:56:36 +02:00
arkiver
1a14af2095 Version 20230509.02. Support new Wget-AT. 2023-05-09 05:48:05 +02:00
arkiver
b2654e9317 Version 20230509.01. Support for new design. 2023-05-09 05:43:21 +02:00
arkiver
7f4db17348 Version 20221021.01. Ignore /tailwind-build.css URL from comment in HTML. 2022-10-21 01:11:46 +02:00
arkiver
8a27002fd3 Version 20221005.01. Max tries for backfeed to 10. 2022-10-05 16:20:17 +02:00
arkiver
35e31af37f Queue redditstatic.com URLs as outlinks. 2022-10-05 16:19:53 +02:00
arkiver
bab4b4dcd2 Version 20220729.05. Fix aborting item on bad status code on url: item. Keep old retry code otherwise. 2022-07-29 04:52:08 +02:00
arkiver
8c45a263aa Version 20220729.04. Queue extra found URLs on media URLs to backfeed. 2022-07-28 18:31:23 +02:00
arkiver
e8fe03fbd0 Version 20220729.03. Add url: prefix to url item. 2022-07-28 18:20:59 +02:00
arkiver
2d8fa4034b Version 20220729.02. Support older Wget versions. 2022-07-28 18:15:54 +02:00
arkiver
f81b2ce97e Version 20220729.01. Queue media URLs back to reddit project and download individually. 2022-07-28 18:09:04 +02:00
arkiver
edacb2065a Fix README. 2022-05-07 04:49:30 +02:00
arkiver
cc83009a94 Version 20220605.01. Support GNU Wget 1.21.3-at.20220503.02. Fix killing crawl when items cannot be queued. 2022-05-06 18:31:38 +02:00
arkiver
7c4cf4548e Version 20220415.02. 2022-04-15 21:39:33 +02:00
arkiver
754fd256cb
Merge pull request #13 from NGTmeaty/patch-1
Add support for latest change in _options
2022-04-15 21:38:46 +02:00
arkiver
0ce1c59ca4 Version 20220415.01. Do not queue /r/undefined/ URLs. 2022-04-15 20:38:36 +02:00
Jake L
a858c33e29
Add support for latest change in _options 2022-03-31 20:46:28 -04:00
arkiver
da28d3c902 Version 20220323.03. Fix items to maxtries variable name. Fix backfeed key name. 2022-03-23 21:59:52 +01:00
arkiver
8944cf1fc6 Version 20220323.02. Fix items to maxtries variable name. 2022-03-23 16:36:23 +01:00
arkiver
10eaa7c50c Version 20220323.01. Fix backfeed. Fix maxtries use. 2022-03-23 16:16:58 +01:00
arkiver
28f132a052 Version 20220312.01. Fix backfeed. 2022-03-12 23:53:48 +01:00
arkiver
4f50a0d699 Version 20220311.01. Use new backfeed endpoint for queuing. 2022-03-11 03:52:49 +01:00
arkiver
383c101aef Version 20220109.02. Cut off URL at space when found between brackets without href= in front. 2022-01-09 17:19:29 +01:00
arkiver
df35317e0c Version 20220109.01. Add codepoint to utf8 support. Percent encode outlinks correctly. 2022-01-09 17:15:10 +01:00
arkiver
8a3f8cd1de Version 20211004.02. Fix incomplete facebook.com fix. 2021-10-04 21:09:21 +02:00
arkiver
d0070db67a Version 20211004.01. Do not check facebook.com while down at the moment. 2021-10-04 21:04:03 +02:00
arkiver
0c5e8cd3bd Version 20211001.01. Use GNU Wget 1.20.3-at.20211001.01. 2021-10-01 02:44:01 +02:00
arkiver
ed80cb5a9d Version 20210707.01. Do not get media for cross posts. 2021-07-07 00:12:56 +02:00
arkiver
4b976e2ea7 Version 20210521.01. Use TLS 1.2. 2021-05-21 22:37:19 +02:00
Katie Holly
f4619bb17f use onbuild-based image 2021-05-16 21:05:18 +00:00
km09
e6b876e9e6
New day.. new wget-at 1.20.3-at.20210504.01 2021-05-06 00:07:16 +01:00
Thomas Glass
1f9e995b4e
20210410.01 - New day, new wget-at 2021-04-10 15:20:27 +01:00
arkiver
6e15841550 Version 20210407.01. Improve video archiving. Detect if video is still being processed by reddit. 2021-04-07 00:38:20 +02:00
arkiver
1b3690d994 Version 20210330.04. Only decode unicode characters in URLs on v.redd.it URLs. 2021-03-30 22:20:43 +02:00
arkiver
ce7fff480d Version 20210330.03. Unescape unicode characters. Do not HLS for video. 2021-03-30 20:57:31 +02:00
arkiver
ad04f45d4f Fix typo. 2021-03-30 16:11:12 +02:00
arkiver
adc7f9c6fb Version 20210330.02. Skip images that are only in JSON and not on web page. 2021-03-30 02:21:55 +02:00
arkiver
07ed16c44b Version 20210330.01. Handle 403 on v.redd.it on deleted post. 2021-03-30 01:49:48 +02:00
arkiver
8849165130 Version 20210321.01. Do not get all video sizes. 2021-03-21 02:21:41 +01:00
arkiver
d3b6659419 Version 20210312.01. Get URLs with utm_* and context params. 2021-03-12 21:36:32 +01:00
arkiver
a5c798945c Version 20210306.01. Remove some AppleWebKir user-agents for getting 403s. 2021-03-06 00:27:31 +01:00
Katie Holly
eaad7cd7e7
add 1.20.3-at.20210212.02 as supported wget-at version 2021-02-25 03:01:34 +01:00