Commit Graph

11 Commits

Author SHA1 Message Date
arkiver
40063adcaf Use wget-at with ZSTD. 2020-06-30 19:11:06 -04:00
Arkiver2
831f79f0d9 Do not import warcio. Update version to 20200102.03. 2020-01-02 17:43:53 +01:00
Arkiver2
cf3f6c7af9 Skip URL on status code 204. Update version to 20200102.02. 2020-01-02 17:41:38 +01:00
Arkiver2
ac65b0a818 Update version to 20200102.01. 2020-01-02 17:39:20 +01:00
Arkiver2
4cf7bd18f0 Version 20190729.01; do not get page requisites from outlinks; do not pip install warcio. 2019-07-29 22:58:09 +02:00
Arkiver2
8902255c76 Version 20190405.01; support www.reddit.com; support videos; support outlinks 2019-04-05 04:52:06 +02:00
Arkiver2
9d1ea0c688 rewrite 2019-02-22 01:15:18 +01:00
Arkiver2
11aef69a32 pipeline.py: cookies! 2015-07-06 18:46:00 +02:00
Arkiver2
9f531c900f pipeline.py: use redd.it to discover comments 2015-07-05 19:27:32 +02:00
Arkiver2
61dd537f15 Update pipeline.py 2015-07-05 17:48:53 +02:00
Arkiver2
ff1bb532c6 pipeline.py 2015-07-05 12:03:02 +02:00