Bookie Weekly Update: April 22nd 2012
Bookie Weekly Update: April 22nd 2012
<div class="chronodata">22Apr12</div>
<div class="itemtext">
<p>Another week, another few lines of code, and yay for two weeks in a row!</p>
<div class="section" id="bookie">
<h1><a class="reference external" href="">Bookie</a></h1>
<p>Not a ton here, just some CSS updates and updating the backup script for pulling the INI correctly.</p>
<div class="section" id="bookie-parser">
<h1><a class="reference external" href="">Bookie Parser</a></h1>
<p>I spent some time cleaning up the CSS. I did some research on the most readable fonts for screens and surprisingly, it seems that sans serif wins on digital displays. So I updated the CSS and combined with some work on the Bookie main CSS files to make the readable pages a bit nicer. I&#8217;ve still got some more cleanup to do, but it reads a bit nicer now.</p>
<p>I also fixed the html generated to not have the empty body tag. It was due to the way the readable parsing library was giving me a full html document of content. See the updates over there for some bigger updates.</p>
<p>Finally, I added a form on the main page so you can try it out on a url just by entering it. So if you&#8217;re just curious what it does, <a class="reference external" href="">go try it out</a>!</p>
<div class="section" id="bookie-api">
<h1><a class="reference external" href="">Bookie Api</a></h1>
<p>Just added a <cite>ping</cite> command. It should help make sure that the configuration is correct for new users. It&#8217;s also a nice start to a non-admin specific api command. A little bit of cleanup aside from that, but nothing major.</p>
<div class="section" id="readability-lxml">
<h1><a class="reference external" href="">readability_lxml</a></h1>
<p>Currently, Bookie uses a library called <a class="reference external" href="">decruft</a> for parsing html pages for the actual important article content. The <a class="reference external">bookie_parser</a> project is using a different fork of that called <a class="reference external" href="">readability_lxml</a>. The author is a bit open to merging changes in and actually says she&#8217;s in &#8216;maintenance mode&#8217;. Since I kind of want a really decent library for this, it&#8217;s an important feature, I started hacking on it. In the process, this is where my week of hacking went.</p>
<p>First I updated it to allow me to get back only a partial html document vs an entire <cite>&lt;html&gt;</cite> doc. I then fixed some bugs, started cleaning up the code (adding tests, making the command line client all nice and argepare&#8217;y) etc. In the process I noticed that there&#8217;s a big branch in Github that adds a ton of things like multiple page document support and such. I&#8217;ve started to try to pull his branch into my work and the origin author&#8217;s code. It&#8217;s a LOT of <cite>git cherry-pick</cite> and really a pain since I want to clean up the code as I go. Unfortunately, this just means that Git gets confused on future merges since the code&#8217;s changed between commits. Ugh!</p>
<p>I&#8217;m about half way done though and I hope this will leave us with one solid library to do this parsing. I&#8217;m hoping to kind of take over stewardship of the library as I complete this work. It should hopefully make <a class="reference external" href="">Bookie</a> and <a class="reference external" href="">bookie_parser</a> all the more awesome.</p>
<div class="section" id="the-coming-week">
<h1>The coming week</h1>
<p>I&#8217;m giving a talk on the <a class="reference external" href="">YUI JavaScript library</a> at <a class="reference external" href="">Penguicon</a>. This means my<br />
hacking time will be a bit less since I&#8217;ve got a presentation to prepare for. Next week&#8217;s status report might be a bit light and boring, but hey, maybe I&#8217;ll scrounge up some more beta users of Bookie while at the conference.</p>
Filed under: Bookie
Tags: api, bookie, github, parser, penguicon, readability_lxml
<ul class="secondary">
A Michigan Techie exploring Linux, Programming, and Woodworking.
RSS
RSS - Posts
RSS - Comments
Blogroll
<ul class='xoxo blogroll'>
Bookie

Michigan Usr Group
My Bookmarks
My Github
My Vim Screencasts
OSS Michigan
Ubuntu Michigan LoCo
Tags
accounts
android
api
beanstalkd

bookie
bookmarklet
bookmarks
chc
chrome
code
coders
coffee
coffeehousecoders
community
css
delicious
development
docs
extension
firefox
fulltext
javascript
jquery
launchpad
mobile
nginx
node.js
parser
penguicon
profiling
pylons
pyohio
pyramid
python
queue
ruby
social
sprint
sqlalchemy
sqlalchemy-migrate
status
testing
turbogears
yui
<form method="get" id="searchform" action="">
<input type="text" value="Search..." onfocus="if (this.value == 'Search...' ) { this.value = ''; }" onblur="if (this.value == '') { this.value = 'Search...';}" name="s" id="s" size="15" />
<input type="submit" id="searchsubmit" value="Go" />
<div class="clear"></div>
<div style="clear:both;"></div>
<!-- Close Page -->
<script type="text/javascript">
<noscript><img src="" style="height:0px;width:0px;overflow:hidden" alt="" /></noscript>