Recently, I wanted to create epubs of some reading material, specifically some Wordpress-hosted blogs. Rather than implementing a separate script for each blog I wanted an epub from, I wrote a tool to figure it all out for me: trivialpub.
Usage is simple: provide a YAML config file, such as:
[x.get('href') for x in soup.select('div.entry-content a') if 'class' not in x.attrs]
And trivialpub will do the rest:
$ ./trivialpub.py -c practicalguide.yaml
Processing chapter https://practicalguidetoevil.wordpress.com/2015/03/25/prologue/
Processing chapter https://practicalguidetoevil.wordpress.com/2015/04/01/chapter-1-knife/
Processing chapter https://practicalguidetoevil.wordpress.com/2015/04/08/chapter-2-invitation/
Processing chapter https://practicalguidetoevil.wordpress.com/2015/04/16/chapter-3-squire/
$ ls -l build/
-rw-r--r-- 1 acompton admin 2004507 Nov 26 15:05 practicalguide.epub
I’ve included a number of examples in the repo, but basically, you either need to feed it a list of URLs to transform or a Python expression which, when eval’ed in context, will generate a list of URLs. (Yes, yes, eval() is the devil, I know.)
PRs and suggestions welcome.