Monthly Archives: August 2010

Can your own web pages at home!

pickles, originally uploaded by valkyrieh116.

Want to bundle a web page into a single file, without a _files directory, or using the not-supported-everywhere .mht (IE, Opera) or .webarchive (Safari) formats? Use pagecan! I developed pagecan so I can return converted documents on doc.mar.cx as a single file.

pagecan will take an URL of an HTML document, grab all resources referenced by “src”, and bundle the page and encoded resources into a single file, through the use of the data URI scheme. pagecan is written in Ruby and uses the Nokogiri parser (you can install the gem with gem install nokogiri, or the Debian package with sudo apt-get install libnokogiri-ruby).

Usage: pagecan url [file | -]

If ‘-’ or no file is given, output is sent to stdout. pagecan has been tested only with HTTP URLs, but as it uses Ruby open-uri, other URIs and local files may work.

pagecan on github