Mittwoch, 23. November 2011

Scripting Example: My Google+ Stream

Today I tried the Google+ stream download feature and got a ZIP-file containing a large number of HTML-files (one for each posting).
To get a better representation I wrote a little ruby script to throw everything together in one large file.

Have fun.


#!/usr/bin/env ruby

require 'rubygems'
require 'nokogiri'

STREAMPATH = ARGV[0]     # the directory containing the stream files.

def concat(dir)

  # collect all html files
  htmls = Dir[File.join(dir,'*.html')].collect { |filename|
    File.open(filename) { |f| Nokogiri::HTML(f) }}

  # create new (empty) html
  dest = Nokogiri::HTML(
    '<html><head><title>My Stream</title></head><body/></html>')

  # we copy the necessary CSS
  (dest/'html/head').first << (htmls.first/'html/head/style').first

  destbody = (dest/'html/body').first 

  # collect the interesting parts from the bodies
  bodys = htmls.collect {|html| (html/'body/div').first }.sort_by { |div| 
    (Time.parse((div/'div/abbr.published').first['title'])) }

  # and append them to the new html (seperated by <hr />)
  bodys.each do |body|
    destbody << body
    destbody << Nokogiri::XML::Node.new('hr', dest)
  end

  dest     # return the created html
end

# and call it
puts concat(STREAMPATH)