splitbrain.org

electronic brain surgery since 2001

Backup Your Identi.ca Statuses With XMLStarlet & 8 Lines Of Bash

Michael Klier recently decided to shut down his blog. Luckily he provides a tarball of all his posts and used a liberal license for his contents. With his permission I will repost a few of his old blog posts that I think should remain online for their valuable information.

This post was originally posted July, 24th 2010 at chimeric.de and is licensed under the Creative Commons BY-NC-SA License.

The title pretty much says it all. I needed a simple way to get hold of all my identi.ca statuses, so I threw a short bash script together which uses xmlstarlet (if you don't know xmlstarlet yet, I highly recommend to check it out, it's the swiss army knife for command line XML processing).

Here's the script.

identup.sh
#!/bin/bash
num=$(xmlstarlet sel --net -t -m "//user" -v "statuses_count" "http://identi.ca/api/users/show/${1}.xml")
pages=$((num / 200))
for page in $(seq 1 $((pages + 1)))
do
    xmlstarlet sel --net -t -m "//status" -v "created_at" -o " " -v "text" -n "http://identi.ca/api/statuses/user_timeline/${1}.xml?count=200&page=${page}"
    sleep 5
done

To use this script just make it executable and give your identica username as first argument. It outputs each dent preceeded by its timestamp. To save the dents just pipe them into a file or whatever fits your purposes.

$> ./identup.sh chimeric | tee -a identica.bak.txt
Mon Jun 14 16:19:58 +0000 2010 uh oh: Farewell Microblogging http://bit.ly/9L5ZUZ
Mon Jun 14 15:42:10 +0000 2010 @splitbrain @tante achso das, das hab ich erst morgen :-P. Aber danke trotzdem!!!!!
Mon Jun 14 15:36:41 +0000 2010 okay gleich isses soweit
...

If you need more data just lookup the XML output of identi.ca and alter the second xmlstarlet call to also output the values you need.

In theory the script should also work for twitter by replacing the API URL in the script.

Tags:
guestpost, chimeric.de, identi.ca, xmlstarlet, api, bash
Similar posts: