splitbrain.org

electronic brain surgery since 2001

Search Delicious with Google CSE

Delicious1) is a great service to bookmark those sites that might come in handy one day.

Unfortunately your bookmarks are only as good as you tag them. Since that's the only way to find them again later. Today I thought about ways to not only search through the tags but through the contents of the bookmarked sites as well. And of course whenever you think about search you think about Google…

Google let's you create you own specialized search engines (re)using their index. The feature is called CSE – Custom Search Engines.

Usually you provide a list of sites to include in your search engine manually. But there is also a way to let Google automatically download that data from a special XML file on an URL you provide.

So whipped up a script that uses the delicious API to create a Google CSE “Annotation” file:

delicious_cse.php
<?php
// configure
$USER  = 'fixme'; // delicious user
$PASS  = 'fixme'; // delicious password
$CSEID = 'fixme'; // Google CSE inclusion label
 
// no changes below
$xml = new SimpleXMLElement(
               file_get_contents(
                   'https://'.rawurlencode($USER).':'.
                              rawurlencode($PASS).
                             '@api.del.icio.us/v1/posts/all?'
               )
       );
 
header('Content-Type: text/xml; charset=utf-8');
define('NL',"\n");
 
echo '<GoogleCustomizations>'.NL;
echo '<CustomSearchEngine>
  <Title>My Bookmarks</Title>
  <Description>Index over my del.icio.us bookmarks</Description>
  <Context>
    <BackgroundLabels>
      <Label name="'.$CSEID.'" mode="FILTER" />
    </BackgroundLabels>
  </Context>
</CustomSearchEngine>'.NL;
 
echo '<Annotations>'.NL;
foreach($xml->post as $post){
    $post['href'] = trim($post['href']);
    if(!preg_match('#^https?://#i',$post['href'])) continue;
    if(strpos(parse_url($post['href'],PHP_URL_HOST),'.') === false) continue;
 
    echo '  <Annotation about="'.htmlspecialchars($post['href']).'">'.NL;
    echo '    <Label name="'.$CSEID.'" />'.NL;
    $labels = explode(' ',$post['tag']);
    foreach($labels as $label){
        $label = trim($label," \t,");
        echo '    <Label name="'.htmlspecialchars($label).'" />'.NL;
    }
    echo '  <Comment>'.htmlspecialchars($post['description'].' - '.$post['extended']).'</Comment>'.NL;
    echo '  </Annotation>'.NL;
}
 
echo '</Annotations>'.NL;
echo '</GoogleCustomizations>'.NL;

First download the script above, then create a new Custom Search Engine. When asked for a site to be indexed just give http://example.com and delete it again once your engine was created. We don't want to specify any sites ourself – that's what the script will do for us.

Next go to the setings of your CSE and swich to the “Advanced” page. There you'll find a label to be used to include new sites in your engine. You need to put that one in the $CSEID variable in the script. Also put your delicious user name and password in the $USER and $PASS variables respectively.

Then upload the whole file to your PHP enabled webserver. The URL to the script needs to be entered as Annotation feed in the above mentioned “Advanced” page of the CSE settings.

That's it. Your Custom Search Engine should now automatically search the contents of all sites you bookmark at delicious.

Tags:
del.icio.us, google, cse, php, api
Similar posts:
1)
or del.icio.us as I still like to write it