Subscribe to RSS feed

splitbrain.org - electronic brain surgery since 2001

Search Delicious with Google CSE

Delicious1) is a great service to bookmark those sites that might come in handy one day.

Unfortunately your bookmarks are only as good as you tag them. Since that's the only way to find them again later. Today I thought about ways to not only search through the tags but through the contents of the bookmarked sites as well. And of course whenever you think about search you think about Google…

Google let's you create you own specialized search engines (re)using their index. The feature is called CSE – Custom Search Engines.

Usually you provide a list of sites to include in your search engine manually. But there is also a way to let Google automatically download that data from a special XML file on an URL you provide.

So whipped up a script that uses the delicious API to create a Google CSE “Annotation” file:

delicious_cse.php
<?php
// configure
$USER  = 'fixme'; // delicious user
$PASS  = 'fixme'; // delicious password
$CSEID = 'fixme'; // Google CSE inclusion label
 
// no changes below
$xml = new SimpleXMLElement(
               file_get_contents(
                   'https://'.rawurlencode($USER).':'.
                              rawurlencode($PASS).
                             '@api.del.icio.us/v1/posts/all?'
               )
       );
 
header('Content-Type: text/xml; charset=utf-8');
define('NL',"\n");
 
echo '<GoogleCustomizations>'.NL;
echo '<CustomSearchEngine>
  <Title>My Bookmarks</Title>
  <Description>Index over my del.icio.us bookmarks</Description>
  <Context>
    <BackgroundLabels>
      <Label name="'.$CSEID.'" mode="FILTER" />
    </BackgroundLabels>
  </Context>
</CustomSearchEngine>'.NL;
 
echo '<Annotations>'.NL;
foreach($xml->post as $post){
    $post['href'] = trim($post['href']);
    if(!preg_match('#^https?://#i',$post['href'])) continue;
    if(strpos(parse_url($post['href'],PHP_URL_HOST),'.') === false) continue;
 
    echo '  <Annotation about="'.htmlspecialchars($post['href']).'">'.NL;
    echo '    <Label name="'.$CSEID.'" />'.NL;
    $labels = explode(' ',$post['tag']);
    foreach($labels as $label){
        $label = trim($label," \t,");
        echo '    <Label name="'.htmlspecialchars($label).'" />'.NL;
    }
    echo '  <Comment>'.htmlspecialchars($post['description'].' - '.$post['extended']).'</Comment>'.NL;
    echo '  </Annotation>'.NL;
}
 
echo '</Annotations>'.NL;
echo '</GoogleCustomizations>'.NL;

First download the script above, then create a new Custom Search Engine. When asked for a site to be indexed just give http://example.com and delete it again once your engine was created. We don't want to specify any sites ourself – that's what the script will do for us.

Next go to the setings of your CSE and swich to the “Advanced” page. There you'll find a label to be used to include new sites in your engine. You need to put that one in the $CSEID variable in the script. Also put your delicious user name and password in the $USER and $PASS variables respectively.

Then upload the whole file to your PHP enabled webserver. The URL to the script needs to be entered as Annotation feed in the above mentioned “Advanced” page of the CSE settings.

That's it. Your Custom Search Engine should now automatically search the contents of all sites you bookmark at delicious.

Tags:
del.icio.us,
google,
cse,
php,
api
Similar posts:
1) or del.icio.us as I still like to write it
Posted on Wednesday, November the 11th 2009 (3 months ago).

Comments?

1
Andreas,  As always AWESOME.  I don't use delicous bookmarks per se, but this is something that I am filing under too-cool-to-lose and hope that I can find a nifty way to use it!!

and btw...I was telling some friends about your experience with Delta..only I couldn't remeber which airline it was....but  no one had any doubt that Delta was the perp.
2009-11-11 21:20:44
2
Hi there,

I'm interested by using this. But, when I checked the PHP file uploaded on my server, I received the following error:

Warning: file_get_contents(https://...@api.del.icio.us/v1/posts/all?) [function.file-get-contents]: failed to open stream: No such file or directory in /home/cary/www/delicious_cse.php on line 13

Fatal error: Uncaught exception 'Exception' with message 'String could not be parsed as XML' in /home/cary/www/delicious_cse.php:14 Stack trace: #0 /home/cary/www/delicious_cse.php(14): SimpleXMLElement->__construct('') #1 {main} thrown in /home/cary/www/delicious_cse.php on line 14

What's wrong?

Thx,
Cary
2010-01-19 17:03:54
Cary Crusiau
3
Cary, your PHP probably doesn't allow to use URLs in fopen calls. Reconfigure it or replace the the feed loading with a call to CURL or some other HTTP library.
2010-01-20 09:08:40
4
Implementation was very easy. Worked like charm. Thanks a lot!
2010-02-06 05:40:38
Christian
CAPTCHA

No HTML allowed. URLs will be linked with nofollow attribute. Whitespace is preserved.