Hack of the Whenever I Get Around to It

August 16, 2007

YaCy crawl of this site

Filed under: Uncategorized — Chris Merck @ 2:39 am

At the suggestion of some IRCers I tried the open source, p2p web search and crawling tool YaCy. I have to say I am impressed with the features and the speed of the web crawl.

Here is the visual result of a web crawl of hotwigati.blogspot.com:

And an action shot of the crawl:

The exciting new feature in YaCy is the ability to distribute the web crawling among many peers resulting in fast and extensive crawls.

Of course there are privacy and security concerns. The software is Java-based, reducing the risk of exploitation, but there still remains the ‘risk’ of having your IP in the logs of sites crawled by anonymous peers through your machine.


Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: