Goodbye Weblog Bookwatch
On April 14th, 2002 I launched the Weblog Bookwatch—a look at the most frequently mentioned books across the blogosphere. (original post) Since then, the bookwatch has dutifully scanned the blogosphere day in and day out, noting book ISBNs (and Amazon CD ASINs) and the blogs where they were spotted.
In 2002 there were a fairly manageable number of blogs to scan. Between April and December 2002 there were 36,790 unique citations across 5,207 unique weblogs. Just to give you a sense of the size today, the Bookwatch scanned 47,512 weblogs today between the hours of 12am and 6am. I have a database with over two million citations in it, and it's growing exponentially.
I got an email from my ISP today informing me that I was over my bandwidth limit. I thought that was odd, did I get Slashdotted and not know it? My logs didn't indicate any spikes. Nope, the problem was traffic from my machine. In other words, scanning close to 50,000 weblogs every six hours tends to use some bandwidth. That got me thinking about whether or not I can afford to keep the Bookwatch running.
But what about all that sweet Amazon cash? It's true that the book links on the Weblog Bookwatch are an associate link to Amazon—and I get a cut when people buy books through them. But it has never been a big money maker. In Q2 of 2005 I made $118.67, which isn't even close to covering a month of hosting with my current setup.
I've enjoyed clicking through the sites to read what people are saying about books that show up on the page. But the Bookwatch can't keep up with the entire blogosphere anymore, and there are a couple of great services with more resources that track book mentions across weblogs (and much more!): All Consuming and Technorati Books.
I learned a lot about weblogs and gathering data while running and tuning the Bookwatch, and now it's teaching me about when an experiment should end. So as of today, the
In 2002 there were a fairly manageable number of blogs to scan. Between April and December 2002 there were 36,790 unique citations across 5,207 unique weblogs. Just to give you a sense of the size today, the Bookwatch scanned 47,512 weblogs today between the hours of 12am and 6am. I have a database with over two million citations in it, and it's growing exponentially.
I got an email from my ISP today informing me that I was over my bandwidth limit. I thought that was odd, did I get Slashdotted and not know it? My logs didn't indicate any spikes. Nope, the problem was traffic from my machine. In other words, scanning close to 50,000 weblogs every six hours tends to use some bandwidth. That got me thinking about whether or not I can afford to keep the Bookwatch running.
But what about all that sweet Amazon cash? It's true that the book links on the Weblog Bookwatch are an associate link to Amazon—and I get a cut when people buy books through them. But it has never been a big money maker. In Q2 of 2005 I made $118.67, which isn't even close to covering a month of hosting with my current setup.
I've enjoyed clicking through the sites to read what people are saying about books that show up on the page. But the Bookwatch can't keep up with the entire blogosphere anymore, and there are a couple of great services with more resources that track book mentions across weblogs (and much more!): All Consuming and Technorati Books.
I learned a lot about weblogs and gathering data while running and tuning the Bookwatch, and now it's teaching me about when an experiment should end. So as of today, the
obidos-bot
has crawled its last site. It's been a fun app, but it's time to say goodbye to Weblog Bookwatch. Thanks (again) to weblogs.com, Blogger, and Amazon for publishing data in an easy-to-use format.