Add Camera Images to Flickr

When I'm browsing photos on Flickr, I use the More Properties link quite a bit. That's the link that takes you to the Exif data associated with a photo if it's available. Embedded Exif data is how Flickr knows what type of camera took a particular photo, what the shutter speed was, aperture setting, and a bunch of other technical details about the state of the camera at the time the photo was taken. The more properties link is to the right of a photo on Flickr, and looks like this when it's there:

More properties link

The first thing I look at on the More Properties page is the camera model. But unless I know a particular camera model number already, it doesn't tell me much. "Ahh yes, the EX-Z750," I tell myself. Of course I have no idea what that model number means. So if I really want to know what type of camera the photographer used, I have to copy the model number, go to Amazon or Google, paste it in, and sort through the results. I knew there had to be a better way.

So I wrote a (relatively) quick Greasemonkey script that does the work of looking up the camera model for me. It even inserts a picture of that particular model on the Flickr "More properties" page. Here's what it looks like in action.

More properties page before:

More properties before

More properties page after:

More properties with camera image

And you can click the camera image to view more info about the camera at Amazon. Bonus for me: if you buy the camera through that link, I'll get a little kickback through Amazon's Associates Program.

Here's how it works. The script grabs the camera model from the Flickr page, contacts the Amazon API looking for that model in the Camera & Photo category, then grabs the image of the first result. Then the script inserts the image and a link to the product page into the page at Flickr.

It's not perfect. Sometimes Amazon doesn't carry that particular camera but has accessories that include a description with the model number. So you'll see a flash or remote shutter release instead of a camera. And sometimes the first result from Amazon isn't the correct model number—especially with older cameras. I'll keep tinkering with it to see if I can get more accurate results from Amazon.

If there's no match at all on Amazon, the script makes the model number a link to Google search results for that phrase.

The script just gives me a quick look at the type of camera that took the photo. I've been surprised to see cameras that look like video cameras taking nice still photos. Anyway, it was fun to put together and I learned a bit more about JavaScript.

If you already have Firefox with Greasemonkey installed, you can install this script for youself here: Flickr Camera Images

Thanks to the author of Monkey Match for a solid Amazon E4X parsing example, and of course Dive Into Greasemonkey. For more fun hacking around with with these applications check out Flickr Hacks and Amazon Hacks. (disclaimer: as you probably know I worked on both of these books.)

Add a batch of dates to Google Calendar

I've always used several calendars to plan out my life. Until recently, I used a paper desk calendar to track work-related events like project milestones. I used an insanely hacked-up version of PHP Calendar to track daily appointments and travel plans. And I used a paper calendar hanging in the kitchen to track family events like birthdays and anniversaries. And to be honest, with all of the calendars I still wasn't very organized. The distinction between types of events and the calendars weren't as clear-cut as I'm describing them, and I'd often have a work project milestone on my kitchen calendar, or a birthday in PHP Calendar, not in their "proper" locations.

What I like about Google Calendar is the ability to lay several calendars on top of each other. So I can keep the family birthdays separate from the project milestones, but I can still show them all on one calendar if I need to. And with a click, I can remove the dates that aren't relevant for what I'm working on at the moment. The calendar list looks like this:

calendar controls

I decided to make Google Calendar my One Calendar To Rule Them All, and the switch has been very easy. The Ajaxy interface makes adding events insanely intuitive—click a day to add an event on that day. And I love the ability to click and drag several days to add weeklong events like conferences. The other big advantage to going digital is the ability to share calendars with other people. I can't easily send all of the data on my paper calendars to friends and family without Xerox and Fedex involved.

The one issue I ran into during the conversion was with family events. I had over 50 birthdays and anniversaries I wanted to add to a calendar, and the thought of clicking Create Event and adding data for each one, or worse—hunting and pecking to find a particular day to click—wasn't appealing. So I thought I'd share my method for dumping a bunch of dates into Google Calendar. You just need a little time to get your dates together, some Perl, and a Google Calendar account.

Import/Export

The Google Calendar doesn't have an API (yet), but it does have a hacker's little friend called import/export. Google accepts two types of calendar formats for import: iCalendar and Outlook's Comma Separated Values (CSV) export. So if you already have calendar data in Outlook or iCal you can simply import/export at will. (Yahoo! Calendar also exports to the Outlook CSV format, so switching is fairly painless.) But I didn't know the first thing about either of these formats, I simply had a list of dates I wanted to dump.

Gathering Dates

I had a head start because I already had a list of family birthdays and anniversaries in a text file. I massaged the list a little to get it into a data-friendly format, and ended up with a file full of dates that looked like this:
4/18/1942,Uncle Bob's Birthday
4/28/1944,Aunt Sally's Birthday
7/23/1978,Lindsay and Tobias' Anniversary
8/10/1989,Cousin Maeby's Birthday
...
(obviously not real data.)

If you're building a list of dates from scratch you can use Excel. Just put dates in the first column in mm/dd/yyyy format, descriptions in the second. When you're done, save the file in CSV format, ignoring all the warnings about compatibility.

I called the file family_dates.csv. Yes, this is a comma-separated value list too, but not the format Google Calendar is expecting. Plus you don't want to add an event on April 18th, 1942. You want to add a full day event for April 18th, each year going forward. This is where I turned to Perl to massage the data.

The Code

This simple Perl script: calendar_csv.pl transformed the simple CSV list of dates and titles into the Outlook CSV format that Google likes to see. As you run the script it converts the year of the event into the current year, and adds an event for the next several years.

You'll need to customize the script a bit before you run it. Change $datefile to the name of your simple CSV file, in my case family_dates.csv. You can change $importfile to your preferred name of the output file, the default is import.csv. And you can set the number of years into the future that you'd like the date to appear by adjusting the value of $yearsahead, the default is 5. (If your events should only be added in the current year, set this to 1.)

Keep in mind that the larger the amount of data in your calendar, the longer it will take Google to load that calendar when you fire up Google Calendar. I originally set the $yearsahead value to 10, but with over 500 events, the calendar was noticably slowing the Google Calendar startup.

In addition to Perl, you'll need the standard Date::Calc module.

And if you're not in the US and would prefer dd/mm/yyyy format, simply change this bit: my ($month, $day) = to this: my ($day, $month) =. Instant internationalization!

Once everything is set, run the script from a command prompt, like this:

perl calendar_csv.pl

A new file called import.csv will magically appear with your dates formatted as Outlook CSV events. With the file in hand you can head over to Google Calendar.

Importing Data

Over at Google Calendar, click Manage Calendars under your calendar listing on the left site. Choose Create new calendar, and give your calendar a name and any other details. Click Create Calendar, and you'll see the new calendar in your list. Now click Settings in the upper right corner of the page, and choose the Import Calendar tab. Click Browse..., choose import.csv from your local files, set the calendar to your new calendar, and click Import.

That's all there is to it. You'll get a quick report about the number of events Google was able to import. Go back to your main view, and you should see your imported dates on the calendar, in the color of your newly created calendar. With one import, my view of April went from this:

calendar pre import

To this view with family birthdays the rust color:

calendar post import

(The details have been removed to protect the innocent.)

And once you have your calendar in Google, you can invite others to view and even help maintain the dates. Where I think this batch importing will be useful is for very large data sets. Imagine a teacher who wants to track the birthdays of students. It wouldn't be too hard to add the dates by hand. But a principal who wants to track the birthdays of everyone in a school will have an easier time putting together a spreadsheet than entering the days by hand. And even for my 50+ dates, writing a Perl script was preferable to entering the dates by hand.

So far I'm enjoying Google Calendar, and I haven't found any major problems beyond the limited importing ability. But now I really don't have an excuse for not sending out birthday cards.

Update (4/20): Google just released their Google Calendar API. I'll bet there are scores of hackers rushing to build bulk-import tools. Using the Calendar API would be a more stable way to import dates quickly. And wow! Hello, lifehackers!

Bloglines Update

Great news, Bloglines addressed the "onfocus/nofocus" problem and the Greasemonkey script I wrote isn't needed anymore. I got an email from Paul at Bloglines letting me know that, "Our anti-XSS code was being too aggresive and attempting to filter attribute values, in addition to attribute keys." Thanks, Paul! I'm very happy they took time out to address the problem because I think it's a great service and I didn't want to move to another reader. If you installed the Greasemonkey script, you can get rid of it. I deleted it from my server.

Flickr Hacks Code

There's a nice review of Flickr Hacks over at MyMac.com: Hack Your Way Into Flickr, and the reviewer mentioned that the code for all of the hacks wasn't available online. O'Reilly has remedied the situation, and you can grab all of the code from the book in one zip file: Flickr Hacks Code. Carpal tunnels everywhere are rejoicing. (And don't forget about the color figures gallery at Flickr—another way to view parts of the book.)

Bloglines Greasemonkey Script

In January I posted about a peculiar problem between this site and Bloglines: Bloglines filtering. Basically, Bloglines filters out the word "onfocus" from links to avoid cross-site scripting (XSS) attacks. The filter isn't smart enough to realize that "onfocus.com" is perfectly ok, and not a threat. This means that anytime someone links to my site, or I link to images on my site, the Bloglines filter changes the domain from onfocus.com to nofocus.com. When people click on a link to my site within Bloglines, they get a 404 error page at nofocus.com. (System administrators over at nofocus.com must wonder why they get some strange 404 errors showing up in their logs.)

Anyway, I've emailed Bloglines about the problem several times and now I'm getting silence. I don't blame them, this is an obscure issue that only affects one of the millions of sites that flow through their system. But it still bugs me, so I wrote a quick Greasemonkey script to solve the problem. If you use Bloglines and Firefox and Greasemonkey, I encourage you to install this script: fix-bloglines-onfocus.user.js. (Of course, if you're reading this from within Bloglines, you'll need to visit onfocus.com directly to get the script.) The script changes any instance of "nofocus.com" to "onfocus.com". This script is as blunt as Bloglines' XSS filter, but it's my attempt to fix the issue from this end.

Many thanks to Mark Pilgrim for his Greasemonkey Patterns—it's a great resource for building scripts.

Update: Bloglines fixed their XSS filter.

Music Personality Score

Since talking with Gabriel at MusicStrands the other day, I've been thinking more about how we share our musical tastes with others. I was making the point to him that there should be a way to quickly relate the type of music you're interested in without forcing people to wade through months of listening data like the current social music services require. For example, you can see that my top two artists at Last.fm are Bob Marley and Mozart based on frequency of plays, but that doesn't mean that my top two genres are Reggae and Classical. (I wouldn't place those as my top two if someone asked me.) You have to wade through the entire list to see that I also like classic rock, indie rock, electronic music, and lots of other genres.

What I was trying to say to Gabriel, but couldn't quite articulate, is that there should be a Myers-Briggs style scoring system for musical taste. When I see that someone is an ENFP, I have one instant measure of their personality. If you could do the same for music, you'd have a way to instantly relate your musical interests. I'm not sure what the criteria would be—maybe I'm an ISAE (indie structured ambient electronic), or MECR (mainstream eclectic classic rock). And this would go hand in hand with a service like MusicStrands because they can analyze the last 1,000 songs I actually listened to. With the score in hand, I could paste it into the dozen or so social network sites I belong to, giving people a more nuanced look at my preferences than my top 5 bands or something.

The iTunes Signature Maker is one stab at this concept. This application wades through your iTunes collection and creates a short audio signature based on the music it finds. When listening to others' signatures I guess you could listen for electronica vs. distorted guitars, but it doesn't really give you a sense of music preference. This is more of a fun hack than a useful way to share your musical identity. It'd be much more accurate to analyze what you're actually listening to, and then do a bit of categorization based on meta info about those tracks.

I [heart] NY

sk and I just got back from a week in New York—here are some snapshots. We ended up spending quite a bit of time just walking around New York City. Our first trek was through Central Park.

central park

Meg and Jason had a beautiful wedding, the reason for our trip.

meg and jason

On Sunday we took a bus to New Paltz, NY about an hour and half north of New York City to visit sk's aunt and uncle.

on the bus

We went on some great hikes in the area. Here's a picture sk snapped of me, apparently happy to be hiking.

pb hiking

The hiking highlight was a rock scramble up the side of a cliff, with great views from the top.

crag view

I'm hoping to make it back to New York City in the not too distant future. There's so much to do there and I feel like we barely scratched the surface.

no standing

You can see more photos from the trip (mostly from my cell phone) at Flickr, tagged with nyc and new paltz.

Musicstrands

Today I chatted with someone from Musicstrands and found out a bit about the company. They're based here in Corvallis, Oregon and employ somewhere around 30 people locally. It's fun to learn that a little piece of Web 2.0 is being built right here in my backyard. I use their competitor Last.fm (my profile), but I don't feel too bad because I've been sending my listening habits there since Audioscrobbler appeared several years ago. Sharing music seems so natural that I bet iTunes or YME will ship with more social features (like those MusicStrands provides) in the future.

If you want to see what Musicstrands is cooking up, check out MusicStrands Labs. They even have a tool for people like me that gives music recommendations for Last.fm users. (Thanks in part to the Audioscrobbler API, I assume.) Also fun: MusicStrands patents.

eJournal USA mentions onfocus

The US Department of State mentioned this site in their monthly eJournal, an issue called Media Emerging. It was in an articled about online photo journals, and you can see the article here: Online Albums. Click Enter Album to see all of the photoblogs mentioned. They also have an article about blogs: Bloggers Breaking Ground in Communication. It's great to be mentioned as a photoblogger even though I don't necessarily think of myself in that category anymore. But it's a good reminder that I should keep posting photos. They contacted me about the article a week or two ago and it was strange to see an email in my inbox with the subject, request from U.S. Dept of State.

Mechanical Turk

ETech has been over for a week, and one presentation is still nagging at me on a regular basis. Amazon has a Web Service called Mechanical Turk (named after this Mechanical Turk), and Felipe Cabrera from Amazon spent 15 minutes or so talking about MTurk during one of the ETech morning talks.

The talk focused on the idea that artificial intelligence hasn't materialized, and there are still some tasks that are easy for humans but impossible for computers. For example, a human can look at a picture of a chair and answer the question: Is this a picture of a chair or a table? A computer would have a tough time with that.

MTurk farms out these sorts of questions to real live humans and wraps their decisions (or HITs in MTurk parlance) into a Web Services API so they can be used in computer programs. Cabrera called this process of tapping humans to make decisions for machines Intelligence Augmentation (IA) as apposed to Artificial Intelligence (AI). The talk was good, and MTurk is definitely a clever hack, but the idea has been bothering me.

I can imagine a world where my computer can organize my time in front of the screen better than I can. In fact, I bet MTurk will eventually gather data about how many HITs someone can perform at peak accuracy in a 10 hour period. Once my HIT-level is known, the computer could divide all of my work into a series of decisions. Instead of lunging about from task to task, getting distracted by blogs, following paths that end up leading nowhere, the computer could have everything planned out for me. (It could even throw in a distraction or two if that actually increased my HIT performance.) If I could be more efficient and get more accomplished by turning decisions about how I work over to my computer, I'd be foolish not to.

I guess this idea of people being managed and controlled by machines is nothing new, and it was the bread and butter of science fiction books I read as a kid. But MTurk puts this dystopia in a new, immediate context. Machines are smarter than ever, and control of human decision-making could be highly organized.

MTurk is only a few months old, and there's nothing inherently wrong with it. But I can't stop projecting the ideas behind the system ahead a few years, and that's what's bothering me. I can't even fully articulate why it's bothering me. I don't have any conclusions, or even concrete hypotheticals of MTurk gone awry—so I'm just using my blog as therapy. Obviously my computer didn't ask me to write this.

slashdot topic feeds

Matt was looking over my shoulder while I was reading feeds at the airport yesterday, and he noticed that I have a feed for Google-related posts at Slashdot. I told him I was scraping it together because Slashdot doesn't offer topic feeds (and I don't want to see everything at Slashdot), and Matt thought I should share the rss-generating love with the world. I agreed, and here we are.

Here's the script I'm using to scrape Slashdot. It's in Perl, and you'll need a couple modules: LWP::Simple and XML::RSS::SimpleGen. Once installed, grab the code: slashfeed.pl.

You'll also need the numeric topic ID for any Slashdot topic you want to track. They're easy to find. Those big icons in any Slashdot post link to a topic page. Click on one of those, and look for a number in the URL. For example, the Slashdot Google Topic Page is here:

http://slashdot.org/search.pl?tid=217

Note the tid=217 in the URL. That's your Slashdot topic ID for posts about Google. You can browse the directory of all available Slashdot topics at the top of the Slashdot Search page.

To generate an RSS feed full of Slashdot Google goodness, run the script from a command prompt, passing in a topic ID like this:

% perl slashfeed.pl 217

The script will spit out a file called slashdot_217.xml that contains the latest Google-related posts, RSS style. Just make sure the script saves this file to a publicly addressable web folder (you might need to tweak the output file path on line 55). The final URL should look something like:

http://example.com/feeds/slashfeed_217.xml

Throw your new URL in your feed reader, and run the script on a regular basis with cron or Windows Task Scheduler. That's all there is to building a topic-specific Slashdot feed.

Scaping is notoriously brittle, so if Slashdot changes their HTML this script will break. If that happens, view source on the Slashdot topic page and rewrite the regular expressions on line 39 or so of the script. That's the only labor-intensive bit in this script.

ETech 2006 thoughts

I'm back from ETech. The theme this year was The Attention Economy, and I have to agree with Matt's Thoughts on etech that I didn't walk away with much new information about attention. But ETech is always about more than the theme, and a 2nd emerging theme from the conference was ubiquitous computing. In fact, Bruce Sterling's opening talk was called The Internet of Things where he discussed his concept of Spime—a virtual object that manifests itself physically for a time while retaining the trackability of a virtual object. (As I understood it.) For example, shoes could be digitally designed, fabricated, and made location-aware. That way you could simply Google them if you can't find them in the morning. (His extended thoughts on Spimes are in Shaping Things.) Many sessions touched on ubiquitous computing and controlling the physical world in a more fluid, digital way.

Another emerging topic was Yahoo!, with three or four sessions devoted entirely to Yahoo! products. Of course I'm very interested in Yahoo! after working on Yahoo! Hacks, but their presence felt heavy-handed. (Granted, many members of the ETech selection committee were acquired by Yahoo! over the past year.) But the sessions I saw were straight product-pitches with little or no bearing on the conference theme of Attention Economy. I don't mind seeing demos or product pitches if they're within the context of larger ideas. Yahoo! wasn't the only offender there. Just to compare: Google was absent from the conference, and I only saw one pitch from Microsoft.

My favorite sessions were about big ideas: Maribeth Back's reading rooms, danah boyd's G/localization, Derek's distributed communities, and Clay Shirky's patterns for social software. I think what I'm personally looking for is a more academic, less commercial conference devoted entirely to social interaction mediated by technology. That's a convoluted way of saying Social Software Conference, but I'd also like to hear about trends in ubiquitous computing and networked devices as well.

Once again, I came away from ETech with notes full of ideas to digest and play with. And even though I might not have a better handle on attention, it's often the unexpected threads that emerge from the conference that turn out to be the most valuable.
« Older posts    Newer posts »