The $24 Poor Man’s Social Media Expression Pattern Database (PoMaSoMeExpPaDa)
Expression pattern images are some of the most information-rich data housed at model organism databases. They are time consuming to generate. They are time consuming to collect and annotate.
Moreover, copyright restrictions mean that many images remain captive at publisher’s websites, unable to be placed within the rich intellectual framework that exists at sites like WormBase and FlyBase. How many near identical images are stashed away in darkened confocal rooms? How many possibly informative rejects are tossed out due to the puny limitations of publication? Gabijillions?
I wanted to build an easy to use expression pattern image resource that got around these limitations. The system would allow people to add their own photos for display within a broader intellectual context, comment on photos, add tags, search for a variety of criteria, etc. The problem? Developer cycles. This is a lower than low priority project and there aren’t enough hands to go around as it is.
I started wondering if I could leverage a site like Flickr to create a Poor Man’s Expression Pattern Database. Flickr is a key exemplar for Web 2.0 community style features. Tags, contacts, comments, an API.
I took approximately 6000 public, highly curated expression pattern images from WormBase. We display these on Expression Pattern Summary pages.
I wrote a script exploiting Flickr’s REST-like API to programmatically upload images.
For each image, the script added a text description of the expression pattern with hyperlinks back to WormBase genes, anatomy ontology terms, gene ontology terms, strains, transgenes, etc. Images were posted to a dedicated user named, ahem, wormbase.
Tags were added to each image corresponding to the unique gene ID, public gene names, and anatomy ontology terms.
Here’s an example image on Flickr.
I wasn’t happy with the current Perl interfaces to the Flickr API so I wrote my own (Flickr::API::Simple; note that I haven’t released this to CPAN yet and probably never will).
To pull the correct images, tags, and comments from Flickr, individual expression pattern pages levy a query for images at Flickr from the wormbase user with tags corresponding to either the expression pattern ID or the current gene being displayed. Information is displayed inline on the page but served from Flickr.
Since the goal of this project is to allow *anyone* with images to post them for display at WormBase, we needed a open account or group to do that. But since people are nuts, I wanted to constrain it a bit to people who I knew. Attribution is critical. To do this, I created an invite-only group on Flickr called, erm, WormBase.
If a user has an image that they would like to share on the WormBase site proper, all they need to do is:
* Upload the image to their account
* Post the image to the WormBase group on Flickr
* Tag the image with the unique gene ID
These images will automatically be displayed on WormBase Expression Pattern pages using the exact mechanism as above: Expression Pattern pages search Flickr for images belonging to the WormBase group (instead of user), tagged with the current gene.
That’s it! A Poor Man’s Expression Pattern database with integration and cross links to a public genomics repository.
We get tagging, searching (clustered tag analysis), social features like commenting and blog integration for (nearly) free. We don’t have to spend six months time in development.
Cost: $24 bucks a year for a Flickr Pro account. This gives 24 GB of storage. Ridiculous. No electricty costs. No sysadmin. No maintenance. $24 dollars or 6 pints.
Time: about 2 hours of programming time to figure out the Flickr REST-like API. About 2 days of running time to upload images (I’m on a slow link).
- Heather,
- Andrew Perry,
- Marcin,
- Benedikt Koehler,
- Jeroen Van Goey,
- Jason Stajich,
- Neil Saunders,
- Ian Holmes,
- Rajarshi Guha,
- Bill Hooker,
- Cameron Neylon,
- Pawel Szczesny
-
Wonderful! Of course, you know what I'm going to ask for next don't you...?
-
Brilliant!
-
Thanks all! This was a fun little diversion. I'm curious to see if anyone actually *adds* any images...
-
This is great. Makes you wonder if we could build quite complex data management systems using APIs to existing web services (Flickr, various Google tools, FriendFeed...) and aggregating to a portal.
-
This looks so cool and simple that I can't really grasp why nobody hasn't done this before. Not sure what Flickr's TOS says about this but it's quite brilliant.
-
Cameron: happy to contribute! Ricardo: I think as long as images are linked back to Flickr, this is within the TOS. We'll see the next time someone decides to LWP::Simple their way through our expression patterns!
-
Neil: I've been working along the exact same lines, building resources for less well-characterized (and less well-funded) organisms. You need a foundation upon which to layer third party services. I'm using a genome browser with rudimentary annotations: gene models with stable identifiers. It's a great leg up if you have limited resources and presents some intriguing mashup possibilities for larger operations, too.
-
The question I was going to ask was about the license - could it be CC-BY by default? But yes, using Flickr for research image management is really a no brainer - it would be great to wire up something more complex and automated. To a certain extent this is what Jean-Claude already does but withou automated aggregation.
-
Ah, I see. I got my 2.0 communications lines crossed. I tend to think the licensing should be left up to the contributor. I'm not familiar with Jean-Claude's work. Linky?
-
@Todd: http://usefulchem.wikispaces.com/
-
Great idea

