Community annotation — by any name — still isn’t a part of the research process. It should be.

Bioinformatics people like to orate about "community annotation". I've never heard a biologsy use the phrase. Therein lies the problem.

In order for community annotation efforts to succeed, they need to become part of the established research process: mine annotations, generate hypotheses, do experiments, write manuscripts, submit annotations. Rinse and repeat.

A few weeks ago, I posted the following tweet:

Bioinformatics people like to orate about "community annotation". I've never heard a biologist use the phrase. Therein lies the problem.

A few retweeters responded that in their particular realm of bioinformatics, community annotation was called “community curation” or a “jamboree” and they’ve had various degrees of success. Points taken and effort applauded.

The real essence of my tweet was that community annotation — regardless of what it is called — largely fails or is undertaken on a very small scale because it simply isn’t a priority for biologists.

Working at the bench, community annotation doesn’t even make the long list of things to do: conducting experiments, writing manuscripts and grants, mentoring, sitting on committees, teaching. Contributing to community annotation efforts simply does not make the cut.

How might we fix this?

1. Top-down emphasis on the importance of community annotation.

Community annotation isn’t required of publishers or funding agencies except in the most minimal degree (eg submission of sequences). This needs to be changed. By making community annotation part of the process of doing research, the research itself will become more reproducible, more accessible to a broader audience, and more stable over time. It should be complementary to writing a manuscript.

Publishers benefit because extracted entities become markup targets to enhance their online product. Funding agencies benefit since having primary authors and domain experts submit annotation suits the mission of transparency and reproducibility and has a presumed efficiency over third party curation.

2. Better tools.

The tools for community annotation are embryonic and do not match the user experience people have come to expect in the Facebook / Pinterest / Instagram / Google Docs era. Bioinformatics teams need to begin employing user interface, user experience, and graphic design professionals to build friendlier, more efficient, and more beautiful tools to encourage participation.

3. Recognition.

Again, in an effort to encourage participation, we need to recognize the efforts of people who do contribute. This system must have professional currency to it, akin to writing a review paper, and should be citable for two reasons. First, it adds legitimacy to the contribution. It’s now part of the scientific record that can be extended by other researchers. Second, the primary contributor can now make note of their effort expended on CVs and in the tenure or job performance review process.

Nanopublications and microattribution represent the most promising avenues for providing suitable recognition with scientific legitimacy that maps to the current academic and professional status quo.

Migrating to RDS: Converting MyISAM to InnoDB

If you want to leverage the RDS service on AWS, you’ll receive maximum benefit by converting MyISAM tables to InnoDB. Here’s a distillation of a useful approach outlined Another woblag on the Interweb


// Create a backup of your database
mysqldump -u USER -p MYSQLDB | gzip -c > /mnt/backups/mysqldb.sql.gz


// Log in to your mysql instance and dump a .sql to convert tables in batch
mysql> select concat('ALTER TABLE `',table_schema,'`.`',table_name,'` ENGINE=InnoDB;') from information_schema.tables where table_schema='mydb' and ENGINE='MyISAM' into outfile '/tmp/InnoBatchConvert.sql'
mysql> quit
shell> mysql -u root -p < /tmp/InnoBatchConvert.sql


// Confirm tables have been converted to InnoDB
mysql> select table_name, engine from information_schema.tables where table_schema = 'mydb';

Central Serous Retinopathy: the new carpal tunnel for information workers

Is Central Serous Retinopathy (CSR) the new carpal tunnel for a generation of over-stressed and over-loaded information workers who spend far too many hours per day staring at screens of varying dimensions?

Central serous retinopathy (or choroidopathy) is essentially a delamination of the retina when cellular layers that normally serve as a fluid barrier between the choroid and the retina begin to leak. This introduces a bubble or blister of fluid underneath the retina. This results in blurred and dimmed vision.

Although CSR is idiopathic, it has been linked to chronic stress, defined biochemically as elevated serum cortisol levels. This finding is corroborated by an increased incidence of CSR in those with Cushing’s Syndrome (chronic overexposure to elevated levels of cortisol.) Men are more often affected than women; with an age of onset between 20-50, averaging around 45.

I’ve been having progressively worse vision problems since December that I had attributed to floaters or sleep deprivation. Given the sad state of my own personal health care coverage as a self-employeed worker and the prevalence of holidays and work deadlines around the turnover of a new year, I didn’t get around to checking this out until this week. After a standard eye exam, I was tentatively diagnosed with Central Serous Retinopathy (CSR), confirmed a few days later by fluoroscein angiography.

My symptoms currently include a large purplish gray blotch almost dead-center in my field of vision; completely distorted visual acuity that’s not just blurry but makes straight lines look broken and covered with Adobe’s marching ants from using the lasso tool; micropsia (things appear smaller than the unaffected eye); loss of several aspects of color perception; and — surprise — everything looks dim and desaturated.

I’m certainly not a high stress individual. I’m not Type-A; I don’t go around yelling at people. I am, however, a perfectionist, although I’ve softened in my old age. Now I’m satisfied if things are done as best as they possibly can be with the time and team available.

I do work hard and I work long hours and have been doing so for many years.

Here’s a brief outline of a typical day for me.

Wake up early, anytime between 3-4:30 am. Roll over and check the time on my phone. Check my email. Read about things I need to deal with and decide to just get up. Espresso. Since I’m a teleworker, lunch was almost always a working lunch at my desk. And without any seminars or Bits ‘n’ Nibbles to attend in the afternoon, I’d work straight through until 6, 7 or 8, with a full work day of 15, 16, or 17 hours. Multiply that times seven and I was typically logging close to 100 hours a week, each week, weekends and holidays inclusive.

So what am I doing to change? First off, I’m no longer tethered to my phone. If I’m not working, I’m not answering work emails. I’m waiting until I’m actually at my desk to start working. And I’m making every effort to reclaim my weekends and holidays and not working at all. And I’m keeping my fingers crossed I retain my vision.

End of an era: The C. elegans genetic map is now frozen.

Nearly 50 years after Sydney Brenner’s letter to Max Perutz set the wheels in motion for the use of Caenorhabditis elegans as a potent genetic model system, leading eventually to six Nobel prizes and a global research community numbering in the thousands, a new threshold has been crossed.

Starting with the latest release of the C. elegans genome (WS232 in worm-speak), the genetic map is now FROZEN. Recombinational distances have changed very little over the last three years, a testament both to the fine granularity of the genetic map as well as — perhaps — to shifting tides in experimental approaches.

New mutations, deficiencies and rearrangements will still be placed on the map but simply assigned an interpolated genetic position.