Tuesday, July 20, 2010

Disordered ramblings

Of late I've become interested in so-called intrinsically disordered proteins (IDP's).* These are proteins that contain one or more "significant" regions of sequence that are unstructured. "Significant" can range from perhaps ten or so residues, up to the entire protein. There are experimental data suggesting that the disordered regions in some of these are vital for function. It is generally thought that disordered regions important for function might undergo a folding reaction when bound by another protein, or a nucleic acid, or even a small molecule.

I'm interested in IDP's** because two of my favorite proteins have disordered domains that are essential for function and that do undergo the kind of folding upon binding mentioned above. This makes these proteins more interesting to me intellectually (not that they would be boring without disorder), but can also make them significantly more difficult to study than your garden variety well-folded, globular protein.

The IDP field is populated by large numbers of bioinformaticists (spawning my last post). There are also experimentalists and computational biologists (of the molecular simulation kind), but much of the initial driving force in creating this as a field appears to have come from the bioinformaticists. A small group of them.

Who are seriously over hyping the field.***

The hype being based largely on predictions of disorder. Predictions. Not much data. A prediction is just a pointer to something that might (or might not) be interesting. It's pretty much meaningless without experimental verification.

This is a problem. Yes, we all need to sell ourselves and our research. We all need to convince others that what we do is important and should receive gobs of funding $$$'s. But what you're selling has to have some connection to reality. A track record. Data.

Right now the IDP field has all the appearances of an infomercial for some kitchen gadget that is promised to mix, knead, puree, blend, chop, slice, dice, julienne, fry, roast, bake, boil, steam, load the dishwasher, sweep the floors, put the children to bed, and polish your shoes. Only believable in the wee hours of the morning after a long evening consuming copious quantities of the alcoholic beverage of your choice.

For now I'm keeping my credit card in my pocket.

* There are many, many recent reviews on the subject. This one is okay (and free).

** I seriously dislike the name "intrinsically disordered protein." For a start, the majority of the IDP's that have been identified are mostly well-structured and only have a fraction of their sequences disordered. I saw someone use the term "intrinsically disordered region." That's an improvement.

*** Case in point: the many, many reviews. Many, many of which were authored/co-authored by this guy. Dude, enough already. Go spend some time in the lab generating new data.

Monday, July 19, 2010

Data mining talks

As a molecular biophysicist I often hear talks (and see posters) given by bioinformaticists.* I am struck by how these are almost uniformly abysmal. I'm not necessarily referring to the data, but rather the presentation as a whole. This has reached the point where I don't think I can bring myself to sit through another bioinformatics talk (or poster presentation) for at least the next three months.

Why has the quality of the now dozens of such talks I've suffered through been so low?

In the majority of cases I posit it's a combination, in varying degrees, of a lack of imagination and a disconnection from the underlying biology. Too many of these presenters regale their audiences with interminable laundry lists of how property X is over-represented in sequences of class A, and under-represented in sequences of class B. Ummmm... So what? Why should I care? Often such presenters either don't know or are too lazy to spend the time connecting their data with known biology. As an example, I recently sat through a talk where the speaker made a big deal about the prevalence of glutamine-rich sequences in proteins involved in transcription. Not once did he refer to the fairly substantial body of experimental data on these very same sequences. In fact, when asked, he couldn't offer up any explanation for this observation.** Major fail.

I can't explain why this happens. Obviously it shouldn't. Perhaps it's a function of the relatively immature nature of bioinformatics as a field. It's still at a stage where method development trumps method application. Application of the intelligent kind.

I remember when macromolecular crystallography talks suffered from similar issues. They would often be these long detailed descriptions of the structure(s) just solved by the crystallographer. No connection to the biology, just the details of the structure. Listen, I don't give a rat's arse that there's a type VIIb turn between helices 7 and 8. What I want to know is what the structure tells us about the biology. Nowadays most crystallographers do make the connections. One can't get a grant for simply solving structures any more.***

I've heard through the grapevine that getting a grant to do bioinformatics has become increasingly difficult. More so than would be expected from the downturn in science funding. Perhaps we'll see the field forced to mature more rapidly and presentations improve.

* By "bioinformatics" I mean the data-mining thing. A colleague once defined it thusly: "Bioinformatics is the mining of biological databases for profit (not necessarily of the monetary kind)." This is distinct from computational biology which, at least at the molecular level, tends to employ an energy function of sorts.

** Glutamine-rich regions can be involved in DNA binding - the glutamine side chain is quite good at making hydrogen bonds with nucleic acids.

*** Not when I'm reviewing the grant. :-)

Friday, July 09, 2010


Thanks to the grant-making powers that be it looks like I might be getting a new toy. Makes me feel all warm and fuzzy inside.

Now I need to start planning how to get one of these... Or an equivalent.

I like fluorescence. Can you tell?