Categories
Software and Programming

Munging for Mundanes

Data Munging for Non-Programming Biologists, an article by Amir Karger and Eitan Rubin at perl.com, describes another attempt at finding a general solution to those thousand cuts of doom that keep stinging any hacker [*] down the hall from a non-programmer, whether he’s a scientist or a marketing guy. There’s all this data (DNA sequences, patent numbers, customer e-mails you need to spam…), and it just needs to be filtered a bit, sorted, re-arranged, combined with this other data in this file here… sure, you can do it by hand, and plenty of non-programmers do. Or they try to use Excel (which isn’t bad, really), or Access, or they ask the programmer for help and he hesitates between doing some cunning one-liner (in Perl or a chain of clever Unix commands), writing a script he’ll only use once (OK, twice, because this is great, but, says the non-programmer, I just need one more change…), or wishing the non-programmer would, well, learn programming.

Krager built what he calls the scriptome, a sort of wizard or cookbook for assembling “protocols” for transforming data out of smaller operations (“atoms”), which the non-programmer is supposed to cut and paste into a command window. The comments mention a tool called Sprog, a GUI program that tries to do something similar.

Somehow, I can’t help but feel there’s some conceptual barrier here that is keeping programmers back from finding a really clever general solution for this problem. Or maybe it’s just this mirage that keeps slipping through our fingers, the idea of making what’s simple to a programmer so trivial it can be simple to a non-programmer too.


* – ok, when I say “any hacker”, I mean “me”. Because I’m a sucker for helping with stupid tasks which let me procrastrinate on my actual work, and because my absurd faith in reason makes me believe that spending a silly amount of time automating something is superior to solving it with a few (OK, 132) copy-paste actions.