Categories
Software and Programming

Dictionary of Algorithms and Data Structures

The Dictionary of Algorithms and Data Structures not only explains all sorts of theoretical subjects in programming, it also lists implementations.
Found through a couple of skips after looking up Schwartzian Transform on Wikipedia, which Gaal mentioned.

Categories
Software and Programming

Munging for Mundanes

Data Munging for Non-Programming Biologists, an article by Amir Karger and Eitan Rubin at perl.com, describes another attempt at finding a general solution to those thousand cuts of doom that keep stinging any hacker [*] down the hall from a non-programmer, whether he’s a scientist or a marketing guy. There’s all this data (DNA sequences, patent numbers, customer e-mails you need to spam…), and it just needs to be filtered a bit, sorted, re-arranged, combined with this other data in this file here… sure, you can do it by hand, and plenty of non-programmers do. Or they try to use Excel (which isn’t bad, really), or Access, or they ask the programmer for help and he hesitates between doing some cunning one-liner (in Perl or a chain of clever Unix commands), writing a script he’ll only use once (OK, twice, because this is great, but, says the non-programmer, I just need one more change…), or wishing the non-programmer would, well, learn programming.

Krager built what he calls the scriptome, a sort of wizard or cookbook for assembling “protocols” for transforming data out of smaller operations (“atoms”), which the non-programmer is supposed to cut and paste into a command window. The comments mention a tool called Sprog, a GUI program that tries to do something similar.

Somehow, I can’t help but feel there’s some conceptual barrier here that is keeping programmers back from finding a really clever general solution for this problem. Or maybe it’s just this mirage that keeps slipping through our fingers, the idea of making what’s simple to a programmer so trivial it can be simple to a non-programmer too.


* – ok, when I say “any hacker”, I mean “me”. Because I’m a sucker for helping with stupid tasks which let me procrastrinate on my actual work, and because my absurd faith in reason makes me believe that spending a silly amount of time automating something is superior to solving it with a few (OK, 132) copy-paste actions.

Categories
Software and Programming

addEvent (and other joys of Javascript)

QuirksMode is a great resource documenting the various, umm, quirks of how Javascript is implemented in different browsers. A while ago, the site ran a contest soliciting the public to write a better version of Scott Andrew’s classic addEvent script. The version they chose as a winner is short and sneaky, but the real gold is in the ensuing comments, where several clever people vie to improve on it and on each other’s version.

In other Javascript news, here’s a story about someone who put a sneaky Javascript “virus” in his user profile on a social networking site called MySpace, which got thousands of users to unwittingly add him to their “friends” list and add the line “Samy is my Hero” to their profile. It’s worth reading how he did it if you’re ever involved in making a site that lets users add arbitrary HTML to their user pages.

Categories
Software and Programming

Uh… The F**k? (- 8)

Being an Israeli, you learn a whole lot more about how text is represented on computers than you really want to know. Or think about.

UTF-8 is a text format compatible with ascii (what people who insist that computers can only speak English call “plain text”) that can represent text in many different languages, including Hebrew, Russian and European (by which I mean, it includes Hebrew and Cyrillic letters, as well as those funny accented characters that Germans and Scandinavians use).

However, for silly historical reasons, it is customary to insert a small piece of gibberish called a BOM at the begining of a UTF-8 file. You might possibly imagine some reason someone thought at the time that sticking it there might be a good idea, but generally, it’s a damn stupid one, because it messes up the whole “compatible with plain text” thing. Certain stupid programs, like, say, PHP (written in part by a couple of Israelis, so naturally its International text support is teh suck).

Now, my text editor of choice knows how to handle UTF-8; in fact, it offers 2 ways of encoding it, called “UTF-8” and “UTF-8 with cookie”.

So, quick quiz: If you wanted to save a UTF-8 file without a BOM, which would you choose? With cookie or without?

I thought that “cookie” might be some technical cute way of referring to this BOM thing. However, turns out that by “cookie” my editor means that it will try to guess if the file is UTF-8 automatically by reading the first line and seeing if it uses the words “coding” and “utf-8” together in some way (this is an XML convention).

Apparently, lots of other people don’t know from BOMs either: they just know that, if they use UTF-8, they can write French (or, with a little more difficulty because of directionality, Hebrew) text in their “plain text” files. This guy uses BBEdit, which actually offers “UTF-8” and “UTF-8, without BOM”, and he still got confused.

Ugh.

Categories
short Software and Programming

A free flash gallery viewer

SimpleViewer is a free Flash-based image gallery viewer, that looks both pretty and configurable. Now all I need is a gallery to try it out on.