Sterling on Assange

Bruce Sterling’s analysis of Wikileaks is long, engaging, and depressing.

The cables that Assange leaked have, to date, generally revealed rather eloquent, linguistically gifted American functionaries with a keen sensitivity to the feelings of aliens. So it’s no wonder they were of dwindling relevance and their political masters paid no attention to their counsels. You don’t have to be a citizen of this wracked and threadbare superpower — (you might, for instance, be from New Zealand) — in order to sense the pervasive melancholy of an empire in decline. There’s a House of Usher feeling there. Too many prematurely buried bodies…. This knotty situation is not gonna “blow over,” because it’s been building since 1993 and maybe even 1947. “Transparency” and “discretion” are virtues, but they are virtues that clash. The international order and the global Internet are not best pals. They never were, and now that’s obvious.

Read the whole piece and ponder how we’ve been falling into decline and denial simultaneously so many years. Wikileaks is like a stiff wind against a house of cards. Let’s hope for a better deal next shuffle.

Iraq 2006: a bag of words

How to make sense of Wikileaks data? One way is visual analysis, as we see here, via Jonathan Stray of Associated Press:

Click the image for the high res version.

Stray and Julian Burgess created a visualization using data from December 2006 Iraq Significant Action (SIGACT) reports from Wikileaks. That was the bloodiest month of the war, and the central (blue) point on the visualization represents homicides, i.e. clusters of reports that are “criminal events” and include the word “corpse.” These merge into green “enemy action” reports, and at the inteface we have “civ, killed, shot,” civilians killed in battle. Stray tells how this was done, with some interesting notes, e.g.

…by turning each document into a list of numbers, the order of the words is lost. Once we crunch the text in this way, “the insurgents fired on the civilians” and “the civilians fired on the insurgents” are indistinguishable. Both will appear in the same cluster. This is why a vector of TF-IDF numbers is called a “bag of words” model; it’s as if we cut out all the individual words and put them in a bag, losing their relationships before further processing.

As a result, he warns that “any visualization based on a bag-of-words model cannot show distinctions that depend on word order.” (Much more explanation and detail in Stray’s original post; if you’re interested in data visualization and its relevance to the future of journalism, be sure to read it.)

Thanks to Charles Knickerbocker for pointing out the Stray post.

Advocating for the Open Internet

“Net neutrality” and “freedom to connect” might be loaded or vague terminologies; the label “Open Internet” is clearer, more effective, no way misleading. A group of Internet experts and pioneers submitted a paper to the FCC that defines the Open Internet and explains how it differs from networks that are dedicated to specialized services, and why that distinction is imortant. It’s a general purpose network for all, and can’t be appreciated (or properly regulated) unless this point and its implications are well understood. I signed on (late) to the paper, which is freely available at Scribd, and which is worth reading and disseminating even among people who don’t completely get it. I think the meaning and relevance of the distinction will sink in, even with those who don’t have deep knowledge of the Internet and, more generally, computer networking. The key point is that “the Internet should be delineated from specialized services specifically based on whether network providers treat the transmission of packets in special ways according to the applications those packets support. Transmitting packets without regard for application, in a best efforts manner, is at the very core of how the Internet provides a general purpose platform that is open and conducive to innovation by all end users.”