ATTN 000: information theory nerdspaggery

Rococo Modem Basilisk · March 26, 2010, 08:18:54 PM

I wrote a short python script to generate a first order markov model of a document (tokenizes by whitespace), and after each token use the equation i=log₂(a_nb_n-1/a_n-1b_n) to figure out the difference in information from one token to the next in all of the token pairs in the model.

I ran it on the first 4539 words in the phrack archives, and used google docs to graph it:

Edit: Whoops! I forgot to mention: I model the HMM in terms of ratios a:b, where a is the frequency of a given pair, and b is the frequency of all pairs with the same first token. a_n would be the frequency of the pair after the current token has been added, and a_n-1 would be its frequency before it has been added.

~~I can post the code up in a bit.~~
Edit: here it is

Triple Zero · March 26, 2010, 10:42:41 PM

I saw you post that image on twitter, wondered what it was about.

I have alcohol in my head right now, I will check this out later.

Principia Discordia

News:

ATTN 000: information theory nerdspaggery

Rococo Modem Basilisk

Triple Zero