R, lists

I hate this language

example <- list(list_element = list(c("random string"))) Possible ways of setting list_element: example$list_element <- "whatever" example['list_element'] <- "whatever" example[[1]] <- "whatever" Fine, right? Except these do different things. Options 1 and 3 really doesn't care what list_element…

Read More

Twitter, paper.li, grumpasaurus rex

Has anyone actually used paper.li?

In all of history. Not to write something, but to read something. Anyone? I keep getting infuriatingly spammy twitter mentions from some goofball who has decided to write an [R/Wikipedia/data analysis/people-who-cuss] "newsletter" on that site. I've never actually seen anyone use it, though, and I've never had…

Read More

R, Star Wars, EARL2015

News and notes

    It looks like I'll be talking here in Boston at Earl 2015 The best bit about this writeup on Ballghazi is the sentence opening "Based on the assumption that fumbles per play follow a normal distribution". I don't know about anyone else, but I've never seen an actual, real-world phenomenon…

    Read More

    R, urltools, packages

    Announcing urltools 1.0.0

    Stepping aside from driving useRs crazy with grumpiness, for a second: Back in December I released urltools, a library for handling URLs within R. While R theoretically has a URL encoder and decoder, these are in practise neither vectorised nor particularly efficient, and are actively buggy. urltools improved on this…

    Read More


    The R statistics package is a free software tool designed by sadistic psychology professors to find the exact point at which PhD students’ minds collapse under stress. It is so difficult to master that the authors of the software had to invent a super-intelligent robot, Hadley Wickham, to understand exactly…

    Read More

    Don't use the mailing lists

    I love the R community deeply. That's the first thing I should say. Well, really, the first thing I should say is that people I cite, quote and praise in this essay do not necessarily endorse it - and in many cases don't know that it's being written. It's slightly…

    Read More

    statistics, gender gap


    Heading off to SF shortly, so scrubbing my machines: My coworker Ellery, who works with our fundraising team, has some interesting thoughts on A/B testing. Max and Magnus's work on the gender gap. Randy Zwitch has an interesting demonstrator of using SQL windowing to sessionise events within the database…

    Read More

    sessions, Human-Computer Interaction, academia

    WWW 2015

    A paper Aaron and I wrote with a set of GroupLens collaborators has just been accepted to WWW 2015. Looking forward to seeing some of you in Florence in May! Abstract: Session identification is a common strategy used to develop metrics for web analytics and behavioral analyses of user-facing systems…

    Read More

    The torture of learning new things

    This quarter my focus at work has been on improving my technical chops. Whether it's C++ (woo!) or Java (woo?) or Maven (BOO), we have a lot of toolkits and, with 125k requests a second, a definite need for speed, so I've been trying to branch out and understand these…

    Read More

    devlog (29-12-14 - 04-01-15)

      I finally got that webrequest class in. Every time I have to use maven, I die a little. More work on mwutils, including a basic template parser. Java is making me like C++ less, amusingly. Some work on a session analysis MapReduce pile. Patched Puppet to stop us from creating…

      Read More