So I saw this blog post, in which Pretty Famous ranked every one of Brad Pitt’s movies (I have no idea how I came across it, I’m not particularly a movie buff or a fan of Señor Pitt, but anyway). Then I wondered how easy/hard it would be to do something like that in R. Pretty Famous used a few sources, but here I’m going to stick to Rotten Tomatoes, since it’s a pretty well-known movie ratings site, maybe the most well-known.
Here are a couple of little tips and tricks that I’ve picked up for use with RMarkdown html documents (including presentations and notebooks). This post is aimed at the R user who doesn’t know much, if anything, about html and css. Background images Sometimes it’s useful (or just nice) to have a background image of some sort in a presentation or notebook. This could be the logo of your university or company, for example.
I loved this R script from hdugan when I first saw it a while ago. The script makes a 2-page pdf of all the colors available in R, using R. Nice. The other day, I thought about making a tidyverse version of it, using dplyr to get the data ready and ggplot2 to visualize it. I won’t for a second pretend that this code is as short and tidy as the original, and in fact it may be a good example of when base R can be really useful, but anyway here it is.
The Irish radio station newstalk published this video the other day, in which director and actor Terry McMahon spoke out against the austerity programme running in Ireland since the aftermath of the financial crisis in 2008. Leaving aside his conflation of any type of business activity with immorality, McMahon claimed that “austerity is murder” and detailed some alarming facts about suicide numbers in Ireland, clearly linking the two (i.e., austerity = more suicide).
Since it’s European Statistics Day, I thought I would make a quick post showing how to utilise some of the data that we have on the European Union in R. In particular, I will use European Parliament voting data from Simon Hix’s website. The data is freely available, so by copying and pasting the code below, you will be able to recreate the analysis I’ve done here. We’re going to be using Stan to make theme-specific ideal points for members of the European Parliament.
For those interested in Brazilian politics, there’s a great new package called electionsBR (those who understand Portuguese can find a post on it here). This package takes data from the Tribunal Superior Eleitoral and makes it available in a tidy format for users of R. Given my recent obsession with map-making, I think it’s only natural I’d want to make maps of Brazil with this package.1 So, what can we do with it?
5/2/2017 Update: it seems something is broken in the scripts to run this analysis. I’ll fix it asap.* The Economist is well known for its graphs and images, and I personally like them a lot. I was doing some work on Brexit when I spied the image above, and thought how much I would like to make something similar. Since my go-to environment is R, and its go-to plotting package ggplot2, I thought I’d try to recreate the image using these tools.
R is actually great for working with spatial data (for example, see here and here for fantastic graphs and maps made with R), however, you often need data that is actually spatial to get started! What do you do if you have an image, a map, let’s say, that is not geo-referenced in any way? The regular answer to this problem is to use software such as QGIS to manually enter GPS coordinates, with the help of Google Maps or something similar.
In an earlier post, I described some ways in which you can interact with a web browser using R and RSelenium. This is ideal when you need to access data through drop-down menus and search bars. However, working with RSelenium can be tricky. There are, of course, easier ways to get information from the internet using R. Perhaps the most straightforward way is to use rvest, in tandem with other packages of the Hadleyverse1, such as dplyr and tidyr for data preparation and cleaning after the webscrape.
The code below on Stan is also available as an RPub webpage, if you’d rather work through the examples than read all of the post. One of the first areas where Bayesian modelling gained an entry point into the social sciences (and in particular political science) was in the area of legislator ideal points, with the use of the Item-Response Theory (IRT) models from the educational testing literature in psychology. This topic proved to be the perfect subject for the comparison of Bayesian and frequentist methods, since ideal point creation usually depends on nominal voting data, which may contain a lot of missing data (legislators who miss votes or abstain) and a huge number of parameters (hundreds of roll-calls by hundreds of legislators).