Robert McDonnell

5 minute read

This Guardian article on the prospective French President Macron got me thinking, especially this passage:

France’s economic performance in recent years has been underwhelming, especially when compared to that of Germany. Fifteen years ago, the eurozone’s two biggest countries enjoyed comparable living standards. Today, Germany’s are almost a fifth higher than those in France. Likewise, at the time when euro notes and coins were introduced in 2002, French and German unemployment rates were both around 8%. Today, Germany’s unemployment rate has dropped below 4% while French unemployment is close to 10%.

Youth unemployment is a particular problem. Almost one in four of those aged under 25 are out of work, a much higher rate than in Germany. Hollande has had some success increasing the number of people in training and he forced through legislation last year making hiring and firing easier. But more than 85% of employment growth last year was for temporary jobs, and the vast majority of those hired were on contracts of less than a month.

I was interested in visualizing this quickly in R, so I grabbed the data from Eurostat. After a little cleaning, we can plot the figures using ggplot2.

library(tidyverse)
fg <- read_csv("ilc_di03/ilc_di03_1_Data.csv")
une <- read_csv("une_rt_a/une_rt_a_1_Data.csv")

Let’s take a look at these:

head(fg)
## # A tibble: 6 × 7
##    TIME                                              GEO   AGE   SEX
##   <int>                                            <chr> <chr> <chr>
## 1  1995 Germany (until 1990 former territory of the FRG) Total Total
## 2  1996 Germany (until 1990 former territory of the FRG) Total Total
## 3  1997 Germany (until 1990 former territory of the FRG) Total Total
## 4  1998 Germany (until 1990 former territory of the FRG) Total Total
## 5  1999 Germany (until 1990 former territory of the FRG) Total Total
## 6  2000 Germany (until 1990 former territory of the FRG) Total Total
## # ... with 3 more variables: INDIC_IL <chr>, UNIT <chr>, Value <chr>
head(une)
## # A tibble: 6 × 6
##    TIME                                              GEO   AGE
##   <int>                                            <chr> <chr>
## 1  1995 Germany (until 1990 former territory of the FRG) Total
## 2  1996 Germany (until 1990 former territory of the FRG) Total
## 3  1997 Germany (until 1990 former territory of the FRG) Total
## 4  1998 Germany (until 1990 former territory of the FRG) Total
## 5  1999 Germany (until 1990 former territory of the FRG) Total
## 6  2000 Germany (until 1990 former territory of the FRG) Total
## # ... with 3 more variables: UNIT <chr>, SEX <chr>, Value <dbl>

So let’s quickly tidy the data up and keep only what we need. There are no Value observations for 2016, so we remove that year. The value for 2003 is missing for France, as are 2003 & 2004 for Germany, so here we can take an average of the values the year before and the year after and impute these. After that, we just tidy up the way ‘Germany’ is entered, change Value to numeric and TIME to a date-time, and then select only what we want, renaming in the process. (I do something similar for the unemployment data.)

fg <- fg %>% 
  filter(TIME != 2016) %>% 
  mutate(Value = ifelse(
    TIME == 2003 & GEO == "France", (14889 + 15242)/2, Value), 
         Value =  ifelse(
           TIME == 2004 | TIME == 2003 & GEO == "Germany (until 1990 former territory of the FRG)", (15758 + 16393)/2, 
           Value)) %>% 
  mutate(GEO = gsub("\\(until 1990 former territory of the FRG\\)", "", GEO),
         Value = gsub(",", "", Value),
         Value = as.numeric(Value),
         TIME = paste0(TIME, "-01-01"),
         TIME = lubridate::parse_date_time(TIME, "Ymd")) %>% 
  select(date = TIME, country = GEO, `Median Income`= Value)

Now these data are easier to work with:

head(fg)
## # A tibble: 6 × 3
##         date  country `Median Income`
##       <dttm>    <chr>           <dbl>
## 1 1995-01-01 Germany            13439
## 2 1996-01-01 Germany            14524
## 3 1997-01-01 Germany            14769
## 4 1998-01-01 Germany            14393
## 5 1999-01-01 Germany            14603
## 6 2000-01-01 Germany            15340
head(une)
## # A tibble: 6 × 3
##         date  country `Unempl. Rate`
##       <dttm>    <chr>          <dbl>
## 1 1995-01-01 Germany             8.2
## 2 1996-01-01 Germany             8.9
## 3 1997-01-01 Germany             9.6
## 4 1998-01-01 Germany             9.4
## 5 1999-01-01 Germany             8.6
## 6 2000-01-01 Germany             7.9

Let’s get our Euro-France-Germany-themed plots going. France has the colours #013896 (blue), #ffffff (white), and red (#cf142b). Germany is #000000 (black), #ffce00 (yellow), and red (#dd0000).

ggplot(fg, aes(x = date, y = `Median Income`, colour = country)) +
  geom_line() +
  theme_classic() + 
  scale_colour_manual(values = c("#013896", "black")) +
  labs(title = "Median Income, France & Germany", colour = "",
       subtitle = "1995-2015", caption = "Data: Eurostat") +
  theme(title = element_text(colour = "#cf142b"), 
        axis.line = element_line(colour = "#ffce00"))

ggplot(une, aes(x = date, y = `Unempl. Rate`, colour = country)) +
  geom_line() +
  theme_classic() + scale_y_continuous(limits = c(0, 13)) +
  scale_colour_manual(values = c("#cf142b", "#000000")) +
  labs(title = "Unemployment Rate, France & Germany", colour = "",
       subtitle = "1995-2015", caption = "Data: Eurostat") +
  theme(title = element_text(colour = "#013896"), 
        axis.line = element_line(colour = "#ffce00"))

Ok, these colours aren’t amazing together. But at least the unemployment figures show the differences between the two countries that the Guardian article mentioned. Still, median income actually favours France, although it’s not exactly clear what the Guardian means by “living standards”; I’m sure we could explore this with many more indicators.

I wonder how facebook’s prophet package would predict the near future for the unemployment rate?1

library(prophet)

france <- une %>% 
  filter(country == "France") %>% 
  select(ds = date, y = `Unempl. Rate`)

germany <- une %>% 
  filter(country == "Germany ") %>% 
  select(ds = date, y = `Unempl. Rate`)

fr <- prophet(france)
## Initial log joint probability = -2.17682
## Optimization terminated normally: 
##   Convergence detected: absolute parameter change was below tolerance
future_france <- make_future_dataframe(fr, periods = 1)
forecast_france <- predict(fr, future_france)

ge <- prophet(germany)
## Initial log joint probability = -2.40272
## Optimization terminated normally: 
##   Convergence detected: absolute parameter change was below tolerance
future_germany <- make_future_dataframe(ge, periods = 1)
forecast_germany <- predict(ge, future_germany)
plot(fr, forecast_france)

Ok, France’s is quite unpredictable using these sparse data. Germany’s however, shows a clear downward trend:

plot(ge, forecast_germany)

I wonder how much Macron will change all this? Given the ingrained statism of France, I doubt it will be a lot.


  1. Super unsophisticated analysis, I know. The data here are yearly, quarterly or monthly would be far better.

comments powered by Disqus