The is a data package with an excerpt from the Gapminder data. The main
object in this package is the gapminder
data frame or
“tibble”. There are other goodies, such as the data in tab delimited
form, a larger unfiltered dataset, premade color schemes for the
countries and continents, and ISO 3166-1 country codes.
The gapminder
and gapminder_unfiltered
data
frames include six variables, (Gapminder.org
documentation page):
variable | meaning |
---|---|
country | |
continent | |
year | |
lifeExp | life expectancy at birth |
pop | total population |
gdpPercap | per-capita GDP |
Per-capita GDP (Gross domestic product) is given in units of international dollars, “a hypothetical unit of currency that has the same purchasing power parity that the U.S. dollar had in the United States at a given point in time” – 2005, in this case.
The package contains two main data frames or tibbles:
gapminder
: 12 rows for each country (1952, 1957, …,
2007). It’s a subset of …gapminder_unfiltered
: more lightly filtered and
therefore about twice as many rows.Note: this package exists for the purpose of teaching and making code examples. It is an excerpt of data found in specific spreadsheets on Gapminder.org circa 2010. It is not a definitive source of socioeconomic data and I don’t update it. Use other data sources if it’s important to have the current best estimate of these statistics.
Install gapminder
from CRAN:
install.packages("gapminder")
Load it and test drive with some data aggregation and plotting:
library(gapminder)
library(dplyr)
library(ggplot2)
aggregate(lifeExp ~ continent, gapminder, median)
#> continent lifeExp
#> 1 Africa 47.7920
#> 2 Americas 67.0480
#> 3 Asia 61.7915
#> 4 Europe 72.2410
#> 5 Oceania 73.6650
%>%
gapminder filter(year == 2007) %>%
group_by(continent) %>%
summarise(lifeExp = median(lifeExp))
#> # A tibble: 5 × 2
#> continent lifeExp
#> <fct> <dbl>
#> 1 Africa 52.9
#> 2 Americas 72.9
#> 3 Asia 72.4
#> 4 Europe 78.6
#> 5 Oceania 80.7
ggplot(gapminder, aes(x = continent, y = lifeExp)) +
geom_boxplot(outlier.colour = "hotpink") +
geom_jitter(position = position_jitter(width = 0.1, height = 0), alpha = 1 / 4)
country_colors
and continent_colors
are
provided as character vectors where elements are hex colors and the
names are countries or continents.
head(country_colors, 4)
#> Nigeria Egypt Ethiopia Congo, Dem. Rep.
#> "#7F3B08" "#833D07" "#873F07" "#8B4107"
head(continent_colors)
#> Africa Americas Asia Europe Oceania
#> "#7F3B08" "#A50026" "#40004B" "#276419" "#313695"