| Type: | Package |
| Title: | Download Data from the World Inequality Database |
| Version: | 0.0.1 |
| Author: | Thomas Blanchet [aut], Ignacio Flores [cre] |
| Maintainer: | Ignacio Flores <stats@wid.world> |
| Description: | Tools to download data from the online World Inequality Database directly into R. The World Inequality Database is an extensive source on the historical evolution of the distribution of income and wealth both within and between countries. It relies on the combined effort of an international network of over a hundred researchers covering more than seventy countries from all continents. |
| License: | MIT + file LICENSE |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.1.2 |
| Depends: | R (≥ 2.10) |
| Imports: | httr (≥ 1.2.1), base64enc (≥ 0.1), plyr (≥ 1.8.4), jsonlite (≥ 1.6.1) |
| Suggests: | testthat (≥ 1.0.2), knitr (≥ 1.16), rmarkdown (≥ 1.6), dplyr (≥ 1.0.0), ggplot2 (≥ 2.2.1), scales (≥ 0.4.1), tidyverse (≥ 1.1.1) |
| NeedsCompilation: | no |
| Packaged: | 2026-02-18 14:07:13 UTC; iflores |
| Repository: | CRAN |
| Date/Publication: | 2026-02-20 11:20:02 UTC |
Check list of age codes
Description
Check that the list of age codes submitted by the user is valid.
Usage
check_ages(ages)
Arguments
ages |
List of age codes |
Author(s)
Thomas Blanchet
Check list of area codes
Description
Check that the list of area codes submitted by the user is valid.
Usage
check_areas(areas)
Arguments
areas |
List of area codes |
Author(s)
Thomas Blanchet
Check list of indicator codes
Description
Check that the list of indicator codes submitted by the user is valid.
Usage
check_indicators(indicators)
Arguments
indicators |
List of indicators. |
Author(s)
Thomas Blanchet
Check list of percentiles
Description
Check that the list of percentiles submitted by the user is valid
Usage
check_perc(perc)
Arguments
perc |
List of percentiles |
Author(s)
Thomas Blanchet
Check list of population codes
Description
Check that the list of population codes submitted by the user is valid.
Usage
check_pop(pop)
Arguments
pop |
List of population codes |
Author(s)
Thomas Blanchet
Check list of years
Description
Check that the list of years submitted by the user is valid
Usage
check_years(years)
Arguments
years |
List of years |
Author(s)
Thomas Blanchet
Download data from WID.world
Description
Downloads data from the World Wealth and Income Database
(https://wid.world) into a data.frame.
Type vignette("wid-demo") for a detailed presentation.
Usage
download_wid(
indicators = "all",
areas = "all",
years = "all",
perc = "all",
ages = "all",
pop = "all",
metadata = FALSE,
include_extrapolations = TRUE,
verbose = FALSE
)
Arguments
indicators |
List of six-letter strings, or |
areas |
List of strings, or |
years |
Numerical vector, or |
perc |
List of strings, or |
ages |
Numerical vector, or |
pop |
List of characters, or |
metadata |
Should the function fetch metadata too (ie. variable
descriptions, sources, methodological notes, etc.)? Default is |
include_extrapolations |
Should the function return estimates that are
the results of extrapolations and interpolations based on limited data?
Default is |
verbose |
Should the function indicate the progress of the request?
Default is |
Details
Although all arguments default to "all", you cannot download the
entire database by typing download_wid(). The command requires you
to specify either some indicators or some areas. To download the entire
database, please visit https://wid.world/data/ and choose "download
full dataset".
If there is no data matching you selection on WID.world (maybe because
you specified an indicator or an area that doesn't exist), the command
will return NULL with a warning.
All monetary amounts for countries and country subregions are in constant
local currency of the reference year (i.e. the previous year, the database
being updated every year around July). Monetary amounts for world regions
are in EUR PPP of the reference year. You can access the price index using
the indicator inyixx, the PPP exchange rates using xlcusp
(USD), xlceup (EUR), xlcyup (CNY), and the market exchange
rates using xlcusx (USD), xlceux (EUR), xlcyux
(CNY). To check the current reference year, you can look at when the price
index is equal to 1.
Shares and wealth/income ratios are given as a fraction of 1. That is, a top 1% share of 20% is given as 0.2. A wealth/income ratio of 300% is given as 3.
The arguments of the command follow a nomenclature specific to WID.world. We provide more details with a few examples below. For the complete up-to-date documentation of the structure of the database, please visit https://wid.world/codes-dictionary/.
Indicators
The argument indicators is a vector of 6-letter codes that corresponds to a
given series type for a given income or wealth concept. The first letter
correspond to the type of series. Some of the most common possibilities include:
| one-letter code | description | |
a | average (local currency unit, last year’s prices) | |
b | inverted Pareto-Lorenz coefficient | |
f | female population (fraction between 0 and 1) | |
g | Gini coefficient (between 0 and 1) | |
i | index | |
n | population | |
s | share (fraction between 0 and 1) | |
t | threshold (local currency unit, last year’s prices) | |
m | total (local currency unit, last year’s prices) | |
p | proportion of women (fraction between 0 and 1) | |
w | wealth-to-income ratio or labor/capital share (fraction of national income) | |
r | Top 10/Bottom 50 ratio | |
x | exchange rate (market or PPP) | |
e | Total emissions (tons of CO2 equivalent emissions) | |
k | Per capita emissions (tons of CO2 equivalent emissions) | |
l | Average per capita group emissions (tons of CO2 equivalent per capita emissions) |
The next five letters correspond a concept (usually of income and wealth). Some of the most common possibilities include:
| five-letter code | description | |
ptinc | pre-tax national income | |
pllin | pre-tax labor income | |
pkkin | pre-tax capital income | |
fiinc | fiscal income | |
hweal | net personal wealth |
For example, sfiinc corresponds to the share of fiscal income,
ahweal corresponds to average personal wealth. If you don't specify
any indicator, it defaults to "all" and downloads all available indicators.
Check https://wid.world/codes-dictionary/ for a full list of codes.
Area codes
All data in WID.world is associated to a given area, which can be a country,
a region within a country, an aggregation of countries (eg. a continent), or
even the whole world. The argument areas is a vector of codes that specify
the areas for which to retrieve data. Countries and world regions are coded
using 2-letter ISO codes. Country subregions are coded as XX-YY
where XX is the country 2-letter code. If you don't specify any area,
it defaults to "all" and downloads data for all available areas.
Years
All data in WID.world correspond to a year. Some series go as far back as
the 1800s. The argument years is a vector of integer that specify
those years. If you don't specify any year, it defaults to "all"
and downloads data for all available years.
Percentiles
The key feature of WID.world is that it provides data on the whole
distribution, not just totals and averages. The argument perc
is a vector of strings that indicate for which part of the distribution
the data should be retrieved. For share and average variables,
percentiles correspond to percentile ranges and take the form pXXpYY.
For example the top 1% share correspond to p99p100. The top 10% share
excluding the top 1% is p90p99. Thresholds associated to the
percentile group pXXpYY correspond to the minimal income or wealth
level that gets you into the group. For example, the threshold of the
percentile group p90p100 or p90p91 correspond to the 90%
quantile. Variables with no distributional meaning use the percentile p0p100.
If you don't specify any percentile, it defaults to "all" and
downloads data for all available parts of the distribution.
Age groups
Data may only concern the population in a certain age group.
The argument ages is a vector of age codes that specify which
age categories to retrieve. Ages are coded using 3-digit codes.
Some of the most common possibilities include:
| three-digit code | description | |
999 | all ages | |
014 | ages 0 to 14 | |
156 | ages 15 to 64 | |
997 | ages 65 and older | |
991 | ages below 20 | |
992 | ages 20 and older |
If you don't specify any age, it defaults to "all" and downloads
data for all available age groups. Visit https://wid.world/codes-dictionary/
for a comprehensive list of options.
Population types
The data in WID.world can refer to different types of population
(i.e. different statistical units). The argument pop is a vector of
population codes. They are coded using one-letter codes. Some of the
most common possibilities include:
| one-letter code | description | |
i | individuals | |
j | equal-split adults (i.e., income or wealth divided equally among spouses) | |
m | male | |
f | female | |
t | tax unit | |
e | employed |
If you don't specify any code, it defaults to "all"
and downloads data for all available populations.
Extrapolations/interpolations
Some of the data on WID.world is the result of interpolations (when data is only available for a few years) or extrapolations (when data is not available for the most recent years) that are based on much more limited information that other data points. We include these interpolations/extrapolation by default as a convenience, and also because these values are used to perform regional aggregations. Yet we stress that these estimates, especially at the level of individual countries, can be fragile.
For many purposes, it can be preferable to exclude these data points.
For that, use the option include_extrapolations = FALSE.
Value
A data.frame with the following columns:
countryThe country or area code.
variableThe variable name, which combine the indicator, the age code and the population code.
percentileThe part of the distribution the value relates to.
yearThe year the value relates to.
valueThe value of the indicator.
If you specify metadata = TRUE, the data.frame also has the
following columns:
countrynameThe full name of the country/region.
shortnameA short version of the variable full name in plain english.
shortdesA description of the type of series.
popThe population type, in plain english.
ageThe age group, in plain english.
sourceThe source for the data.
methodMethodological notes, if any.
imputationType of estimate (when applicable). The
imputationfield is a short qualitative description of the type of estimate provided, which is strongly related to data quality. For technical details, see themethodfield and papers cited insource.qualityData quality (when applicable). The
qualityfield is a score from 0 to 5 indicating the quality of the data.
Author(s)
Thomas Blanchet, with updates by Ignacio Flores
Get variables associated to a list of area codes
Description
Package API environment selector.
Usage
environment
Format
An object of class character of length 1.
Author(s)
Thomas Blanchet
Get data associated to a list of variables
Description
Perform GET request to the server to retrieve data associated to a list of variables.
Usage
get_data_variables(areas, variables, no_extrapolation = FALSE)
Arguments
areas |
List of area codes. |
variables |
List of variables, of the form: |
no_extrapolation |
Logical: should interpolated/extrapolated years be included or not? |
Author(s)
Thomas Blanchet
Get metadata associated to a list of variables
Description
Perform GET request to the server to retrieve metadata associated to a list of variables.
Usage
get_metadata_variables(
areas,
variables,
report_missing = TRUE,
collected_metadata = NULL
)
Arguments
areas |
List of area codes. |
variables |
List of variables, of the form: |
report_missing |
Logical: report any missing metadata when set to TRUE. |
collected_metadata |
List used to accumulate missing metadata across calls. |
Author(s)
Thomas Blanchet