Cross-National Data on the Web
Contents
- Widely-Used Compendia of Development Data
- Health and Health Care Data
- Data for Latin America and the Caribbean
- Infant and Child Mortality
- Family Planning
- Infant Immunization
- Undernourishment
- GDP per capita
- Educational Attainment
- Income Inequality
- Water and Sanitation
- Geographical Variables
- Democracy, Civil and Political Rights, Women in Parliament
- State Efficacy
- Free-Market Orientation
As of July 28, 2011, all of these links worked
1. Widely-Used Compendia of Development Data
The World Bank World Development Indicators (“WDI”) are probably the most widely-used data pertaining to economic and social development. To access these data, go to the website and click on “World Development Indicators & Global Development Finance.” That gets you to the statistical compiler. To use it, select a country or countries, click “next”; then the series you want, click “next”; then the year(s) you want (earliest year available is 1960, latest is 2009; rare that data are available for all intervening years). The World Bank’s World Development Indicators have data on, among other things, mortality rates (adult, infant, maternal), life expectancy at birth, age at first marriage, fertility, population in different age groups, urban population, population density, contraceptive prevalence, birth attendance, doctors per capita, nurses per capita, hospital beds per capita, access to sanitation and safe water, immunization rates, adult illiteracy rates, GDP per capita, and HIV prevalence. Some indicators are disaggregated by gender, urban/rural, and so on.
The Quality of Government datasets housed at the University of Gothenburg, Sweden (but published in English), bring together an extremely wide range of indicators on national quality of governance (as measured by such indicators as corruption, bureaucratic quality, democracy), some of its hypothesized causes (e.g., colonial origin, religion, and ethnolinguistic fractionalization), and some of its hypothesized consequences (e.g., GDP per capita, educational attainment, infant mortality, gender bias, environmental sustainability, life satisfaction, and trust). Most of the indicators were compiled elsewhere and then aggregated by the researchers (Jan Teorell, Nicholas Charron, Marcus Samanni, Sören Holmberg, and Bo Rothstein). Coverage in the time-series cross-sectional dataset is for 207 countries, including Taiwan (which is technically not a country) and thirteen other countries that no longer exist (e.g., USSR, Yugoslavia). Coverage encompasses, at a maximum, the years from 1946 to 2009. One enormous advantage of this dataset is that you can download the whole thing as a single spreadsheet in either .csv, Stata 9, or SPSS format. The codebook is also user-friendly and downloads as a single .pdf file. In addition to the “Standard Dataset,” a “Social Policy Dataset” and “Expert Survey Dataset” are available
In 2005 the United Nations Statistics Division launched an initiative called “Statistics as a Public Good” that had among its goals to provide free access to cross-national statistics. The result is UNdata, which includes 33 regularly updated databases with more than 60 million separate pieces of information on such topics as “Agriculture, Crime, Education, Employment, Energy, Environment, Health, HIV/AIDS, Human Development, Industry, Information and Communication Technology, National Accounts, Population, Refugees, Tourism, and Trade.”
Each of the Human Development Reports published annually by the United Nations Development Programme (UNDP) from 1990 to 2010 includes a wide range of statistical data related to people’s ability to live a long and healthy life, acquire knowledge, and achieve a decent standard of living. Besides the Human Development Report Office’s 21 annual global Reports covering all countries, associated agencies in 140 countries published 687 regional, national, and provincial Human Development Reports between 1992 and 2010. The Human Development Report 2010, the 20th Anniversary Edition, reaffirms the benefits of the whole 20-year project and tweaks significantly, with careful justification, the algorithms for calculating the major indices, including the Human Development Index. The Human Development Report database can be tapped through an easy-to-use statistical compiler that allows you to select by country, indicator, and year (maximally 1960, 1970, 1980, 1990, 2000, and each year from 2005 to 2010 inclusive).
UNICEF has published State of the World’s Children annually since 1980. Each edition includes statistical data on indicators related to the well-being of children. UNICEF’s ChildInfo website has data on children and women, including updated infant and under-5 mortality statistics and access to recent Multiple Indicator Cluster Survey reports and data.
The United Nations Population Division has a wide range of useful demographic data, some of which are accessible through this user-friendly data query system.
The Norwegian Social Science Data Services keeps current a useful MacroData Guide (in English) with links to state-of-the-art sources of cross-national data on demography, economics, education, health, labor and employment, crime, corruption, natural resources, politics, conflict, human rights, inequality, gender, religion, and other topics.
2. Health and Health Care Data
The World Health Organization’s Global Health Observatory (GHO) has a large cross-national data repository, especially for years since 1990, on mortality, disease incidence, child nutrition, child health, maternal and reproductive health, immunization, HIV/AIDS, tuberculosis, malaria, water and sanitation), non communicable diseases and risk factors, health systems, environmental health, injuries, and violence.
UNICEF provides recent data on the proportion of expectant mothers with access to antenatal care, the share of births attended by trained personnel, fertility and family planning, and maternal mortality. Go to http://www.childinfo.org/
Since the mid-1980s, some 200 Demographic and Health Surveys (DHS) have been conducted in 75 developing countries. The surveys have very good data on maternal and child health service delivery and related indicators. The DHS website includes a useful stat complier.
Expert ratings of maternal and child health “program effort” in 47-50 developing countries in 1999, 2002, and 2005 are available from the Policy Project. The site has both individual country briefs in .pdf format and a useful interactive data query system.
Background on the procedures for the 1999 and 2002 surveys is available in Bulatao, R. A., & Ross, J. A. (2000) “Rating Maternal and Neonatal Health Services in Developing Countries.” MEASURE Evaluation Working Paper WP-00-26 (August), Carolina Population Center, University of North Carolina at Chapel Hill. The working paper reports and analyzes the results of a survey conducted in 1999 and 2000 by the Futures Group International. The survey team asked 10 to 25 public health experts in each of 47 countries to answer 81 questions about the quality and accessibility of each country’s maternal and child health care services. A total of 1,037 experts participated in the survey, in which respondents were asked whether, on a scale of 1 to 5, expectant mothers were typically given iron folate tablets for anemia, whether deliveries were typically assisted by a trained attendant, whether umbilical cords were typically cut with a clean blade, and 78 other such questions. The responses were aggregated into 13 broad categories measuring, on a scale of 1 to 100, the capacity of health centers and district hospitals; the quality of care for expectant mothers and newborns; the quality of government policies toward safe pregnancy; the quality of government monitoring and evaluation of maternal and child health services; provision of family planning and maternal and child health resources, information, and education; the quality of training of health professionals; and the share of the population with access to maternal and child health services. The appendix of this paper provides the overall country rating, as well as the country rating on some 30 sub-dimensions of the overall rating, for each nation surveyed.
An update with data for 2005 is in Ross, John A., Demi Adelaja, and Lori Bollinger, “Effort Levels of National Maternal and Neonatal Health Programs: 2005 Measures and Six Year Trends.” Maternal and Child Health Journal 12, (2008), 586-598. If you or your institution subscribe to this journal, the article is available here.
3. Data for Latin America and the Caribbean
The Socio-Economic Database for Latin America and the Caribbean (SEDLAC), which is based at the Universidad Nacional de La Plata in Argentina, provides the most transparent data ever assembled for 25 countries of the region on per capita income (in local currency units), income inequality, income poverty, household size, educational attainment, housing quality, durable goods ownership, access to electricity, safe water, and adequate sanitation, employment, and eligibility for disability and retirement pensions. Many of the variables are disaggregated by gender. More categories of indicators are slated to be added soon. The data pertain mostly to the period from 1990 to 2010, although some countries have data from earlier years. The distinctive feature of the database is that all of its statistics are compiled from the microdata collected in periodic household surveys. Country experts update the tables whenever the microdata from a new survey become available. The database is available in both English and Spanish, and the data can be accessed either as Excel tables or through a user-friendly data query system.
A superb source for all sorts of health data for all countries in the Western Hemisphere, from Canada to St. Lucia to Argentina, is the health data section of the website of the Pan American Health Organization (PAHO), an agency of the World Health Organization. The site includes a Table Generator System for extracting data on 48 countries and territories in the Western Hemisphere. You can download the data as an Excel file.
For the period from 1990 to 2002, Millennium Development Goals: A Latin American and Caribbean Perspective (United Nations Economic Commission for Latin America and the Caribbean, 2005) has a range of useful development data for the region.
The infant mortality figures published in the World Bank’s World Development Indicators are based on estimates compiled by the Inter-agency Group on Child Mortality Estimation, which includes specialists from the World Bank, World Health Organization, UNICEF, and the United Nations Population Division. The census, survey, and vital registration data underlying these estimates are available at the website of the Interagency Group on Child Mortality Estimation.
The method used to produce the Inter-agency Group estimates was described initially in Kenneth Hill et al., Trends in Child Mortality in the Developing World: 1960-1996. New York: UNICEF, 1999. “Levels and Trends of Child Mortality in 2006: Estimates Developed By the Inter-agency Group for Child Mortality Estimation” provides a more comprehensive account of the methodology employed, along with infant and under-5 mortality estimates for most of the world’s countries through 2005.
Since the mid-1980s, some 200 Demographic and Health Surveys (DHS) have been conducted in 75 developing countries. The surveys have good data on infant and child mortality and related indicators.
Multiple Indicator Cluster Surveys (MICS) provide information in the areas of health, education, child protection and HIV/AIDS, with a special focus on women and children. A first round was administered in 1995 in 60 countries; a second in 2000 in 65 countries; a third in 2005-06 in 50 countries; and a fourth in 2009-2011 in 40 countries. Altogether, some 200 MICS surveys have been carried out in about 100 different countries. Designed and administered by UNICEF, other international organizations, and local government agencies, the MICS surveys are tailored to suit the particular informational needs of the host country.
You can query the United Nations Population Division’s World Population Prospects: The 2010 Revision for infant and under-5 mortality estimates by country by five-year period (e.g., for 2005-2010). Data are available from 1950 to (worrisomely) 2100.
The Demographic and Health Surveys (DHS) have data on various indicators of family planning.
Since 1972, researchers associated with the Futures Group have used an expert rating system to measure family planning program effort in about 100 developing countries around the world. The researchers send questionnaires to country experts, aggregate the responses into component measures of different aspects of family planning effort, and assign each country an overall score equal to its achieved percentage of the maximum attainable score on the combined components. The results of surveys from 1972, 1982, 1989, 1994, and 1999 are available in Ross, John, and John Stover (2000). “Effort Indices for National Family Planning Programs, 1999 Cycle.” MEASURE Evaluation Working Paper WP-00-20 (May), Carolina Population Center, University of North Carolina at Chapel Hill. Data from the 1999, 2004, and 2009 rounds of the survey are available in a subsequent paper.
In June 2000, researchers at UNICEF and the World Health Organization began a concerted effort to evaluate and reconcile data on immunization coverage around the world. Their goal was to produce, for as many countries as possible and for each year from 1980 onward, a “consensus estimate” of the share of a target population (usually children surviving to age 1) that had been immunized with a specific antigen. To produce these estimates, they reviewed and evaluated all available immunization coverage information for as many countries as possible for as many years as possible from 1980 onward. The first estimates were released in 2001; updated series are available here.
The World Health Organization (WHO) has data on infant and child undernourishment. Some countries have data spanning the early 1980s to the early 2000s. The sources of the data for each country are described with admirable thoroughness.
The Food and Agriculture Organization (FAO) has compiled data for about 190 countries on the proportion of the total, adult, and child populations that suffer from undernourishment, as well as on many other food and nutrition-related indicators (food needs; food, protein, and micronutrient availability; food trade; food aid), in 1990-92, 1995-97, 2000-02, and 2005-07. In previous years the FAO has also produced estimates for 1979-81, 2001-03, and 2002-04.
The state-of-the-art source for data on GDP per capita (at PPP, in constant 2005 US dollars) is Penn World Table 7.1. The database contains information for 189 countries for some or all of the years from 1950 to 2010. In my own work, when I want a country’s GDP per capita in a certain year, I use the variable entitled “rgdpch” (which is in column “Y”). This means real (in 2005 international dollars, so that the figures are adjusted for inflation) GDP per capita at purchasing power parity according to a chain index. “About the Variables in PWT 7.1″ tells you what each variable in the spreadsheet means. Unfortunately, the excel spreadsheet contains a three-letter “isocode” rather than a full country name. To find out which country each isocode corresponds to, scroll down to the bottom of the page and click on “Appendix” under “Old Documentation.”
The economist Angus Maddison, who died in April 2010, collected comparable data on population, total GDP, and GDP per capita (at PPP, in constant 1990 international Geary-Khamis dollars) for many countries and regional aggregates (e.g., Latin America as a whole) over the very long run, as well as the short run. GDP per capita estimates are available for 32 countries for 1 C.E., and for larger and larger numbers of countries for subsequent years: 33 for 1500, 56 for 1870, 142 for 1950-1989, and 166 for 1990-2000. The data are available here (scroll down to “Historical Statistics”). If you click on the “vertical” file, you get the countries across the top and the years (1 C.E. to 2006) down the side, which is probably the most useful format for most purposes.
Data on average years of schooling and other measures of educational attainment, as well as the same indicators for females only, are available for 146 countries at http://www.barrolee.com/data/dataexp.htm. I usually use the data for the 15+ age group; data for the 25+ age group are also available.
For illiteracy, see the World Bank World Development Indicators and the UNDP Human Development Reports listed at the top of this page under “Major Statistical Compendia.”
The UNESCO Institute for Statistics has recent data on enrollment ratios, repetition rates, and other educational indicators for most of the world’s countries.
10. Income Poverty and Income Inequality
This webpage is the gateway to the World Bank’s data on income poverty and income inequality.
The Socio-Economic Database for Latin America and the Caribbean (SEDLAC), which is based at the Universidad Nacional de La Plata in Argentina, provides the most transparent data ever assembled for 25 countries of the region on income inequality, income poverty, and many other indicators.
Data on income inequality are available from the World Institute of Development Economics Research (WIDER) World Income Inequality Database V 2.0c May 2008. You’ll need the .pdf User Guide as well as the Excel spreadsheet file. These data incorporate, refine, and update the well-known dataset assembled by Klaus Deininger and Lyn Squire at the World Bank in the mid-1990s.
Data on income poverty and income inequality in several Latin American countries from 1970 to 1995 are available in Juan Luis Londoño and Miguel Székely (1997). “Persistent Poverty and Excess Inequality: Latin America, 1970-1995.” Working Paper 357, Office of the Chief Economist, Inter-American Development Bank. October. Available as of January 24, 2008, at http://www.iadb.org/res/publications/pubfiles/pubWP-357.pdf
The WHO/UNICEF Joint Monitoring Programme for Water Supply and Sanitation provides carefully collected data on the proportion of the population with access to safe water and adequate sanitation. The most detailed information is to be found in the country files, some of which contain estimates from as far back as 1980. In most cases, however, UNICEF considers data collected before 1990 as significantly lower in quality than data from 1990 forward.
Data on land area, proportion of the population near the coast, latitude, population, and other such variables have been assembled by John Gallup, Andrew Mellinger, and Jeffrey Sachs. To obtain them, go to a web page at the Harvard Center for International Development and:
1. Scroll down to Geography Data Sets and click on General Measures of Geography. You’ll see 1) Physical geography and population (Revised data 9/04/01).
2. On a Mac (sorry, don’t know what to do on a PC), hold down “option” and click on “ASCII file (comma delimited),” which downloads a .csv comma-delimited file to your desktop.
3. Open up a newish version of Excel, and from within Excel, open the comma-delimited .csv file (to do this, you may need to switch within the “open” dialog box in Excel from “all readable documents” to “all documents”). Then save the resulting document as an Excel workbook.
4. The variable names won’t make much sense without the “Description of Data,” which is available as a Word file just above the line that says “ASCII file (comma delimited).”
13. Democracy, Civil and Political Rights, Women in Parliament
The most comprehensive dataset of quantitative democracy indicators is probably Polity IV, compiled by Ted Robert Gurr, Keith Jaggers, Monty Marshall, and their collaborators. This dataset stands out for its long empirical time frame (data go all the way back to a country’s date of independence, with the cut-off at 1800), its transparent and detailed coding rules, and its use of multiple coders and of tests of inter-coder reliability. To create the database, coders drawing on secondary literature assigned each of the world’s independent nations, in each year from 1800 to 2008, scores on “democracy” and “autocracy.” The scores are based on three sets of criteria: (1) “openness and competitiveness of the recruitment of the chief executive”; (2) “constraints on the authority of the chief executive”; and (3) “political participation and opposition.” Each criterion has subcomponents. For example, political participation and opposition includes “regulation of participation” (how much factionalism and personalism there is in politics) and “competitiveness of participation” (how much incumbents restrict political opposition). The subcomponents and components are scored, weighted, and combined to form a democracy score ranging from 10 to 0, as well as an autocracy score ranging from 0 to -10 (10 is most democratic, -10 is most autocratic). The two scores are then combined to form a “Polity” score ranging from 10 (most democratic) to -10 (most autocratic). The coding is done transparently and systematically and is checked for inter-coder consistency.
The data and a guide are at the Polity IV Project gateway page. Go there, scroll down to Polity IV Data Series version 2010, click on the link, then scroll down to Polity IV: Regime Authority Characteristics and Transitions Datasets,” and click on “Polity IV Users’ Manual pdf file” for the codebook (on the left side of the web page) and on “Excel times[sic]-series data” for the data (on the right side of the web page).
Freedom House rates countries annually on “civil liberties” and “political liberties.” Go here and scroll down to ”Country ratings and status, FIW 1973-2011″; an Excel spreadsheet with the time series for some 200 countries downloads to your desktop. The methodology by which the scores were produced is described here.
A useful description and critique of quantitative democracy indicators, including the Polity and Freedom House indicators, is Gerardo Munck and Jay Verkuilen, “Conceptualizing and Measuring Democracy: Evaluating Alternative Indices.” Comparative Political Studies 35 No. 1 (February 2002), 5-34. The article and commentary on it are available here if you or your institution subscribe to this journal. Another useful paper by Gerry Munck on this topic is, for the time being (April 2012), here.
The International Institute for Democracy and Electoral Assistance (IDEA) has useful databases on voter turnout, electoral systems, and gender quotas for national legislative seats. Click here and try the links under “Databases and Networks.”
A World Bank webpage provides access to aggregate governance indicators for 212 countries for 1996-2007 for six dimensions of governance: voice and accountability, political stability and absence of violence, government effectiveness, regulatory quality, rule of law, and control of corruption.
The Fraser Institute in Vancouver, BC, rates most countries of the world for 1970, 1975, 1980, 1985, 1990, 1995, and each year from 2000 to 2009 according to how closely each conforms to what the Institute defines as a free-market system. To make sense of the data you’ll need to consult the Report. The 2010 Report is downloadable in .pdf format, and the data in Excel format, at the Fraser Institute website.
