There
has been a spectacular increase in the availability
and quality of data from developing countries in recent years.
Many of these datasets are
either in the public domain or can be obtained at modest cost from the
data collection agency. This page is intended as a resource to
help locate those data.
The
International Household Survey Network provides links to
documentation to many of these and other data.
Some of the data are available at
ICPSR, the
World Bank Microdata Catalog and the
Harvard Dataverse Network which houses data produced by
JPAL.
We provide links to some of the
data that are on-line and explanations of how to obtain others.
There has also been a huge investment in methods that isolate
exogenous variation in programs, policies, opportunities and
constraints. Randomized controlled trials have played a central
role in this work. The AEA maintains a
Registry of RCTs which is an extremely valuable resource
for investigators to register their RCTs and to learn about
on-going or completed RCTs.
An on-going longitudinal survey of individuals, households,
families, communities and facilities, the first wave of the
survey was conducted in 1993/4 and included interviews with
7,224 households in 321 communities in 13 provinces in
Indonesia. The survey is representative of about 83% of the
Indonesian population. The second wave, in 1997/8, was followed
by a survey of 25% of the sample enumeration ares in 1998.
IFLS3 was conducted in 2000, IFLS4 was conducted in 2007/08.
and IFLS5 was conducted in 2014/15.
The surveys
contain retrospective histories about, for example,
employment, marriage,
fertility and migration over the life course of each
respondent. The surveys also include household consumption,
assets, self-reported health status and a battery of health measures
(including anthropometrics, hemoglobin, blood pressure, lung capacity
and time to stand from a sitting position). In 2007, cholesterol
and CRP assays using dried blood spots were added. In 2014, dried blood spots were
collected to measured CRP and HbA1c.
Public domain data and documentation are available on
the web
.
conducted in 1976/7 and 1988 also contain extensive
histories on employment, marriage, fertility and migration.
Respondents in the first wave were followed in the subsequent
waves; in the second wave, a refreshment sample was added.
MFLS1 (1976/77) and MFLS2 (1988)
are in the
public domain.
was conducted in 1996 and covers the same area as the Matlab
Demographic Surveillance System. The data are in the
public domain
.
A resurvey is underway.
is an on-going nationally representative longitudinal
survey of individuals, households, families and
communities.
The first wave was conducted in 2002. The first follow-up was
completed in 2005. The second follow-ups was conducted in 2009/10.
In addition to
consumption, income, wealth,
employment, marriage and fertility,
the survey contains a module on crime and victimization
as well migration histories. Respondents are followed if they
move and interviewed in their new location. This includes
people who move to the U.S. and those that return to Mexico.
Biomarker data are collected and include
assets, self-reported health status and a battery of health measures
and dry blood spots.
Data from the first two waves collected in Mexico are in the
public domain
.
is a single cross section survey which was
was conducted in rural communities in 4 of Guatemala's 22
departments. The survey was fielded in 1995. The data are
publicly available
.
Conducted by a team of researchers from the United States
and the Philippines,
the Cebu Longitudinal Health and Nutrition Survey is an ongoing
study of a cohort of Filipino women who gave birth between May 1,
1983 and April 30, 1984 and have been re-interviewed
periodically since then.
The China Health and Nutrition Survey
baseline was conducted in 1989 and has been fielded almost every
other year since then.
The survey
provides a wealth of detailed information on health and
nutrition of adults and children including physical examinations
and extensive biomarkers.
The Nang Rong projects represent
a major data collection effort that was started in 1984
with a census of households in 51 villages. The villages
were resurveyed in 1988 and again in 1994/95.
New entrants were interviewed and a subsample of
out migrants were followed.
is an on-going panel survey of households in Russia that
began in 1992.
Living Standards Measurement Studies (LSMS)
Since 1980, the
World Bank has been collecting multi-purpose household
survey data in several countries under the
Living Standards Measurement Study
umbrella.
This site contains information about the LSMS project,
lists the countries included in the project and
describes how data may be accessed.
The Vietnam Life History Survey is a collaboration between
the University of Wasthington,
the Institute of Sociology and the Institute of Social Sciences,
in Vietnam. The survey collects
data from about 100 households in two urban and two rural
areas in Vietname.
The data are available at
CSDE at UW.
The Vietnam Longitudinal Survey is a collaboration between
Professor Charles Hirschman, University of Wasthington,
the Institute of Sociology in Vietnam. The survey collects
detailed demographic information from all adult respondents
in over 1,800 households in one area of Vietnam.
The data are available at
CSDE at UW.
China Health and Retirement Longitudinal Study (CHARLS)
The
China Health and Retirement Longitudinal Study (CHARLS)
is patterned after the Health and Retirement Study (HRS) in the US.
Pilot data were collected in 2008 in two provinces: Zhejiang and Gansu
(the richest and poorest provinces). One person aged 45 and over was
randomly chosen in each household with an age eligible person, and they
and their spouse were interviewed. The sample is representative of
people 45 and over in these two provinces in China. This sample
contains data on 1,570 households and just under 2,700 individuals.
Data are available
here.
The first nationally-representativa wave of CHARLS
was fielded in 2011 and inclues 17,500 individuals living in 10,000 households
in 150 countires/districts and 450 villages/resident committees.
The first follow-up was in 2013 and a Life History Survey was conducted in 2014.
The Townsend Thai Project
The
Townsend Thai Data
comprises annual and monthly panels. The baseline survey was conducted in 1997
in villages in four provinces and has been expanded to add urban areas and other
provinces.
Upon beginning the Townsend Thai project, the Principal Investigator, Dr.
Robert M. Townsend, sought to understand what risks households in a
typical village face. In particular, he wanted to examine if adverse
events, or shocks, that households experience were aggregate or
idiosyncratic. This led to the creation of what was originally intended to
be one large survey of rural Thailand in 1997, but has continued up until
the present time.
The initial plan was to evaluate villages in similar environments with and
without formal institutions by using a combination of annual household,
institutional, and key informant surveys. In August 1998, the project
incorporated an extremely detailed monthly survey to gather data, in
combination with other environmental data. This ongoing data collection
project, the Townsend Thai data, has been generously supported over the
past by the National Institute of Child Health and Human Development
(NICHD), the National Science Foundation (NSF), the Ford Foundation, the
John Templeton Foundation, the Bill & Melinda Gates Foundation, the
University of the Thai Chamber of Commerce (UTCC), the Thai Ministry of
Finance, the Bank for Agriculture and Agricultural Co-operatives (BAAC),
the National Opinion Research Center (NORC), and the National Bureau of
Economic Research (NBER).
India Human Development Survey
The
India Human Development Survey is a nationally representative multi-topic
longitudinal survey of over 41,000 households in India. The baseline was conducted
in 2004-5. Data are available through
ICPSR.
DataFirst,
a research unit
at the University of Cape Town, is a web portal for South African census
and survey data as well as metadata and all research output based on this
data.
The catalogue of downloadable datasets is
here.
INDEPTH Network
The
INDEPTH Network is a global network of health and demographic
surveillane system (HDSS) field sites in Africa, Asia and Oceania.
The network describes itself as being the only organization in the world
capable of producing reliable longitudinal data not only about the
lives of people in low- and middle-income countries but about the impact
on those lives of development policies and programs.
The
INDEPTH Data Repository shares data from the
INDEPTH sites.
State-level data from India copiled by the Economic Organiasation and Public
Policy Programme at the LSE is available
here. Topics covered include
land reform
media and political agency
labor regulation
quality of life
economic reforms
India Agriculture and Climate Data Set
The database provides district level data on
agriculture and climate in India from 1957/58 through 1986/87.
The dataset includes information on
Area planted, production and farm harvest prices for five major and
fifteen minor crops.
Areas under irrigated and high-yielding varieties (HYV)
for major crops.
Data on agricultural inputs, such as, fertilizers,
bullocks and tractors - in both quantity and price terms
Agricultural labor, cultivators, wages and factory
earnings, rural population and literacy proportion.
Meteorological station level climate data (average climate over
30 year period)
Soil data
The dataset was compiled by
Apurva Sanghi, K.S. Kavi Kumar, and James W. McKinsey,
of the World Bank and draws on work by James McKinsey
and Robert Evenson of Yale University.
For more information, click
here
.
The data and documentation are available
here
.
A note on converting the files to STATA written by Gareth Nellis is
here
.
Indian National Sample Survey Organization
The National Sample Survey Organisation (NSSO)
of India has a long tradition of conducting
high quality surveys.
NSSO carries out socio-economic surveys,
undertakes field work for the Annual Survey of Industries
and follow-up surveys of Economic Census,
sample checks on area enumeration
and crop estimation surveys
and prepares the urban frames useful in drawing of urban samples,
besides collection of price data from rural and urban sectors.
The data are
available for purchase
on CD.
Mexican Health and Aging Study (MHAS)
The
Mexican Health and Aging Study
is a prospective longitudinal survey of older
adults (born before 1951) and their spouses.
The first wave was conducted in 2001 and
interviewed almost 10,000 adults and 5,000
spouses. The first follow-up was completed
in 2003. The project is a collaboration
of researchers at the Universities of Pennsylvania,
Maryland and Wisconsin with INEGI in Mexico. It is
directed by Beth Soldo.
SABE (Salud Bienestar Y Envejeveimiento en America Latina y El Caribe)
SABE (Salud Bienestar Y Envejeveimiento en America Latina y El Caribe)
is a series of comparable cross-national surveys
on health and aging organized as
a cooperative venture among researchers in Argentina, Barbados,
Brazil, Chile, Cuba, Mexico and Uruguay.
The goal of the project is to describe health, cognitive
achievement and access to health care among people age 60
and older with a special focus on people over 80 years old.
Professor Alberto Palloni is the project PI which has been
funded by PAHO and the NIA.
Learning and Education Achievement in Punjab Schools
The Learning and Education Achievement in Punjab Schools (LEAPS) Project
is a multi-year project initiated by researchers at Harvard University,
Pomona College, and the World Bank that attempts to capture and track
changes in the educational universe at the primary level (upto grade 5) in
112 villages in Pakistan. The main component of the project is a set of
extensive surveys designed & conducted by the LEAPS team, with care being
taken to be representative of the various actors in the educational
market.
The data consists of questionnaires administered to all 823 primary
schools (public, private, NGO) in the 112 villages, to over 800 teachers
(with basic information on 5,000 teachers), 1800 households, 6000 school
children, and achievement tests of 12,000 class 3 children in Mathematics,
English, and Urdu. All children, households, schools and teachers are
matched and then followed over three additional (annual) rounds of
surveys, for a complete 4-year panel.
The first round of data from these surveys & related documentation is now
publicly available for researchers at:
www.leapsproject.org.
The website
also provides related information (questionnaires for all rounds,
preliminary papers, and a LEAPS report that highlights findings from the
first round).
The RIGA project, a collaborative effort of FAO, the World Bank and
American University in Washington, DC, aims to promote the
understanding of the roles, relationships and synergies of on-farm
and off-farm income generating activities for rural households.
Building on existing household living standards surveys, the project
has developed methodologically consistent, internationally comparable
income data that are now available free of charge from the project's
website.
The database contains cross-country comparable indicators of
household-level income for 26 surveys representing 16 countries
across Africa, Asia, Eastern Europe and Latin America, making it a
valuable resource for researchers and analysts in the development
field. The surveys are both cross-sectional and panel, and currently
run from 1992 through 2005; more surveys will be added to the
database as they become available. While the RIGA project focuses
mainly on the analysis of rural issues, the dataset contains
information on both urban and rural income sources.
Descriptions of evaluations conducted by the Abdul Latif Jameel Poverty Action Lab
are available from the
J-PAL evaluations page.
Data underlying these evaluations are available from the same site.
IFPRI has conducted several very innovative surveys in African and Asian
countries. Many of these surveys are available for research
purposes. See their
home page and click on datasets.
Agricultural Innovation and Resource Management in Ghana
South African National Income Dynamics Study (NIDS)
NIDS is a nationally representative
panel study that examines income,
consumption and expenditure of households over time in South.
Africa.
The baseline survey was conducted in 2008 and follow-ups have been
conducted every 2 years since then.
The data
throws light on matters such as coping strategies deployed in response
to shocks and unexpected events whether
negative or positive, such as death in the
family or an unemployed relative obtaining a job.
In addition to income and expenditure dynamics,
study themes include the determinants of changes in poverty
and well-being;
household composition and structure;
fertility and mortality;
migrancy and migrant strategies;
labour market participation and economic activity;
human capital formation,
health and education; vulnerability and social
capital.
See the NIDS
web page for details.
Langeberge integrated household survey was conducted by a consortium of
South African and American universities along with government and
non government agencies in South Africa. Data may be requested
by sending an email. See their web page
web page for details.
in five Asian countries collected detailed information
on the status of women and their husbands in conjunction
with fertility choices.
Data collected in
Malaysia, Pakistan, Philippines and Thailand in 1993/1994
are available for downloading
here.
Professor Doug Massey and collaborators have collected several
waves of surveys on migration from central Mexico with special
sub-samples of Mexicans living in Chicago. The data can
be obtained from the
MMP.
web-site of by contacting Kristin Espinosa at the University
of Pennsylvania. Her e-mail address is
espinosa@pop.upenn.edu.
is an extension of the
MMP.
Mexican Migration Project .
The project is directed by Professor Doug Massey
who, with his collaborators, has collected data
in Puerto Rico, the Dominican Republic, Nicaragua,
Costa Rica and Peru. Data are available
here.
collects fertility and health surveys carried out in
Central America. Data from Belize, Guatemala, El
Salvador, Honduras, Nicaragua, Costa Rica and Panama
are included in the collection.
TAPS is an annual panel data set covering the period 2002 throuh 2006
that follows a native Amazonian horticultural and
foraging society experiencing rapid integration to the rest of the world.
The study has been tracking about 1,500 native Amazonians
in about 250 households of 13 villages
along the Maniqui River, Department of Beni, Bolivia,
and has introduced agricultural development projects.
TAPS surveys take place every year during June-August. The first five-years
of data, 2002-2006, are now available to the public in STATA.
To request access to the 2002-2006 panel data set and its
documentation go to the following web site:
http://people.brandeis.edu/~rgodoy/research/pgs/panel.html
or contact Ricardo Godoy (781) 736-2784, rgodoy@brandeis.edu
The World Fertility Surveys (WFS)
were conducted in 41 countries
during the 1970s and early 1980s.
The data are all in the public domain and
available at the
Office of Population Research at
Princeton University .
This is a very good site to find out about data on
fertility including the
Chinese In-Depth Fertility Surveys.
Countries for which World Fertility Surveys are available include:
More recent fertility, mortality and health data are available from
Demographic and Health Surveys (DHS) .
National
which is
DHS has been collecting national sample surveys of population and
maternal and child health conducted in many developing
countries since the 1980s. Data are currently collected under
the umbrella of the Measure project which is administered by
Macro International.
Data have been collected in four waves:
The Centers for Disease Control (CDC) assists countries throughout
the world in the development, implementation and analysis of national
reproductive health surveys.
Firm level data collected by The World Bank
in collaboration with the Centre for the Study of African
Economies, Oxford University, and several Government Statistical
Agencies may be downloaded from
this site.
CSAE faculty have collected firm level data in several
African countries.
Data from
Ghana, Ethiopia, Tanzania and also, from a
comparative study, in Cameroon, Ghana, Kenya and Zimbabwe.
are available from the
CSAE web-site
.
Some of these data are also available on the
World Bank web site.
This site is maintained by The World Bank and
contains country level data on economic growth.
Data related to several articles published on
models of growth are available.
These include, for example:
Barro, Robert J., and Jong-Wha Lee. 1993.
"International Comparisons of Educational Attainment."
Journal of Monetary Economics
32 (3): 363-94.
De Long, J. Bradford, and Lawrence Summers. 1993.
"How Strongly Do Developing Economies Benefit from Equipment Investment?"
Journal of Monetary Economics
32 (3): 395-415.
Fischer, Stanley. 1993.
"The Role of Macroeconomic Factors in Growth."
Journal of Monetary Economics
32 (3): 485-512
King, Robert G., and Ross Levine. 1993.
"Finance, Entrepreneurship, and Growth: Theory and Evidence."
Journal of Monetary Economics 32 (3): 513-42.
Levine, Ross, and David Renelt. 1992.
"A Sensitivity Analysis of Cross-Country Growth Regressions."
American Economic Review 82 (4): 942-63.
Many of these offices provide a wealth of information.
Census data are available from these offices;
some household- and firm-level surveys are
public use and may either
be downloaded or ordered from their office.
Geohive provides a listing of statistical offices
across the globe.
This page is maintained by Duke University.
Please send comments and suggestions
about this page including links to data sources to
Duncan Thomas.
Please address all questions about data availability,
access and quality to the institution providing
the data.
If no
institution is listed, we regret the data are not supported.