National Baseline Household Survey, NBHS 2009

Sudan, 2009

Get Microdata

Reference ID

SDN_NBHS_2009_HD_V2.0

Producer(s)

Economic Research Forum, Central Bureau of Statistics (CBS)

Collections

Income and Expenditure Surveys ERF Harmonized Datasets

Metadata

DDI/XML JSON

Created on

May 08, 2014

Last modified

Oct 30, 2014

Page views

309673

Downloads

13157

Identification

Survey ID

SDN_NBHS_2009_HD_V2.0

Title

National Baseline Household Survey, NBHS 2009

Country

Name	Country code
Sudan	SDN

Study type

Income/Expenditure/Household Survey [hh/ies]

Series information

The National Baseline Household Survey 2009 (NBHS) is the second national sample survey designed to provide information for all Sudan after the peace agreement in 2005.

Abstract

<p style="border:solid thin black;"> THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED FOR THE 15 NORTHERN STATES BY THE CENTRAL BUREAU OF STATISTICS (CBS) IN THE REPLUBLIC OF SUDAN </p>

The central focus of the National Baseline Household Survey for the year 2009 (NBHS 2009) is to provide indicators and reference statistics for the living condition of all Sudanese. The NBHS 2009 was conducted in all the 25 States of Sudan in a uniform way whereas CBS collected and processed the data for the 15 Northern States and the Southern Sudan Centre for Census, Statistics and Evaluation (SSCCSE) had similar responsibility for the 10 Southern States.

The NBHS involves random sampling, interviewing 528 households in every state. The respondents were asked questions about their education, economic activity, consumption, housing and other topics addressing the issue of poverty.

Accurate, up-to-date, and relevant data from household surveys are essential for governments to make sound economic and social policy decisions. Governments need the data from the NBHS to measure and monitor poverty, employment and unemployment, school enrollment, nutritional status, housing conditions, and other dimensions of living standards. They need the data to determine whether food, drinking water, schools, agriculture, services, roads, electric power, and other basic services are reaching the poor and other disadvantaged.

Additionally analysts need household survey data to model economic behavior and thus provide answers to such important policy issues.

The raw survey data provided by the Central Bureau of Statistics for the 15 Northern States were cleaned and harmonized by the Economic Research Forum, in the context of a major project that started in 2009. During which extensive efforts have been exerted to acquire, clean, harmonize, preserve and disseminate micro data of existing household surveys in several Arab countries.

Kind of data

Sample survey data [ssd]

Unit of analysis

1- Household/family.
2- Individual/person.

Version

V1.0: The raw version of the survey data used for cleaning and harmonization purposes only. This version is not publicly disseminated.

V2.0: A cleaned and a harmonized version of the survey dataset, including all harmonized variables that could be generated from the survey raw data, in addition to a number of detailed-composite coded version of the variables considered essential on the household as well as the individual level, produced by the Economic Research Forum for dissemination.

Version date

2014-04

Version notes

All documentation available for the original survey provided by the Statistical Agency, and for the harmonized datasets produced by the Economic Research Forum, has been published, along with a copy of all international classifications of expenditures, occupations and economic activities used during the harmonization process.
However, as far as the datasets are concerned, the Economic Research Forum produces and releases only the harmonized versions in both SPSS and STATA formats.

Scope

Notes

Household: Includes geographic, social, and economic characteristics of households, namely, household composition, dwelling characteristics, ownership of assets indicators, heads' and spouses' characteristics, annual household expenditure and income.

Individual: Includes demographic, migration, education, labor and health characteristics, as well as annual income for household members identified as earners. Moreover, fathers' and mothers' characteristics are generated for household members if possible.

Topics

Topic	Vocabulary
Poverty	ERF
Expenditure	ERF
Income	ERF
Infrastructure	ERF
Education	ERF
Labor	ERF
Health	ERF

Coverage

Geographic coverage

This survey is representative for the 15 states of Northern Sudan, including urban and rural areas.

Universe

The survey covered a national sample of private households in Northern states of Sudan and all individuals permanently residing in surveyed households.
It did not cover special types of households such as institutional households (hostels, hospitals etc), refugee camps, IDP camps, cattle camps, and homeless people.

Producers and sponsors

Primary investigators

Name	Affiliation
Economic Research Forum
Central Bureau of Statistics (CBS)	Sudan

Producers

Name	Role
Statistics Norway	Technical assistance
United Nations Devlopment Fund	Technical assistance
United Nations Food and Agriculture Organization	Technical assistance

Funding Agency/Sponsor

Name	Role
African Development Bank	Financial support
The Norwegian Ministry of Foreign Affairs	Financial support
Federal Ministry of Finance and National Economy	Financial support

Sampling

Sampling procedure

<p style="border:solid thin black;"> THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED FOR THE 15 NORTHERN STATES BY THE CENTRAL BUREAU OF STATISTICS (CBS) IN THE REPLUBLIC OF SUDAN </p>

The sample selected for the NBHS2009 was based on a stratified two stage sampling procedure. The preliminary count of households per Enumeration Area (EA) as well as the cartographic work from the 2008 National Population and Housing Census were used as the sampling frame. The EAs from the census constituted the primary sampling units. For the NBHS 2009, the Census EAs were stratified by urban and rural in each State. Some of the sample EAs could not be covered because of security or other problems, in such case they were replaced by EAs within the same geographical areas. In addition, the sample did not include nomadic population due to lack of proper sampling frame for them and problems of accessibility. Also institutional households, camps etc as well as the homeless part of the population were excluded from the sample.

A second sampling stage was conducted by listing all households within the selected EAs in the sample and thereafter selecting a fixed random sample of 12 households to be interviewed. A total sample size of 528 households in each State was distributed into 44 primary sampling units (PSUs).

The sample size was designed to obtain reliable estimates for key survey variables at the State level and for urban and rural domains at the national level, the State 15 level and the State 10 level.

Response rates

The response rate for the NBHS 2009 conducted in the 15 Northern States, including replacements, is 99.9 per cent.

Weighting

The total sample for Northern Sudan is not self-weighting since a fixed sample of 528 households was drawn from each state, irrespective of the population size. Therefore, to derive estimates for Northern Sudan, it was necessary to assign a weight to each state level sample.
The variables "weight" or "hhweight" (renamed HHWEIGHT and HHWEIGHT2, respectively in the harmonized data file) should be used as the weighting coefficient. The 2 variables are identically weighing the sample.

Survey instrument

Questionnaires

The questionnaire was designed in English and translated into Arabic with the same wording and modules. It was distributed to the respondents in Arabic only.

In addition a comprehensive field manual (English) was prepared to assist the fieldworkers in filling out each section of the questionnaire. A summary manual was translated to Arabic and used for the training in the 15 Northern States.

The questionnaire was designed for Optical Character Recognition (OCR) using a commonly available software. It was printed on standard 80 grams A4 paper and stapled to a booklet.

A technical working group was established in July 2008. This group oversaw the development of the questionnaire, with inputs from different stakeholders and technical consultants.

Data collection

Dates of Data Collection

Start	End	Cycle
2009-05-17	2009-06-30	-

Data Collectors

Name
Central Bureau of Statistics

Supervision

The trained interviewers and supervisors in each state were distributed into three teams. Each field team composed of one supervisor responsible for supervision, guidance, selection of the ultimate sampling units and revision of filled in questionnaires on daily basis; four interviewers responsible for household listing and data collection and a driver. The state NBHS director provided overall supervision and follow up of the field work on daily basis. The field data collection was carried out during the period of 17th May - 30th June 2009.

Teams from the principal trainers had visited the states during the field work period. They met with the interviewers and supervisors in the field and discussed with them the problems faced and progress of data collection.

Notes on data collection

The field data collection was carried out in only 6 weeks during the period of 17th May 2009 - 30th June 2009.

Data processing

Cleaning operations

Raw Data
=======
The questionnaires for the 15 Northern states were scanned centrally at CBS premises in Khartoum. A high capacity scanner and optical character recognition (OCR) software were used. Approximately 96-97% of all characters filled in was automatically interpreted and entered into the software internal database. The scanning procedure included manual on-screen verification of remaining data that could not be automatically interpreted. Finally, the scanned data were exported as ASCII files with corresponding digital images of each questionnaire. The data files were converted, further processed/edited and also tabulated using the software SPSS/PASW.

The NBHS2009 was edited as a combination of post-scanning automated edits and manual back-checks on electronic images (TIF-files) stored for each questionnaire. The latter mainly used for verifying outliers due to possible scanning or fieldworker errors.

The automated edits were pre-programmed to identify and correct consistency errors within each thematic section of the questionnaire and, especially for age related variables (marital status, education and work), also across section checks were applied.

Outliers were defined as outside the range of MEAN +/- 3 x STDV of actual variable in stratum. Outliers were listed and, unless manual intervention from subject matter specialist, the outliers were automatically imputed to MEDIAN value of stratum.
However, for the very thorough edits of the questionnaire section M (purchase and consumption) additional information on local market prices were used to correct the raw data.

If skip was missing or inconsistent with responses given in the related detailed question, the detailed question response overruled the skip and the skip was adjusted.

The difficulties with achieving consistency between age and level of current school attending was approached by introducing a predefined acceptable age range with upper and lower cut-off for each level of school from Primary 1 to University. People defined too old for a certain school level reported, was corrected to "not currently attending" and the initially reported school level was imputed in the "highest ever school level" variable.

To keep track of the amount and type of edits done, all variables with automated or manual intervention were flagged.

Two cleaned data master files are produced from the NBHS2009. One file with individuals distributed (section B-D) and one file with households distribute (E-O). In addition special files are produced for commodities (section M) used for poverty and food security calculation and for the agriculture (section N) concerning crop production and structures.

There were some challenges encountered in the implementation of the survey:
· Change from Quick Baseline Poverty Survey (QBPS) to the NBHS concept resulted in addition of other modules that inflated the questionnaire which involved much more work and additional funds were required to conduct the survey
· Delay of transfer of filed work budget to the CBS statistical offices at the states to almost one month had delayed the start of data collection stage from April to May 2009.
· Due to insecurity situations in some parts of Darfur region; six clusters in South Darfur, three in North Darfur and one in West Darfur were replaced in the same geographical areas. In addition, due to respondents refusal to cooperate with the field work teams in two EAs (clusters) one in each of Blue Nile and Nahr Elnil states, these selected EAs were replaced and the field work was completed.
· The collection of consumption information for some items was made especially hard by the lack of standardized units of measurement in North Sudan. Because, consumption of these items is sourced in non-standardized units (such as heaps, cups, bundles, rubu etc.), it is hard to calculate consumption in standardized comparable units (such as kilograms and litres). Accordingly, the questionnaire allowed respondents to report consumption in non-standardized units. A market survey, conducted at state level, provided specific conversion factors for the non-standardized measurement units. While this was the only feasible solution, it may still be prone to non-trivial measurement errors.

Harmonized Data
============
- The Statistical Package for Social Science (SPSS) is used to clean and harmonize the datasets.
- The harmonization process starts with cleaning all raw data files received from the Statistical Agency.
- Cleaned data files are then all merged to produce one data file on the individual level containing all variables subject to harmonization.
- A country-specific program is generated for each dataset to generate/ compute/ recode/ rename/ format/ label harmonized variables.
- A post-harmonization cleaning process is then conducted on the data.
- Harmonized data is saved on the household as well as the individual level, in SPSS and converted to STATA format.

Data Access

Access authority

Name	Affiliation	URL	Email
Economic Research Forum	(ERF)	www.erf.org.eg	erfdataportal@erf.org.eg

Confidentiality

Is signing of a confidentiality declaration required?	Confidentiality declaration text
yes	To access the micro data, researchers are required to register on the ERF website and comply with the data access agreement. The data will be used only for scholarly research, or educational purposes. Users are prohibited from using data acquired from the Economic Research Forum in the pursuit of any commercial or private ventures.

Access conditions

Licensed datasets, accessible under conditions.

Citation requirement

The users should cite the Economic Research Forum and the Central Bureau of Statistics in the Republic of Sudan as follows:

OAMDI, 2014. Harmonized Household Income and Expenditure Surveys (HHIES), http://www.erf.org.eg/cms.php?id=erfdataportal. Version 2.0 of Licensed Data Files; NBHS 2009 - Central Bureau of Statistics (CBS), Sudan. Egypt: Economic Research Forum (ERF).

Disclaimer and copyrights

Disclaimer

The Economic Research Forum and the Central Bureau of Statistics in the Republic of Sudan have granted the researcher access to relevant data following exhaustive efforts to protect the confidentiality of individual data. The researcher is solely responsible for any analysis or conclusions drawn from available data.

Contacts

Name	Email	URL
Economic Research Forum (ERF) - 21 Al-Sad Al-Aaly St., Dokki, Giza, Egypt	erfdataportal@erf.org.eg	www.erf.org.eg

Metadata production

Document ID

SDN_NBHS_2009_HD_V2.0

Producers

Name	Role
Economic Research Forum	Cleaning and harmonizing raw data received from the Statistical Agency

Date of metadata production

2014-04

Metadata version

Version

Version 2.0

Back to Catalog