ResearchAdd-on projects
Flood risk and livelihood

Add-on project: Flood risk and livelihood in the Central Highlands of Vietnam

Project report

This project, jointly implemented by the University of Bristol ( and the University of Economics, Ho-Chi-Minh City (, is titled “An Interdisciplinary Approach to Understanding Past, Present and Future Flood Risk in Viet Nam” (NERC/NAFOSTED, NE/S003061/1).

The project team included: Dr. Pham Khanh Nam (University of Economics, Ho-Chi-Minh City, UEH), Dr. Jeffrey Neal (University of Bristol, UoB) as PIs, Dr. Felix Agyemang (UoB), Dr. Sean Fox (UoB), Dr. Andre Groeger (Universitat Autònoma de Barcelona, UAB Barcelona), Dr. Laurence Hawker (UoB), Dr. Truong Dang Thuy (UEH), Dr. Yanos Zylberberg (UoB) as Co-Is and Collaborators. The team has been supported by two TVSEP staff, namely Mr. Niels Wendt who programmed the questionnaire using Survey Solutions and assisted the project team in online supervision of the interviews and Mr. Manh Hung Do who assisted the project team in the field implementation of the survey.

This project on flood risk and livelihoods in Dak Lak builds upon the activities of the Thailand Vietnam Socio Economic Panel (TVSEP) in Vietnam.  A survey was conducted in four villages which belong to the TVSEP panel in the province of Đắk Lắk, Vietnam, during September and October 2019. While TVSEP regularly interviews 10 panel households per village since 2007, this project aimed at a village census and interviewed 90 % of all households in these four villages, with a total of 947 households being interviewed. The project also introduced a few novel features: 1) network linkages between households are explicitly recorded; these include, for example, labour exchange, previous credit linkages, informal insurance network, information exchange, friendship etc.; 2) the land plots and the assets (houses) of each household are precisely geo-referenced using satellite images augmented by cadastral boundaries; 3) buildings were photographed and geo-referenced to and the module on housing conditions was extended to capture possible resilience to hydrological hazards. The second feature (2) will be further developed in subsequent panel waves of TVSEP.

In the following, an overview of the project, including survey design and sampling, the questionnaire, training and survey organization, quality assurance and some selected initial results are provided.

Research objectives

Rural households in developing economies adapt to risk and to the failure of formal institutions by diversifying their activities and engaging in small, informal institutional arrangements sustaining economic exchanges. The diversification of agricultural activities possibly underlies a puzzle highlighted in recent research: the very low and heterogeneous agricultural productivity in rural economies. The research builds upon a unique household survey in four rural villages of the Central Highlands of Vietnam and novel data with hydro-physical modelling of flood risk in order to:

  • shed light on the direct relationship between risk and the fragmented allocation of production inputs across various activities (e.g., agricultural households holding numerous, geographically dispersed land parcels);
  • explain the relation between such diversification and productivity (its average level, and its dispersion within villages);
  • understand the role of informal insurance networks in mitigating the risk.

The design of the survey thus responds to the need to: precisely capture flood risk at the land parcel level (and at the building level); and to fully characterize local insurance networks. We describe the survey design next.

Survey design and sampling

The sampling of households in the household survey was done by selecting four villages of Đắk Lắk covered by TVSEP (that regularly interviews 10 households per village as part of the panel) between 2007 and 2017. The sample is not representative of the Central Highlands nor of the Đắk Lắk province. The non-representativity of the sample at the district or province level is a drawback associated to the need of conducting a full census within villages: a representative sample would require a large number of villages spanning the range of settlements in Đắk Lắk along the main characteristics of interest (e.g., cropping patterns, non-agricultural sector, geography and flood risk, ethnic composition etc.). For practical and financial reasons, we had to restrict ourselves to a limited number of villages, inducing a more narrative approach to the sampling design.

The final selection of villages was based on the following principles:

  • TVSEP: the villages need to be covered by TVSEP between 2007 and 2017.
  • Flooding: the villages should be exposed to floods and cover different topographies. In practice, most rural villages in Đắk Lắk are in relatively low-land areas surrounded by rugged terrain. The proximity to or the location along the river may however vary. Within our selection, one village is located in the North of Đắk Lắk while the three others are in the South, South- East and far East.
  • Agricultural diversification: the villages should be mostly rural and should cover the different types of agricultural production observed in Đắk Lắk (i.e., rice, coffee and other cash crops, such as rubber or pepper).
  • Size: the villages should be composed of between 150 and 350 households.
  • Outliers: the villages should not report experiencing issues with drug trafficking, should not host a major factory employing a significant share of the village.

Sampling of households within villages: within each village, the purpose of the survey was to interview all households. To this purpose, we collected administrative data and visited all registered households. We ended up interviewing 947 households, to be compared with about 1,025 registered households.



The questionnaire was designed using the World Bank’s Survey Solution software and was conducted using tablets, some of them borrowed from TVSEP.  The software allows for:

  • Real time feedback to the enumerators while conducting the interview through: 
    • Connection of sections for consistency and logic;
    • Identification of implausible values;
  • Paradata provides opportunities to monitor the enumerators’ activity (e.g., length of the survey, time between questions etc.) and identify underlying issues in the interview process;
  • Display of warnings and errors to assist the enumerator;
  • Option for the enumerator to comment on implausible values and explain; 
  • Checking and rejection of questionnaires after upload to further enhance data quality. 

The survey questionnaire is available online together with the present document; it heavily builds upon the 2019 TVSEP questionnaire except for the following unique features: the land plots and assets of each household are precisely geo-located using a novel procedure based on the recognition of land parcels on satellite images augmented by cadastral boundaries; production data is matched at the parcel and crop levels (including a decomposition of labour input between hired labour, family labour and labour exchange between households); all network linkages between households are recorded (e.g., previous credit linkages, informal insurance network, information exchange).

We provide below a description of the land parcel recognition procedure and the identification of the household network.

Parcel module: The geo-localization of parcels is part of the land module and proceeds as follows:

  • A satellite map is prepared and augmented by the addition of points of interest (e.g., gas stations, supermarkets, schools etc.). The map covers a radius of 8 kilometers around the village centroid.
  • The software automatically centers the map around the current location (the “House”); the interviewer then helps the respondent navigate by showing her/him the main points of interest, the main roads, the waterways. In practice, the most efficient way of finding a land parcel is to ask the respondent to follow the usual route on the map, starting from the house to the land parcel.
  • Once the location of the land parcel is identified by the respondent on the map, the interviewer draws a polygon under the instructions of the respondent.
  • Additional questions help capture possible issues with the geo-localization, e.g., how sure the respondent may be, how much help was needed etc.

 We provide below a comparison between the measured land characteristics (area) and these characteristics as reported by the respondent.

Figure 1: Validation of the geo-location data. Notes: This figure reports the relationship between the measured land area (x-axis) and the area as reported by the respondent (y-axis). We create bins of observations along the x-axis variable and the dots represent the average of the y-axis variable within each bin. The lines are locally weighted regressions with their associated 95% confidence interval.

Network module: The identification of the household network relies on: (i) a list of contacts with their name, age, gender, phone number (last 6 digits), and a description of their relationship with the different household members; (ii) references to these contacts when relevant along the questionnaire.

Enumerators were encouraged to establish a preliminary list and to update the list as the interview went along if new contacts were mentioned by the household.

The matching algorithm proceeds in steps: (i) matching is performed on gender, age (within a window of 5 years), and the last 6 digits of the phone number; among unmatched entries, (ii) matching is then based on gender, age, and exact name matching; (iii) unmatched entries are finally matched through a fuzzy matching on names, accounting for specificities of the Vietnamese language (and frequent misspelling). The outcome of this matching procedure is about 2,900 linkages from about 4,000 reported contacts (a match rate of about 71%).

We provide below descriptive statistics about the match quality: differences between reported age by the villager and the actual age of the matched friend; and a decomposition of friends by matching method (exact name matching, fuzzy matching, matching using the phone number).

Figure 2: Match quality (age difference). Note: This figure reports the distribution of age differences (reported versus actual) within the sample of matched contacts.
Figure 3: Match quality (matching method). Notes: This figure reports the number of matches per matching method (phone: match based on gender, phone number, and age; exact: match based on exact string matching between names; fuzzy: match based on fuzzy string matching between names).


The training of enumerators and supervisors followed three main steps: In a first step, enumerators were encouraged to read the training manual (available online together with this document). In a second step, enumerators and supervisors went through a “classroom” training of about 5-6 days with practice but also theoretical discussions (e.g., how to collect income and expenditures data, how to report wages and hours worked, how to convert quantities for harvested crops etc.). In a third step, enumerators and supervisors collected household surveys in a fifth village near Buon Ma Thuot during 1-2 days to get experience “on the field”.

Most of the enumerators and supervisors were either former TVSEP enumerators or students in Economics from the local University (Tay Nguyen University,


Survey organization

The survey was supervised by our main team at the University of Economics, Ho-Chi-Minh City, and one researcher was always present on the field. The villages were interviewed in a sequential way; the whole team of supervisors and 25 enumerators was thus present in the same village at any point in time. Three supervisors were responsible for a sub-team of about 8 enumerators each. Finally, five data checking assistants (DCAs) were allocated five enumerators each.

The survey was completed in about one month from mid-September 2019 to mid-October 2019, and no incidents of significance occurred. A barbecue with all the organizing team and the enumerators was organized in the last village upon survey completion.

Quality assurance

Quality assurance relied on: (i) a live monitoring of each questionnaire by DCAs with a large number of questionnaires being rejected to be resubmitted after minor revisions, (ii) a monitoring of each enumerator’s activity through the Survey Solution software, (iii) preliminary warnings prepared within the questionnaire.


All in all, the survey can be regarded as a success with all goals being achieved. The implementation using computer assisted personal interviews (CAPI) and the utilization of both the TVSEP villages and questionnaire as a basis for sampling and survey proved to be invaluable and significantly increased the quality of the collected data. Further assistance by the TVSEP greatly aided in the facilitation of the project.

The collected data will provide the foundation for a wide variety of interesting research and the new approaches explored in this survey will in turn help to advance the future TVSEP questionnaires.