Ernst Strüngmann Forum

You Are Here

Digital Ethology

Human Behavior in Geospatial Context

Edited by Tomás Paus and Hye-Chung Kum

An edited collection that looks deeply at how humans transform their environments and how these environments, in turn, shape humans.

Countless permutations of physical, built, and social environments surround us in space and time, influencing the air we breathe, how hot or cold we are, how many steps we take, and with whom we interact as we go about our daily lives. Assessing the dynamic processes that play out between humans and the environment is challenging. Digital Ethology, edited by Tomás Paus and Hye-Chung Kum, explores how aggregate area-level data, produced at multiple locations and points in time, can reveal bidirectional—and iterative—relationships between human behavior and the environment through their digital footprints.

Experts from geospatial and data science, behavioral and brain science, epidemiology and public health, ethics, law, and urban planning consider how humans transform their environments and how environments shape human behavior.

Contributors

José Balsa-Barreiro, Kim A. Bard, Steven Bedrick, Michael Brauer, Thomas Brinkhoff, Nitesh V. Chawla, Tamas Dávid-Barrett, Megan Doerr, Guillaume Dumas, Peter Ejbye-Ernst, Sophia Frangou, Camilla Bank Friis, Jason Gilliland, Kimmo Kaski, Heidi Keller, Fabio Kon, Hye-Chung Kum, Lasse Suonperä Liebst, Marie Rosenkrantz Lindegaard, Gina S. Lovasi, Daniel P. Lupp, Claudia Bauzer Medeiros, Maria Melchior, Mónica Menendez, Virginia Pallante, Tomás Paus, BeateRitz, Sven Sandin, Abeed Sarker, Cason D. Schmit, Lindsey Smith, Kimberly M. Thompson, Henning Tiemeier, Michele C. Weigle

Digital Ethology: From Individuals to Communities and Back

July 24–29, 2022

Frankfurt am Main, Germany

Tomas Paus and Hye-Chung Kum, Chairpersons

Program Advisory Committee

Kimmo Kaski, Hye-Chung Kum, Julia Lupp, Maria Melchior, and Tomas Paus

Background

Over the past decade, large-scale genomic studies have identified common genetic variations associated with complex traits in health (e.g., educational attainment) and disease (e.g., psychiatric disorders). These genomic studies were made possible through technological and conceptual advancements sparked by the Human Genome Project and by pooling together numerous datasets to increase power (e.g., ~ 1 million individuals in a genetic study of educational attainment). Discoveries from these studies have yielded valuable insights into the molecular pathways that underlie complex traits, yet they can only explain a small amount of inter-individual variability for a given trait. Although some of the “missing” heritability might be carried by rare genetic variants, it is generally accepted that environmental influences contribute a much larger portion of variance.

Environment, however, is difficult to measure on a large scale. Still, the ubiquitous presence of information technology in our lives has created a vast body of digital information and provides a detailed record of many human activities. Harvesting this digital footprint for research purposes lags behind partisan and for-profit use. Some key barriers have been a much lower tolerance for error, higher ethical and legal standards for data use, and high levels of requirements of cyber infrastructure and technical skills. The potential exists for this information to be extracted from multiple sources, so that a rich picture of the human environment can be obtained and related to various phenomena (e.g., brain maturation [Parker et al. 2017], well-being [Kardan et al. 2015], obesity [Maharana and Nsoesie 2018], health [Abnousi et al. 2018], social relationships [David-Barrett et al. 2015]).

How can this potential be realized, especially in the light of differences in conceptualization and methodology across disciplines as well as national differences, governance issues, and ethical considerations? To promote greater understanding, within and between disciplines, and promote future research, the Ernst Strüngmann Forum is convening this transdisciplinary dialogue.

This Forum will explore how digital ethology—the study of human behavior as captured by its digital footprint—can be used to quantify the human environment and facilitate understanding of its impact on health and well-being. The behavior that we seek to understand can be direct (e.g., tweets) or indirect, as inferred by its effect on the physical environment (e.g., broken windows on Google Street). Key concepts will be examined, as will methods needed to quantify the human environment from existing data sources at the aggregate (e.g., neighborhood) level. Requirements for using the resultant information, in conjunction with individual-level data derived from administrative databases, will be explored, as will privacy issues and ethical, legal, and societal implications. Ways of linking aggregate- and individual-level data through geospatial coding will be examined at different levels of spatial granularity. In summary, this Forum aims to:

Examine ways through which digital data can broaden research into human behavior and support future comparative behavioral studies across species
Construct a conceptual and methodological framework for integrating various data sources
Expand understanding of how the environment shapes human development across the life span

This Forum is supported by the Deutsche Forschungsgemeinschaft

The German Research Foundation

Group 1: How concepts of ethology can be applied to large-scale digital data

The ethological approach is used to study naturally occurring behavior. In the modern world, such behavior is connected to, and recorded by, a wide array of digital services (e.g., communication and social networking, on-line shopping, information search). How can ethological concepts be applied to help us characterize the modern environment in which humans live? What aspects of the ethological approach can guide us to obtain measures captured directly from digital data generated by our social activities? What kinds of models do we need to understand how human behavior can be inferred from the physical and built environment? The bidirectional nature of these relationships will be explored; namely, how individuals create their environment, and how the environment shapes the individual.

Group 2: Quantifying and geocoding the physical and built environment

Human activities (behaviors) influence the physical environment (e.g., air, green space) and generate the built environment (e.g., roads, sidewalks, stores, service and digital infrastructures); in turn, the physical and built environment influences human behavior (e.g., by imposing barriers). Which sources of information (e.g., aerial and satellite images, Google Street) are relevant to the study of human development in these two domains? This group will explore ways to extract meaningful signals from these sources and to map these signals at different levels of spatial and temporal granularity. It will also suggest models (e.g., predictive, statistical, physical/mathematical) and platforms for sharing tools to facilitate their use.

Group 3: Quantifying and geocoding the social environment

Individuals both create and respond to their social environment through their behavior. Which types of data from heterogeneous digital streams (e.g., Twitter, Facebook, Google search, call detail records, Smartphone locations) are relevant to the study of human environment and, in turn, human development and health across the lifespan? This group will propose ways to extract meaningful signals from these sources (e.g., natural language processing) and to map these signals at different levels of spatial and temporal granularity. It will also explore ways to analyze these measures, using modeling approaches (e.g., predictive, physical/mathematical, statistical) and platforms for sharing tools to facilitate their use.

Group 4: Integrating Knowledge from Individual- and Aggregate-Level Data

National- and local- (e.g., municipality) level administrative data (e.g., health, education, income, housing, civil, social services) systematically and continuously capture information relevant for health and well-being, thus providing an ideal source for research in these domains. Large population databases from these sources efficiently capture individual-level data and have increased over the last decade, albeit differently in various countries. This group will discuss how individual-level data derived from (existing) heterogeneous databases can be brought together with (newly derived) aggregate-level data about the physical, built, and social environments. Strategies for knowledge generation from this linked data will be explored (e.g., data fidelity, efficient workflow, statistical modeling and validation, high dimensionality, interpretation). Issues related to data governance and barriers to data access will be considered, as will the ethical, legal, and societal implications of this line of research.

This Forum is supported by the Deutsche Forschungsgemeinschaft

The German Research Foundation

Parker, N., A. P. Wong, G. Leonard, et al. 2017. Income Inequality, Gene Expression, and Brain Maturation during Adolescence. Sci. Rep. 7:7397

Kardan, O., P. Gozdyra, B. Misic, et al. 2015. Neighborhood Greenspace and Health in a Large Urban Center. Sci. Rep. 5:11610

A. Maharana and E. Okanyene Nsoesie. 2018. Use of Deep Learning to Examine the Association of the Built Environment with Prevalence of Neighborhood Adult Obesity. JAMA Network Open 1:e181535

F. Abnousi, J. S. Rumsfeld, and H. M. Krumholz. 2018. Social Determinants of Health in the Digital Age: Determining the Source Code for Nurture. JAMA doi:10.1001/jama.2018.19763

T. David-Barrett et al. 2016. Communication with Family and Friends across the Life Course. PLoS One 11:e0165687