- Record linkage fundamentals: Participants will understand the essential concepts of record linkage, including its purpose for using data from various sources to identify the same entity.
- Explore diverse applications: Attendees will discover how record linkage is applied across multiple scientific disciplines to gain a wider understanding of the analytical use of merged data.
- Identify data quality challenges: Participants will learn to recognise common data quality issues in record linkage and understand how these challenges can influence statistical outcomes, including differential linkage bias.
- Understand privacy considerations: The workshop will equip attendees with knowledge of the privacy problems resulting from record linkage. The focus will be on the technical solutions to avoid these problems.
- Record linkage methods: Participants will develop elementary skills in deterministic and probabilistic record linkage, as well as PPRL using Bloom filters.
- Conducting record linkage: The ability to conduct simple linkages using these methods in R.
Introduction to Record Linkage (2026)
13.04.2026 09:00 - 13:00
Record linkage refers to the process of identifying and linking records from multiple data sources that correspond to the same underlying entity – for example, matching individuals across distinct administrative databases. By enabling the integration of separate datasets of the same persons or organisations, record linkage substantially expands the analytical potential of available data. Applications include the construction of historical panels, the validation of survey responses by using process-generated data, and the estimation of the size of rare populations.
This in-person workshop provides a comprehensive introduction to record linkage for practitioners across diverse scientific disciplines, including medicine, the social sciences, and economics. It addresses the fundamental methodological challenges of record linkage, particularly data quality, classification error, and privacy protection. Participants will gain a practical understanding of these challenges and the conceptual foundations required to address them effectively.
Special emphasis will be placed on privacy-preserving record linkage (PPRL), which aims to enable accurate linkage while safeguarding sensitive personal information. The workshop concludes with a hands-on computational demonstration in R that illustrates selected approaches. These include probabilistic record linkage based on the Fellegi-Sunter model and privacy-preserving techniques employing Bloom filters.
Learning Objectives
Application
The deadline to apply for a seat in this workshop is March 23, 2026 (CET).
As the number of participants is limited, and to ensure the best fit between the workshop’s content and its participants, we will ask you a few questions in the application form about your background and interest in the course. We will review all applications after the deadline and notify you of the outcome.
The workshop is completely free of charge. As part of this opportunity, we kindly ask all participants to actively contribute to our evaluation process. This helps us to continuously improve the course.
Any questions? Please contact us at berd-academy@stat.uni-muenchen.de.