'Big data' is a widely-used term without a commonly-accepted definition. The HMA-EMA Big Data Task Force defined big data as ‘extremely large datasets which may be complex, multi-dimensional, unstructured and heterogeneous, which are accumulating rapidly and which may be analysed computationally to reveal patterns, trends, and associations. In general, big data sets require advanced or specialised methods to provide an answer within reliable constraints’.

A single dataset may not strictly meet the definition of big data but, when pooled or linked with other datasets, they become sufficiently large or complex to analyse to assume the characteristics of big data. Sources include real-world data (such as electronic health records, insurance claims data and data from patient registries), genomics, clinical trials, spontaneous adverse drug reaction reports, social media and wearable devices.

Medicines regulators will increasingly use insights derived from big data to assess the benefit-risk of medicines across their lifecycle.

HMA-EMA Big Data Steering Group

The joint HMA-EMA Big Data Steering Group advises the EMA Management Board and HMA on prioritisation and planning of actions to implement the ten recommendations in the Big Data Task Force final report.

The Steering Group began its work in May 2020. It is co-chaired by Jeppe Larsen, Director of Medical Devices Division at the  Danish Medicines Agency and Peter Arlett, Head of Data Analytics and Methods at EMA.

The Steering Group reviews the workplan annually to cover any new emerging topics. It last updated the workplan in July 2023.

The workplan aims to increase the utility of big data in regulation, from data quality through study methods to assessment and decision-making. It is patient-focused and guided by advances in science and technology. Implementation of the workplan will be flexible and certain actions may be re-scheduled.

 

Big data workplan 2022-2025

 

For more information, see:

Big Data Steering Group meeting minutes

Big data training curriculum

The Big data training curriculum is a collection of training modules aimed to help European medicines regulatory network staff develop expertise in integrating big data analysis into decision-making processes related to medicine regulation. 

The first two modules, on real-world evidence and real-world data, are available for eligible users via the EU Network Training Centre platform. The Big data training curriculum also covers biostatistics, data science and other training topics which will be added over time.

EMA's Methodology Working Party oversees its development.

The Big data curriculum is part of the Big data workplan for 2023-2025 and the European medicines agencies network strategy to 2025. For more information, see: 

HMA-EMA catalogues of real-world data sources and studies

Two online catalogues are available from HMA and EMA, one for real-world data sources and one for real-world data studies:

Real-world data are observational data stored for instance in electronic health records and disease registries.

Making greater use of these can improve the evidence base for benefit-risk decisions and bringing medicines to patients.

The catalogues serve to:

  • help regulators, researchers and pharmaceutical companies identify the most suitable data sources to address specific research questions;
  • support the assessment of study protocols and results;
  • promote transparency;
  • encourage the use of good practices;
  • build trust in research based on real-world data. 

They enhance and replace two databases previously maintained by EMA:

CatalogueDiscontinued database 
Real-word data sources European Network of Centres for Pharmacoepidemiology and Pharmacovigilance (ENCePP) catalogue
Real-world data studiesEuropean Union electronic register of post-authorisation studies (EU PAS Register)

The catalogues use an agreed list of metadata to describe and connect data sources to studies, using ‘FAIR’ (Findable, Accessible, Interoperable and Reusable) data principles. 

The HMA-EMA Big Data Steering Group reviews the list of metadata annually, based on feedback from catalogue users. 

A draft good practice guide is also available to help users of the catalogues.

By end of June 2024, EMA intends to publish a revised version addressing feedback from a public consultation. Separate sections will cover using the catalogues to identify and assess the suitability of data sources, and definitions and descriptions of the metadata list.

All European data holders, marketing authorisation holders, networks, researchers and institutions interested in having their data used for medicines regulation, or mandated by policy on non-interventional post-authorisation safety studies (PASS), are encouraged to use these catalogues.

Improving data discoverability is a priority of the HMA-EMA joint Big Data Task Force final report (phase two), reflected in the European medicines agencies network strategy to 2025 and implemented through the joint HMA-EMA Big Data Steering Group workplan. 

Data discoverability refers to the identification and use of real-word data sources and studies.

For more information: 

Data quality framework for EU medicine regulation

The final version of the data quality framework for EU medicines regulation is available below, following a public consultation.

This guideline sets out the criteria for a more consistent and standardised approach to the quality of data used in medicine regulation to support benefit-risk decisions.

It is meant to:

  • help identify, define and further develop data quality assessment procedures and recommendations for current and novel data types;
  • support pharmaceutical companies and other stakeholders in selecting data sources for their studies;
  • ensure the trust of patients and healthcare professionals in data-driven regulatory decision-making.

The data quality framework was co-produced by EMA, the Heads of Medicines Agencies (HMA) and the Joint Action Towards the European Health Data Space (TEHDAS).

Establishing this framework is a key element in the HMA/EMA joint Big Data Steering Group workplan. EMA will work with stakeholders to use the framework’s concepts and develop practical guidelines for assessing the quality level of data. These will initially focus on the domains of real-world data and adverse drug reactions.

Data standardisation strategy

The European medicines regulatory network's data standardisation strategy sets out principles to guide the definition, adoption and implementation of international data standards by the network.

It aims to:

  • enable quicker uptake of international data standards across the EU;
  • improve data quality;
  • enable data linkage and data analysis to support medicine regulation.

The strategy is a key deliverable of the Big Data Steering Group workplan.

EMA and HMA published the strategy in December 2021 and will maintain it over time to reflect any changing priorities or new requirements.

Darwin EU

EMA is establishing a coordination centre to provide timely and reliable evidence on the use, safety and effectiveness of medicines for human use, including vaccines, from real world healthcare databases across the EU. 

This capability is called the Data Analysis and Real World Interrogation Network (DARWIN EU®). For more information, see:

Pilot on using raw data in medicine evaluation

Through a proof-of-concept pilot, selected applicants can submit 'raw data' to EMA as part of their initial and post-authorisation marketing authorisation applications. 

Raw data refers to individual patient data from clinical trials. These include:

  • clinical laboratory results;
  • imaging data;
  • patient medical charts.

Currently, applicants are submitting data in an aggregated format as clinical summaries or as individual patient data in PDF listings. This can hinder data analysis and slow down the evaluation process.

In contrast, raw data are stored in electronic structured format. This enables regulators to more easily visualise and analyse the data if needed.

The pilot aims to assess whether using raw data can help speed up and improve the medicine-evaluation process. The goal of this is to allow patients faster and better informed access to innovative medicines.

EMA launched the pilot in July 2022.

The pilot currently accepts expressions of interest for procedures with a submission date up to Q1 2024.

It will run for up to two years and include approximately ten regulatory procedures submitted to EMA from September 2022.

For any queries and to apply to take part in the pilot, write to rawdatapilot@ema.europa.eu

The pilot is a key activity under the priority recommendations of the HMA-EMA Big Data Task Force. It refers to the priority of building network capability to analyse data.

For more information, see: 

Support for pharmaceutical companies

Further information is available to support pharmaceutical companies with their participation in EMA's raw data pilot. 

The documents include:

  • a questions and answers document on the raw data pilot;
  • a participation letter to confirm pilot participation for a specific regulatory procedure;
  • a cover letter for pilot participants to attach to their data packages.

Data protection

For information on data protection in the raw data proof-of-concept pilot, see:

Big Data Highlights

banner_image_big_data_newsletter.png

Big Data Highlights provides a quarterly update on the implementation of the HMA-EMA Big Data Steering Group workplan.

EMA publishes Big Data Highlights every three months. 

Issues from June 2023 (Issue 6) onwards are available at the link below: 

Previous issues from 2022 are available on EMA's website in PDF format: 

Use the link below to receive Big Data Highlights by email: 

Work of the former HMA-EMA Big Data Task Force

The HMA-EMA Big Data Task Force operated from 2017 until December 2019 to report on the challenges and opportunities posed by big data in medicines regulation. It carried out its work in two phases. 

In phase one, the task force:

  • reviewed the landscape of big data from a regulatory perspective and identified opportunities for improvements in the operation of medicines regulation;
  • performed online surveys of national regulatory agencies and the pharmaceutical industry on perspectives, expertise and challenges. This helped develop an understanding of the challenges and the current state of expertise in the regulatory network.

In phase two, the task force made practical recommendations to inform strategic decision-making and planning by the HMA and EMA and to contribute to the European medicines regulatory network's work on developing a five-year EU Network Strategy to 2025.

The HMA/EMA Task Force on Big Data published its final report in January 2020. It contained practical recommendations on how the European medicines regulatory network could make best use of big data by evolving its approach to data use and evidence generation in support of innovation and public health  .

It identified ten priority recommendations and practical steps to implement them:

    The task force published an interim report in February 2019. It provided a comprehensive summary of various data sources and set out recommendations for understanding the acceptability of evidence derived from big data in support of the evaluation and supervision of medicines by regulators:

     

    This took into account the analyses of the task force's subgroups:

    The task force was composed of experienced medicines regulators and data experts appointed by the national competent authorities, EMA and the European Commission (EC). For more information, see HMA-EMA Big Data Task Force

    Meetings and workshops

    For information on related meetings and workshops, see:

    Veterinary big data

    EMA and HMA established the veterinary big data initiative to explore the use of new digital technologies in key veterinary regulatory activities.

    It takes account of the increasing amount of data generated via new digital systems put in place to implement the Veterinary Medicinal Products Regulation.

    A European veterinary big data strategy sets out how the European medicines regulatory network intends to implement this initiative:

    For more information:

    Data protection

    EMA is preparing dedicated guidance on the impact of EU data protection legislation on the secondary use of health data in support of the development, evaluation and supervision of medicines.

    The aim is to help medicine developers, data providers and research bodies comply with EU data protection rules, and to help patients and consumers understand their rights and the existing safeguards to protect personal data.

    Secondary use of data refers to the use of data for a different purpose than the one for which it was originally collected. It typically involves the use of electronic health records, health insurance claims data, registry data or drug consumption data for medicines research and public health purposes.

    The guidance will cover various operational scenarios, including the development of medicines, the evaluation of marketing authorisation applications and post-authorisation safety monitoring.

    By July 2020 EMA had gathered input from patients and consumers as data contributors as well as from medicines developers, research-performing and research-supporting infrastructures and other data providers (e.g. payers of healthcare).

    In September 2020, stakeholders discussed with EMA the key questions concerning the application of the General Data Protection Regulation (GDPR) in the health sector and the secondary use of health data for medicines and public health purposes:

    EMA aims to finalise the guidance in consultation with the European Commission and the European Data Protection Supervisor (EDPS) in the last quarter of 2021. It will take into account stakeholder input and guidance from the EDPS on the processing of health data for research.

    Ensuring that personal data are managed and analysed within a secure and ethical governance framework in compliance with EU data protection legislation is one of the recommended priorities of the HMA/EMA Big Data Task Force.

    EU data protection legislation includes:

    • Regulation (EU) 2016/679, known as the General Data Protection Regulation (GDPR), which applies to private and public entities in the Member States;
    • Regulation (EU) 2018/1725, known as the EU Data Protection Regulation (EUDPR), which applies to all EU institutions and bodies.

    Artificial Intelligence (AI)

    An EMA draft reflection paper on the use of artificial intelligence (AI) in the medicinal product lifecycle is available to support the regulation of human and veterinary medicines:

    This is in a context where pharmaceutical companies increasingly use AI-powered tools in the research, development and monitoring of medicines.

    The reflection paper outlines the scientific principles relevant for regulatory evaluation of medicines developed or monitored with the help of AI.

    An AI workplan for the European medicines regulatory network is also available, to guide the use of AI in medicines regulation in Europe to 2028.

    It identifies actions for regulators in four key areas:

    • Guidance, policy and product support - delivering guidance on the use of AI throughout the medicine lifecycle
    • AI tools and technology - providing frameworks for the use of AI tools
    • Collaboration and training - developing capacity for the use of AI technology
    • Experimentation - ensuring a structured and coordinated approach

    The Big Data Steering Group developed the plan endorsed by HMA in November 2023, and subsequently by EMA's Management Board in December 2023, and maintains it over time.

    AI-enabled knowledge mining tool

    In March 2024, EMA introduced an AI-enabled knowledge mining tool called Scientific Explorer for EU regulators.

    The tool enables easy, focused and precise search of regulatory scientific information from network's sources to support decision-making and simplify processes.

    The first release focuses on scientific advice procedures for human medicines.

    For more information, see the document below with answers to frequently asked questions about the tool:

    Use of real-world evidence

    A reflection paper on non-interventional studies that use real-world data in order to generate real-world evidence for regulatory purposes is available for public consultation.

    It is aimed at all stakeholders involved in the planning, conduct and analysis of this type of non-interventional studies, including marketing authorisation holders and applicants.

    A non-interventional study is a clinical study that does not meet any of the conditions defining a clinical trial in Article 2.2(2) of Regulation (EU) No 536/2014 on clinical trials on medicinal products for human use.

    The public consultation runs from 3 May to 31 August 2024.

    You can contribute by completing the survey below:

    For more information on this draft reflection paper, see: 

    Guidance on real-world evidence

    Guidance is available on how EMA can help generate real-word evidence by analysing real-word data, with the aim of supporting regulatory decision-making.

    It is meant for EU regulators and decision-makers, including EMA's scientific committees, working parties and groups, as well as national competent authorities, healthcare technology assessment bodies and payers.

    It covers:

    • how the mentioned stakeholders can request real-world data studies from EMA;
    • what types of studies can be performed;
    • how EMA can help identify resources to address research questions.

    EMA cannot consider requests from other bodies and institutions, including academia, pharmaceutical companies and contract research organisations.

    For queries, you can use our webform:

    The guidance builds on EMA's experience in using real-world evidence to support regulatory decision-making, described in the report below.

    Findings include:

    • Real-world evidence can support decision-making in various regulatory contexts
    • Regulators need access to additional data sources such as secondary care databases, claims databases and registries from across Europe
    • Research needs should be identified as early as possible so that studies can be conducted in time for regulatory decisions
    • More information on data source characteristics is needed to facilitate the interpretation of real-world evidence derived from these sources
    • There is a need to build more capability and capacity for real-world evidence generation

    This report informs the development of the Data Analysis and Real World Interrogation Network (DARWIN EU).

    For more information, see below the report document (also accessible by clicking on the thumbnail image in this section) and the list of research topics and use cases:

    Related information materials

    International collaboration on real-world evidence

    Within the International Coalition of Medicines Regulatory Authorities (ICMRA), EMA works to help integrate real-world evidence into regulatory decision-making across the world. 

    ICMRA held a workshop in June 2022 for regulators to share experience in obtaining and using real-world evidence for the assessment of medicines, and issued a pledge in July 2022 to foster global efforts in this area. 

    For more information:

    Reflection paper

    A reflection paper was available for public consultation aiming to harmonise real-world evidence terminology and enable the convergence of general principles for planning and reporting studies using real-world data to support regulatory decision-making. 

    The document builds on the 2022 statement from the International Coalition of Medicines Regulatory Authorities (ICMRA), offering a strategic approach for future ICH guidelines on the assessment of real-world data and real-world evidence.  

    EMA encouraged medicine regulators, academic researchers, the pharmaceutical industry and other stakeholders to comment. 

    The public consultation was open until 30 September 2023.

    The reflection paper was released by the Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH), and co-sponsored by EMA, the United States of America's (USA) Food and Drug Administration (FDA), and Health Canada.

    More information:

    Share this page