InnoHealth DataLake 2021 Symposium - Health Data Utilization in the Light of COVID

In the InnoHealth DataLake project, different solutions were developed based on a wide variety of health data sources. The results and vision of the InnoHealth DataLake project were presented on the 25h of May for the attendees by the project organizations. The videos below provide a brief overview of the event.

InnoHealth DataLake project summary

The project was launched in 2018 with the cooperation of E-Group and the University of Pécs. Considerable progress has been achieved as a result of the hard work of internationally acclaimed doctors, biologists, physicists, pharmacologists, and health experts, as well as IT professionals, mathematicians, engineers, and data scientists.

The aim of the InnoHealth DataLake project is to develop a state-of-the-art R&D&I ecosystem based on unified processes supporting health care, prevention, data collection, data representation, data storage, data utilization, data management, data processing, and data modeling. To define an industry standard, the consortium members clarified the legal, technological and configurational frameworks. In the InnoHealth DataLake project, an innovative IT eco-system, a so-called data lake was created to facilitate the utilization of health data for improving the quality of health care services (e.g., analyzing patient pathways, cost optimization). The deeper analysis and leveraging the potential of data opens up new areas for medical care and prevention. It highlights new correlations in diagnostics and therapy from a health economics perspective enabling new opportunities in health, pharmaceutical research and innovation.

The InnoHealth DataLake project is a unique initiative in Hungary, which can serve as a model and prototype for national, or international healthcare systems in the medium term.

The resulting complex technological product has a wide-range of applications for further healthcare innovation and data-driven decision support in the government, financial and energy sectors. The solution is able to collect the data cost-effectively and reliably from siloed, independent data sources. It extracts, loads different data formats effectively in a standardized way enabling the utilization of advanced artificial intelligence models. Depending on the analytical objectives, the system dynamically links the information, thus revealing new correlations (e.g., from health data, we can deduce the probability of occurrence of hidden diseases and their course).

The data lake considerably simplifies the work of data scientists and opens up new perspectives for decision makers to establish a data-intensive business operation. This supports world-class research, provides universities and hospitals with the opportunity to participate in international pharmaceutical research, and plays a key role in the international ranking of universities.

Symposium summary

To implement an ambitious and complex project as the InnoHealth DataLake, we divided the work into specific streams and created sub-projects. The sub-projects not only demonstrate the diversity significance of data technology, human technology, medical modeling, legal aspects as well as the legitimacy of InnoHealth DataLake, but improve the quality of health care and to achieve research, diagnostic or therapeutic results.

The Symposium summary has got two parts. Part one consists of the summary of the technological and legal presentations and part contains the summary of the medical sub-projects.
Title
Title: Opening, Summary of the IHDL’s objectives
Presenter
Prof. Dr. Miseta Attila János, the Rector of the University of Pecs
Summary
Immense amounts of data have accumulated in healthcare. The utilization of these data with modern methods contribute greatly to making the diagnosis and treatment of patients more effective. There are two main directions: one of these is sorting and cleaning data resulting a reliable, quality data that can support international research and development. The other method is to perform experiments that apply modern digital techniques in terms of continuously monitoring the condition of patients, which is performed with body sensors and appropriate communication tools. The data obtained in this way are combined with patient’s historical data. This provides extremely important and up-to-date information to the physician and, indirectly, to the patient. The COVID-19 pandemic has drawn attention to the importance of creating uniform databases that can be utilized in the areas of healthcare, and that these databases are made accessible to the right people, to the right organizations.
Title
Model change
Presenter
Prof. Dr. József Bódis was born in Csurgo in 1953, graduated from the University of Pécs Medical School in 1977 as a general practitioner and in 1981 as an obstetrician-gynecologist. His research interests include gynaecological endoscopy, reproductive endocrinology and urogynecology. He is head of the research group “Human Reproduction”, supported by the Hungarian Academy of Sciences. From 2010 to 2018, he was Rector of the University of Pécs, and then Secretary of State for the Ministry of Human Capacities. Since 1 September 2019, State Secretary for Higher Education, Innovation and Vocational Training at the Ministry of Education and Science.
Summary
Until the end of the last century, the practice of mass education in higher education determined the way it operated and was financed. Today’s globalized world is having an increasing influence on the development of the higher education system. The challenge for institutions is to balance tradition and progress. It is in the symbiosis of tradition and innovation that the promising future of the university system must be found, serving both local and national interests and adapting to global changes. All this shows that the modernisation of Hungarian higher education is a challenge in which increasing competitiveness is a top priority. This is why the document “A change of gear in higher education, a medium-term policy strategy” has been produced. The aim outlined in this document is to develop and effectively operate a higher education system capable of responding to challenges in the international education and research arena. This new education system should guarantee Hungary’s social and economic competitiveness. Higher education is challenged not only by continuous social and demographic changes, but also by the rapid development of technology. This has led to the process of changing the model of the higher education system. The aim is to create a higher education structure that is more open to the needs of the economy and technology, and that works more closely and effectively with businesses to deliver quality and performance. It is supporting a more performance-oriented research, development and innovation strategy to be implemented in the institutions. This will focus on practical research, a market approach and the protection of patents and intellectual property. The ultimate aim is to create a stable and innovative institutional environment for young Hungarian students.
Title
Data-driven innovation
Presenter
Tibor Gulyás, Deputy State Secretary, Ministry of Innovation and Technology
Summary
Data is the cornerstone of innovation. Today, data and data-driven innovation have taken on a new meaning. This has been driven by an increase in the quality and quantity of data and changes in the methodology of data processing. Over the last ten years, the development of industries where data is particularly important has followed an exponential growth curve. This development is based on technological advances which have opened up a new dimension for innovation. Many industries have started to develop thanks to the modern use of data. The health industry stands out among these, as the data services that inform our lives and health are of central importance to us. By creating its own developments and background, the government is contributing to the development of the domestic health industry. The development of the Electronic Health Service Space (EESZT) and idea of the InnoHealth DataLake project illustrate this data-driven healthtech progress. The launch of the Data Driven National Laboratory is imminent. The importance of the innovation represented by the IHDL project is illustrated by the fact that the competences (IT and technological background, medical and health sciences experience and legal knowledge) that have been essential for the successful implementation of the IHDL project are also essential for the Data Driven National Laboratory.
Title
Data and health: worldwide overview and the domestic Covid situation
Presenter
Dr. Miklós Szócska, Dean, Semmelweis University, Faculty of Public Health
Summary
The use of data-rich Hungarian health data is an important step towards a data-driven Hungarian healthcare system. The National Laboratory for Data-Driven Health aims to provide a toolkit for physicians and researchers. This toolkit ranges from diagnostic decision support, imaging laboratory pathology, to the development of prognostic and predictive artificial intelligence models, or even the tracking of digital twin patient pathways. It can support research, innovation, methodological development and international reference applications. Database building, data fusion and data warehouse construction are also important objectives. It is essential to highlight how data scientists can contribute to the successful work of clinicians. Data scientists need to be made aware of the criteria for using databases for healing, saving lives and generating new knowledge. On the regulatory side, we need to keep abreast of the legal changes (telereferral and teleconsultation) and prepare for further data protection challenges, enabling population-based health management. In conclusion, we are not simply talking about digitalization, but about a data-driven paradigm shift. An important part of this being the redefinition of the use of Hungarian health data. This global shift is beneficial for the data-rich Hungarian healthcare sector, which has all the capabilities to become a leading player in data-driven international healthcare and healthtech.
Title
The goals of the InnoHealth DataLake project and its digital/technological backbone and future vision
Presenter
An electrical engineer and computer scientist, he founded E-Group in 1993 as a fresh engineering graduate with his university friend András Nagy. In 1997, under his leadership E-Group was the first of the Hungarian SMB IT companies to found an independent research laboratory. In 2011 he builds E-Group CUP Gateway to service China UnionPay. His main areas of interest are Cognitive Computing/AI technologies, related innovation opportunities and the creation of Hungarian potential.
Summary
The aim of the IHDL project is to make all the high-resolution and diverse data of PTE available in a structured format in a single datalake. The datalake approach means that data is pumped from the operational systems into the datalake. It is important to see the five main dimensions of the so-called Big Data. These are volume, variance, velocity/temporality, quality and confidentiality character. In the case of health data, quality and confidentiality are particularly relevant, as personal data or even data critical to national security is being handled. The IHDL project is implemented in a modular framework. The medical module, the legal module and the IT module cover the whole project. The creation of data and analytical systems is seen as the key to innovation in a research university. This is the task of the IT module. It is not only about an IT infrastructure, but also a data infrastructure is needed. The design process of the data platform considers the strategic data protection aspects. Sensitive data can be handled in-house, while other data can be stored off-site for capacity efficiency reasons. The digital architecture of the IHDL is implemented by using Microsoft SQL Server 2019 Big Data Cluster. This technology is composed of layers, each layer providing independent functionality and security. The layers rely on each other and form the whole of the dataset. The system created by E-Group, the Smart Data Platform and it guarantees efficient work with large volumes and varied data. The hyper convergent infrastructure allows data to be used without performance bottlenecks in case of using state-of-the-art data science tools. This technical basis for the data lake in order to properly support research development and innovation. The data lake will be able to increase efficiency through data science, generate new quality knowledge and multiply research potential for a research university. However, it is not enough to create a data warehouse and an analytics platform to achieve success. Human capital is also essential. The work of clinicians and researchers can be made more efficient by data engineers and data scientists. Capabilities of the Datalake are not exclusively for research universities, but also offer a major development opportunity for a government or a large corporate entity.
Title
Data-conscious Research University: digital challenges for data
Presenter
Dr. Péter Kristóf, Director of Informatics, PTE, Directorate of Informatics and Innovation
Summary
The amount of data produced by humanity is increasing dramatically. According to Gordon Moore’s law, the power of computers is doubling every year and a half. However, this trend seems to have the opposite effect in other areas. The time to produce a new drug molecule has also increased dramatically in recent years. It used to be possible to produce thirty new drugs for $1 billion, nowadays only one. Fortunately, this trend seems to be reversing. Thanks to modern data science, informatics can help health research development and innovation. Realising the concept of a data-driven university is a major challenge for any modern research university. A practical prerequisite is the development of appropriate technology and data asset management. Data management should be a core service for university researchers and faculty, which is framed around the capture, exploitation and use of data. A key role in this is played by the IT support (development and operation) This is how to get from data to information to knowledge. This is how we can turn data into scientific, economic and social value.
Title
Data law within the data lake
Presenters
Dr. Gergely László Szőke, PhD, Head of the Department of Administrative Law at the Faculty of Law and Political Sciences of the University of Pécs, Adjunct Professor of the Information and Communication Law Group within the Faculty. His research interests include in particular data protection and freedom of information, the reuse of public data, and the legal issues raised by Big Data, algorithm-based decision-making and AI. He has participated in several research projects in these areas and he is the author of several publications in Hungarian and English.
Dr. Katalin Kuthy, attorney-at-law, participated in the IHDL project as a legal advisor to E-Group. In 1999 as an associate of Squire Patton & Boggs Law Firm she started to work in the field of IT law and its subfields. In 2010, she founded Dr. Kuthy Law Firm, which has gained significant theoretical knowledge and practical experience in the field of IT law (IT contract law, software law), copyright law and data protection law, also in an international context.
Summary
InnoHealth DataLake’s legal group has developed a Data Protection Impact Assessment related to the operation of the project, which assesses the data protection risks and obligations and makes proposals for their resolution. The Legal Working Group continuously monitors changes in EU legislation and Member States’ regulations, and analyses and evaluates best practices. The legal group has identified a number of possible legislative changes that could be needed in the service of health research. The presentation covered the most relevant legal issues raised by the Data Protection Impact Assessment, best practices in the Member States and new EU and Hungarian legislation expected in the near future, including the Data Governance Regulation, the Digital Markets Regulation, the Digital Services Regulation, the Copyright Act amendment, among others.
Title
Data science and analytics to support clinical research, Data analytics toolkit of the IHDL platform
Presenter
Ákos Tényi, PhD, Head of E-Group’s Data Science team and SmartData business unit. He studied at the Budapest University of Technology and Economics (BME), where he graduated as a BSc and MSc in Computer Engineering and Health Engineering. He holds a Master’s degree in BioHealth Computing (Université Joseph Fourier, Universitat de Barcelona). He holds a PhD degree in Medical and Translational Research from the Universitat de Barcelona, Faculty of Medicine.
Summary
Today, health data has become a key player in almost every aspect of healthcare, from prevention to patient care and services to innovation. Given the extremely long and expensive clinical cycles in the biomedical sector, i.e. the need to translate research findings into practice, extracting health data assets and putting them into a technological environment is a fast track approach to deliver results, where scientific evidence supports clinical care, while data collected from daily clinical practice facilitates new scientific discoveries and optimises healthcare delivery. In order to facilitate the exploitation of data assets and to help achieve high quality research results, a core requirement of the IHDL platform is the creation of an analytical infrastructure that supports the identification and prediction of intrinsic correlations, patterns, symptom-specific traits by analysing health data. The presentation will briefly review the four main pillars of the project’s data analytics deliverables: i) data harmonisation, standardisation and integration, ii) AI-based data curve expansion from existing data sources, iii) inclusion of unstructured data curves and iv) molecular and pharmaceutical data analysis.
Title
Drug repositioning with real-word based evaluation
Subtitle
Possibilities and Perspectives of Data-Based Drug Repositioning in the Light of COVID-19 and Oncology Drug Therapy
Subproject leader
Péter Mátyus PhD, Doctor of the Hungarian Academy of Sciences. M.Sc. in Chemical Engineering, Pharmaceutical Chemist (BME). Fellow of the Institute for Pharmaceutical Research (1975-1997); Director of the Department of Organic Chemistry at Semmelweis University for 19 years, Professor at the Faculty of Public Health; E-Group Project Manager. Visiting professor at foreign universities. Honorary Doctor (University of Cagliari); Honorary Professor (University of Pécs). Mr. Mátyus specializes in organic and pharmaceutical chemistry; drug innovation. His professional work has been recognized with several professional awards, including the Gábor Dénes Award.
Summary
Drug reposition is the use of existing drugs for new therapeutic purposes. Thus, the cost and risk are significantly lower compared to traditional strategy. As an example, it has become an attractive approach for orphan drug innovation with relatively modest market potential. In our repositioning project, we follow two pathways or their combination. One way is to utilize a wide variety of relevant biomedical and chemical data through IT analytical tools., while in the other an intuitive approach is considered. Our goal is to combine benefits and values of the two routes for fast and cost-effective drug repositioning. Our achievements so far are as follows, the launch of a phase II human clinical trial with a new drug candidate in the field of oncology, while elaboration of an IT platform supporting data-based drug repositioning is well in progress.
Title
RWD Module
Subtitle
How can raw clinical data be refined for research analysis? Experiences based on the study of cancer
Subproject leader
Antal Zemplényi (MSc, PhD) is an economist and a certified health policy expert. In 2016, he obtained a PhD in health economics. Between 2007 and 2018, he managed the Clinical Center of the University of Pécs. Since 2018, he has been leading the Health Technology Assessment Center of the University of Pécs. His major fields of work are health economics analysis and the use of health data for research purposes.
Summary
One of the most detailed data sources in healthcare is data routinely collected during patient care, the processing of which, however, poses many challenges due to unstructured textual documentation, data bias, and incompleteness. In the RWD (Real-World Data) subproject, we have developed specific methods to collect and analyze data relevant to clinical research from medical records. In the reference analysis of prostate cancer, information was generated about patients’ clinical characteristics and disease outcomes that would not be possible from an insurance database. This can help implement real-world clinical trials, make better and more cost-effective therapeutic decisions, and ultimately result in more effective patient recovery.
Title
Assessing the effectiveness of modern radiotherapy and oncology care
Subtitle
Lessons from oncology data collection and our future plans
Subproject leader
Prof. Dr. László Csaba Mangel holds a general medical degree from Semmelweis University, and a psychiatrist, radiotherapy, clinical oncology specialist, health manager and palliative medicine license exam. Director of the Department of Oncotherapy at the University of Pécs. His main areas of interest are modern radiotherapy, combined treatment modalities and oncology quality assurance. Holder of the Knight’s Cross of the Hungarian Order of Merit and the Krompecher Award. Married, father of 3 children.
Summary
Due to the rapid development of oncology diagnostics and therapy, as well as the diversity of cancer, the collection of real-world health data is becoming increasingly important in oncology, as we gain valuable information about the behavior of individual oncology diseases and the effectiveness of anti-cancer treatments. In addition, the efficiency of the health care system can be monitored by reviewing health data. In order to improve the care of cancer patients, we consider it particularly important that oncology data collection become part of the research of health data assets at our disposal, while complying with data protection laws and regulations.
Title
Development of a seizure prediction system by monitoring sensor data
Subtitle
Development of a seizure prediction system by monitoring sensor data
Subproject leader
Tamás Péter Dóczi, full member of the Hungarian Academy of Sciences. Surgeon and neurosurgeon specialist. Privatdozent / Oberarzt, Universitätsspital, Zurich, 1990-1992. Director of the Department of Neurosurgery, University of Pécs, 1992-2014. Managing Director, Pécs Diagnostic Center / NeuroCT Kft. (Since 2015), MTA/ ELKH-PTE CLINICAL NEUROSCIENCE MR RESEARCH GROUP 2012-2019. National Brain Research Program 1.0 and 2.0 clinical pillar and other applications leader. Areas of expertise: clinical neuroscience, neurosurgery.
Summary
The hypothesis of the project is that long-term multi-modal monitoring of the autonomic nervous system may provide novel data in patients with epilepsy or migraine, that can serve as useful biomarker(s) for seizure/migraine-headache prediction. Based on our previous research, the most obvious way to test this hypothesis is to collect measurement data with a mobile device and perform heart rate variability analysis (HRV) from ECG recordings. The involvement of the study subjects is continuous, 34 migraine subjects have been involved so far, and we are working with the recordings of 26 monitored patients with epilepsy. For processing large amounts of sensory data and interviewing subjects more efficiently, specific software solutions are being developed.
Title
Digital care support system for the prevention and treatment of brain injuries
Subtitle
Digital care support system for the prevention and treatment of brain injuries (Department of Neurosurgery)
Subproject leader
Prof. Dr. András Büki, neurosurgeon, clinical oncologist, Doctor of the Hungarian Academy of Sciences, Director of the Department of Neurosurgery, University of Pécs. He is the main researcher of several international and national applications. He is the author of more than 200 scientific papers with over 4,000 independent citations. He performs hundreds of neurosurgical operations each year, specializing in cranial endoscopic surgery, neurooncology, and neurotraumatology. Former President of the World Neurotrauma Society, won the Aesculap Grand Prix and the NIH Fogarty Scholarship.
Summary
Skull/brain injuries are the leading cause of death in the first four decades of human life. The biggest challenge regarding brain injuries is early detection and effective treatment of secondary brain damage, which is a major determinant in the outcome. The most important tool for this is modern neuromonitoring, the continuous monitoring of brain pressure, circulation, circulatory regulation, oxygenation, temperature, electrical activity, and metabolic processes. In addition to the development of a monitoring system implemented based on previous research programs at the Department of Neurosurgery of the University of Pécs, the foundations of a modern online decision support system were established. In addition to the development of a uniquely coordinated monitoring system for multiparametric monitoring of patients implemented from previous research programs at the Department of Neurosurgery of the University of Pécs, the foundations of a state-of-the-art online decision support system were established by synchronizing data from a world-class neuromonitoring system, channeling these data into a data lake and synchronizing them with patient care data. We are convinced that as a result of the IHDL project, we can solve one of the most important health problems of the 20th century: the utilization of the unmanageable amount of data from modern monitors.
Title
Pre-hospital assessment of the severity of strokes through video application
Subtitle
Significance of large vessel occlusion and possibilities of its early detection in pre-hospital care of acute ischemic stroke
Subproject leader
Graduated summa cum laude from the Medical University of Pécs in 1993. He obtained his Ph.D degree in 2004. In 2004 he was assigned to manage the stroke profile of the Department of Neurology. In 2011, he was habilitated. In 2012, he was appointed Associate Professor at the university. Member of the board of the Hungarian Society of Stroke and Neurology. He is currently the Head of the Stroke Department at the Janus Pannónius Clinical Block. Since 2016, he has been the President of the Hungarian Stroke Society. He is a member of the Clinical Neurosciences Committee of the Hungarian Academy of Sciences.
Summary
Approximately 30,000 acute cerebral vascular occlusions occur in Hungary every year, the successful treatment of which is ensured by vascular opening treatment. The catheter clot removal form of these treatments is only available in a limited number of centres in Hungary. In the pre-hospital phase of a stroke, it is not only crucial to recognize, but also to correctly assess a stroke’s severity, as it can determine the type of stroke center ideal for the patient. The essence of the stroke subproject is to develop a video application that can be used during on-site patient care. By sending the recorded video to a stroke call center, the center’s experts are equipped to assess the severity of the stroke based on the recording, screen for cases of suspected cerebral vascular occlusion and therefore deemed optimal for catheterization, thus facilitating pre-hospitable decision-making and determining optimal patient care.
Title
Clinical and Policy Decision Support
Subtitle
Data-based decision support
Subproject leader
Prof. Dr. Imre Boncz, Deputy Dean for Relations, Director of the Institute, University of Pécs, Institute of Health Insurance
Summary
The aim of the clinical and professional policy decision support subproject was to utilize the data assets accumulated in the Clinical Center of the University of Pécs. It aims provide real-time support for institutional management and local and sectoral policy decision-making directly. With the help of the system, the problem, for example, of the optimization of the utilization of hospital beds can be solved.