You are here:

Extreme-scale Data and Computing

…tackles the new fundamental challenges for computer science and engineering research, posed by the unique combination of Modelling and Simulation with Health Data Science and Health Informatics. The computational and data processing needs of Sano research will push the boundaries of current state-of-the-art infrastructures for AI, HPC, big data and cloud computing. This includes alignment of traditional HPC systems with big data analytics and ML/AI workloads, using CPU, GPU, many-core, hybrid, virtualized and containerized environments with the computational needs of systems required to deliver patient-specific care at timescales appropriate for clinical use. Moreover, health data science will benefit from novel approaches in distributed computing and security research, such as Federated Learning, Blockchain, Differential Privacy or Encrypted Computation, which can be applied to medical data in a secure and privacy-conscious manner.

New developments in computer hardware, programming models, cloud computing and emerging services will influence development, deployment and execution of computational and AI models at extreme scale, requiring constant evaluation of new technologies and platforms, experimenting with novel approaches, and prototyping new solutions. Development of systems operating in clinical, research, HPC, Big Data/AI environments will require novel, transparent techniques, and the delivery of state-of-the-art in health data science and in-silico techniques will require exascale computing resources. Sano will become a driver for developments within EU-level HPC and cloud initiatives, including PRACE and EuroHPC.

Read more

Maciej Malawski

PhD Team Leader Extreme Computing

Over 20 years of experience in research in parallel and distributed computing, high performance computing (HPC), grid and cloud technologies, serverless and container-based infrastructures. Interested in innovative applications of these technologies to scientific applications, with a special focus on biomedical research. 

Big data analytics, with experience in large-scale processing of scientific data in cloud infrastructures, contributed to use of Apache Spark and serverless processing of data in high energy physics in collaboration with CERN. 

Scientific workflows with focus on usage of novel and emerging large-scale computing infrastructures, performance evaluation, resource management, scheduling and cost optimization. 

Co-author of over 50 international publications including journal and conference papers, and book chapters. Member of technical program committees of premier conferences on scientific, parallel and distributed computing (SC, ICCS, IPDPS, CCGrid, UCC). Leadership positions in major conferences in the field: general co-chair of Euro-Par 2020 and member of Steering Committee, Area Co-chair IEEE Cluster 2021, BoF Vice-Chair at SC18. Member of editorial board of Future Generation Computer Systems Journal. 

Prizes and distinctions:

  • 2020 Paper: Performance evaluation of heterogeneous cloud functions published in Concurrency and Computation: Practice and Experience, was among the top most downloaded papers in 2018-2019 
  • 2018 AGH Rector’s award for organizational work 
  • 2018 Publons Peer Review Award, for placing in the top 1% of reviewers in Computer Science 
  • 2011 Executable Paper Grand Challenge – 1st prize 
  • 2019 – ongoing Associate Professor, Institute of Computer Science AGH, University of Science and Technology, Kraków, Poland 

    Senior Researcher, Sano Centre for Computational Medicine in Krakow, Poland 

     

  • 2001 – ongoing Researcher,  employed in research projects at the Academic Computer Centre CYFRONET AGH 

     

  • 2009 – 2019  Assistant Professor, Department of Computer Science AGH 

     

  • 2015 Adjunct Research Assistant Professor, University of Notre Dame, Center for Research Computing, Notre Dame, USA 

     

  • 2013 Adjunct Research Assistant Professor, University of Notre Dame, Department of Computer Science and Engineering, Notre Dame, USA 

     

  • 2011-2012 Postdoctoral Research Associate, University of Notre Dame, Center for Research Computing, Notre Dame, USA 

     

  • 2001 – 2009  Teaching and Research Assistant, Department of Computer Science AGH 

  • 2009 Ph.D., Computer Science, AGH University of Science and Technology, Kraków, Poland  
  • 2004 MSc, Physics, Jagiellonian University, Kraków, Poland 
  • 2001 MSc, Computer Science, AGH University of Science and Technology, Kraków, Poland 

Sano Centre for Computational Medicine 

Czarnowiejska 36 building C5, 30-054, Cracow, Poland 

Email:  

Team Members

Jan Meizner

PHD STUDENT, SENIOR SCIENTIFIC PROGRAMMER

Graduated in 2009, majoring in federated IT security systems. Since then he has been working at ACC Cyfronet AGH on many EU and national projects involving a wide range of subjects, including computational medicine. His work focuses on IT security, cloud and HPC infrastructures, as well as building software for such infrastructures. In Sano he was involved in the operations of IT systems, as well as a range of IT security tasks, including identity management and data security as the Technical Manager. Later he shifted his focus back to science by becoming a PHD Student in Extreme-scala Data and Computing Team as well as Senior Scientific Programmer in the ISW Project.

Michał Daniłowski

Master Student in Extreme-scale Data and Computing

Michał is pursuing MSc degree in Data Science at AGH UST. He is interested in machine learning and cloud technologies. An open-source enthusiast. Currently researching applications of on-device federated learning in medicine as his master's thesis project. In his free time, he enjoys traveling and listening to classical music.

Karol Zając

Junior Scientific Programmer

During his engineering studies in the field of Computer Science at the Faculty of Computer Science, Electronics and Telecommunications at AGH University of Science and Technology in Cracow. Currently participating in collaborative, European project - In Silico World (ISW). Especially interested in Modeling, Simulation and also Artificial Intelligence. Usually spends his leisure time on swimming, traveling with his mates and being into DIY for mental and manual skills development.

Bartosz Balis

Senior Postdoc

Bartosz Balis is an associate professor at the Institute of Computer Science of the AGH University of Science and Technology, and a Senior Postgraduate Researcher at the Sano Centre for Computational Medicine. He is also a member of the CERN ALICE experiment. A graduate of AGH and Jagiellonian University, he obtained his PhD and DSc (habilitation) in Computer Science from the AGH University. A co-author of over 60 international peer-reviewed scientific publications, including papers in high-ranked journals. His research interests include scientific workflows, data science, e-Science, cloud computing, and distributed computing. Dr Balis has been a member of conference program and organizing committees, including Euro-Par 2020 workshops (General Co-chair), HPCS 2018-19 (Tutorials Co-Chair), IEEE/ACM SC18 Birds of a Feather Planning Committee, IEEE/ACM SC16 Workshops Planning Committee. He has participated in national and EU-FP5/FP6/FP7/H2020 research projects CrossGrid, CoreGRID, K-Wf Grid, ViroLab, Gredia, UrbanFlood, PaaSage and WATERLINE.

Krzysztof Gądek

Junior Scientific Programmer

Currently studying computer science at the AGH UST (WIEiT). Programming since he was 12 years old, it's one his biggest passion. In Sano started as intern on the HPHOB project, now working with MEE development and occasionally other programming tasks. His other passions are science (physics, mathematics (especially differential equations), astronomy), history and sports, like gym, swimming or skiing. Great fan of the Balkans culture, visited most of countries in this region, except for Albania. Foreign languages lover, currently studying English and Russian, in future plans to learn German, Italian and Serbo-Croatian

Piotr Kica

Junior Scientific Programmer

Currently doing BSc in Computer Science at AGH UST in Cracow. Particularly interested in big data problems and cloud computing solutions. His Sano career started with an internship on data analysis related projects, now working as a part of the In Silico World project. A follower of an active lifestyle – both physical (working out, playing team sports) and mental (playing chess). In his free time he likes to study Norwegian, listen to audiobooks and go to orchestral concerts.

Magdalena Otta

PhD Student

Her scientific journey started in Scotland, at the University of Edinburgh where she graduated from in 2021 with a Master of Physics degree. The subject of her thesis was modelling a response of a single cell to the use of fractionated radiotherapy in order to induce the abscopal effect in cancer treatment. She joined Sano Modelling and Simulation Team as a PhD student and is currently working on modelling venous flow in patients presenting with deep vein thrombosis of the lower limb.

Paulina Adamczyk

Master Student in Extreme-scale Data and Computing

Paulina is a fourth-year student of computer science at the AGH University of Science and Technology in Krakow, where she belongs to the BEST AGH Krakow student organization. In her free time, she travels a lot, discovering new places and cultures, reads and plays squash. From time to time, she goes surfing which is her greatest hobby.

Jan Przybyszewski

Junior Scientific Programmer

Jan has always been interested in healthcare, having a short tenure in Jagiellonian University Medical College. In the end, he decided to pursue a career in engineering - he obtained his M.Sc degree in Computer Science at AGH UST in Cracow. His thesis was focused on using graph neural networks in miRNA-mRNA target prediction. Apart from academic experience, he also developed his software engineering skills by working on commercial projects at companies such as Nomagic and Mercedes-Benz AG. At Sano, he joins the Extreme-scale Data and Computing team to conduct exciting research on the use of Federated Learning in healthcare. In his free time, Jan enjoys reading, playing videogames, and doing sports.

Dominik Strama

Master Student

Dominik is pursuing an MSc degree in Computer Science at the Warsaw University of Technology. Graduated in 2021 from Applied Computer Science at AGH with a BSc thesis related to object detection in real-time using a phone camera. Interested in writing software for Apple ecosystem, machine learning and deep learning. Currently doing research on smart watches as his master's thesis project. Daily he works as a software engineer developing mobile applications for iOS. Enjoys spending his free time actively, most often running, hiking and working out.

Dominika Ciupek

PhD Student

Dominika earned a Master's degree in Biomedical Engineering at the AGH University of Science and Technology in Kraków. She is currently pursuing her PhD at Sano as part of the Extreme-scale Data and Computing team on the topic of Application of Federated Learning to Medical Data at Large Scale. Dominika specializes in medical imaging of the brain, especially in diffusion-weighted magnetic resonance imaging. So far, her main fields of research have been biophysical multi-compartment models and tractographic algorithms. Privately, she is a fan of psychological thrillers and documentaries about serial killers. She also spends her free time sketching and digital drawing.

Sylwia Marek

Master Student in Extreme-scale Data and Computing

Bachelor Student of Computer Science at AGH UST. Currently, together with three other students, under supervision of Sano, completing an engineering thesis, the subject of which is development of an application to support patient rehabilitation. Took an active part in a small-case study and experiment concerning the topic: "Can we reduce the burden of self-reporting through gamification of health surveys?" After hours - exploring Tatra Mountains, travelling, interested in mathematics and sports of all kinds, especially a former volleyball player.

Current Projects

The In Silico World project aims at accelerating the uptake of modelling and simulation technologies used for the development and regulatory assessment of medicines and medical devices, by lowering seven identified barriers: development, validation, accreditation, optimisation, exploitation, information, and training.

Computer models informed by experimental data enable us to test hypotheses and make predictions, significantly streamlining the research and development cycle relative to trial and error. When it comes to medicine, experimentation relies on biological samples ranging from cultured cells to whole animals, so increased reliance on modelling has additional benefits. Harnessing Big Data and tremendous advances in computing power could pave the way to minimising and eventually eliminating the need for anything other than in silico ‘experimentation’ in medical research and development.

The consortium will use an advanced simulation environment developed by Sano to address the needs for scalability and efficiency of the solutions developed in the project. Such environment provides access computation and storage resources in local and the main European e-infrastructures and commercial cloud services. Moreover, Sano works on performance, scalability and cost efficiency of the advanced simulation models running at extreme scale.

More about the project: https://insilico.world/

  • MSc Project 
  • Although federated learning is a promising technique for analysis of medical images, as it may solve some security and privacy issues related to distributed data access, there is still a need to evaluate this technique in large-scale experiments in a distributed environment such as cloud infrastructure. 

The goal of the thesis will be to run large-scale experiments with federated learning on medical image classification tasks. We plan to use public datasets such as chest X-ray images, coming from multiple sources (countries, hospitals), and existing distributed machine learning frameworks. As the computing infrastructure, a public cloud and PL-Grid infrastructure will be used. Various metrics related to the distribution of data, its granularity and partitioning will be investigated, to understand their impacts on the both the efficiency of the learning process and the performance of the infrastructure. It will be also possible to extend the study to assess the impact of possible attacks and their mitigation strategies. 

  • MSc Project 
  • Federated learning is a technique which allows training machine learning models in a distributed way without transferring the data from its source. It has thus potential applications in medical image analysis, where privacy and security issues are of great importance. Although there are examples of using federated approaches to analysis of medical images, there is still need for research in this area and for experiments in distributed environment. 

The goal of the thesis will be to apply federated learning techniques to the problem of medical image segmentation. We plan to use public datasets such as echocardiography, coming from multiple sources. The analysis will be performed using distributed computing frameworks such as Flower or FedML, using distributed computing infrastructures such as PL-Grid or a public cloud service. In addition to evaluation of the learning process, the goal will be also to evaluate the performance of distributed computing environment. Further study will include also possible attacks and security of the developed solution. Other types of data and machine learning tasks can be considered for comparison as well. 

Publications

Dajda, Jacek; Idzik, Michał; Sroka, Jakub; Pawłowski, Mikołaj Sikora Wiktor; lka, Maciej Smo; Jabłecki, Przemysław; Ślazyk, Filip; Malawski, Maciej; Majerz, Emilia; Pasternak, Aleksandra; Dzwinel, Witold

Current Trends in Software Engineering Bachelor Theses Journal Article

In: Computing and Informatics, 2021.

Abstract | BibTeX | Links:

P., Jabłecki; F., Ślazyk; M., Malawski

Federated Learning in the Cloud for Analysis of Medical Images - Experience with Open Source Frameworks Conference

2021.

Abstract | BibTeX | Links:

Bubak, M; Czechowicz, K; Gubała, T; Hose, D R; Kasztelnik, M; Malawski, M; Meizner, J; Nowakowski, P; Wood, S

The EurValve model execution environment Journal Article

In: Interface Focus, vol. 11, no. 1, pp. 20200006, 2021, ISSN: 2042-8898.

Abstract | BibTeX | Links:

Malawski, Maciej; Gajek, Adam; Zima, Adam; Balis, Bartosz; Figiela, Kamil

Serverless execution of scientific workflows: Experiments with HyperFlow, AWS Lambda and Google Cloud Functions Journal Article

In: Future Generation Computer Systems, vol. 110, pp. 502–514, 2020, ISSN: 0167739X.

Abstract | BibTeX | Links:

Malawski, Maciej; Rzadca, Krzysztof (Ed.)

Euro-Par 2020: Parallel Processing Book

Springer International Publishing, Cham, 2020, ISBN: 978-3-030-57674-5.

BibTeX | Links:

Tomasiewicz, Dawid; Pawlik, Maciej; Malawski, Maciej; Rycerz, Katarzyna

Foundations for Workflow Application Scheduling on D-Wave System Proceedings Article

In: Krzhizhanovskaya, Valeria V; Závodszky, Gábor; Lees, Michael H; Dongarra, Jack J; Sloot, Peter M A; Brissos, Sérgio; Teixeira, Joao (Ed.): Computational Science -- ICCS 2020, pp. 516–530, Springer International Publishing, Cham, 2020, ISBN: 978-3-030-50433-5.

Abstract | BibTeX

Avati, Valentina; Blaszkiewicz, Milosz; Bocchi, Enrico; Canali, Luca; Castro, Diogo; Cervantes, Javier; Grzanka, Leszek; Guiraud, Enrico; Kaspar, Jan; Kothuri, Prasanth; Lamanna, Massimo; Malawski, Maciej; Mnich, Aleksandra; Moscicki, Jakub; Murali, Shravan; Piparo, Danilo; Tejedor, Enric

Declarative Big Data Analysis for High-Energy Physics: TOTEM Use Case Proceedings Article

In: Yahyapour, Ramin (Ed.): Euro-Par 2019: Parallel Processing, pp. 241–255, Springer International Publishing, Cham, 2019, ISBN: 978-3-030-29400-7.

Abstract | BibTeX

Nowakowski, Piotr; Bubak, Marian; Bartyński, Tomasz; Gubała, Tomasz; Harceżlak, Daniel; Kasztelnik, Marek; Malawski, Maciej; Meizner, Jan

Cloud computing infrastructure for the VPH community Journal Article

In: Journal of Computational Science, vol. 24, pp. 169–179, 2018, ISSN: 18777503.

Abstract | BibTeX | Links:

Malawski, Maciej; Juve, Gideon; Deelman, Ewa; Nabrzyski, Jarek

Algorithms for cost-and deadline-constrained provisioning for scientific workflow ensembles in IaaS clouds Journal Article

In: Future Generation Computer Systems, vol. 48, pp. 1–18, 2015, ISSN: 0167739X.

Abstract | BibTeX | Links:

Malawski, Maciej; Figiela, Kamil; Nabrzyski, Jarek

Cost minimization for computational applications on hybrid cloud infrastructures Journal Article

In: Future Generation Comp. Syst., vol. 29, no. 7, pp. 1786–1794, 2013.

BibTeX | Links:

Malawski, Maciej; Bartyński, Tomasz; Bubak, Marian

Invocation of operations from script-based Grid applications Journal Article

In: Future Gener. Comput. Syst., vol. 26, no. 1, pp. 138–146, 2010, ISSN: 0167-739X.

BibTeX | Links:

F, Ślazyk; P, Jabłecki; M, Malawski; P., Płotka

CXR-FL: Deep Learning-based Chest X-ray Image Analysis Using Federated Learning Conference

22nd International Conference on Computational Science 0000.

BibTeX

Open Positions