Center for Molecular Medicine Cologne

Bozek, Katarzyna - assoc. JRG 04

Data Science of Bioimages


We are interested in resolving large data challenges in biology and medicine. With the use of deep learning we aim at developing new data-driven approaches to study of image and video data in biology. Our theoretical interests lie in using machine learning to find appropriate data representations for resolving specific scientific questions. We apply and develop computational and analytical solutions to questions in cancer and ageing research.

Located in the interdisciplinary environment of the CMMC, our group is involved in multiple close collaborations with partners from experimental and medical backgrounds. Strongly applied in biomedical research, we search for ways in which computational and in particular machine learning methods can advance resolving specific medical and biological questions. Currently we pursue two major projects.

Systems medicine of triple negative breast cancer

Highly differentiated diseases require large-volume, high-resolution datasets and novel analytical approaches to find recurring patterns in large spectrum of genetic and molecular variation. In collaboration with experimental groups at the Charité University Hospital Berlin we address the question of high molecular heterogeneity in Triple Negative Breast Cancer (TNBC). This most aggressive breast cancer subtype with high rates of recurrence and mortality offers currently no therapy options.

In this project we expand over the existing knowledge of the genetic heterogeneity between and within TNBC tumors, by investigating the differences in their transcriptome, proteome and morphology as well as the extracellular matrix components and immune cell environment. Our analysis is based on patient samples from prospective clinical trials at multiple levels.

We strive to establish a novel type of biomarkers that instead of individual molecule levels will integrate multiple types of information and will be quantified with the use of machine learning and mathematical models.

C. elegans phenotyping

C. elegans, a tiny nematode worm, is used to study a broad range of questions in biology, from diseases to neural function. This apparently simple organism shows a broad repertoire of behaviors incomprehensible to human observer. These behaviors might be representative of its health and disease phenotype.

We employ deep learning methods to quantify and search for distinct motion patterns representative of worm molecular phenotype. The grand challenge of this project involves finding ways to represent worm posture and dynamics. Inspired by methods for language processing we aim to find analogous meaningful representations of words and sentences in the language of worm behavior.


The recent progress of both imaging and image analysis techniques are currently opening ways to elevate image data to a new scale in biological research. Whether in diagnostics of cancer or phenotyping of model organisms, we aim to establish methods and approaches allowing for broad, data scientific use of visual data in biology.

Lab Website

For further information please check the Bozek Laboratory for Data Sience of Images' webpage.

  • K Bozek, L Hebert, AS Mikheyev, GJ Stephens. "Pixel personality for dense object tracking in a 2D honeybee hive."
  • K Bozek, L Hebert, AS Mikheyev, GJ Stephens. "Towards dense object tracking in a 2D honeybee hive." Computer Vision and Pattern Recognition 2018
  • EE Khrameeva, I Kurochkin, K Bozek, P Giavalisco, P Khaitovich. "Lipidome Evolution in Mammalian Tissues." Mol Biol Evol 2018
  • K Bozek, L Hebert, AS Mikheyev, GJ Stephens. "Pixel personality for dense object tracking in a 2D honeybee hive."
  • K Bozek, EE Khrameeva, J Reznick, D Omerbašić, NC Bennett, GR Lewin, J Azpurua, V Gorbunova, A Seluanov, P Regnard, F Wanert, J Marchal, F Pifferi, F Aujard, Z Liu, P Shi, S Pääbo, F Schroeder, L Willmitzer, P Giavalisco, P Khaitovich. "Lipidome determinants of maximal lifespan in mammals." Sci Rep. 2017
  • Q Li*, K Bozek*, C Xu, Y Guo, J Sun, S Pääbo, CC Sherwood, PR Hof, JJ Ely, Y Li, L Willmitzer, P Giavalisco, P Khaitovich (*equal contribution). "Changes in lipidome composition during brain development in humans, chimpanzees and macaque monkeys." Mol Biol Evol 2017
  • K Bozek, Y Wei, Z Yan, X Liu, J Xiong, M Sugimoto, M Tomita, S Pääbo, CC Sherwood, PR Hof, JJ Ely, Y Li, D Steinhauser, L Willmitzer, P Giavalisco, P Khaitovich. "Organization and evolution of brain lipidome revealed by large-scale analysis of human, chimpanzee, macaque and mouse tissues." Neuron 2015
  • K Bozek, Y Wei, Z Yan, X Liu, J Xiong, M Sugimoto, M Tomita, S Pääbo, R Pieszek, CC Sherwood, PR Hof, JJ Ely, D Steinhauser, L Willmitzer, J Bangsbo, O Hansson, J Call, P Giavalisco, P Khaitovich. "Exceptional evolutionary divergence of human muscle and brain metabolomes parallels human cognitive and physical uniqueness." | PLoS Biol 2014
  • EE Khrameeva, K Bozek, L He, Z Yan, X Jiang, Y Wei, K Tang, MS Gelfand, K Prufer, J Kelso, S Pääbo, P Giavalisco, M Lachmann, P Khaitovich. "Neanderthal ancestry drives evolution of lipid catabolism in contemporary Europeans." Nat Commun 2014
  • K Bozek, T Lengauer, S Sierra, R Kaiser, FS Domingues. "Analysis of Physicochemical and Structural Properties Determining HIV-1 Coreceptor Usage." PLoS Comp Bio. 2013
  • K Bozek, M Eckhardt, S Sierra, M Anders, R Kaiser, HG Kräusslich, B Müller, T Lengauer. "An expanded model of HIV cell entry phenotype based on multi-parameter single-cell data." Retrovirology. 2012
  • K Bozek, EE Nakayama, K Kono, T Shioda. "Electrostatic potential of human immunodeficiency virus type 2 and rhesus macaque simian immunodeficiency virus capsid proteins." Front Microbiol. 2012
  • K Bozek, T Lengauer. "Positive selection of HIV host factors and the evolution of lentivirus genes." BMC Evol Biol. 2010
  • K Bozek, AL Rosahl, S Gaub, S Lorenzen, H Herzel. "Circadian transcription in liver." Biosystems. 2010
  • A Kuroishi, K Bozek, T Shioda, EE Nakayama. "A single aminoacid substitution of the human immunodeficiency virus type 1 capsid protein affects viral sensitivity to TRIM5alpha." Retrovirology. 2009
  • K Bozek, A Thielen, S Sierra, R Kaiser, T Lengauer. "V3 loop sequence space analysis suggests different evolutionary patterns of CCR5- and CXCR4-tropic HIV." PLoS One. 2009
  • K Kono, K Bozek, FS Domingues, T Shioda, EE Nakayama. "Impact of a single amino acid in the variable region 2 of the Old World monkey TRIM5alpha SPRY (B30.2) domain on anti-human immunodeficiency virus type 2 activity." Virology. 2009
  • K Bozek, A Relogio, SM Kielbasa, M Heine, C Dame, A Kramer, H Herzel. "Regulation of clock- controlled genes in mammals." PLoS One. 2009
  • K Bozek, SM Kielbasa, A Kramer, H Herzel, "Promoter analysis of mammalian clock-controlled genes." Genome Inform. 2007
  • K Bozek. "OCEAN GenRap for DB2 9." Software Developer's Journal. 2007
  • K Bozek, A Gambin, B Wilczynski, J Tiuryn. "Automated modeling of genetic control in Arabidopsis Thaliana." Journal of Fruit and Ornamental Plant Research 2006
Dr. Katarzyna Bozek CMMC Cologne
Dr. Katarzyna Bozek

Center for Molecular Medicine Cologne | Lab. of Data Science of Bioimages | CMMC Research Building

CMMC - PI - assoc. JRG 04

+49 221 478 89529

Center for Molecular Medicine Cologne | Lab. of Data Science of Bioimages | CMMC Research Building

Robert-Koch-Str. 21

50931 Cologne

CMMC Profile Page

Curriculum Vitae (CV)

Publications on PubMed


Link to PubMed

Figure 1
Figure 2