The Computational Biology Aging group aims to apply state-of-the-art computational approaches to address challenges in aging, stress (including disease), and regulatory acceptance. Central to our approach is the integration of multiple levels of data to better understand the specific biological system.
Here we are able to make use of available complex datasets, including OMICs, chemistry, or other metadata to, for example, develop novel multivariate biomarkers or develop new adverse outcome pathways.
The advancement in OMICs technologies has brought in a new era of understanding of human and environmental health. Larger and larger projects are being thought up to establish the molecular basis of a multitude of societal challenges. From the fight against cancer, establishment of early detection systems for (preventable) diseases, development of new and repurposing of drugs for novel therapies, to the understanding of the impact of human action on the environment – OMICs are being used as key tools to establish this knowledge.
Data generated by OMICs technologies are rarely simple and can contain any number of features that represent the molecular building blocks. In transcriptomics, for example, working on the transcript level can yield up to 300.000 molecular features, while in another technology such as mass-spec based proteomics up to 6.000 proteins can be measured simultaneously. Analysing these datasets therefore requires a very specific skillset starting with an understanding of the peculiarities of the technology, knowledge on applying the correct statistical/computational methodology, and means to interpret and understand the results.
Even more complex and exciting is the integration of multiple datasets both vertical (multiple datasets from a single individual) and horizontal (multiple datasets from several independent individuals). This can allow for the development of more accurate multivariate biomarkers, identify novel pathways linked to the challenge, and improve our general understanding of the biological system being worked on. Here we present a few active projects that utilise OMICs and metadata to highlight the effectiveness of applied computational biology.
An important aspect of the aging process is accrued DNA damage, particularly those that have occurred over the lifetime of an organism. Human actions, including the release of chemicals, climate change, or habitat loss can contribute to the increase in DNA damage across all organisms subjected to the environment. In addition, early developmental stages are particularly sensitive to external signals and can lead to numerous adverse outcomes including early death, increased disease risk, or reduced fecundity.
To better understand the impact of external influence on developing organisms we exposed Danio rerio (zebrafish) embryos to a suite of industrial, pharmaceutical, and other chemical agents known to be in the environment. For a total of 158 compounds in up to 5 concentrations we developed transcriptomic and phenotypic readouts including death, deformities, behaviour, and cardiac output. Changes in heart rate were observed in three different states upregulation (Figure 1B), downregulation (Figure 1C and D), and no change (Figure 1A).
Using the transcriptomics data we then build a model able to predict these changes in heart rate and achieved rates with up to 90% cross-validated accuracy. In parallel predicting heart from chemical structure only achieved up to 60% accuracy, suggesting that chemical structure alone is not sufficient for predicting heart rate. To further explore the chemical structural relationships we developed a PCA and associated clustering to identify the number of chemical structure groups in our dataset (Figure 2A).
Similarly, we clustered the molecular responses which showed a very different response and clustering pattern (Figure 2B). A closer examination of the cluster memberships found that only 1 out of the 4 groups in each approach led to the same membership suggesting that only in very specific structures the molecular response is equivalent. We are continuing to work on the development of further predictive models describing this data.
For thousands of years humans have used bee products in numerous applications such as honey as sweetener, beeswax as waterproof coating, or even medicine. The medical applications are particularly interesting and possibly every bee product appears to have a medicinal benefit. Several bee products, honey, propolis and venom had shown antimicrobial, antioxidative, antiaging, anti-inflammatory and anticarcinogenic activity. Moreover, wound healing and gastroprotective effect were observed in propolis, honey and royal jelly. Honey, in particular, has been shown to contain many anti-cancer relevant compounds.
One of these compounds, Apigenin, has been firstly described by Birt et al where they observed anti-mutagenic and anti-promotion of carcinogenesis activity on mutagenesis inducing in Salmonella typhimurium and mouse skin epidermis. This led to Apigenin being tested on several cancer cell lines including colorectal, breast, lung, prostate, cervical, ovarian, glioblastoma, leukaemia, melanoma, pancreatic and osteosarcoma cancer cell lines to establish its potency surrounding anti-cancer and cancer preventing functions. Apigenin was found to trigger diverse mechanisms in each of the cancer cell lines exhibiting anticancer activity, induction of apoptosis, cell cycle arrest, metastasis inhibition and anti-angiogenesis. It also demonstrated low toxicity against non-cancerous cells in comparison with cancerous alternatives, which is one of requirements for becoming a clinical candidate.
The mechanism by which Apigenin functions however has remained elusive. Due to the effect of Apigenin on cell cycle arrest, suggesting an impact on protein phosphorylation in and around the cell cycle, we employed a phosphoproteomics approach. For 3 concentrations (IC10, IC20, and IC30) as well as over time (30, 60, and 90 min) we extracted and analysed the impact of Apigenin on the phosphoproteomics state. A modified geneset enrichment analysis approach we developed for phosphoproteomics datasets identified a number of pathways associated with concentration and the interaction between concentration and time. Interestingly, epigenetic regulation of gene expression was identified as one of the most significant functions impacted by Apigenin exposure over time and concentration. To further understand the potential mechanisms a NetworKIN analysis of the data, which looks to assign potential kinases to the observed phosphoproteomics data, identified that CDK1 is a key kinase linked to the observed changes. In addition, 60% of the directly connected proteins in this analysis are related to epigenetic function further strengthening our observations (Figure 3).
A literature search of the identified functions and kinases identified that CDK1 is known to be upregated in ovarian cancers compared to normal cells. The disruption of CDK1 activity and expression resulted in apoptosis and cell cycle arrest at G2/M phase. Apigenin was reported to inhibit the activation and expression of cyclin B and cdc25c, both working consequently as a regulator of CDK1. This could explain on how Apigenin exhibit more antiproliferative activity in CDK1 overexpressing cancer cells than normal cells. Moreover, Apigenin has also been implicated in inhibition of DNA topoisomerase I and II resulting in increased DNA stress and fragmentation which related directly to the epigenetic function identified. Lastly, the strongest interaction in this NetworKIN analysis was identified between GSK3A and PDHK1. GSK3A is involved with a wide range of cellular processes. The deactivation of GSK3A caused inhibition of its downstream pathway, which are involved with multiple cellular mechanisms e.g. cell progression, proliferation, RNA translation, etc; again providing a direct link to the epigenetic impact observed. In summary, it is likely that Apigenin targets multiple cellular activities to express its antiproliferative activity (Figure 4). Further experiments into the impact of Apigenin on cell continue.
Aging, and in particular, longevity have always been a hallmark of scientific research. With the advance of molecular technologies, it is now possible to study aging on multiple levels of biological hierarchy and, even more excitingly, in an integrated manner to establish the molecular interactome leading to longer life.
One may envision comparisons between older persons with kidney failure as rapidly aging persons (and tissues) such as in cohorts collected at CECAD, that can be compared to individuals aging at an average rate and individuals with an exceptionally healthy aging trajectory (collected at LUMC).
This project will focus on establishing a collaborative exchange with world-leading researchers in the aging field in and around the Cologne campus across the CECAD, University Hospital Cologne, MPI-AGE, CMMC, and LUMC.
LUMC currently holds the Leiden Longevity Study (LLS) on long-lived individuals and their siblings, their middle-aged offspring and the partners thereof as including clinical, molecular, and demographic data. Relevant for this project are the participants of IOP2 and 3 and an intervention study GOTO. In parallel, the Dept. 2 of Internal Medicine (UHC/CECAD) holds several cohorts suffering from chronic kidney disease (CKD) including intervention studies with similar data depth (Figure 5).
CKD is one of the key morbidities leading to premature aging and increased risk of aging-associated diseases in humans. The dietary and physical exercise intervention substudies of the LLS and the CKD cohorts will complement the knowledge on beneficial and adverse molecular profiles in the circulation related to kidney aging with a specific focus on proteomic responses. Dietary interventions are among the most powerful tools to increase lifespan and organismal fitness and are conserved in evolution.
Use of such approaches in elderly individuals and patients suffering from CKD is expected to revert aging-associated changes. Inclusion of data before and after these interventions allows for a dynamic view in a longitudinal fashion. Taken together, the combination of these large datasets is a unique asset and allows for a deep molecular phenotyping of aging combined with clinical characteristics in several conditions of human aging and longevity. One approach to utilise this large cohort of data is to develop a network representation of the interactions between every possible combination of features (Figure 6).
In its simplest form this can be represented by a correlation network. However, correlation does not always perform well when data is zero-inflated for example - a hallmark of count based datasets. For this reason, we will use data property relevant modelling techniques to link features with each other. For zero-inflated data this could employ hurdle or negative binomial, for demographics binomial, and for normally distributed data gaussian models. Such an approach then allows us to develop a better understanding of the relationships between molecular responses and the aging processes.
In addition, more accurate biomarkers predicting age and disease could highlight key biological processes that need to be further studied in the context of longevity and general health or detect confounding factors associated with disease.
The approaches underlying computational biology can be utilised in numerous challenges presented in biomedical and environmental fields. OMICs analyses benefit from the unbiased approach that these technologies represent and can yield novel insight into the underlying biological mechanisms at a fraction of the time required for more classical molecular biology approaches. Data integration approaches further help in developing more robust biomarkers and associated knowledge leading to better patient stratification and more personalised medicine.
For further information please check the Antczak Lab - Computational Biology of Aging webpage.