Project 1
Prediction of complex human traits from genomic data using machine learning
methods and informative priors

Leader:Athina Spiliopoulou
Tutor: Reka Nagy
Participants: Konstantin Sharafutdinov, Elena Carnero, Dunja Vucenovic, Tatiana Shashkova

Project Description
In this project we sought to improve prediction of complex human health traits from genomic data by evaluating different ways of incorporating domain knowledge into parametric and non parametric prediction methods. Read more..

Project 2

Experimental design in Omics - simulation studies

Leaders: Jeanine Houwing-Duistermaat, Mar Rodriguez Girondo

Tutors: Lucija Klaric and Frano Vuckovic

Participants: Sara Koska, Ena Melvan, Manshu Song, Viktoria Szeifert

Project Description

In this project we evaluated the impact of the different steps in the experimental design that may introduce errors, biases, and loss in efficiency in the data analysis. For that, we analysed two case-control studies. In both cases, the final goal was to measure glycans to assess relationships between a disease (lupus and multiple sclerosis, respectively) and the measured glycans. For this goal it is important to measure the glycans efficiently  and not to introduce bias. Topics discussed were distinguishing between  biological and technical variation, experimental design, randomization, replication and missing data. We performed  simulations under various scenarios and discussed about adequate strategies for efficient designs and analysis plans data for  “–omics “case-control studies.

Mini-conference presentation

Project 3
Discovering the difference in genetic control of plasma and IgG glycosylation

Leaders and tutors: Sodbo Sharapov, Yakov Tsepilov, Olga Zaitseva

Participants: Alyce Russel, Ashley van der Spek, FeiFei Zhao

Project Description
Many plasma proteins are modified by covalently-bound glycans; oligosaccharides that are biologically important for normal biochemical processes, such as protein folding and cellular signaling. The biosynthesis of such glycans occurs in the endoplasmic reticulum in tandem to protein biosynthesis. Yet, unlike proteins, glycans do not follow a genetic template. In fact, a complex network of hundreds of genes is involved in the biosynthetic pathway, and many genes controlling glycosylation are unknown. Immunoglobulin G (IgG), transferrin and fibrinogen are the most prevalent glycoproteins in the plasma proteome. IgG is heavily studied and its glycan moieties are known to have a downstream influence on effector functions; that is, whether it is anti- or pro-inflammatory. Genome-wide association studies (GWAS) are a powerful tool to discover new associations of common genetic variants and phenotypes of interest. GWAS have already been successfully used on the plasma and IgG glycomes separately, identifying important glyco-genes; however, there have been fewer studies on other plasma protein glycomes. For this project we were interested in analysing the plasma glycome using GWAS without the IgG fraction of glycans with the aim of identifying new loci associated with the glycosylation of other important plasma glycoproteins. Read more..

Project 4

Longitudinal and Survival data analysis

Leader: Ivo Ugrina

Tutor: Frano Vuckovic

Participants: Zlata Cherpakova, Nikolina Sostaric

Project Description

During the project students learnt basics of Longitudinal and Survival data analysis. Emphasis was given to Friedman test and Cox proportional hazards model. Students learnt how to work with appropriate R functions for testing, visualizing and describing data. As an additional (non conventional) approach students learnt about basic Tensors Algebra and Tensor decompositions like PARAFAC and Tucker decompositions. These methods were applied to longitudinal data to test the quality and interpretability of decompositions to data which contains time as one mode. Results: Longitudinal and Survival data analysis was conducted on two real-life data sets. Additionally, PARAFAC tensor decomposition was applied on longitudinal data resulting in interesting representation of the data and giving some insights into problematic structure of the data. Results of the project seem interesting and will be processed in future scientific research.

Mini-conference presentation