The Histone code case:
A Semantic Web approach to Data integration
In the context of the Virtual Laboratory for e-Science project, we investigate a new integrative approach based on Semantic Web technology to elucidate the complex relationship between the 'histone code', DNA sequence, and gene expression.
Biological background: the Histone Code case
Histones are proteins that pack DNA into higher order structures and influence processes such as transcription, repair and replication of DNA. They have been implied in diseases such as cancer (e.g. Santos-Rosas and Carldas, 2005) and Huntington's Disease (e.g. Steffan et al., 2001). The histones can undergo several distinct post-translational chemical modifications including acetylation, methylation, phosphorylation and ubiquitylation. These modifications can be passed on to subsequent generations and are hypothesized to govern the transcriptional state of the genome. Because series of histone modifications form patterns that can be recognized or acted on by other proteins, it is believed that these patterns form a 'histone-code' on top of the DNA code (Peterson and Laniel, 2004, Strahl and Allis, 2000). An intricate relationship between the histone code, transcriptional activity, and DNA sequence is suspected, but poorly understood. Combining knowledge and various sources of data related to these facets may provide the tools to reach a higher level of understanding.
A Semantic Web approach to data integration
From the computer science point of view, we investigate the application Semantic Web technology for the integration of heterogeneous data sources in an environment for computational experimentation, an 'e-(bio)science' laboratory. We have defined a strategy for the annotation of biological data with domain-specific ontologies that implies sophisticated integration of data and knowledge and the formation of a histone-code knowledge model. We have built our own ontology for histones, 'HistOn', that covers the necessary concepts and levels of granularity. The flexibility of Semantic Web formats such as the Web Ontology Language (OWL) can be used to merge and extend ontologies as required by a computational experiment. To test our strategy we want to see if it can help to answer questions like: Is there a relationship between histone modifications and transcription factor binding? Therefore, we have used our approach to integrate two datasets from the UCSC genome browser: Transcription Factor Binding Sites and binding density of modified histone H3K4Me3.
Detailed explanation with data
People
![]() |
Lennart Post is a PhD student at IBU and the Nuclear Organisation Group headed by Prof. Dr. Roel van Driel. Biochemist by education, his role is to apply and co-develop e-science technology for the histone code case study. |
![]() |
M. Scott
Marshall is a postdoctoral researcher at IBU. Computer scientist by
education, his role is to develop a 'semantic framework' that underlies
a virtual laboratory for integrative bioinformatics. Scott is active
for IBU in W3C's Semantic Web Health Care and Life Sciences Interest Group
(HCLSIG). |
![]() |
Marco
Roos is a postdoctoral researcher at IBU. He is molecular cytologist by training with
additional training in computer science. With a background in chromatin
research and data integration his role is to bridge the gap between
e-science developments and case studies related to DNA function. |


