Computational approaches to study transcriptional regulation in the human genome

Tesis doctoral de Juan Manuel Vaquerizas Erdocia

Abstract it is essential for an organism»s viability to ensure that the correct sets of genes are expressed in the right place and al the right time. There are several mechanisms by which cells regulate the amount of protein produced from genes under different conditions. One of the most basic is transcriptional regulation. By controlling the recruitment of rna polymerase and associated factors to gene promoters, and the assembly of the transcription initiation complex, transcription factors regulate the transcriptional process, and therefore the expression of particular genes. A large number of human diseases are caused by malfunctions in transcriptional regulation, highlighting the importance of this system. here i present a computational study of transcriptional regulation in the human genome. First i identify and analyse the properties of 1.369 sequence-specific dna-binding transcription factors in the human genome. We show that: i) 80% of transcription factors belong to just three protein families, with the c2h2-zn finger family being the most common; ii) 40% of factors are spatially clustered in specific chromosomal regions, and as a result may function in a co-ordinated manner; iii) transcription factors either function specifically in one or two tissues or ubiquitously across the whole body, giving rise to a two-tier organisation of global and local regulators; an iv) groups of transcription factors have arisen in the human lineage at key events during evolution (such as the appearance of mammalian organisms). secondly, i examine how sequence variation in the human genome, and in particular single nucleotide polymorphisms (snps), disrupt the normal function of the transcriptional regulatory system. I predict functional nucleotide sequence motifs (such as transcription factor binding sites and exonic splicing enhancers) inside or in the proximity of genes, and identify snps that overlap with them. Despite the simplicity of the approach, many of the predicted disruptive snps have been validated experimentally and have been associated with diseases. finally, none of the above results could have been obtained without the development of methods and tools required to perform a robust analysis of the data. In the past ten years the tandem development of high-throughput technology along with the sequencing of numerous genomes have produced a flood of data describing biological systems from a global perspective. These new data types often require special statistical or mathematical treatment in order to interpret them. I have devoted a large part of this dissertation towards creating methods and web-tools to analyse genomic data. These include approaches for: i) cdna microarray normalisation and quality control; ii) identifying differentially expressed genes; iii) binding sets of genes with class prediction properties; iv) performing transcription factor annotation of microarray experiments; v) assessing the sensitivity and specificity of gene level measurements for affymetrix genechips; vi) detecting tissue-specific expression from microarray data; and vii) detecting binding signal for chipchip tilling arrays experiments.

 

Datos académicos de la tesis doctoral «Computational approaches to study transcriptional regulation in the human genome«

  • Título de la tesis:  Computational approaches to study transcriptional regulation in the human genome
  • Autor:  Juan Manuel Vaquerizas Erdocia
  • Universidad:  Autónoma de Madrid
  • Fecha de lectura de la tesis:  22/02/2008

 

Dirección y tribunal

  • Director de la tesis
    • Nicholas M Luscombe
  • Tribunal
    • Presidente del tribunal: alfonso Valencia herrera
    • boris Lenhard (vocal)
    • Juan Fernando Poyatos adeva (vocal)
    • janet Thornton (vocal)

 

Deja un comentario

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *

Scroll al inicio