NIH – Centers of Excellence in Genomic Science (CEGS) (RM1) – Department of Biochemistry and Molecular Biology

Participating Organization:

National Institutes of Health (NIH)

Funding Opportunity Title:

Centers of Excellence in Genomic Science (CEGS) (RM1)

Funding Opportunity Announcement (FOA):

PAR-14-195

Website:

http://www.genome.gov/10001771

Anticipated Number of Awards:

Approx 10 at one time; no more than 2 awarded in a given year

Award Budget:

Up to $2M direct for any year ($10M direct over 5 years); additional $500K for specialized equipment over life

Award Project Period:

Up to 5 years

Maximum Funding Period:

10 years

Letters of Intent Receipt Dates: 30 days before application receipt dates

Application Receipt Dates: May 20, 2015; May 20, 2016

Scientific Merit Review Dates: Nov 2015; Nov 2016

Advisory Council Dates: Jan 2016; Jan 2017

Earliest Start Dates: Apr 2016; Apr 2017

Funding Opportunity Purpose:

The Centers of Excellence in Genomic Sciences (CEGS) program establishes academic Centers for advanced genome research. Each CEGS grant supports a multi-investigator, interdisciplinary team to develop innovative genomic approaches to address a particular biomedical problem. A CEGS project will address a critical issue in genomic science or genomic medicine, proposing a solution that would be a very substantial advance. Thus, the research conducted at these Centers will entail substantial risk, balanced by outstanding scientific and management plans and very high potential payoff. A CEGS will focus on the development of novel technological or computational methods for the production or analysis of comprehensive data sets, or on a particular genome-scale biomedical problem, or on other ways to develop and use genomic approaches for understanding biological systems and/or significantly furthering the application of genomic knowledge, data and methods towards clinical applications. Exploiting its outstanding scientific plan and team, each CEGS will nurture genomic science at its institution by facilitating the interaction of investigators from different disciplines, and, by providing training to new and experienced investigators, it will expand the pool of highly-qualified professional genomics scientists and engineers.

Overview:

The Human Genome Project (HGP) has produced a wealth of genomic data. The next challenge is to discover and analyze the vast amount of biological information contained within it. The Centers of Excellence in Genomic Science (CEGS) program supports the formation of multi-investigator, interdisciplinary research teams to develop novel and innovative genomic research projects, using the data sets and technologies developed by the HGP.

Each CEGS will conduct highly innovative research designed to develop new concepts, methods, technologies, or ways to analyze data that will substantially advance the state of the art in genomic approaches to the study of a biological problem. Thus, CEGS research will ultimately foster the wider application of comprehensive, high-throughput genomics methods to the study of human biology and disease.

A CEGS will require visionary leadership and strong management. While use of the Multi-PI option is permitted, the requirement for the contact PI to devote significant effort is intended to provide the focus needed to drive the project toward its goals.

Each CEGS is also required to have a training component that leverages the strengths of CEGS and its investigators to train the next generation of interdisciplinary scientists, who will bring creative approaches to studying biological problems through a genomic approach. This component of the program includes a specific focus on engaging the talents of individuals from underrepresented minority groups.

Essence of a Center of Excellence in Genomic Science:

A successful CEGS must include ALL of the following:

A CEGS is highly innovative, being designed to develop new concepts, methods, technologies, or ways to produce or analyze comprehensive data sets, or on a particular genome-scale biomedical problem, or on other ways to develop and use genomic approaches for understanding biological systems and/or significantly furthering the application of genomic knowledge, data and methods towards clinical applications.
A CEGS proposes a very substantial advance to addressing a critical issue in genomic science or genomic medicine. Achieving a substantial advance entails risk; this is balanced by the potential for very high payoff and requires an outstanding scientific plan and effective management strategy.
A CEGS is a tightly focused project implemented by a multi-investigator, interdisciplinary team working in a highly integrated fashion. Components of the program must be interdependent, not simply related. Synergy and integration are key.
A CEGS will lay out a specific and substantive “product” that can be identified as having been the outcome of CEGS funding.
A CEGS will take on challenging aspects of a problem, including ones that have slowed progress in the chosen area of research.
A CEGS will increase the pool of professional scientists and engineers able to work in or use genomics, by offering innovative, substantive education and outreach opportunities across appropriate disciplines. It will integrate the training of new and broaden the training of established investigators. Graduate students and postdoctoral fellows, at a minimum, should participate in the research; however, that participation alone is insufficient as an education and outreach effort.

Additional characteristics of a CEGS:

A CEGS project may include an ELSI component if it is integrated with and closely related to the main focus or theme of the project.
Establishing a CEGS at an institution must add value beyond ongoing activities in genomics at that institution.
A CEGS project may propose very substantial improvement in current technology, to increase throughput and decrease cost.
A CEGS may choose a cell, organism, tissue, pathway, or disease as a model system in which to develop the concepts or methods, but those concepts or methods must be broadly applicable well beyond the chosen example.

A CEGS is NOT:

…an additional implementation of ideas already being pursued by the team or by others;

…the obvious next step in a project or field, which could be accomplished by assembling state-of- the-art components and innovating at the level of a typical R01;

…a program project;

…infrastructure for an existing program or department;

…primarily for the collection of a dataset in the absence of a novel concept or methodological approach;

…”only” outstanding science that fails to meet the criteria required of a CEGS.

The Application:

The Research Strategy must consist of the following subsections with the indicated page limits:

CEGS Research Project: 30 pages
Research Management, Education and Outreach: 6 pages

Active Centers of Excellence in Genomic Science Awards:

Center for Personal Dynamic Regulomes

Causal Transcriptional Consequences of Human Genetic Variation

Center for the Epigenetics of Common Human Disease

Neuropsychiatric Genome-Scale and RDOC Individualized Domains (N-GRID)

Wisconsin Center of Excellence in Genomics Science

Center for Cell Circuits

Genomic Analysis of the Genotype-Phenotype Map

Genomic Analysis of Network Perturbations in Human Disease

Center for Personal Dynamic Regulomes

P50 HG007735

Howard Chang

Stanford University

Despite the rapidly increasing capacity to sequence human genomes, our incomplete ability to read and interpret the information content in genomes and epigenomes remain a central challenge. A comprehensive set of regulatory events across a genome – the regulome – is needed to make full use of genomic information, but is currently out of reach for practically all clinical applications and many biological systems. The proposed Center will develop technologies that greatly increase the sensitivity, speed, and comprehensiveness of understanding genome regulation. We will develop new technologies to interrogate the transactions between the genome and regulatory factors, such as proteins and noncoding RNAs, and integrate variations in DNA sequences and chromatin states over time and across individuals. Novel molecular engineering and biosensor strategies are deployed to encapsulate the desired complex DNA transformations into the probe system, such that the probe system can be directly used on very small human clinical samples and capture genome-wide information in one or two steps. These technologies will be applied to clinical samples and workflows in real time to exercise their robustness and reveal for the first time epigenomic dynamics of human diseases during progression and treatment. These technologies will be broadly applicable to many biomedical investigations, and the Center will disseminate the technologies via training and diverse means.

Center Web Site: Center for Personal Dynamic Regulomes

Causal Transcriptional Consequences of Human Genetic Variation

P50 HG005550

George M. Church

Harvard University, Cambridge, Mass.

The Center for Transcriptional Consequences of Human Genetic Variation (CTCHGV) will develop innovative and powerful genetic engineering methods and use them to identify genetic variations that causally control gene transcription levels. Genome Wide Association Studies (GWAS) find many variations associated with disease and other phenotypes, but the variations that may actually cause these conditions are hard to identify because nearby variations in the same haplotype blocks consistently co-occur with them in human populations, so that specifically causative ones cannot be distinguished. About 95% of GWAS variations are not in gene coding regions, and many of these presumably associate with altered gene expression levels. CTCHGV will identify the variations that directly control gene expression by engineering precise combinations of changes to gene regulatory regions that break down the haplotype blocks, allowing each variations’ effect on gene expression to be discerned independently of the others. To perform this analysis, CTCHGV will extract ~100kbps gene regulatory regions from human cell samples, create precise variations in them in E. coli, and re-introduce the altered regions back into human cells, using zinc finger nucleases (ZFNs) to efficiently induce recombination. CTCHGV will target 1000 genes for this analysis (Aim 1), and will use human induced Pluripotent Stem cells (iPS) to study the effects of variations in diverse human cell types (Aim 2). To explore the effects of variations in complex human tissues, CTCHGV will develop methods of measuring gene expression at transcriptome-wide levels in many single cells, including in situ in structured tissues (Aim 3). Finally, CTCHGV will develop novel advanced technologies that integrate DNA sequencing and synthesis to construct thousands of large DNA constructs from oligonucleotides, that enable very precise targeting and highly efficient performance of ZFNs, and that enable cells to be sorted on the basis of morphology as well as fluorescence and labeling (Aim 4). CTCHGV will also develop direct oligo-mediated engineering of human cells, and create “marked allele” iPS that will enable easy ascertainment of complete exon distributions for many pairs of gene alleles in many cell types.

Center Web Site: Center for Causal Transcriptional Consequences of Human Genetic Variation (CTCHGV)

Center for the Epigenetics of Common Human Disease

P50 HG003233

Andrew P. Feinberg

Johns Hopkins University, Baltimore

(co-funded by National Institute of Mental Health)

Epigenetics, the study of non-DNA sequence-related heredity, is at the epicenter of modern medicine because it can help to explain the relationship between an individual’s genetic background, the environment, aging, and disease. The Center for the Epigenetics of Common Human Disease was created in 2004 to begin to develop the interface between epigenetics and epidemiologic-based phenotype studies, recognizing that epigenetics requires new ways of thinking about disease. We created a highly interdisciplinary group of faculty and trainees, including molecular biologists, biostatisticians, epidemiologists, and clinical investigators. We developed novel approaches to genome-wide DNA methylation (DNAm) analysis, allele-specific expression, and new statistical epigenetic tools. Using these tools, we discovered that most variable DNAm is in neither CpG islands nor promoters, but in what we term “CpG island shores,” regions of lower CpG density up to several kb from islands, and we have found altered DNAm in these regions in cancer, depression and autism. In the renewal period, we will develop the novel field of epigenetic epidemiology, the relationship between epigenetic variation, genetic variation, environment and phenotype. We will continue to pioneer genome-wide epigenetic technology that is cost effective for large scale analysis of population-based samples, applying our knowledge from the current period to second-generation sequencing for epigenetic measurement, including DNAm and allele-specific methylation. We will continue to pioneer new statistical approaches for quantitative and binary DNAm assessment in populations, including an Epigenetic Barcode. We will develop Foundational Epigenetic Epidemiology, examining: time-dependence, heritability and environmental relationship of epigenetic marks; heritability in MZ and DZ twins; and develop an epigenetic transmission disequilibrium test. We will then pioneer Etiologic Epigenetic Epidemiology, by integrating novel genome-wide methylation scans (GWMs) with existing Genome-Wide Association Study (GWAS) and epidemiologic phenotype data, a design we term Genome-Wide Integrated Susceptibility (GWIS), focusing on bipolar disorder, aging, and autism as paradigms for epigenetic studies of family-based samples, longitudinal analyses, and parent-of-origin effects, respectively. This work will be critical to realizing the full value of previous genetic and phenotypic studies, by developing and applying molecular and statistical tools necessary to integrate DNA sequence with epigenetic and environmental causes of disease.

Center Web Site: Center of Excellence in Genomic Science at Johns Hopkins

Neuropsychiatric Genome-Scale and RDOC Individualized Domains (N-GRID)

P50 MH106933

Isaac S. Kohane

Harvard Medical School

As a result of the accelerated pace of development of technologies for characterizing the human genome, the rate-limiting step for large scale genomic investigation in clinical populations is now phenotyping. This is particularly the case for neuropsychiatric (NP) illness, where phenotypes are complex, biomarkers are lacking, and the primary cell types of interest are difficult to access directly. It has become apparent that both rare and common genetic variation contributes to disease risk and that this risk crosses traditional diagnostic boundaries in psychiatry. Taking advantage of a large, already-established NP biobank could dramatically accelerate progress toward understanding the cross-disorder mechanism of action of disease liability genes. This study proposes novel applications of emerging technologies in informatics and cellular neurobiology to eliminate this phenotyping bottleneck. In doing so, it will accelerate investigation of clinical and cellular phenotypes for understanding single and multilocus/polygenic associations. Aim 1: Adapt and expand one of the largest NP cellular biobanks by parsing electronic health records with gold-standard assessment of cognition and other RDoC phenotypes. Aim 2: Define the genome-wide multidimensional functional genomics (MFG) landscape in NP disease into which the transcriptomic signature (RNA-seq) of each induced neuron (IN) representing a clinically characterized individual is projected. The projection provides the mapping from molecular to phenotypic characterization and a directionality towards healthful/neurotypical states used in Aim 3. Aim 3: Develop a probabilistic model of gene expression dependencies that will predict which small molecular perturbations are likely to shift the IN transcriptomic signature in a healthful direction in the MFG and to then update the model based on measured perturbations in the MFG. Aim 4: Select patient samples to study in greater detail for epigenetic (DNA methylation, histone marks and RNA editing) and transcriptional control particularly with regard to activity dependent changes that have been implicated in many NP diseases. Aim 5: Here we assess just how well the clinical phenotypes are informed by the genome-wide characterizations and assess which is more robust.

Center Web Site: Neuropsychiatric Genome-Scale and RDOC Individualized Domains (N-GRID)

Wisconsin Center of Excellence in Genomics Science

P50 HG004952

Michael Olivier

Medical College of Wisconsin, Milwaukee

The successful completion of the human genome and model organism sequences has ushered in a new era in biological research, with attention now focused on understanding the way in which genome sequence information is expressed and controlled. The focus of this proposed Wisconsin Center of Excellence in Genomics Science is to facilitate understanding of the complex and integrated regulatory mechanisms affecting gene transcription by developing novel technology for the comprehensive characterization and quantitative analysis of proteins interacting with DNA. This new technology will help provide for a genome-wide functional interpretation of the underlying mechanisms by which gene transcriptional regulation is altered during biological processes, development, disease, and in response to physiological, pharmacological, or environmental stressors. The development of chromatin immunoprecipitation approaches has allowed identification of the specific DNA sequences bound by proteins of interest. We propose to reverse this strategy and develop an entirely novel technology that will use oligonucleotide capture to pull down DNA sequences of interest, and mass spectrometry to identify and characterize the proteins and protein complexes bound and associated with particular DNA regions. This new approach will create an invaluable tool for deciphering the critical control processes regulating an essential biological function. The proposed interdisciplinary and multi-institutional Center of Excellence in Genomics Science combines specific expertise at the Medical College of Wisconsin, the University of Wisconsin Madison, and Marquette University. Technological developments in four specific areas will be pursued to develop this new approach: (1) cross-linking of proteins to DNA and fragmentation of chromatin; (2) capture of the protein-DNA complexes in a DNA sequence-specific manner; (3) mass spectrometry analysis to identify and quantify bound proteins; and (4) informatics to develop tools enabling the global analysis of the relationship between changes in protein-DNA interactions and gene expression. The Center will use carefully selected biological systems to develop and test the technology in an integrated genome-wide analysis platform that includes efficient data management and analysis tools. As part of the Center mission, we will combine our technology development efforts with an interdisciplinary training program for students and fellows designed to train qualified scientists experienced in cutting-edge genomics technology. Data, technology, and software will be widely disseminated by multiple mechanisms including licensing and commercialization activities.

Collaborating Institutions: University of Wisconsin-Madison, Marquette University

Center Web Site: Wisconsin Center of Excellence in Genomic Science

Center for Cell Circuits

P50 HG006193

Aviv Regev

The Broad Institute, Inc., Cambridge, Mass.

Systematic reconstruction of genetic and molecular circuits in mammalian cells remains a significant, largescale and unsolved challenge in genomics. The urgency to address it is underscored by the sizeable number of GWAS-derived disease genes whose functions remain largely obscure, limiting our progress towards biological understanding and therapeutic intervention. Recent advances in probing and manipulating cellular circuits on a genomic scale open the way for the development of a systematic method for circuit reconstruction. Here, we propose a Center for Cell Circuits to develop the reagents, technologies, algorithms, protocols and strategies needed to reconstruct molecular circuits. Our preliminary studies chart an initial path towards a universal strategy, which we will fully implement by developing a broad and integrated experimental and computational toolkit. We will develop methods for comprehensive profiling, genetic perturbations and mesoscale monitoring of diverse circuit layers (Aim 1). In parallel, we will develop a computational framework to analyze profiles, derive provisional models, use them to determine targets for perturbation and monitoring, and evaluate, refine and validate circuits based on those experiments (Aim 2). We will develop, test and refine this strategy in the context of two distinct and complementary mammalian circuits. First, we will produce an integrated, multi-layer circuit of the transcriptional response to pathogens in dendritic cells (Aim 3) as an example of an acute environmental response. Second, we will reconstruct the circuit of chromatin factors and non-coding RNAs that control chromatin organization and gene expression in mouse embryonic stem cells (Aim 4) as an example of the circuitry underlying stable cell states. These detailed datasets and models will reveal general principles of circuit organization, provide a resource for scientists in these two important fields, and allow computational biologists to test and develop algorithms. We will broadly disseminate our tools and methods to the community, enabling researchers to dissect any cell circuit of interest at unprecedented detail. Our work will open the way for reconstructing cellular circuits in human disease and individuals, to improve the accuracy of both diagnosis and treatment.

Center Web Site: Center for Cell Circuits

Genomic Analysis of the Genotype-Phenotype Map

P50 HG002790

Simon Tavaré

University of Southern California, Los Angeles

Our Center, which started in 2003, focused on implications of haplotype structure in the human genome. Since that time, there have been extraordinary advances in genomics: Genome-wide association studies using single nucleotide polymorphisms and copy number variants are now commonplace, and we are rapidly moving towards whole-genome sequence data for large samples of individuals. Our Center has undergone similar dramatic changes. While the underlying theme remains the same — making sense of genetic variation — our focus is now explicitly on how we can use the heterogeneous data produced by modern genomics technologies to achieve such an understanding. The overall goal of our proposal is to develop an intellectual framework, together with computational and statistical analysis tools, for illuminating the path from genotype to phenotype, and for predicting the latter from the former. We will address three broad questions related to this problem: 1) How do we infer mechanisms by which genetic variation leads to changes in phenotype? 2) How do we improve the design, understanding and interpretation of association studies by exploiting prior information? 3) How do we identify general principles about the genotype-phenotype map? We will approach these questions through a series of interrelated projects that combine computational and experimental methods, explored in Arabidopsis, Drosophila and human, and involve a wide range of researchers including molecular biologists, population geneticists, genetic epidemiologists, statisticians, computer scientists, and mathematicians.

Collaborating Institutions: University of Utah

Center Web Site: The USC Center of Excellence in Genomic Science

Genomic Analysis of Network Perturbations in Human Disease

P50 HG004233

Marc Vidal

Dana-Farber Cancer Institute, Boston

Genetic differences between individuals can greatly influence their susceptibility to disease. The information originating from the Human Genome Project (HGP), including the genome sequence and its annotation, together with projects such as the HapMap and the Human Cancer Genome Project (HCGP) have greatly accelerated our ability to find genetic variants and associate genes with a wide range of human diseases. Despite these advances, linking individual genes and their variations to disease remains a daunting challenge. Even where a causal variant has been identified, the biological insight that must precede a strategy for therapeutic intervention has generally been slow in coming. The primary reason for this is that the phenotypic effects of functional sequence variants are mediated by a dynamic network of gene products and metabolites, which exhibit emergent properties that cannot be understood one gene at a time. Our central hypothesis is that both human genetic variations and pathogens such as viruses influence local and global properties of networks to induce “disease states.” Therefore, we propose a general approach to understanding cellular networks based on environmental and genetic perturbations of network structure and readout of the effects using interactome mapping, proteomic analysis, and transcriptional profiling. We have chosen a defined model system with a variety of disease outcomes: viral infection. We will explore the concept that one must understand changes in complex cellular networks to fully understand the link between genotype, environment, and phenotype. We will integrate observations from network-level perturbations caused by particular viruses together with genome-wide human variation datasets for related human diseases with the goal of developing general principles for data integration and network prediction, instantiation of these in open-source software tools, and development of testable hypotheses that can be used to assess the value of our methods. Our plans to achieve these goals are summarized in the following specific aims: 1. Profile all viral-host protein-protein interactions for a group of viruses with related biological properties. 2. Profile the perturbations that viral proteins induce on the transcriptome of their host cells. 3. Combine the resulting interaction and perturbation data to derive cellular network-based models. 4. Use the developed models to interpret genome-wide genetic variations observed in human disease, 5. Integrate the bioinformatics resources developed by the various CCSG members within a Bioinformatics Core for data management and dissemination. 6. Building on existing education and outreach programs, we plan to develop a genomic and network centered educational program, with particular emphasis on providing access for underrepresented minorities to internships, workshop and scientific meetings.

Center Web Site: Center for Cancer Systems Biology (CCSB) Center of Excellence in Genomic Science

Colorado State University

College of Natural Sciences

NIH – Centers of Excellence in Genomic Science (CEGS) (RM1)