LSU-Health Genomics Data HUB (GEDHUB)

Genomic medicine promises to revolutionize biomedical research, clinical care, precision prevention and monitoring global disease threats. By deciphering the human genome in the context of molecular networks, biological pathways, drug interaction, and modeling gene by environmental interactions, it is now possible for bioinformaticians, genomic scientists and clinicians to identify individuals at risk of disease, provide early diagnoses based on biomarkers, develop novel therapeutics and recommend effective treatments.

However, the fields of genomics and genomic medicine have been caught in a flood of data as huge amounts of information are generated by next-generation sequencers and rapidly evolving analytical platforms such as high-performance computing clusters. This data must be quickly stored, analyzed, shared, and archived, but many genome, cancer and medical research institutions and pharmaceutical companies are now generating so much data that it can no longer be timely processed, properly stored or even transmitted over regular communication lines. In addition to scale and speed, it is also important for all the genomics information to be linked based on data models and taxonomies, and to be well annotated. To address these critical needs and to enhance the research mission of LSUHSC, the BIG program has developed the GEDHUB. This resource is designed to enhance BIG’s research activities and to foster collaboration in biomedical research. The goal is to create a data driven medicine of the 21st century. The GEDHUB is managed by staff of the Bioinformatics and Genomics Program.


GEDHUB functions:  The key functions of the project are:

1.  High-performance management of data input/output. This function addresses the need for large-scalable I/O capability and large bandwidth capabilities for moving large data files.

2.  Policy-driven data management. This function addresses the need for managing the lifecycle of data from creation to deletion or preservation.

3.  Efficient data sharing: This function addresses the need for data sharing within and across logical domains of storage infrastructure.

4.  Large-scale metadata management. This function involves management of raw data files, such as FASTQ files.


Catalogue of curated datasets currently in the GEDHUB. 

Breast Cancer Project: This project focuses population genomics and epigenomics in breast cancer and related diseases such as obesity with goal of accelerating scientific discoveries by performing collaborative research in population, basic and translational research and seeking solutions to the critical problems that apply to improvement of human health and elimination of health disparities.

Prostate Cancer Project: This project is focused on comprehensive analysis and characterization of prostate cancer genomes in different ethnic populations for the discovery of genomic aberrations driving the disease phenotypes. We are particularly focused on identifying genomic changes initiate cancer development and discovering rare genetic alterations that drive cancers in different populations. Our approach relies on harnessing the vast amounts of omics data using efficient bioinformatics tools.

Childhood Cancer Project: This project focuses on mapping the genomic landscape and discovering clinically actionable biomarkers for childhood cancers. We apply comprehensive genomic approach to determine molecular changes that drive childhood cancers. We work collaboratively with clinicians to facilitate discovery of molecular targets and translate those findings into the clinic. We work collaboratively with NCI’s Office of Cancer Genomics and Cancer Therapy Evaluation Program and the Children’s oncology group utilizing data from the Therapeutically Applicable Research to Generate Effective Treatments project (TARGET) and the Pediatric Cancer Genome Project at Washington university and St. Jude Childrens Research Hospital



Access to the data sets can be attained by submitting a data request here.


User and Access Policy:

This information was developed to facilitate collaborative activities and data sharing in biomedical research. Users of this data must still comply with federal, state, local and institutional policies including obtaining and IRB or IRB waiver. This document does not grant you any license to the data sets.  The BIG Staff reserve the right to reject your data request if you do not satisfy the conditions or are not in compliance with the regulations.