Dr Hai Fang

Postdoctoral Research Scientist

Contact information

Statistical functional genomics within ULTRA-DD

The ULTRA-DD is a European-wide consortium comprising both academics and industry, aiming to validate new protein targets for drug discovery by generating chemical probes (and antibodies) in clinically based models of human inflammatory disease. As part of Target Prioritisation Network (WP1), I have been developing genetics-led approaches for drug target discovery, creating an atlas of target predictions in >30 immune traits. The datasets used come in different forms, including 1) genomic summary data produced from genome-wide association studies (GWAS), quantitative trait mapping (eQTL), and promoter capture Hi-C studies, 2) gene-level annotations using biomedical ontologies, and 3) knowledge of gene network connectivity.

I have a longstanding interest in analysing and using omics data, both in an integrative manner and in the light of evolution. I obtained PhD in Genetics/Bioinformatics from the Shanghai Institutes of Biological Sciences at the Chinese Academy of Sciences, where I received bioinformatics training to make sense of transcriptome data underlying differentiation/apoptosis-induced synergistic therapy in leukemia. As part of my first postdoctoral work at the Shanghai Institute of Hematology, I integrated both transcriptome and interactome data identifying a network controling early human organogenesis. In 2010 I came to the Department of Computer Science at the University of Bristol working on the SUPERFAMILY project, using structural phylogenomics for building tree of life and creating domain-centric ontologies.

I am creator and maintainer of software, including (1) Pi for drug target prediction through networking genetic evidence and machine learning; (2) XGR for enhanced interpretation for genomic summary data; (3) dcGOR for analysing ontologies and annotations; (4) dnet for omics data integrative analysis in terms of network, evolution and ontology; (5) TPSC for topology-preserving selection and clustering; and (6) supraHex for tabular omics data analysis using a supra-hexagonal map.