generative genomics


The abundance of high quality gene expression data afforded by the recent development of Massively Parallel Reporter Assays (MPRA) has created an abundance of data for developing a deeper understanding of transcription factor (TF) binding. Here we show that convolutional nueral networks are capable of learning the motifs that underly TF binding and predicting expression using these motifs at various amino acid concentrations [AA]. Using this result we develop a generative adversarial network that can build segments of regulatory sequence to produce specified gene expression at varying [AA].