Identification of transcriptional promoter motifs in Drosophila melanogaster
Kuruvilla, Denison John
MetadataShow full item record
A critical step in understanding the mechanisms of regulation of gene expression is the ability to successfully identify and study regulatory elements. These elements, which serve as binding sites for transcription factors, are often difficult to identify due to the limited knowledge available on transcription factors and their mechanism of control. Computational motif discovery approaches offer a solution to this problem by searching for short sequences within regulatory regions, without making much prior assumptions about the transcription factor or its binding mechanism. Many of these tools have been used to identify potential transcription factor binding sites on promoter sequences. In the fly genome, only ~35% of the promoters contain any of the known promoter motifs. This suggests that there are many core promoter motifs that remain unknown and are important in regulation of gene expression. One reason for the difficulty in identifying new promoter motifs could be that most of the motif discovery methods have essentially focused on promoters as a single set. By studying the promoter as subsets, we may be able to achieve a better signal to noise ratio for computational motif discovery. We grouped promoters into three separate subsets based on their location with respect to the transcription start site, the unique promoters (UPs), the first alternative promoters (FAPs) and the downstream alternative promoters (DAPs). Multiple motif discovery tools were used to identify potential promoter motifs in these sets. These motifs were then clustered based on similarity and the most similar motifs were merged together. A total of 104 potential promoter motifs were identified in this study. Among the 104 motifs identified, 59 motifs (56.7%) were found in DAPs, 24 motifs (23%) in the FAPs and 21 motifs (20.2%) in the UPs sets. This indicated that by including the DAPs and the FAPs in this study, we were able to identify many new promoter motifs in these subsets. The motif characteristics (position bias, strand bias, promoter bias, information content and promoter class bias) of each of the motifs were studied individually and the motifs were ranked based on the presence of these characteristics.