J4 ›› 2014, Vol. 41 ›› Issue (6): 95-99.doi: 10.3969/j.issn.1001-2400.2014.06.016

• Original Articles • Previous Articles     Next Articles

Novel cluster refinement algorithm for DNA motif discovery

ZHANG Yipu   

  1.  (School of Computer Science and Technology, Xidian Univ., Xi'an  710071, China)
  • Received:2013-12-17 Online:2014-12-20 Published:2015-01-19
  • Contact: ZHANG Yipu E-mail:zephyr26026@gmail.com

Abstract:

The motif discovery problem is an important aspect of the analysis of gene transcriptional regulatory relationship. This paper describes a novel entropy-based cluster refinement algorithm (ECRmotif) for motif discovery in DNA sequences. ECRmotif employs a flexible probabilistic model to identify motif from the background sequences. It first utilizes an entropy-based cluster process to divide the dataset into several subsets, and then reduces the instances searching space for each candidate subset and refines the motif from the candidate subsets. Experiments by using both synthetic and real datasets demonstrate that our algorithm increases the running speed and efficiency and finds motif accurately.

Key words: motif discovery, cluster, refinement