Compression within a Genetic Sequence using Unique Pattern Indexing, Mining Frequent Pattern and Mapping Frequency.

Siddiki, Md Noor E Alam (CSE 05206578); Mahmud Hasan, Md (CSE 05206579)

DSpace Home
→
Department of Computer Science & Engineering
→
Internship Report
→
View Item

Compression within a Genetic Sequence using Unique Pattern Indexing, Mining Frequent Pattern and Mapping Frequency.

Siddiki, Md Noor E Alam (CSE 05206578); Mahmud Hasan, Md (CSE 05206579)

URI: http://182.160.110.28:8080/xmlui/handle/123456789/122

Date: 2017-07-02

Abstract:

Searching for the frequent pattern within a specific genetic sequence has become a much needed task in the bioinformatics sector. Most recent works are based on Apriori algorithm, GSP, MacroVspan etc. techniques. However, frequent pattern mining can be made more efficient. We proposed three algorithms for this paper based on Unique Pattern Indexing and Searching Frequent Pattern.The first one is DNA subsequence replaced by unique index. The second algorithm compressed the sequence using Huffman formula and the third algorithm replaced the unique index with the compressed hexadecimal number. Due to its highly frugal nature, the proposed algorithm can reduce typical memory usage by 42%atthe very minimum.

Show full item record