SEGID: Conserved Segment Identification

SEGID is a sequence analysis tool designed to identify conserved segments in a (multiple) sequence alignment. Conserved segments are high-scoring substrings in a long alignment which are probably biologically meaningful. SEGID accepts an alignment, converts the alignment into a sequence of numbers, one for each column, identifies its conserved segments, and generates graphical output. (It can also directly accept a sequence of numbers as input.)

How to use SEGID: An example

(Click the button on the left to start SEGID)

SEGID provides three algorithms to identify 'interesting' segments:
  1. Longest segment with average value lower bound;
  2. All Maximal Length segments with average value lower bound and length lower bound;
  3. N-Maximum Score Segments with length upper bound.

(Click here to see formal definitions and algorithms.)

SEGID has following advantages:
  # Focus on segments rather than individual columns;
  # Multiple adjustable parameters;
  # Fast calculation. All three algorithms run in linear time;
  # Compatible with multiple platforms;
  # User-friendly interface.