NEW (March 22, 2010): Sigma 2.0-BETA available, for the brave. Go here.

NEW (March 16, 2009): Sigma 1.1.3 released

This version contains an improved local-alignment algorithm (linear in space, though quadratic in time), due to Gayathri Jayaraman. The "pre-fragmentation" option ("-l") in versions 1.1 through 1.1.2 has been removed. There is also a bugfix (an off-by-one error). See changelog in the source file for details.

Sigma 1.0 was written in ocaml (why ocaml?). Sigma 1.1 (April 3, 2007) has been rewritten in C (why not ocaml?), is more correct (some bugs and shortcomings have been fixed) and can handle much larger datasets. It is also significantly faster: generally many times faster than version 1.0, or other programs such as Dialign or ClustalW. For details, see the README file in the source distribution. Sigma 1.1.1 and 1.1.2 were minor updates, while 1.1.3 includes significant improvements and a bugfix.

Sigma

Most tools for multiple-sequence alignment are focussed on aligning protein sequence or protein-coding DNA sequence. Sigma ("Simple greedy multiple alignment") is an alignment program with a new algorithm and scoring scheme designed specifically for non-coding DNA sequence. This problem is now growing in importance with the increasing number of fully-sequenced species. In particular, studies of gene regulation seek to take advantage of comparative genomics, and recent algorithms (such as PhyloGibbs) for finding regulatory sites in phylogenetically-related intergenic sequence require alignment as a preprocessing step.

Tests on synthetic data generated to mimic real data show excellent performance, with Sigma showing much greater ``sensitivity'' (more bases aligned) and fewer ``incorrect'' alignments. Results on real data are harder to quantify, but PhyloGibbs performs well on Sigma-generated alignments.

References:
Rahul Siddharthan, "Sigma: multiple alignment of weakly-conserved non-coding DNA sequences", BMC Bioinformatics 7:143 (2006)

The code (Version 1.1.3: March 16, 2009): (Upgrade from previous versions recommended!)

You may need to make the binary executable: type "chmod +x sigma". Binaries for other platforms may be made available later, or you are welcome to contribute them.

You may redistribute these binaries and source under the terms of the GNU General Public License, version 2. (Short inexact summary: you may use them privately as you like, modify them, distribute them; if you distribute modified binaries, you must distribute the corresponding source on request, also under the GNU GPL).

If you're using Sigma for actual research, please let me know so that I can alert you of bugfixes or new releases.

Help with the program

The program does not have too many command-line options, and running it with the "-help" option (or with no option or an invalid option) produces a help summary. There is also a manpage. Output is to standard output, which may be redirected to a file or piped through some other command. Here are some key points:

Limitations in version 1.0 (addressed in 1.1, see README in source distribution):

If you are interested in version 1.0 anyway, the archived source code is here.

For any further information, contact me.

Rahul Siddharthan
The Institute of Mathematical Sciences, Chennai