Single-cell RNA-Seq analysis

SAmpling-Noise-corrected Inference of Transcription activitY:
A rigorous Bayesian method for inferring gene expression states of single cells from raw scRNA-seq data. Sanity estimates expression values and associated error bars directly from raw unique molecular identifier (UMI) counts without any tunable parameters.
Clustering scRNA-seq data at the highest possible resolution. Cellstates infers clusters of cells in scRNA-seq data whose gene expression states are statistically indistinguishable, taking raw UMI count data as input.

Phylogenetic analysis

The REference sequence ALignment based PHYlogeny builder is a free computational pipe-line that builds core genome alignments and infers phylogenetic trees from whole genome sequence data. REALPHY takes a set of FASTQ files with raw sequencing reads of the genomes of a set of related strains together with one or more reference genomes and automatically builds a core genome and infers the phylogeny of the strains.

Regulatory motif finding and prediction of transcription factor binding sites

An integrated suite of Bayesian probabilistic methods for the prediction of TFBSs and inference of regulatory motifs from multiple alignments of phylogenetically related DNA sequences.
An algorithm for de novo identification of regulatory motifs from a collection of DNA sequences, including multiple alignments of orthologous sequences from related organisms.
The DWT-toolbox is a collection of software tools for performing motif finding and transcription factor binding site (TFBS) predictions with Dinucleotide Weight Tensors (DWTs), which generalize position-specific weight matrices by allowing arbitrary dependencies between the PWM columns.
PROCSE (PRObabilistic Clustering of SEquences) uses Monte Carlo sampling to cluster thousands of short DNA sequences under the assumption that the sequences in each cluster derive from a common (but unknown) position-specific weight matrix.
A computational method that uses Hidden Markov Models and an Expectation Maximization algorithm to detect cis-regulatory modules in metazoan genomes, given the weight matrices of a set of transcription factors known to work together.

Spliced alignment

SPA is a computer program for aligning cDNA sequences to a genome. It uses a probabilistic Bayesian model to find the optimal alignment of exons and introns, including splice sites.
