SAmpling-Noise-corrected Inference of Transcription
activitY:
A rigorous Bayesian method for inferring gene
expression states of single cells from raw
scRNA-seq data. Sanity estimates expression values
and associated error bars directly from raw unique
molecular identifier (UMI) counts without any
tunable parameters.
Clustering scRNA-seq data at the highest possible
resolution. Cellstates infers clusters of cells in
scRNA-seq data whose gene expression states are
statistically indistinguishable, taking raw UMI
count data as input.
The REference sequence ALignment based PHYlogeny
builder is a free computational pipe-line that
builds core genome alignments and infers
phylogenetic trees from whole genome sequence
data. REALPHY takes a set of FASTQ files with raw
sequencing reads of the genomes of a set of
related strains together with one or more
reference genomes and automatically builds a core
genome and infers the phylogeny of the strains.
Regulatory motif finding and prediction of transcription factor binding sites
Motevo
An integrated suite of Bayesian probabilistic methods
for the prediction of TFBSs and inference of
regulatory motifs from multiple alignments of
phylogenetically related DNA sequences.
An algorithm for de novo identification of
regulatory motifs from a collection of DNA
sequences, including multiple alignments of
orthologous sequences from related organisms.
The DWT-toolbox is a collection of software tools
for performing motif finding and transcription
factor binding site (TFBS) predictions with
Dinucleotide Weight Tensors (DWTs), which
generalize position-specific weight matrices by
allowing arbitrary dependencies between the
PWM columns.
PROCSE (PRObabilistic Clustering of SEquences)
uses Monte Carlo sampling to cluster thousands of
short DNA sequences under the assumption that the
sequences in each cluster derive from a common
(but unknown) position-specific weight matrix.
A computational method that uses Hidden Markov Models
and an Expectation Maximization algorithm to detect
cis-regulatory modules in metazoan genomes, given the
weight matrices of a set of transcription factors
known to work together.
SPA is a computer program for aligning cDNA
sequences to a genome. It uses a probabilistic
Bayesian model to find the optimal alignment of
exons and introns, including splice sites.