DattaLab Software Page


All software/computer codes listed here are for non-commercial use only.

Furthermore, they are being distributed “as is” without any implicit or explicit

warrantee of any kind. In other words, use them at your own will and risk.




Bioinformatics Software


·       R Code for computing association scores with NGS-data        




Pesonen, M., Nevalainen, J., Potter, S. S, , Datta, S. A Combined PLS and negative Datta, S.

binomial regression model for inferring association networks from next-generation sequencing

count data. IEEE/ACM Transactions on Computational Biology and Bioinformatics, to appear . doi: 10.1109/TCBB.2017.2665495, PMID: 28186904


·       R Code for Differential network Analysis with Palate Development NGS data (preliminary)         




“Differential Network Analysis for Palate Development”. Poster Presentation by Tyler Grimes.

FaceBase Meeting, Boston, May 2017.


·       R Code for Master Regulator Identification         




Sikdar, S. and Datta, S. A novel statistical approach for identification of the master regulator transcription factor,  preprint (2016).



·       R Package for SVAPLS




Chakraborty, S., Datta, S. and Datta, S. svapls: An R package to correct for residual expression heterogeneity in gene expression data. BMC Bioinformatics  14, 236 (2013).


Chakraborty, S., Datta, S. and Datta, S. Surrogate variable analysis using partial least squares (SVA-PLS) in gene expression studies. Bioinformatics  28, 799806 (2012).



·       R Package for Differential Network Analysis




Gill, R., Datta, S. and Datta, S. A statistical framework for differential network analysis from microarray data using partial least squares, BMC Bioinformatics, 11, 95 (2010).


·       R Code for Ensemble Classifier         




Datta, S, Pihur, V. and Datta, S. An adaptive optimal ensemble classifier via bagging and rank aggregation with applications to high dimensional data,  BMC Bioinformatics, 11:427 (2010).






Pihur, V., Datta, S. and Datta, S. RankAggreg, an R package for weighted rank aggregation. BMC Bioinformatics, 10, 62 (2009).


Pihur, V., Datta, S. and Datta, S. Weighted rank aggregation of cluster validation measures: A Monte Carlo cross-entropy approach.  Bioinformatics, 23, 1607-1615 (2007).






Datta, S.  and Datta, S. Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes, BMC Bioinformatics, 7:397 (2006).


Datta, S. and Datta, S. Comparisons and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics, 19,  459-466 (2003).





Datta, S.  and Datta, S.  Empirical Bayes screening (EBS) of many p-values with applications to microarray studies, Bioinformatics, 21,1987-1994 (2005).




Proteomics Software






Satten, G. A., Datta, S., Moura, H., Woolfitt, A., Carvalho, G., De, B. K,  Pavlopoulos, A., Carlone, G. M., and Barr, J. Standardization and denoising algorithms for mass spectra to classify whole-organism bacterial specimens,  Bioinformatics, 20, 3128-3136 (2004). 


·       Peak Detection and Classification using MALDI-TOF Mass Spectra




Ndukum, J., Atlas, M., Datta, S. (2011). pkDACLASS: open source software for analyzing MALDI-TOF, Bioinformation, 6, 45-47. PMC3064853




Statistics Software






Datta, S. and Satten, G. A. Rank-sum tests for clustered data, Journal of the American Statistical Association, 100, 908-915 (2005).





Datta, S. and Satten, G. A. A signed-rank test for clustered data, Biometrics, 64, 501-507 (2008).





Lorenz, D. J., Datta, S. and Harkema, S. J. Marginal association measures for clustered data, Statistics in  Medicine, 30, 3181-3191 (2011).



          R-package for multistate models




Ferguson, A. N., Datta, S., Brock, G. msSurv, an R package for nonparametric estimation of multistate models. Journal of Statistical Software (2012).