
![]() ![]() ![]() ![]() |
Microarray Gene Expression Analysis Tools(for academic use only)Many free and commercial software packages exist for analyzing microarray data. The FGC also provides PathwayArchitect (Stratagene), a commercial package for pathway analysis of your expression data, on a remotely accessible workstation. See our general information page for instructions on how to remote into this workstation.
Statistical issues. Because microarray experiments involve simultaneous comparisons for tens of thousands of variables, the statistical issues are very complicated. There is no consensus about the best way to approach this problem. Various programs can compute the statistics for you, but you might want to consult with a bio-statistician about which statistical tests you should use. There is so much software out there we cannot keep up with all of it. If you know of other good programs, please send us your comments and we will continue to update this page. CommercialIn general, an advantage of commercial programs is that you get customer support if you pay for a program. Unfortunately, the FGC does not have the resources to purchase all of the good software that is available. Available commercial packages that the FGC does not support include ArrayAssist (Stratagene; a free `lite` version is currently available through Affymetrix web site), GeneSpring GX (Agilent), and GeneSight (BioDiscovery). Ingenuity is an alternative commercial web-based program for pathway analysis.FreeGeneChip Operating Software (GCOS). For Affymetrix GeneChips, GCOS will generate spreadsheets with expression scores, view array images, and do a few other basic things. However, the gene expression scores generated by GCOS probably are not as accurate as those generated by other algorithms. We generally recommend GCRMA as the best current algorithm for deriving gene expression scores from Affymetrix arrays. GCRMA, along with RMA and MBEI, are available through GeneSifter. Thus, you probably will want to use GeneSifter or one of the programs listed below to analyze your microarray data. For those wishing to access GCOS it has been installed on or public workstation. See our general information page for instructions on how to remote into this workstation.Bioconductor. This is an open-source, open-development project for the analysis and comprehension of genomic data. It is comprehensive, but plan to spend a lot of time learning how to use it. It is a command-line package based on the R programming language. http://www.bioconductor.org/ ArrayAssist Lite.This is a desktop application for Windows for computation of expression scores from Affymetrix cel files, and basic analysis and visualization of the data. Obtain through Affymetrix web site (free registration required). http://www.affymetrix.com/products/software/specific/arrayassist_lite.affx dCHIP. Also known as Model Based Expression Index (MBEI). It is available as a free desktop application, download at http://biosun1.harvard.edu/complab/dchip/ Harshlight. Finds blemishes on microarrays and masks them. Originally implemented with Linux, was being tested on Windows XP at time of publication. Description at http://www.biomedcentral.com/1471-2105/6/294 TMEV (TIGR MultiExperiment Viewer). Desktop Java application that allows you to analyze your microarray data in several different ways, including various clustering methods, t-tests, ANOVA, and EASE (see below for more on EASE). http://www.tigr.org/software/microarray.shtml RMAExpress. This is a standalone program for Windows (and Linux) to compute gene expression summary values for Affymetrix Genechip data using the Robust Multichip Average (RMA) expression summary. We generally recommend GCRMA over RMA because RMA does not adequately account for non-specific hybridization and therefore tends to underestimate the magnitude of differences between arrays, especially for genes expressed at low levels. However, if you just want to screen for statistical significance, and do not need to estimate fold-change accurately, RMA is fine. http://rmaexpress.bmbolstad.com/ Microsoft Excel (not really free, but already there on most Windows computers). If you are good with Excel, you can do quite a bit with it. For example, if all you want to do is compute average fold-changes and nominal P values from t-tests for GCOS-generated expression scores, you can do this easily by pasting the appropriate formulas into columns. It is easy to order and filter rows and columns of data. If you get only a few hits, you can manually get the current annotations for the probes. However, more complex analysis and plots cannot be done with Excel. SAM (see below) runs as a plug-in within Excel. SAM (Significance Analysis of Microarrays). This program does not compute expression scores from the microarrays, but once you have derived these scores and put them into an Excel spreadsheet, SAM will help you find which genes are differentially expressed. The program was developed at Stanford University and is available at: http://www-stat.stanford.edu/~tibs/SAM/ PAM (Prediction Analysis of Microarrays). This programs finds the genes from your expression data that best classify samples into different groups. Available at: http://www-stat.stanford.edu/~tibs/PAM/ DAVID (Database for Annotation, Visualization and Integrated Discovery). Web tools to visually summarize annotation from large gene lists. Groups genes based on functional similarity and pathway maps. http://david.abcc.ncifcrf.gov/ EASE. Returns biological themes from gene list. Described by Hosack, DA, et al., Genome Biology 2003, 4:R70. Available at the DAVID website. Can download software or submit input online. http://david.abcc.ncifcrf.gov/ease/update/EASE_Files.html PANTHER. The PANTHER (Protein Analysis Through Evolutionary Relationships) Classification System classifies genes by their functions, using published scientific experimental evidence and evolutionary relationships to predict function. Proteins are classified into families and subfamilies of shared function, which are then categorized by molecular function and biological process ontology terms. For many proteins, biochemical interactions in canonical pathways are captured and can be viewed interactively. http://www.pantherdb.org/ oPOSSUM. Web tool to identify over-represented transcription factor binding sites among co-expressed genes. Described by Ho Sui SJ et al., Nucleic Acids Research 2005, 33:3154-64. http://www.cisreg.ca/cgi-bin/oPOSSUM/opossum CREME. CREME is a web-server for identifying and visualizing cis-regulatory modules in the promoter regions of a given set of potentially co-regulated human genes. CREME relies on a database of putative transcription factor binding sites that have been annotated across the human genome using evolutionary conservation with the mouse and rat genomes. A search algorithm is applied to this data set to identify combinations of transcription factors, whose binding sites tend to co-occur in close proximity within the promoter regions of the input gene set. These combinations are statistically evaluated, and significant combinations are reported and visualized. http://creme.dcode.org/ Cismols Analyzer. This program identifies com-positionally predicted cis-clusters that occur in groups of co-regulated genes within each of their ortholog-pair evolutionarily conserved cis-regulatory regions. CisMols Analyzer is based on the hypothesis that the presence of a cluster of ortholog-conserved known cis-acting elements in all or many of the co-expressed or functionally related genes predicts that the genes were co-regulated by these elements. Designed to search for regulatory clusters not just in the upstream region of co-expressed genes but also in the non-coding intronic and 5' and 3' flanking genomic regions, CisMols Analyzer could lead to the discovery of probes for genome-wide identification of regulatory regions. http://info.cchmc.org/help/cismols/index.html L2L (List-to-List). Compares your hit-list of genes to more than 350 other lists from published microarray papers. Output highlights common patterns of gene expression. http://depts.washington.edu/l2l Y.F. Leung web page. Dr. Leung has compiled extensive lists of software for various aspects of microarray data analysis. http://ihome.cuhk.edu.hk/~b400559/arraysoft.html | |