GenMAPP Help Topics    
  GenMAPP Introduction   MAPP Sets
  Drafting Board   MAPPFinder
  Drafting Board Toolbar   MAPPBuilder
  The Gene Database   Downloader
  The Gene Database Manager   Advanced Concepts
  Expression Datasets   GenMAPP Knowledge Base
  Expression Dataset Manager   Converter

MAPPFinder

Overview

MAPPFinder is an accessory program that works with GenMAPP and the annotations from the Gene Ontology (GO) Consortium to identify global biological trends in gene expression data. MAPPFinder relates microarray data meeting a user-defined criterion for a "significant" change to each term in the Gene Ontology hierarchy, calculating the percentage of genes changed for each GO biological process, cellular component, and molecular function term. MAPPFinder then calculates a cumulative total of genes changed for a parent GO term and all of its children and a statistical score, giving a complete picture of the gene expression changes associated with a particular GO term. Furthermore, this MAPPFinder analysis can be performed on a set of MAPPs local to a user's computer, calculating the percentage of genes meeting the criterion for each MAPP.

The results of MAPPFinder analysis are both exported as a text file and displayed in a searchable MAPPFinder Browser, allowing the user to quickly identify those areas of biology that show correlated gene expression changes. Clicking on a GO term in the MAPPFinder Browser window opens a MAPP in GenMAPP that lists all the genes associated with that GO term so that the user can view the individual genes. MAPPFinder creates a gene expression profile across all areas of biology represented in the GO, allowing the user to view at a glance where the largest correlated gene expression changes are occurring in the data.

If you use MAPPFinder for analysis in published work, please cite: Doniger, S.W., Salomonis, N., Dahlquist, K.D., Vranizan, K., Lawlor, S.C., and Conklin, B.R. 2003. MAPPFinder: using Gene Ontology and GenMAPP to create a global gene-expression profile from microarray data. Genome Biology 4:R7.

To launch the MAPPFinder application, double-click on the MAPPFinder shortcut installed on your Desktop, choose GenMAPP 2 > MAPPFinder 2 from the Start > Programs menu, or from within the GenMAPP Drafting Board, choose Tools > MAPPFinder.

Main Menu

When you open the MAPPFinder program, the MAPPFinder Main Menu window will appear. This window allows you to choose which MAPPFinder function you wish to perform: Load Local MAPPs, Calculate New Results, or Load Existing Results.

Clicking on the close box in this window closes the program. MAPPFinder will automatically load your most recently used Gene Database, however if you would like to change which database to use (e.g., you want to calculate results for a different species, or have updated your Gene Database) you can do that from the Main Menu. Go to File>Choose Gene Database and select the appropriate Gene Database.

Loading Local MAPPs into MAPPFinder

In addition to performing MAPPFinder analysis on genes associated with GO terms, you have the option of performing the analysis on MAPPs created with GenMAPP that are local to your computer. For example, you may want to use the entire "Contributed" MAPP Archive from the Downloader, or you may have created your own MAPPs that you would like to use for analysis.

Note: Do NOT include the "GO_Samples" MAPP Archive when loading local MAPPs into MAPPFinder. Since the MAPPFinder application automatically calculates results based on the full GO hierarchy if the "Gene Ontology" analysis option is chosen, including the GO MAPP Archives is unnecessary and will result in delays and potential problems when running MAPPFinder.

 automatically calculates results based on the full GO hierarchy if the "Gene Ontology" analysis option is chosen, including the GO MAPP Archives is unnecessary and will result in delays and potential problems when running MAPPFinder.

To run a MAPPFinder analysis on local MAPPs, you must first load them into the program. To load local MAPPs, click on the Load Local MAPPs button in the MAPPFinder Main Menu window, or choose File > Load Local MAPPs from the Calculate New Results or MAPPFinder Browser windows.

The Choose Local MAPPs Folder window will open. The species of the current Gene Database will be displayed, if this is not the species for which you have data, you will need to choose a new Gene Database. To do this, go to File>Choose Gene Database. Next, you must select the folder containing the local MAPPs you would like included in the MAPPFinder analysis. The analysis will be run on MAPPs in this folder and all subfolders contained within this folder. Click on the Choose Folder button to select the folder. Another window will open where you must select the drive and folder for your local MAPPs. The folder that is shown as open is the folder you are selecting. You must double-click on the folder of your choice; you cannot just select the drive. Click OK when you have made your selection.

Once you have chosen a local MAPPs folder, click on the Load MAPPs button. Loading may take a few minutes depending on the number of MAPPs that you are loading. When MAPPFinder has finished loading the MAPPs, you will be returned to the Main Menu window.

Note: The upper limit for the number of local MAPPs you can load into MAPPFinder is 5000. The more local MAPPs you are loading, the longer it will take to load them and the longer it will take to run the MAPPFinder analysis.

MAPPFinder allows for one set of local MAPPs per species. If you change any of the MAPPs, or add MAPPs to the folder you loaded in to MAPPFinder, you will need to reload the local MAPPs.

Preparing Data for MAPPFinder

MAPPFinder accepts gene expression data in the GenMAPP Expression Dataset (.gex) file format. For help with creating an Expression Dataset file, see the Expression Dataset Manager. An interactive tutorial describing the data format is available here.

Once you have imported your gene expression data into an Expression Dataset, create the Color Sets and Criteria you would like to use to filter the data. The criteria will determine the "changed" genes when calculating the MAPPFinder percentages, Z scores, and p values. MAPPFinder requires that at least one color set be included in the Expression Dataset, without it MAPPFinder will not be able to perform any calculations.

Calculating New Results

Once you have created your Expression Dataset with GenMAPP and loaded local MAPPs into MAPPFinder, you are ready to calculate new results. To calculate new results, click on the "Calculate New Results" button in the MAPPFinder Main Menu window, or choose File>Calculate New Results from the MAPPFinder Browser window. You will need to go through a series of steps to setup the analysis:

  1. At the top of the window, select the Color Set you would like to use for analysis by clicking on its name.
  2. Select the Criterion you would like to use by clicking on its name. To select multiple criteria, left-click on each one you want to use. Each criterion will be calculated independently, it is not an "and" or "or" selection. The last criterion selected is the one that is displayed when all of the results are calculated. To access results for any remaining criteria, use the Load Existing Results feature.
  3. The species that corresponds to the currently loaded Gene Database is displayed. If this is not the correct species for the Expression Dataset you are loading, you will have to choose a new Gene Database. Go to File>Choose Gene Database to do this.
  4. Select the type of analysis you would like to run by clicking on the check boxes for Gene Ontology and/or Local MAPPs. You can calculate results for just the Gene Ontology hierarchy, just local MAPPs, or both at the same time.
  5. If you would like to calculate p values for your MAPPFinder results click the Calculate P values box. Calculating the p values will take an additional 5-10 minutes per criterion.
  6. Choose a filename for the results by clicking on the Browse button. MAPPFinder will create a separate tab-delimited text file (.txt) for the Gene Ontology and Local MAPPs results with the criterion number and either -GO or -Local appended to the file name. MAPPFinder will warn you if you are going to overwrite existing files. Do not remove the -GO or -Local from the filenames; if you do, MAPPFinder will not be able to recognize the files as MAPPFinder results files. MAPPFinder also creates another file with the same name as the Expression Dataset file, but with the extension .gdb (e.g., MyExperiment.gdb). This file is necessary for viewing the results in the MAPPFinder Browser.
  7. Once you have filled out the entire form, clicking on the Run MAPPFinder button at the bottom of the window will begin the analysis. Alternatively, clicking on the Main Menu button will return you to the Main Menu window; clicking on the New Dataset button will allow you to choose a new Expression Dataset from which to calculate results.
  8. When the results have been calculated, your data will be displayed in the MAPPFinder Browser.

Loading Existing Results

Once you have calculated results for an Expression Dataset with MAPPFinder, you may view these results at any time in the MAPPFinder Browser.

To load existing results, click on the Load Existing Results button in the MAPPFinder Main Menu window, or choose File > Load Existing Results from the Calculate New Results window or MAPPFinder Browser. A new window will open and will ask you to select your results files for GO and local results.

To select a Gene Ontology results file, click on the corresponding button to browse and select your file. To select a Local MAPP Results file, click on the corresponding button to browse and select your file. You may choose just Gene Ontology results or just Local MAPP Results or both for viewing in the MAPPFinder Browser window. The species that corresponds to the currently loaded Gene Database is displayed. If this is not the correct species for the results files you are loading, you will be prompted to choose a new Gene Database. Go to File > Choose Gene Database to do this. After you have selected your file(s) and verified that the appropriate species is shown, click the Load Files button to view the results in the MAPPFinder Browser window. MAPPFinder requires the file MyExperiment.gdb to properly load results. If it cannot find this file, it will ask you to locate it.

Note: MAPPFinder results are specific to a particular build of the GenMAPP Gene Database, and the Local MAPPs loaded. This means that if you update your Gene Database, or change the local MAPPs loaded, you must recalculate the results. This is necessary because the GO is evolving and with each new database build, the GO structure will be slightly different.

Viewing Results Using the MAPPFinder Browser

Browsing the results

Once MAPPFinder has finished calculating new results or if you have loaded existing results, the MAPPFinder Browser window will open. The MAPPFinder Browser is similar in appearance to the AmiGO Browser and can be used very similarly to navigate the GO hierarchy and local MAPP folder structure.

There are several ways to navigate the tree structure. The scroll bars to either side allow you to move within the tree window. In addition the entire MAPPFinder Browser can be enlarged or reduced to fit your screen by dragging the lower right-hand corner of the window. A + box indicates that the node contains children. Clicking on the + will open the node and display its children. Alternatively, clicking on the - will collapse a node and hide its children. Just above the tree window are three text boxes that allow you filter the tree to highlight only those nodes meeting specific criteria. You can filter by the percentage of genes changed, the number of genes changed, and the Z score (or p value calculated). Once you create your filter, clicking Expand Tree will highlight all of the appropriate nodes in yellow. Clicking Collapse Tree will collapse all of the nodes so that only the root node is shown. The browser defaults to highlighting all terms with at least 3 genes changed and an absolute Z score > 2 or a p value < .05.

Searching the results

It is also possible to search for a specific GO term, MAPP, or a keyword, using the Word Search option. Enter your keyword or term and click the appropriate radio button for keyword search or exact match, then click the Word Search button. If a MAPP or GO term is found, that node will be colored blue. If the node was already colored yellow as a result of the filter, the node will be colored green.

Similarly, you can search for a specific gene within all of the local MAPPs and Gene Ontology using the Gene ID Search. Enter your gene ID (alternate gene names are not accepted at this time) and select the appropriate gene ID system from the pull-down menu, then click Gene ID Search. If a MAPP or GO term is found, that node will be colored blue. If the node was already colored yellow as a result of a filter, the node will be colored green.

Viewing GO terms, MAPPs and genes

There are two ways to examine the genes associated with a particular GO term or MAPP, both are activated by a mouse click on the local MAPP or GO term of interest in the tree window. The first option for exploring the particular genes is to open a MAPP in GenMAPP. This can either be a pre-existing local MAPP, or a list of genes in a particular GO term that MAPPFinder will build automatically. The MAPP will be opened in GenMAPP and colored with the same Expression Dataset used to calculate the results.

Note: For the GO terms, MAPPFinder recreates the MAPP showing the list of genes associated with that GO term. It is our hope at GenMAPP.org that our users will find these lists as useful templates for creating fully delineated pathway MAPPs and that these pathways will be contributed to the MAPP Archive at GenMAPP.org. However, if you do make any modifications you will need to save the modified MAPP with a new name or it will be rewritten if you click on that GO term again.

The second option is to export the list of genes associated with a particular MAPP or GO term. You can set how you would like MAPPFinder to react to a mouse click by clicking on the Options menu item. This will open the MAPPFinder options window where you can select the MAPPFinder Browser's node click response.

With the second option, a mouse click on a MAPP or GO term will open a new window containing a list of the genes changed and the genes measured for that node.

From this window you can export the lists as a text file.

Calculation Summary

Clicking on the menu Calculation Summary, opens a window that displays information about the overall MAPPFinder results:

For clarity the following terms need to be defined for their use in the context of MAPPFinder

Ranked List

Clicking on the menu item Show Ranked List, opens a window that displays the MAPPFinder results ranked by their corresponding z score (or p value, if calculated). There are separate lists for Local MAPPs and Gene Ontology terms. Clicking on a term in this window will take you to the same MAPP or term in the MAPPFinder Browser.

The GO date in the top right corner of the MAPPFinder browser window indicates the date that the Gene Ontology used in the current analysis was downloaded from the Gene Ontology web site. The GO is a dynamic entity that is being updated on a weekly basis, so if you are working with multiple versions of the Gene Database or other genomics tools that use the Gene Ontology, you will inevitably find discrepancies between different versions of the GO.

Viewing Results as a Spreadsheet

The MAPPFinder results files are saved as a tab-delimited text file (.txt) that can be viewed in a spreadsheet programs (e.g. Microsoft Excel). This will allow you to do additional filtering and sorting of the results to help you identify those MAPPs that are of particular interest to you and to generate tables for papers and presentations.

Note: The -GO and -Local result files are used by MAPPFinder to display existing results in the MAPPFinder Browser window. Any changes made to these files will prevent MAPPFinder from reading the file and cause errors in the program. If you are going to be working with the files in a spreadsheet program you should work with copies of the results files.

At the top of the file, general information and the calculation summary for the results are given. The first 8 lines contain information needed by MAPPFinder when loading these results. Below that is the calculation summary of your results file. Below the general information and calculation summary, you will see columns representing the data calculated by MAPPFinder. Below is an example of what a MAPPFinder results file might look like in a simple text processor.

Working with Multiple Versions of the GenMAPP Gene Database

GenMAPP.org will be releasing new versions of Gene Databases on a regular basis. If you update your Gene Database (which is strongly recommended so that you have the most up-to-date gene annotation) then there are several things you must be aware of in MAPPFinder.

Exporting Multiple MAPPs

You can export GO terms and Local MAPPs in .mapp format, based on either a numerical filter or a word/Gene ID search. From the MAPPFinder Browser, under the File menu, you have the option of Export Terms Matching Numerical Filter or Export Terms Matching Word/Gene ID Search. Both options will open a window where you can enter the name of the folder you would like to create to store these MAPP files. This folder will be placed a your default MAPP folder for MAPPFinder, for example, C:\GenMAPP 2 Data\MAPPs\Mm.

This functionality was specifically created for use with the MAPP Sets function of GenMAPP. Once you have calculated your MAPPFinder results, you can create a filter to highlight the "interesting" GO terms and Local MAPPs and export these terms as a folder of MAPP files. Within the GenMAPP Drafting Board menu, choose File > Export > MAPP Sets. In the MAPP Sets window, choose the folder with the MAPPs exported from MAPPFinder. The MAPP Sets function will then create a set of HTML-formatted MAPPs for display on your website or to share with colleagues.

Export as text

You can also export the highlighted GO terms and MAPPs as text. This option is available in the File menu, under Export highlighted Portion of Tree as Text.

MAPPFinder Calculations

The calculations made by MAPPFinder are intended to give you an idea of the relative amount of genes meeting your criterion that are present in each GO term or Local MAPP. The figure below shows a sample line from the MAPPFinder Browser with the different numbers that are calculated. Each number displayed in the MAPPFinder results and the Calculation Summary is defined below.

where N is the total number of genes measured, R is the total number of genes meeting the criterion, n is the total number of genes in this specific MAPP, and r is the number of genes meeting the criterion in this specific MAPP. A positive Z score indicates that there are more genes meeting the criterion in a GO term/MAPP than would be expected by random chance. A negative Z score indicates that the there are fewer genes meeting the criterion than would be expected by random chance. If the MAPPFinder data truly obeyed the assumptions of the hypergeometric distribution, then a Z score or 1.96 or -1.96 would correlate with a p value of 0.05.