LM: Literature Mining Analysis
Parameter Information
Location of Support File(s)
This option allows users to select the location where all support files needed to run BN.
Network Priors Sources
The checkboxes provide the users to select the source of Bayesian prior probablities in constructing a seeded network.
Currently Literature Mining and KEGG priors are avaialble. The Protein - Protein Interaction as a source of priors is still under development.
As of now, the KEGG support files are automatically downloaded from TN4 website by the application. The user is prompted for Species information
if annotation is not avaialble. All other prior sources must be made avaialble.
Discretize Expression Values
The data mining algorithm requires that the data be discretized into bins before it can be evaluated for network structure learning.
It is strongly recomended that user selects the default value of 3, which means the data can exist in 3 states:
- Under expressed
- Over expressed
- Unchanged
The algorithm functions and reports meaningfully if the 3 state rule is followed.
How to direct Edges for graph
The algorithim uses DFS or Depth First Search to connect nodes in the intial seeded network. For large networks with lots of nodes this can take a while
to complete. The GO Term option of directing edges is not yet fully developed.
Using Support Files created for standard arrays
We have pre-created support files needed to run BN or LM analysis for some popular microarray platforms like Affymetrix, Agilent etc. Currently we are providing support files 3 species Human, Mouse & Rat. MeV comes preloaded with the files for 2 array types in the ~/data/BN_files folder.
- Afymetrix Human U133 Plus 2 Array
- Affymetrix Mouse 430 2 Array
- Support file FTP Location: Human, Mouse & Rat only
- ftp://occams.dfci.harvard.edu/pub/bio/tgi/data/Resourcerer/Human
- ftp://occams.dfci.harvard.edu/pub/bio/tgi/data/Resourcerer/Mouse
- ftp://occams.dfci.harvard.edu/pub/bio/tgi/data/Resourcerer/Rat
- File Naming Conventions:
- All BN/LM related files ends with *_BN.zip. E.g.: affy_HG-U133_Plus_2_BN.zip
- All files start with array/chip vendor name, affy for Afymetrix. E.g.: affy_HG-U133_Plus_2_BN.zip
- Vendor name is followed by chip/array name. E.g.: affy_HG-U133_Plus_2_BN.zip
- Contents of zip files: All zip files contain 6 files
- affyID_accession.txt
- res.txt
- symArtsGeneDb.txt
- symArtsPubmed.txt
- all_ppi.txt
- gbGO.txt
- Steps to use the pre-designed support files. Example array chosen for illustration is Affymetrix Human U133 Plus 2. To use any array follow the steps below:
Download your species & array specific file from the FTP location mentioned above. E.g. affy_HG-U133_Plus_2_BN.zip
- Extract the contents of the zip under the following MeV directory: ~/data/BN_files/
- Once extracted, a folder by the array name will be created. In this case if the example file was downloaded, the following location will now exist: ~/data/BN_files/affy_HG-U133_Plus_2_BN
- Verify all 6 files exist.
- From Mev launch LM or BN module.
- In the start-up dialogue make sure the ‘File(s) Location’ box points to the folder where the supporting files are downloaded for the species and array concerned. If the example array was chosen, the text box should point to ./data/BN_files/affy_HG-U133_Plus_2_BN folder.
- Now you are ready to start the algorithm.