IsoDE README This README file includes the following sections A. Installation: B. How does it work? C. IsoBoot documentation D. IsoDECalls documentation E. Steps for using IsoDE A. Installation: ---------------- 1. Create a IsoDE directory and download the compressed IsoDE-1.0.0.zip from http:http://dna.engr.uconn.edu/software/IsoDE/IsoDE-1.0.0.zip 2. Uncompress IsoEM-1.0.6.zip into the directory. 3. Run the unix script setup, provided in the compressed file. ______________________________________________________________________________________________________________________ B. How does it work? -------------------- IsoDE performs differential gene expression analysis for RNA-Seq data without replicates based on bootstrapping. DE analysis is done in two steps. The first step step is to generate the bootstrap samples, which is done by IsoBoot. IsoBoot starts with alighned RNA-Seq reads (sam files). It samples from the input sam files, and runs IsoEM (gene expression prediction tool) on each bootstrap sample. The input sam files should be sorted by the read name. The second step computes the DE results using the generated bootstraping output. The tool that performs the second step is called IsoDECalls. The executables and source code for IsoBoot, IsoEM, and isoDECalls are included under bin and src directories, respectivley. ______________________________________________________________________________________________________________________ C. IsoBoot documentation ------------------------ IsoBoot: Bootstrapping of sam files and computation of genes/isoforms estimate using IsoEm. isoboot -n -c -m -d Mandatory parameters -------------------- A gtf file List of Sam files USING ABSOLUTE PATH -m -d -a Optional parameters -n -c Note: ----- Either -a or both -m and -d must be present Output: ------- For each sam file given as input: a _DIR: directory at the same location where the sam file, with the following 3 sub-directories inside: - Genes: to hold all genes estimates on the different bootstrap samples; one output file per sample - Isoforms: to hold isoforms estimates on the different bootstrap samples; one output file per sample - One_run: to hold output of one isoem run on the original sam file (genes and isoforms estimates) Estimate for this One run are saved in One_run/Gene and One_run/Isoform subdirectories. Example ------- isoboot hg19Ensembl64.gtf /data1/Control_Rep7_hg19GenEns64inGenCoord.sam -n 10 -c hg19Ensembl64TranscriptToGene.txt -m 260 -d 30 or isoboot hg19Ensembl64.gtf ./Control_Rep7_hg19GenEns64inGenCoord.sam -n 20 -c hg19Ensembl64TranscriptToGene.txt -a ______________________________________________________________________________________________________________________ D. IsoDECalls documentation --------------------------- IsoDECalls: Bootstrapping-based gene Differential Expression Analysis isode -c1 -c2 -b -dfc -out Mandatory parameters -------------------- -c1 List of boostraping path for condition 1 using ABSOLUTE PATH -c2 List of booststrapping path for condition 2 using ABSOLUTE PATH -b bootstrap support -dfc desired fold Change -out the output file will be saved inside each directory given as input. Notes ----- 1. All directory name should end with "/" 2. The FPKM of each genes for condition 1 are paired to each FPKM for the same gene in condition2 resulting in N^2 FC values if we have N runs of bootstrap per condition. 3. To help compute the support threshold for your analysis for a given significance level based on the number of bootstrap samples, a tool is available at http://dna.engr.uconn.edu/~software/cgi-bin/calc/calc.cgi Output ------ Bootstrap_Merge1_DIR - A directory the includes isoem output for bootstrap samples, for all replicates of condition 1 combined. It has the same structure os subdurectories as the in the output of isoboot, described above. Bootstrap_Merge2_DIR - A directory the includes isoem output for bootstrap samples, for all replicates of condition 2 combined. It has the same structure os subdurectories as the in the output of isoboot, described above. output.txt - The final output file with DE results. This file will be located in Bootstrap_Merge1_DIR and Bootstrap_Merge2_DIR Description of Isode Output file: --------------------------------- P: List of ratios. Each ratio is computed by pairing each first bootstrap sample with every second bootstrap sample resulting in N^2 FC values in case of N bootstrap runs per condition. B: Bootstrap support FC: Fold change Using the parameters defined above, IsoDE produces a csv formatted file containing the following 7 columns (the name of each column is in parentheses): 1- Gene(ID) 2- log_2(FC) lower/upper bound for user specified bootstrap support >B (B=xx%) (log2(FC) for x% support) sort P = {set of logs of ratios of pairs of bootstraps} in descending order - compute B’th log-value P_B If P_B >0 then report P_B else sort P = {set of logs of ratios of pairs of bootstraps} in ascending order - compute B’th log-value P_B If P_B < 0 then report P_B else report NDE 3- Support for user given FC (FC= xx in %) [support for FC >= x] sort P = # logs of ratios of pairs of bootstraps in ascending order report (| bootstrap ratios larger than FC| / |P| )*100% 4- Support for user given 1/FC (FC = 1/xx in %) (support for FC <= 1/x) sort P = # logs of ratios of pairs of bootstraps in ascending order report (|bootstrap ratios smaller than FC| / |P|) *100% 5- log_2 fc = log_2(2ndFPKM/1stFPKM) (Exp FC) [ Exp log2(FC) ] 6- 1stFPKM = original IsoEM (FPKM_1) 7- 2ndFPKM = original IsoEM (FPKM_2) Example ------- isode -c1 /data1/BRAIN_UHR_Test/BRAIN_Genome_DIR/ /DataSet1/Test1_DIR/ -c2 /data1/BRAIN_UHR_Test/UHR_Genome_DIR/ /DataSet1/Test2_DIR/ -b 50 -dfc 2 -out "output1.txt" isode -c1 ./BRAIN_Genome_DIR/ ./Test1_DIR/ -c2 ./UHR_Genome_DIR/ ./DataSet1/Test2_DIR/ -b 50 -dfc 2 -out "output2.txt" ______________________________________________________________________________________________________________________ E. Steps for using IsoDE ------------------------ The steps below lists the steps needed to run IsoDE on RNA-Seq data. 1) Map the RNA-Seq reads using a mapper suitable for your data 2) Sort the resulting SAM files by read name. If not sure, run this command to sort each of the sam files sort -k 1,1 aligned_reads.sam > aligned_reads_sorted.sam 3) Run IsBoot on the resulting sam files. The executable is bin\isoboot. 3) Run IsoDECalls on the resulting sam files. The executable is bin\isodecalls.