数据分析之peakfinder
ChIPSeq Peak Finder
程序下载地址
总体而言,因为程序都是一堆 python 脚本,写的很分散,所以感觉用
起来不是很好用,所以现在开始测试这个程序。
Peak finder 解压,数了数,一共有17 *.py 文件,也没作什么合并
所以几天都没有跑起来
I.程序文档的基本解读
1.
You will want to first convert Solexa output for the chip
and the control sample into bed files using one of the
following scripts:
maketrackfromeland.py
maketrackfromrealign.py
覆盖 Solexa 输出到 chip, 使用这两个脚本控制 示例 到 基准文件
2.
The following scripts are used to read the output from the
0.3 version of ELAND run with the --multi option:
maketrackfromeland2.py
maketrackmulti.py
下面的脚本用于读 ELAND 0.3版本的输出, 使用 --multi 选项
3.
You can also create a bed-formatted WIG file, for display
The following scripts are used to read the output from the
0.3 version of ELAND run with the --multi option:
maketrackfromeland2.py
maketrackmulti.py
你也能创建一个 基准 WIG 文件,以上的脚本用于读 ELAND 0.3 版本
的输出, 使用 --multi 选项
4.
You will want to first convert Solexa output for the chip
and the control sample into bed files using one of the
following scripts:
maketrackfromeland.py
maketrackfromrealign.py
Chip 到 Solexa 输出的转换,控制 示例 到 基准文件.
5.
on the UCSC browser:
makewiggle.py
USCE 浏览器, 这个脚本什么作用?
6.
The main script actually implements the peak finder:
findall.py
peak finder 实际执行的主脚本
7.
You will want to first convert Solexa output for the chip
and the control sample into bed files using one of the
following scripts:
maketrackfromeland.py
maketrackfromrealign.py
on the UCSC browser:
findallnocontrol.py
文件转换 和 示例 矫正 到 基准,作者推荐使用第一个脚本
8.
NEW FEATURE of findall.py : as of version 2.0, you can
/ should use the -normalize option to calculate
everything as Reads Per Million (RPM). While we have
kept the original behavior as default, we will switch
-normalize to be the default in the next release.
findall.py 脚本的新特征: version 2.0 可以使用-normalize
选项计算每个RPM(Reads Per Million). 我们默认保持原样,下
一个版本将会打开 -normalize
The philosophy of this peak finder is to define regions,
and then search for the motif. However, the findall
script can report the actual peaks in the region with
the -listpeak option.
peak finder 的哲学是定义区域, 搜索模体。尽管这样, findall
脚本报告实际的峰的区域,选项, -listpeak
9.
The rest of the analysis depends heavily on Cistematic
to run. The following scripts find associated genes and
anlyze their GO ontology enrichment, if any:
getallgenes.py
analyzego.py
基于 Cistematic 的其余分析,关联 基因 和 GO 富集
10.
The following scripts, also requiring Cistematic,
the sequence in the enriched regions, find motifs using
Meme and map motif sites in regions around the peaks:
getfasta.py
findMotifs.py
getallsites.py
其余脚本, 也要求 Cistematic, 恢复富集区域的序列,使用
MEME 寻找模体,比对peak附近的模体区域
11.
The output of findMotifs.py and input of getallsites.py
are motifs in the Cistematic .mot format. A modified
version of getallsites.py to output NRSEs that uses
multiple motifs is:
getallNRSE.py
NRSE2.mot
NRSE2left.mot
NRSE2right.mot
findMotif.py 的输出 以及 getallsites.py 的输入均是 Cistematic .mot格式。
一个修饰的版本是getallsites.py 到 NRSEs 使用 多个 模体。
12.
The remaining scripts are just helper scripts to allow
comparison between runs and/or move data into UCSC format.
bedtoregion.py
makesitetrack.py
regiontobed.py
regionintersects.py
siteintersects.py
剩余的脚本是一些帮助脚本,帮助比较运行或转换数据到UCSC格式
II. 程序测试实例.
[数据分析之peakfinder]