【一起学生信】bam文件统计覆盖深度、靶向捕获效率

news/2024/11/26 4:20:55/

bam文件统计覆盖深度、靶向捕获效率是在基因组测序分析中经常用到的操作,之前也用过python、perl实现过但是速度比较慢,今天偶然发现了一个软件bamdst(https://github.com/shiquan/bamdst), 采用c语言编写,速度快,分析的类型也比较多,涉及到了mapping统计、靶向捕获统计、flanking区域统计、深度覆盖统计等。

用起来比较方便,具体使用可以参考github。

以下是这个软件分析的项目以及解释。

ItemAnnotation
[Total] Raw Reads (All reads) All reads in the bam file(s).
[Total] QC Fail reads Reads number failed QC, this flag is marked by other software,like bwa. See flag in the bam structure.
[Total] Raw Data(Mb) Total reads data in the bam file(s).
[Total] Paired Reads Paired reads numbers.
[Total] Mapped Reads Mapped reads numbers.
[Total] Fraction of Mapped Reads Ratio of mapped reads against raw reads.
[Total] Mapped Data(Mb) Mapped data in the bam file(s).
[Total] Fraction of Mapped Data(Mb) Ratio of mapped data against raw data.
[Total] Properly paired Paired reads with properly insert size. See bam format protocol for details.
[Total] Fraction of Properly paired Ratio of properly paired reads against mapped reads
[Total] Read and mate paired Read (read1) and mate read (read2) paired.
[Total] Fraction of Read and mate paired Ratio of read and mate paired against mapped reads
[Total] Singletons Read mapped but mate read unmapped, and vice versa.
[Total] Read and mate map to diff chr Read and mate read mapped to different chromosome, usually because mapping error and structure variants.
[Total] Read1 First reads in mate paired sequencing
[Total] Read2 Mate reads
[Total] Read1(rmdup) First reads after remove duplications.
[Total] Read2(rmdup) Mate reads after remove duplications.
[Total] forward strand reads Number of forward strand reads.
[Total] backward strand reads Number of backward strand reads.
[Total] PCR duplicate reads PCR duplications.
[Total] Fraction of PCR duplicate reads Ratio of PCR duplications.
[Total] Map quality cutoff value Cutoff map quality score, this value can be set by -q. default is 20, because some variants caller like GATK only consider high quality reads.
[Total] MapQuality above cutoff reads Number of reads with higher or equal quality score than cutoff value.
[Total] Fraction of MapQ reads in all reads Ratio of reads with higher or equal Q score against raw reads.
[Total] Fraction of MapQ reads in mapped reads Ratio of reads with higher or equal Q score against mapped reads.
[Target] Target Reads Number of reads covered target region (specified by bed file).
[Target] Fraction of Target Reads in all reads Ratio of target reads against raw reads.
[Target] Fraction of Target Reads in mapped reads Ratio of target reads against mapped reads.
[Target] Target Data(Mb) Total bases covered target region. If a read covered target region partly, only the covered bases will be counted.
[Target] Target Data Rmdup(Mb) Total bases covered target region after remove PCR duplications. 
[Target] Fraction of Target Data in all data Ratio of target bases against raw bases.
[Target] Fraction of Target Data in mapped data Ratio of target bases against mapped bases.
[Target] Len of region The length of target regions.
[Target] Average depth Average depth of target regions. Calculated by "target bases   length of regions".
[Target] Average depth(rmdup) Average depth of target regions after remove PCR duplications.
[Target] Coverage (>0x) Ratio of bases with depth greater than 0x in target regions, which also means the ratio of covered regions in target regions.
[Target] Coverage (>=4x) Ratio of bases with depth greater than or equal to 4x in target regions.
[Target] Coverage (>=10x) Ratio of bases with depth greater than or equal to 10x in target regions.
[Target] Coverage (>=30x) Ratio of bases with depth greater than or equal to 30x in target regions.
[Target] Coverage (>=100x) Ratio of bases with depth greater than or equal to 100x in target regions.
[Target] Target Region Count Number of target regions. In normal practise,it is the total number of exomes.
[Target] Region covered > 0x The number of these regions with average depth greater than 0x.
[Target] Fraction Region covered > 0x Ratio of these regions with average depth greater than 0x.
[Target] Fraction Region covered >= 4x Ratio of these regions with average depth greater than or equal to 4x.
[Target] Fraction Region covered >= 10x Ratio of these regions with average depth greater than or equal to 10x.
[Target] Fraction Region covered >= 30x Ratio of these regions with average depth greater than or equal to 30x.
[Target] Fraction Region covered >= 100x Ratio of these regions with average depth greater than or equal to 100x.
[flank] flank size The flank size will be count. 200 bp in default. Oligos could also capture the nearby regions of target regions.
[flank] Len of region (not include target region) The length of flank regions (target regions will not be count).
[flank] Average depth Average depth of flank regions.
[flank] flank Reads The total number of reads covered the flank regions. Note: some reads covered the edge of target regions, will be count in flank regions also. 
[flank] Fraction of flank Reads in all reads Ratio of reads covered in flank regions against raw reads.
[flank] Fraction of flank Reads in mapped reads Ration of reads covered in flank regions against mapped reads.
[flank] flank Data(Mb) Total bases in the flank regions.
[flank] Fraction of flank Data in all data Ratio of total bases in the flank regions against raw data.
[flank] Fraction of flank Data in mapped data Ratio of total bases in the flank regions against mapped data.
[flank] Coverage (>0x) Ratio of flank bases with depth greater than 0x.
[flank] Coverage (>=4x) Ratio of flank bases with depth greater than or equal to 4x.
[flank] Coverage (>=10x) Ratio of flank bases with depth greater than or equal to 10x.
[flank] Coverage (>=30x) Ratio of flank bases with depth greater than or equal to 30x.
[flank] Coverage (>=100x) Ratio of flank bases with depth greater than or equal to 100x.

 

 


http://www.ppmy.cn/news/504962.html

相关文章

基于约束的装配设计【CadQuery】

本教程介绍在CadQuery中如何使用装配约束功能来构建逼真的模型,我们将组装一个由 20x20 V 型槽型材制成的门组件。 1、定义参数 我们希望从定义模型参数开始,以便以后可以轻松更改尺寸: import cadquery as cq# Parameters H 400 W 200…

自动化测试之路 —— Appium使用教程

😏作者简介:博主是一位测试管理者,同时也是一名对外企业兼职讲师。 📡主页地址:【Austin_zhai】 🙆目的与景愿:旨在于能帮助更多的测试行业人员提升软硬技能,分享行业相关最新信息。…

mate9 android8.0 rom,华为Mate9 8.0降级:EMUI8.0回退到EMUI5.0/安卓8.0降级7.0

华为Mate9 EMUI 8.0 版本可以通过华为手机助手回退到Android 7.0 + EMUI5.X官方稳定版本,无需下载中转包即可直接回退至稳定版本。回退过程中会将您的个人数据全部清除,请注意备份您的数据。通过华为手机助手进行回退操作,完成升级后手机会自动重启。 注意事项: 1、请确保当…

[转帖]华为Mate20 X 5G版拆解:巴龙5000还配备了3GB独立内存!

华为Mate20 X 5G版拆解:巴龙5000还配备了3GB独立内存! 投递人 itwriter 发布于 2019-07-29 21:35 评论(7) 有1733人阅读 原文链接 [收藏] https://news.cnblogs.com/n/628918/了解一下华为消费者BG的 supplychain 貌似美国公司的确很少了. 北京时间 …

巨屏旗舰 — Mate 20 X 长测

除了特殊的 Mate 20 RS 之外,Mate 20 系列中最能让你挪不开视线的,还是拥有 7.2 英寸巨屏的 Mate 20 X 了,要知道在 2012 年那个手机还是 4 英寸的时候,Nexus 7 以 7 英寸 1280*800 分辨率的屏幕就可以算是平板了。 而今天&#x…

华为Mate 20X 5G手机供应链一览,附手机深度拆解过程

近日,华为刚刚推出5G版Mate 20 X,随后著名电子产品拆解团队iFixit就对这款手机的欧洲版进行了拆解,并给出了4分的拆机难度(得分越高,越易拆解,满分10分),可见这难度还是蛮高的。 iFi…

华为认证HCIA+HCIP题库(超500题含答案解析)

本套题库包含以下内容,覆盖数通Datacom方向HCIA和HCIP两科考题,考试代码为H12-811、H12-821、H12-831。有需要的可以直接拿。 首先来看看各科目的考试内容及分值占比。所有满分均为1000分,只需600分即可拿证,考试时间均为60分钟。…

管理类联考——写作——素材篇——论说文——写作素材07——制篇:积累·习惯08——制篇:容让·宽厚

管理类专业学位联考 (写作能力) 论说文素材 07——制篇:积累习惯 论文说材料: 合抱之木,生于毫末;九层之台,起于累土;千里之行,始于 足下。 ——《老子》 一:道理论据 操千曲而后晓声&#…