修复缺损双端reads
写在前面的
当分析非原始双端FASTQ数据时,可能这些数据来自于处理后的FASTQ文件,可能存在成对reads缺损的情况。此时成对reads的下游分析很可能会报错。因此,在分析非原始双端reads时,有必要检查并排除内部可能缺损的不成对reads。这里推荐一个工具BBMap中的repair.sh,BBMap的安装可以通过conda安装
repair.sh参数解释
在终端输入repair.sh即可查看参数及其解释
Written by Brian Bushnell
Last modified November 9, 2016
Description: Re-pairs reads that became disordered or had some mates eliminated.
Please read bbmap/docs/guides/RepairGuide.txt for more information.
Usage: repair.sh in=<inputfile> out=<pair output> outs=<singleton output>
Input may be fasta, fastq, or sam, compressed or uncompressed.
Parameters:
in=<file> The 'in=' flag is needed if the inputfileisnot the first
parameter. 'in=stdin' will pipefrom standard in.
in2=<file> Use this if2nd readof pairs arein a different file.
out=<file> The 'out=' flag is needed if the outputfileisnot the second
parameter. 'out=stdout' will pipeto standard out.
out2=<file> Use this to write 2nd readof pairs to a different file.
outs=<file> (outsingle) Write singleton reads here.
overwrite=t (ow) Settofalsetoforce the program toabort rather than
overwrite an existing file.
showspeed=t (ss) Setto'f'to suppress display of processing speed.
ziplevel=2 (zl) Setto1 (lowest) through9 (max) tochange compression
level; lower compression is faster.
fint=f (fixinterleaving) Fixes corrupted interleaved files using read
names. Only useon files with broken interleaving - correctly
interleaved files from which somereads were removed.
repair=t (rp) Fixes arbitrarily corrupted paired readsbyusingread
names. Uses much more memorythan'fint' mode.
ain=f (allowidenticalnames) When detecting pair names, allows
identical names, instead of requiring /1and /2or1: and2:
JavaParameters:
-Xmx This will setJava's memory usage, overriding autodetection.
-Xmx20g will specify 20 gigs of RAM, and -Xmx200m will
specify 200 megs. The max is typically 85% of physical memory.
-eoom This flag will cause the process to exit if an out-of-memory
exception occurs. Requires Java 8u92+.
-da Disable assertions.
Please contact Brian Bushnell at bbushnell@lbl.gov if you encounter any problems.
常规用法
repair.sh in1=sample_R1.fq.gz in2=sample_R2.fq.gz out1=sample_paired_R1.fastq out2=sample_paired_R2.fastq outs=sample_singletons.fastq

