2018年8月7日 星期二

One-line get sorted Bam file from bowtie

One line get sorted Bam file

$ read=collapsed_read.fasta       # collapsed fasta that fastx_collapser creates
$ genome_index=Genome             # basename of bowtie's .ebwt file
$ cat $read | fastx_uncollapser | bowtie -S -f $genome_index - | samtools view -u -h -F 4 - | samtools sort - -o aln.sorted.bam
"-"    It mean <STDIN> that is <STDOUT> of previous command/ program.

fasta_uncollapser [-i INFILE] [-o OUTFILE]

-i [INFILE]      FASTA input file. default is STDIN.

-o [OUTFILE]     FASTA output file. default is STDOUT.

bowtie [options] <ebwt> <s> <hit>

<s>              The query input file. 
                 If - is specified, Bowtie gets the reads from the "standard in" filehandle.
-f               The query input file (<s>) is FASTA files
-S/--sam         Print alignments in SAM format.

samtools view [option] 

-h               Include the header in the output.

-u               Output uncompressed BAM. 
                 This option saves time spent on compression/decompression 
                 and is thus preferred when the output is piped to another 
                 samtools command.

-F [INT]         Do not output alignments with any bits set in INT present 
                 in the FLAG field. # 4 is unmapped reads


samtools sort [-o out.sam|out.bam|out.cram] [in.sam|in.bam|in.cram]

-m [INT]         Approximately the maximum required memory per thread, 
                 specified either in bytes or with a K, M, or G suffix.

-@ [INT]         Set number of sorting and compression threads. 
                 By default, operation is single-threaded


Finally, Create index file of Bam that some software requires

$ samtools index aln.sorted.bam aln.sorted.bam.bai

沒有留言:

張貼留言

DEseq2 usage