Skip to content

aDNA-trim produces invalid merged FASTQ files in some cases #3

@shyama-mama

Description

@shyama-mama

Hi Guys,

I am using aDNA trim as follows:
seqtk mergepe R1.fastq R2.fastq | adna-trim -p aDNA_trim_pe - > aDNA_trim_merged.fastq
The data is a NovaSeq sample pre-processed with FasP to trim polyG tails.

This is the original read
R1

@A00488:28:HJ3THDSXX:2:1146:24189:25441 1:N:0:ACCAACT
TCCAGAGTTATTGCTGTGATACAGGCAGAGATGCTATAACTGAGTTTGTATTCTAGGGGGGGGGGGCCGATGTTAACGGGGAAAAGATAAAAATTTAACTTAATTGATACAGTGATATTAAATACGGACGAGCACACGACTAAC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFFF:,,,,:F,F,:,FF,:FF,,:,F,,FF:,,F,,FF,,:F,:,,,,F,,::,,,F:,,:FF,F,,F,F,F,F,F,::F,

R2

@A00488:28:HJ3THDSXX:2:1146:24189:25441 2:N:0:ACCAACT
CCTGTATCACTAAAGTTACATTATTATCTTTTCCCTGTTAACGTCGGGGGGGGCGGGGGGGGTGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
+
FFFFFFFFFFFFFFFF,,FFFFFFF,FFFFFFFFF,F,,,,:,,,,F,F,:,,,,,,::FFF,::,,,:FF::,:F:FFF:,:FF:FF,FFF:,F:F,F,,::,,,FFF,FF,::,:,:,F:,FFFF,FF,,:F:FFF:F:FF:

Using aDNA trim on the fastq directly does not merge the reads. So I ran FastP to trim the PolyG tail. This is the modified fastq
R1

@A00488:28:HJ3THDSXX:2:1146:24189:25441 1:N:0:ACCAACT
TCCAGAGTTATTGCTGTGATACAGGCAGAGATGCTATAACTGAGTTTGTATTCTAGGGGGGGGGGGCCGATGTTAACGGGGAAAAGATAAAAATTTAACTTAATTGATACAGTGATATTAAATACGGACGAGCACACGACTAAC
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFFF:,,,,:F,F,:,FF,:FF,,:,F,,FF:,,F,,FF,,:F,:,,,,F,,::,,,F:,,:FF,F,,F,F,F,F,F,::F,

R2

@A00488:28:HJ3THDSXX:2:1146:24189:25441 2:N:0:ACCAACT
CCTGTATCACTAAAGTTACATTATTATCTTTTCCCTGTTAAC
+
FFFFFFFFFFFFFFFF,,FFFFFFF,FFFFFFFFF,F,,,,:

Using aDNA-trim produces the following invalid read.

@A00488:28:HJ3THDSXX:2:1146:24189:25441_22:*
,F,::F,
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFFF:,,,,:F,F,F!FFFFFFFFF,FF;FFF;,FFFFF;F-FFFFFFFF;FFFFFFFFFFF:FFFFFFFFFFFF:,,,,:F,F,:,FF,:FF,,:,F,,FF:,,F,,FF,,:F,:,,,,F,,::,,,F:,,:FF,F,,F,F,F,F,F,::F,

The read should be able to be merged to produce a valid read. See FastP read below:

@A00488:28:HJ3THDSXX:2:1146:24189:25441 1:N:0:ACCAACT merged_113_0
TCCAGAGTTATTGCTGTGATACAGGCAGAGATGCTATAACTGAGTTTGTATTCTAGGGGGGGGGGGCCGATGTTAACGGGGAAAAGATAATAATGTAACTTTATTGATACAGG
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFF:FFFFFFFFFFFF:,,,,:F,F,:,FF,:FF,,:,F,FFF:F,F,,FFF,:F,:,,,,FF

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions