Delete lines shorter than a certain length and the one above it (remove short sequences in a FASTA file)
I have a file containing the following text:
>seq1
GAAAT
>seq2
CATCTCGGGA
>seq3
GAC
>seq4
ATTCCGTGCC
If a line that doesn't start with ">" is shorter than 5 characters, I want to delete it and the one right above it.
Expected output:
>seq2
CATCTCGGGA
>seq4
ATTCCGTGCC
I have tried sed -r '/^.{,5}$/d'
, but it also deletes the lines with ">".
Comments
Post a Comment