How do I return the number of lines that appear with a certain beginning?

2

I wanted to open a document and have it return to me the last number in a row containing '>' as a reference. That of all rows that have this '>' . or that it read the amount of '>' that existed in the file and returned it in the form that I could put each number of it (1, 2, 3 ..) into a variable.

The data appears this way:

  

'>' VE05.fasta.screen.Contig1   TTTTGTTTTTTTTTTTTTTTTTTTTATTTAATTTTTTTCTTTGGGGGGGG   GGAAAATTTTTTTTTCCCTCCCTTCTACAACACAAGAAAAAAAAACTTCC   '>' VE05.fasta.screen.Contig2   TTTTGTTTTTTTTTTTTTTTTTTTTATTTAATTTTTTTCTTTGGGGGGGG   GGAAAATTTTTTTTTCCCTCCCTTCTACAACACAA

I made this code, but I know it is incomplete.

open (my @number, '<', @n);

@number = chop (); 

print "Contig's final number:@num";

close @n;
    
asked by anonymous 30.06.2014 / 17:10

1 answer

3

open ([FILEHANDLE, '[FILE OPENING]], [DIRECTORY / FILE]);

If the @n array contains the file information, read the array directly:

open my $new_file, '>', 'new_sequence_file'
        or die "Not possible open file"; 

for my $row (@n){

  if ( $row =~ /^'>'/){
    print $new_file ++$sequence.$row, "\n";

  }else{
    print $new_file $row, "\n";
  }
}

If you need to read the file directly:

Opening in read mode

open my $fh, '<', '/usr/bin/TESTE/new_sequence_file.txt'
        or die "Not possible open file" 

Creating a new file to handle the sequential, but if you want, you can overwrite the main file.

open my $new_file, '>>', 'new_sequence_file'
        or die "Not possible open file"; 

for ( my $row = $fh) ){

  if ( $row =~ /^'>'/){
    print $new_file ++$sequence.$row, "\n";

  }else{
    print $new_file $row, "\n";
  }
}

Output:

1'>'VE05.fasta.screen.Contig1
TTTTGTTTTTTTTTTTTTTTTTTTTATTTAATTTTTTTCTTTGGGGGGGG
GGAAAATTTTTTTTTCCCTCCCTTCTACAACACAAGAAAAAAAAACTTCC
2'>'VE05.fasta.screen.Contig2
TTTTGTTTTTTTTTTTTTTTTTTTTATTTAATTTTTTTCTTTGGGGGGGG
GGAAAATTTTTTTTTCCCTCCCTTCTACAACACAA

If you need to tell the batch header that contains '>' use the $ sequence variable

    
02.07.2014 / 18:58