Assuming the file has a format similar to this:
>SEQUENCE 1
MTEITAAMVKELRESTGAGMMDCKNALSETNGDFDKAVQLLREKGLGKAAKKADRLAAEG
LVSVKVSDDFTIAAMRPSYLSYEDLDMTFVENEYKALVAELEKENEERRRLKDPNKPEHK
IPQFASRKQLSDAILKEAEEKIKEELKAQGKPEKIWDNIIPGKMNSFIADNSQLDSKLTL
MGQFYVMDDKKTVEQVIAEKEKEFGGKIKIVEFICFEVGEGLEKKTEDFAAEVAAQL
>SEQUENCE 2
SATVSEINSETDFVAKNDQFIALTKDTTAHIQSNSLQSVEELHSSTINGVKFEEYLKSQI
ATIGENLVVRRFATLKAGANGVVNGYIHTNGRVGVVIAAACDSAEVASKSRDLLRQICMH
I assume that what you want to remove are the lines with this format >SEQUENCE xxxx
(or similar), beforehand I already tell you that I do not understand anything of this format, Wikipedia a little, but I think your goal is simple, if it really is just just read the line by line of the FASTA file.
arquivo = 'foo.dat'; # Seu arquivo "fasta"
f = open(arquivo, 'r') # Abre para leitura
lines = f.readlines() # Lê as linhas e separa em um vetor
relist = [] # cria um novo array para pegar somente as linhas de interesse
for line in lines:
if line.find('>') != 0: # ignora as linhas que começam com >
relist.append(line)
print(relist) # Mostra o array no output
Now if what you want is to actually remove the first line, whatever, just use .pop(0)
, like this:
arquivo = 'foo.dat';
f = open(arquivo, 'r')
lines = f.readlines() # Lê as linhas e separa em um vetor
firstLine = f.pop(0) #Remove a primeira linha
print(lines)
To make array
into string
("text") just use str.join(array)
, it should be like this for the first example:
''.join(relist)
And so for the second:
''.join(lines)