I have a code that overrides certain string by whitespace
dados = '[{"Id":12345,"Date":"2018-11-03T00:00:00","Quality":"Goodão","Name":"X","Description":null,"Url":"x.com.br/qweqwe","ParseUrl":"x-art","Status":"Ativa","Surveys":0,"KeySearch":"x Art","QualityId":3,"Type":"Tecnology"},{"Id":12346,"Date":"2018-11-03T00:00:00","Quality":"Good","Name":"YYy","Description":"Lorem Ipsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.","Url":"https://www.y.com.br/sdfsfs","ParseUrl":"y beautiful","Status":"Ativa","Surveys":0,"KeySearch":"y like","QualityId":3,"Type":"Tecnology"},{"Id":12347,"Date":"2018-11-03T00:00:00","Quçality":"Pending","Name":"z Z","Description":"Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur","Url":"http://www.z.com.br/asdasdas","ParseUrl":null,"Status":"Ativa","Surveys":112,"KeySearch":"z plant","QualityId":4,"Type":"Agro"},{"Id":12335,"Date":"2018-11-03T00:00:00","Quality":"óéGood","Name":"J","Description":null,"Url":"www.j.com.br","ParseUrl":"x-art","Status":"Ativa","Surveys":0,"KeySearch":"x Art","QualityId":3,"Type":"Tecnology"},{"Id":12332,"Date":"2018-11-03T00:00:00","Quality":"óéGood","Name":"J","Description":null,"Url":"www.j.com.br/","ParseUrl":"x-art","Status":"Ativa","Surveys":0,"KeySearch":"x Art","QualityId":3,"Type":"Tecnology"}]'
dados = dados.replace('http://', '')
dados = dados.replace('https://', '')
print(dados)
Result:
[{"Id":12345,"Date":"2018-11-03T00:00:00","Quality":"Good�o","Name":"X","Description":null,"Url":"x.com.br/qweqwe","ParseUrl":"x-art","Status":"Ativa","Surveys":0,"KeySearch":"x Art","QualityId":3,"Type":"Tecnology"},{"Id":12346,"Date":"2018-11-03T00:00:00","Quality":"Good","Name":"YYy","Description":"Lorem Ipsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book.","Url":"www.y.com.br/sdfsfs","ParseUrl":"y beautiful","Status":"Ativa","Surveys":0,"KeySearch":"y like","QualityId":3,"Type":"Tecnology"},{"Id":12347,"Date":"2018-11-03T00:00:00","Qu�ality":"Pending","Name":"z Z","Description":"Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur","Url":"www.z.com.br/asdasdas","ParseUrl":null,"Status":"Ativa","Surveys":112,"KeySearch":"z plant","QualityId":4,"Type":"Agro"},{"Id":12335,"Date":"2018-11-03T00:00:00","Quality":"��Good","Name":"J","Description":null,"Url":"www.j.com.br","ParseUrl":"x-art","Status":"Ativa","Surveys":0,"KeySearch":"x Art","QualityId":3,"Type":"Tecnology"},{"Id":12332,"Date":"2018-11-03T00:00:00","Quality":"��Good","Name":"J","Description":null,"Url":"www.j.com.br/","ParseUrl":"x-art","Status":"Ativa","Surveys":0,"KeySearch":"x Art","QualityId":3,"Type":"Tecnology"}]
In this situation the result happens as expected, but when I need to use a regex to do the replace I can not (I've already tried it in several ways).
As you can see below, it only replaces the first element and overwrites the entire data variable, see:
dados = re.sub(re.compile('(/.*)', re.MULTILINE), '', dados)
print(dados)
Result:
[{"Id":12345,"Date":"2018-11-03T00:00:00","Quality":"Good�o","Name":"X","Description":null,"Url":"x.com.br
I understand what happened, but I wonder if there is a way to replace using regex, similar to the replace
function.
The goal is to just leave the domain and remove all the junk, eg:
for x.com.br/qweqwe
, I consider "garbage" to be the /qweqwe
, because only x.com.br
is important.