<?xml version="1.0" encoding="UTF-8"?>
<gpx version="1.1" ...>
<metadata>...</metadata>
<trk>
<trkseg>
<trkpt lon="00" lat="00">
<ele>000</ele>
<time>2014-01-01T00:00:00.000Z</time>
<extensions>
<gpxtpx:TrackPointExtension>
<gpxtpx:hr>99</gpxtpx:hr>
</gpxtpx:TrackPointExtension>
</extensions>
</trkpt>
....
<trkpt ...>
...
<extensions>
...
</extensions>
</trkpt>
</trkseg>
</trk>
</gpx>
The system basically generates a <trkpt>
element at each reading (geographical + physiological + other devices). I need to remove all instances of the <extensions>
element within <trkpt>
(that is, all content of it). I tried using the ElementTree
library with the following code:
import xml.etree.ElementTree as ET
tree = ET.parse('input.gpx')
root = tree.getroot()
for ext in root[1][2].iter('{http://www.topografix.com/GPX/1/1}trkpt'):
ext = trkpt.find('{http://www.topografix.com/GPX/1/1}extensions')
root.remove(ext)
tree.write('output.gpx')
The code even removes the elements, but I did not like 3 things here:
The first is that the library adds the XML schema URLs to the element names. I lost a lot of time without understanding why my algorithm did not find the elements ...
The second is this root[1][2]
to have a pointer to the parent of the elements that I want to remove. I would be able to access the elements directly by invoking root.iter('{...}extensions')
.
And finally, the more serious question is that when writing the result in the file I noticed that the library renames the tags breaks the original format. The result looks like this:
<?xml version='1.0' encoding='UTF-8'?>
<ns0:gpx ...>
<ns0:metadata>...</ns0:metadata>
<ns0:trk>...</ns0:trk>
</ns0:gpx>
As I have no experience with this library, perhaps some configuration I did not see in my superficial reading is missing documentation . I'm then looking for a solution to my problem with this or another library.