How to Index a Standard Google Shopping Mall in Sphinx Search?
XML:
<?xml version="1.0"?>
<rss version="2.0"
xmlns:g="http://base.google.com/ns/1.0">
<channel>
<title>O nome do seu feed de dados</title>
<link>http://www.example.com</link>
<description>Uma descrição do seu conteúdo</description>
<item>
<title>Suéter de lã vermelho</title>
<link> http://www.example.com/item1-info-page.html</link>
<description>Confortável e macio, este suéter o manterá aquecido nas noites frias de inverno.</description>
<g:image_link>http://www.example.com/imagem1.jpg</g:image_link>
<g:price>25</g:price>
<g:condition>new</g:condition>
<g:id>1a</g:id>
</item>
<!-- ... -->
The Sphinx index will use xmlpipe2 data source
.
Will I need to convert the XML to xmlpipe2 document
pattern before indexing it?
Format xmlpipe2 document
:
<?xml version="1.0" encoding="utf-8"?>
<sphinx:docset>
<sphinx:schema>
<sphinx:field name="subject"/>
<sphinx:field name="content"/>
<sphinx:attr name="published" type="timestamp"/>
<sphinx:attr name="author_id" type="int" bits="16" default="1"/>
</sphinx:schema>
<sphinx:document id="1234">
<content>this is the main content <![CDATA[[and this <cdata> entry
must be handled properly by xml parser lib]]></content>
<published>1012325463</published>
<subject>note how field/attr tags can be
in <b class="red">randomized</b> order</subject>
<misc>some undeclared element</misc>
</sphinx:document>
<!-- ... -->