I'm doing a WebAPI that generates an XML, this XML is read several times a day, so on the first run it serializes all my XML and saves it to disk, and for 24h it reads from the disk instead of serializing the entire object again.
I do this because it has several accesses, the XML is large some with up to 300mb, and the information can be cached by 24h
The problem is that the description field, I believe it could be 'compressed' or better could try to make a minify in xml before writing it to disk. I'm trying to remove whitespace and line breaks only from this field for now so I've reduced some good megs.
I use Webapi in C #, Redis, MSSQL
Today I'm sending it like this:
<description><![CDATA[SOBRADO
Área Terreno: 8 x 28
Área Construída: 170m²
Pavimento Superior:
2 dormitórios sendo 1 dormitorio com armario embutido planejado e um maste
banheiro
jardim de inverno
sacada
Pavimento Térreo:
2 salas
Copa
Cozinha
Corredor lateral
jardim na frente
quintal
Edícula:
1 dormitórios
banheiro
lavanderia
deposito
4 vagas
IPTU R$ 1.200,00 anual]]></description>
I would like to send this:
<description><![CDATA[SOBRADO Área Terreno: 8 x 28 Área Construída: 170m²...
I use 2 functions to try to clear the code, but it's not quite as it should.
description = Biblioteca.RemoveTroublesomeCharacters(Biblioteca.CorrigeDescricao(imovel.Descricao)),
internal static string CorrigeDescricao(string descricao)
{
var tab = '\u0009';
descricao = descricao.Replace(" ", " ");
descricao = descricao.Replace("=\r\n", "");
descricao = descricao.Replace(";\r\n", "");
descricao = descricao.Replace("\t", " ");
descricao = descricao.Replace(tab.ToString(), "");
return RemoveHtml(descricao);
}
E
internal static string RemoveTroublesomeCharacters(string inString)
{
if (inString == null) return null;
var newString = new StringBuilder();
char ch;
for (int i = 0; i < inString.Length; i++)
{
ch = inString[i];
// remove any characters outside the valid UTF-8 range as well as all control characters
// except tabs and new lines
//if ((ch < 0x00FD && ch > 0x001F) || ch == '\t' || ch == '\n' || ch == '\r')
//if using .NET version prior to 4, use above logic
if (XmlConvert.IsXmlChar(ch)) //this method is new in .NET 4
{
newString.Append(ch);
}
}
return newString.ToString();
}