Xml string compression?  
Author Message
Sailu





PostPosted: Architecture General, Xml string compression? Top

Looking for a way to compress a huge xml string.

Xceedsoft has a compression library but should we really buy it just for that.

Any suggestions will be appreciated.

Thanks,

Sailu




Architecture1  
 
 
SupunLiyanage





PostPosted: Architecture General, Xml string compression? Top

There are many commercial packages for XML compression, but we can write our own “Tag Based” compression module that we want, this is done using:

Ex: If we have a tag called <staff-members> and if its repeating many times(normally its
repeating) we can predefined this tag as <sf> (as you wants).

So we can reduce the length of XML string, (then transmit to the processing module then reassign the tag <sf> with <staff-members>).

This is only a technique (that I have using in my projects) but there can be many more techniques.



 
 
Pranshu





PostPosted: Architecture General, Xml string compression? Top

Is it for transferring between machines If so - you could use Gzip compression.

Pranshu



 
 
Diego Dagum





PostPosted: Architecture General, Xml string compression? Top

Regarding Pranshu suggestion, you have in .NET 2.0 a new API exclusively for compressing/decompressing streams

System.IO.Compression Namespace

Hope it helps



 
 
SupunLiyanage





PostPosted: Architecture General, Xml string compression? Top

It can be just transferring between machines thought NWs’, or for use with applications.



 
 
Pranshu





PostPosted: Architecture General, Xml string compression? Top

The GZIP compression that I was talking about - is useful for the network only. Most applications ( yours or third party) will support HTTP compression. The way it works is that the Web server( or the HTTP Server which could be your application) - reads the client header "accept-encoding" and if finds the values like Gzip, deflate there, it knows that the client knows how to unzip gzipped file and can compress the file on the fly and send to the client. This addes to the Processing load but reduces the network traffic, and is a good strategy of transferring files/data/etc. to remote destinations.

As far as the application is concerned, it will always have uncompressed data. So you are not able to optimize Memory utilization - as it is not compressed while application is reading it, or if you would serialize it in database or file.

For transferring data in the same machine, or machines kept next to each other, it might be better to save on the CPU processing to zip and unzip and transfer un-compressed files.

Pranshu



 
 
Narayanan Dayalan





PostPosted: Architecture General, Xml string compression? Top

Hi Friends,

the below code can compress any kind of string. if u like this code u can use it...

using System.IO.Compression;
using System.Text;
using System.IO;

public static string Compress(string text)
{
 byte[] buffer = Encoding.UTF8.GetBytes(text);
 MemoryStream ms = new MemoryStream();
 using (GZipStream zip = new GZipStream(ms, CompressionMode.Compress, true))
 {
 zip.Write(buffer, 0, buffer.Length);
 }

 ms.Position = 0;
 MemoryStream outStream = new MemoryStream();

 byte[] compressed = new byte[ms.Length];
 ms.Read(compressed, 0, compressed.Length);

 byte[] gzBuffer = new byte[compressed.Length + 4];
 System.Buffer.BlockCopy(compressed, 0, gzBuffer, 4, compressed.Length);
 System.Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gzBuffer, 0, 4);
 return Convert.ToBase64String (gzBuffer);
}

public static string Decompress(string compressedText)
{
 byte[] gzBuffer = Convert.FromBase64String(compressedText);
 using (MemoryStream ms = new MemoryStream())
 {
 int msgLength = BitConverter.ToInt32(gzBuffer, 0);
 ms.Write(gzBuffer, 4, gzBuffer.Length - 4);

 byte[] buffer = new byte[msgLength];

 ms.Position = 0;
 using (GZipStream zip = new GZipStream(ms, CompressionMode.Decompress))
 {
  zip.Read(buffer, 0, buffer.Length);
 }

 return Encoding.UTF8.GetString(buffer);
 } 
}

The strings need to be longer than 400 characters; otherwise the compression rate is not good enough.

for more info check this site http://www.csharphelp.com/archives4/archive689.html