Some info about LZX compressing

 
Home    Various Info    Some info about LZX compressing
 

Introduction
Although LhA is the most commonly used archive format on the Amiga, and Zip is the most used format of them all (regardless of platform), I decided to go for the LZX format to be used on the Amiga911 boot disk. The reason for this was simply because of the great degree of compression that can be achived with this archive format - something which leads to much smaller archives than can be achived with LhA & Zip.

One common trick that can be used for making archive files as small as possible, is to first create an archive with no compression that only contains files of similar type, and then add this archive to a new one where maximum compression is used instead (thus compressing the first archive). As an example of this I can take my AnArchiver program which can be found here. If you scroll down to the bottom of that page and take a look at the BGPix.lha, Data.lha, Font.lha and all three IconButtons archives in the list, you may notice that they are all compressed. This is because I used the method described above when creating the AnArchiver.lha archive, something that in the end led to a much smaller file.

The thing about LZX is that it already have this type of functionality built in, where it upon creating an archive can first take several files of similar type and merge them into a single block, then it will compress the entire block and add it to the archive. Upon extracting, each file in the block will be decompressed into their original form. The negative thing about file merging in LZX has to do with deleting files from the archive, because if a file that's present in a block is to be deleted, the rest of the files in the same block has to be recompressed as well - something which may slow down things a bit.
This is the main reason why the default maximum merge size is set to 260KB in LZX (meaning that each block can be max 260KB in size), but it is however possible to change this by using the -M option on the command line. Amiga911 Maker sets this value to -M8000 and the compression is set to -9 when creating lzx archives, this in turn means max block size of 8MB and maximum compression possible - all in order of creating as small archives as possible. To make full use of LZX's "merge files" functionality, all archives will always be created from scratch, no files will ever be added to or deleted from an already existing lzx file. This also means that any existing lzx archive in your project dir will be overwritten!


The 68000 CPU limitation
As mentioned above, a compression level of  -9 will be used when creating lzx archives. But if you use Amiga911 Maker on a Amiga with 68000 or 68010 processor, LZX will use -3 compression instead of -9 since the latter isn't supported by the 68000 version of LZX. This will result in slightly larger lzx archives. When it comes to extracting files, the same limitation applies to 68000 LZX.
There are however no problems with UnLZX and the XADmaster system, since they both are able to extract -9 compressed lzx archives.


The Y2K bug in LZX
If you take a look at the contents of lzx archives made in year 2000 or later, you may notice that the file dates can be messed up (just look at the example below), this is because LZX suffers from the Y2K bug - which actually is the biggest flaw with LZX. There have been a few attempts of creating patches that can fix this, but unforunately they are not very reliable in all situations, It is difficult to create a proper patch since no LZX source code is available. But in Amiga911 context, the date problem is not really relevant since the UnLZX tool used for extracting simply ignores the dates in the archives, and uses the current date instead.


An example of LZX vs. LhA & Zip
Just to give you an example of the advantages of LZX when compared to LhA and Zip, I created a new project in Amiga911 Maker 1.46 which was a Normal boot disk, and where the default settings was used. In addition I added the following programs to the project: FileMaster3, HDInstTools, JanoEditor & TransADF. After the lzx archives was created, I converted them to lha & zip archives as well (by using the max compression LhA & Zip can offer), and below you can see the results:
 

System1 archives       System2 archives       Programs archives      
System1.lzx  286443 bytes    System2.lzx  193242 bytes    Programs.lzx  207434 bytes   
System1.lha  335463 bytes    System2.lha  217647 bytes    Programs.lha  229964 bytes   
System1.zip  347527 bytes    System2.zip  213303 bytes    Programs.zip  225815 bytes   


You can take a look at the contents of  the System1 archives by clicking the links, and if you open the System1.lzx.text file you can see that all files in the archive is grouped into 4 blocks, where the first block contains all icons, the second block contains preferences files + 1 datatype, the third contains executables & libraries, the fourth contains ASCII text files, and at the end of the archive there are some empty directories. Below each block you can both see the original and compressed size of the entire block.

I decided to take the example above further, so I created the Amiga911 disk and in turn replaced the lzx archives with the lha & zip ones to really see the differences. I also replaced UnLZX (20.6KB) with LhEX (36.3KB) and UnZip (90.8KB) for each test, and this was the final results:

LZX Disk: 726KB used, 153KB free (83%)
LhA Disk: 837.5 KB used, 41.5KB free (95%)
Zip Disk: Could not fit all files, 16.5KB too much! *
 
*  This is because of the rather large UnZip tool, unfortunately any smaller and simpler alternative to it doesn't exist.


So why do I still use LhA & Zip?
Even though LZX offers superior archive compression when compared to LhA and Zip, these two archivers have their share of advantages over LZX as well. One of these advantages is that both LhA and Zip can take the name of a text file as argument on the command line, where the text file contains a list over all files to add, extract or delete to/from the archive (this instead of specifying the action files directly). LZX on the other hand, lacks this feature. Amiga911 Maker does actually use this method quite a lot for extracting files, and this is the main reason why all standard archives in the A911MakerData drawer is in the lha format instead of lzx.

Another thing is that the lzx archive format is much more obscure, where it's not really that well supported on other platforms than the Amiga. As an example, there's only one tool that exists for extracting lzx archives in Windows/Dos (afaik). Things are a bit different with the lha (aka lzh) format, since it's more common on other platforms. As a matter of fact, the original Amiga version of LhA was actually ported from the Unix version. Zip is of course the most used archive format of them all.





    Followed a link? Please go to the Main Site                   © Roger E. Håseth 2011 - 2014