• ZX0 variant only okay...

    From Harry Potter@21:1/5 to All on Sat Mar 11 11:18:47 2023
    Hi! I'm working on several compression techniques, and most of them usually do pretty well. Some of them are variants of the ZX0 compression technique for 8-bit systems. One of them is doing only okay, but another one is doing very well on most files.
    The last one mentioned does significantly better on most files but significantly worse on one. :( I am adding the following to the base technique:

    * Adaptive Huffman codes
    * My own replacement of the Elias technique for writing lengths and part of LZ77 offsets
    * Last16, which shortens repeated LZ77 blocks to an offset to the previous based on # blocks to the copy
    * Shorten literals if same as last two literals
    * A different way to do BPE

    I am asking for ideas to add to these. Thank you for listening.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harry Potter@21:1/5 to All on Sun Apr 30 10:01:56 2023
    Okay. I added some optimizations to Adaptive Huffman codes that sometimes updates the Huffman code tree when writing an LZ77 or Last16 block or a reference to one of the last two literals (a derivative of my Placement Offset Basic technique--ask for
    more information). I also have some optimizations to LZ77 that will decrease the size of an LZ77 block if doing so produces better results. Right now, this technique does 2.6% better than Exomizer. My goal is 3% or better by the end of the day. I
    could apply the rllz technique mentioned in a previous post here. I plan to try it out at a later date but will need a way to identify which bytes are being filled by the rllz blocks. For now, I'm looking for other ways to do better. If you have any
    suggestions, I'd be happy.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harry Potter@21:1/5 to All on Sun Apr 30 10:21:01 2023
    I'm also not compressing using Adaptive Huffman codes if doing so will result in larger data, am sometimes updating the Huffman tree if repeating one of the last two literals in literal compression and making certain literal compression techniques
    optional if not being used in a block of data.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harry Potter@21:1/5 to All on Fri May 5 04:05:44 2023
    I also have a version that uses MTF instead of Adaptive Huffman. Right now, that version is doing way too well to be true. In case it is true, what can I add to MTF to make it better?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harry Potter@21:1/5 to All on Fri May 5 04:42:38 2023
    I was wrong, and I'm sorry. :'( I was exiting a loop at the first byte in a block compressed. I am still doing better but nearly as much better than I thought.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Harry Potter@21:1/5 to All on Mon Jun 19 07:57:28 2023
    I'm doing much better with this variant now and am almost ready to start optimizing and debugging. But first, I want to buy a few more points with the compression ratio. I have ways to optimize lz77: somebody online told me about hash tables; while
    processing this information, I thought to use an 8k buffer of bits, where each bit determines whether an associated word was already encountered; somebody else helped me optimize the main loop, and I was testing the current word in the lz77 sliding
    dictionary before checking for a match. Are there any other ways I can optimize lz77? I am working for 8-bit computers as well as more modern computers, so memory may sometimes be at a premium.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)