Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Color index array as FIFO #94

Closed
phoboslab opened this issue Dec 19, 2021 · 7 comments
Closed

Color index array as FIFO #94

phoboslab opened this issue Dec 19, 2021 · 7 comments

Comments

@phoboslab
Copy link
Owner

As proposed by @nigeltao and advocated for in #48, the fifo-branch implements the color index as a FIFO, instead of using a hash. It is up to implementers how to search through this FIFO in the encoder. The naive approach would be a linear search. The implementation in the fifo-branch uses a secondary hash table.

master:

A running array[64] (zero-initialized) of previously seen pixel values is 
maintained by the encoder and decoder. Each pixel that is seen by the encoder
and decoder is put into this array at the position formed by a hash function of
the color value. In the encoder, if the pixel value at the index matches the
current pixel, this index position is written to the stream as QOI_OP_INDEX. 
The hash function for the index is:
	
	index_position = (r * 3 + g * 5 + b * 7 + a * 11) % 64

fifo:

A running array[64] (zero-initialized) of pixel values is maintained by the
encoder and decoder. Every pixel en-/decoded by the QOI_OP_DIFF, QOI_OP_LUMA, 
QOI_OP_RGB and QOI_OP_RGBA chunks is written to this array. The write position
starts at 0 and is incremented with each pixel written. The position wraps back
to 0 when it reaches 64. I.e:

	index[index_pos % 64] = current_pixel;
	index_pos = index_pos + 1;

An encoder can search this array for the current pixel value and, if a match is
found, emit a QOI_OP_INDEX with the position within the array.

Pros:

  • not having to dictate a specific hash function. Implementers can search the index in whichever way suits their needs. Probably nice for hardware implementations or platforms that have trouble with multiplications. In practice I doubt that it matters.
  • slight speedup in the decoder (245mpps vs. 224mpps on my machine; funny enough this is still slower than using lookup tables for the color hash)
  • makes the decoder simpler

Cons:

  • somewhat harder to explain. We have to be careful to specify when exactly the color index is written to and at which position
  • puts more burden on the implementation to make the encoder fast, instead of being fast by default
  • makes the encoder more complicated

Compression ratio with the FIFO is only slightly better in total (even when using an exhaustive linear search). Slightly worse with some particular images.

I'm still undecided. In theory it's cool to not specify the hash function, but in practice we don't gain much here and (imho) make it more complicated. I'm curious to hear if there are any objections, specifically from other implementers. @pfusik, @MasterQ32?

@chocolate42
Copy link

Using hash_pos during encode is a neat solution because AFAICT it's scalable, if encoder memory is very tight hash_pos can be some lower multiple of index size which only mildly decreases efficiency due to more collisions in hash_pos. Some higher multiple seems fine for normal use, and maximal compressors can always use linear search if they wish. Eliminating the hash entirely from the decode path seems very beneficial, IMO it's far more critical that the decode path be as simple as possible.

@pfusik
Copy link
Contributor

pfusik commented Dec 19, 2021

On x86, x * 3 and x * 5 correspond to single LEA instructions. No need for lookup tables.

I like the simpler and faster decoder and the option to trade encoding time for improved compression ratio. As I understand, FIFO can have better compression ratio for every possible input?

Is your FIFO encoder slower? How much?

@phoboslab
Copy link
Owner Author

As I understand, FIFO can have better compression ratio for every possible input?

No. Even with exhaustive search, some images encode slightly worse compared to master. I assume this is because it's beneficial for these images to refer to "older" pixels, further back then 64.

(lfifo is linear search, hfifo is the current hash impl in the fifo-branch)

## Total for images/textures_photo
        decode ms   encode ms   decode mpps   encode mpps   size kb    rate
qoi.lfifo:    6.0        30.2        173.32         34.73      1984   64.6%
qoi.hfifo:    6.0         9.0        174.84        115.99      1985   64.6%
qoi.master:   6.6         9.4        158.82        111.93      1981   64.5%

In the majority of cases, fifo compresses better, though. Sometimes substantially.

## Total for images/textures_pk
        decode ms   encode ms   decode mpps   encode mpps   size kb    rate
qoi.lfifo:    0.3         1.2        165.44         37.53        72   54.0%
qoi.hfifo:    0.3         0.4        163.83        107.48        73   54.7%
qoi.master:   0.3         0.4        147.57        104.48        75   56.5%

Totals:

# Grand total for images
        decode ms   encode ms   decode mpps   encode mpps   size kb    rate
qoi.lfifo:    1.9         7.8        243.15         59.79       460   28.0%
qoi.hfifo:    1.9         2.7        244.66        174.31       462   28.1%
qoi.master:   2.1         2.9        224.66        161.11       463   28.2%

Is your FIFO encoder slower? How much?

As you can see, encoding is a bit faster with the fifo (hash impl) compared to master, for reasons I don't understand.

@wbd73
Copy link

wbd73 commented Dec 19, 2021

Those numbers look good for the fifo approach and the source code only looks slightly more complicated.
An advantage of fifo is that each color is guaranteed to stay in the queue for 64 color changes.
It's more predictable. With the original code it can be much longer but in case of bad luck (hash collision) also very short.

I do agree that it is somewhat harder to explain how it works.

@ikskuh
Copy link

ikskuh commented Dec 19, 2021

I think i'd still prefer the hash variant with 35711, as it's easier to implement in general and easier to explain.

For performance comparison, one would need to implement a nice vectorized lookup function for the fifo, as this is something that can nicely be vectorized, so it might not have much impact on encoding speed.

I personally would probably stay with the hash impl

@pfusik
Copy link
Contributor

pfusik commented Dec 19, 2021

I think a "SIMD FIFO" encoder would be possible, with 64-entry std::map implemented with SIMD.

Your only cons are implementation complexity, but QOI remains straightforward compared to JPEG, PNG and even GIF.

@phoboslab
Copy link
Owner Author

After careful consideration, I have decided against this. It doesn't feel right. QOI emphasizes simplicity (and explaining how it works is part of that) so much everywhere else that it would be weird to sacrifice it here for slight gains in compression and throughput.

This makes the current specification in master final.

Thanks everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants