Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shorthand encoding for positions #68

Open
Heath123 opened this issue Mar 11, 2022 · 0 comments
Open

Shorthand encoding for positions #68

Heath123 opened this issue Mar 11, 2022 · 0 comments

Comments

@Heath123
Copy link

Heath123 commented Mar 11, 2022

Say we wanted to encode "123". The first occurrence of this is Pi is at position 1924. However, a shorter way to encode this would be to store "123", which is shorthand for "the byte at the position in Pi where the byte 123 can be found". This stores the same data as storing "1924", but in a shorter form. This also skips the costly Pi lookup step, drastically improving performance.

For example, the byte sequence:

FA 01 7A D7 12 0B

would be encoded as:

FA 01 7A D7 12 0B

A function to convert between plain bytes and shorthand Pi offsets could look like this pseudocode:

char encode(char original) {
  return byteAtPiPosition(findByteInPi(original)):
}

char decode(char encoded) {
  return byteAtPiPosition(findByteInPi(encoded)):
}

However, we can skip some steps here, for an optimised version:

char encode(char original) {
  return original:
}

char decode(char encoded) {
  return encoded:
}

This would bring many of the advantages of traditional filesystems to PiFS, such as high performance, and reduces the size of the metadata.

As a bonus, this encoding is fully compatible with traditional filesystem drivers, due to the output metadata being readable as if it were the original data. Therefore, you don't even have to reformat your disk to use this new implementation of PiFS!

But wait, it gets even better! All we have to do to add support for PiFS to existing drivers, such as EXT4 and NTFS, is to inject the two encode and decode functions into wherever the drivers write and read to the disk. So, a read like this:

int var = read_from_disk(position);

will have to be changed to this:

int var = decode(read_from_disk(position));

If we mark the functions with always_inline, or allow the compiler to automatically inline the functions, then it will get converted to this:

int var = read_from_disk(position);

You may notice that this is completely identical to the original code! This means that we can skip the step of modifying, recompiling and replacing the code entirely!

Here is a simple 0-step tutorial to switch to from a traditional filesystem to this new version of PiFS:

And you're done!

Also, since you do not need to modify the code, this even works on proprietary drivers like the Windows NTFS one. In fact, you have already been using it for as long as you're been using a computer, without even knowing it. Amazing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant