Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-4067: Optimize First Byte of Long Decode #3183

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 18 additions & 5 deletions lang/java/avro/src/main/java/org/apache/avro/io/BinaryDecoder.java
Original file line number Diff line number Diff line change
Expand Up @@ -184,10 +184,25 @@ public int readInt() throws IOException {
@Override
public long readLong() throws IOException {
ensureBounds(10);
int b = buf[pos++] & 0xff;
int n = b & 0x7f;

/*
* Long values are used for many different areas of the spec, for example: a
* string is encoded as a long followed by that many bytes of UTF-8 encoded
* character data. Because of this, long values actually tend to be pretty small
* on average, and so can often fit within the first byte of the variable-length
* array. Therefore, the first byte is prioritized. For the first byte, if the
* high-order bit is set, this indicates there are more bytes to read, but also
* this means a signed value >= 0 does not have any following bytes.
*/
long l;
if (b > 0x7f) {
int b, n;
if ((b = buf[pos++]) == 0) {
return 0;
} else if (b > 0) {
// back to two's-complement (zig-zag)
return (b >>> 1) ^ -(b & 1);
} else {
n = b & 0x7f;
b = buf[pos++] & 0xff;
n ^= (b & 0x7f) << 7;
if (b > 0x7f) {
Expand All @@ -209,8 +224,6 @@ public long readLong() throws IOException {
} else {
l = n;
}
} else {
l = n;
}
if (pos > limit) {
throw new EOFException();
Expand Down
Loading