Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kernel] Load the protocol and metadata from the CRC files when available #4077

Merged
merged 62 commits into from
Jan 31, 2025
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
dd9fe7b
[Kernel] Support for loading protocol and metadata from checksum file…
vkorukanti May 15, 2024
df12241
refactor according to the comments
huan233usc Jan 21, 2025
a404846
add logging and fix comments
huan233usc Jan 22, 2025
1e410c5
move comments
huan233usc Jan 22, 2025
1994b2e
update comments
huan233usc Jan 22, 2025
ddc0815
update comments
huan233usc Jan 22, 2025
02067f3
replace static test table with test generated
huan233usc Jan 22, 2025
dec4938
use optional in building crc and use rfii in crc test
huan233usc Jan 22, 2025
6eff126
resolve comments
huan233usc Jan 22, 2025
69c29df
update comments
huan233usc Jan 22, 2025
fec680e
fix test
huan233usc Jan 22, 2025
60981a6
updated param name
huan233usc Jan 22, 2025
97bc44f
update doc
huan233usc Jan 22, 2025
2f5b1a2
fix idention
huan233usc Jan 22, 2025
c7bf343
update the doc to reflect nullness
huan233usc Jan 22, 2025
dbf4490
fix javafmt
huan233usc Jan 22, 2025
f19d48b
check crc info's version, use snapshot for lower bound
huan233usc Jan 22, 2025
25912ed
update comments
huan233usc Jan 22, 2025
44c8445
clean up tests
huan233usc Jan 23, 2025
4122d70
clean unused test methods
huan233usc Jan 23, 2025
017260c
revert unused test methods
huan233usc Jan 23, 2025
ab1115a
clean up unused import
huan233usc Jan 23, 2025
7d89e55
revert unused test methods
huan233usc Jan 23, 2025
2270126
resolve comments
huan233usc Jan 23, 2025
ab3d62e
handle edge case
huan233usc Jan 23, 2025
78bb03c
prefer use crc over checkpoint
huan233usc Jan 23, 2025
9949e90
fix java format
huan233usc Jan 23, 2025
e1532e7
add tests and fix listing bug
huan233usc Jan 23, 2025
69acace
refactor per comments
huan233usc Jan 23, 2025
9993886
handle version is 0
huan233usc Jan 23, 2025
70829a0
merge from latest version
huan233usc Jan 23, 2025
733eaed
fix java format
huan233usc Jan 23, 2025
4202b31
rever accident deleted changes in conflict resolve
huan233usc Jan 23, 2025
37f4251
refactor
huan233usc Jan 23, 2025
7494fd3
fix test attempt 1
huan233usc Jan 23, 2025
3f0e05f
fix test attempt 2
huan233usc Jan 23, 2025
4bb7bad
add tests
huan233usc Jan 24, 2025
e42b70e
fix indent
huan233usc Jan 24, 2025
18ecc74
fix test
huan233usc Jan 24, 2025
7383397
add docs
huan233usc Jan 24, 2025
afd5db1
fix comment
huan233usc Jan 24, 2025
ec3a24b
fix comment
huan233usc Jan 24, 2025
412e0fb
resolve comments
huan233usc Jan 24, 2025
e3d86a9
format java
huan233usc Jan 24, 2025
0dbea3f
update check
huan233usc Jan 24, 2025
588b464
format java
huan233usc Jan 24, 2025
155cf2b
update internal utils to move filter away
huan233usc Jan 24, 2025
8fe533b
fix comment
huan233usc Jan 28, 2025
119d0e0
take while
huan233usc Jan 28, 2025
c189537
adding header
huan233usc Jan 28, 2025
cf54166
fix typo
huan233usc Jan 28, 2025
21ed78f
fix typo
huan233usc Jan 28, 2025
5bf479e
switch to checkpoint opt
huan233usc Jan 29, 2025
8a96791
remove unused import
huan233usc Jan 29, 2025
9882740
merge with conflict
huan233usc Jan 30, 2025
d05450d
resolving conflict 1 - remove unnecessary files
huan233usc Jan 30, 2025
4e7078d
resolving conflict 2 fix tests
huan233usc Jan 30, 2025
2408be8
resolving conflict 3 resolve accidental deleted suites
huan233usc Jan 30, 2025
aa82629
empty line
huan233usc Jan 30, 2025
5b6c03b
fix year
huan233usc Jan 30, 2025
bdba22a
fix year
huan233usc Jan 30, 2025
a1611b8
use inmemory list
huan233usc Jan 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
/*
* Copyright (2024) The Delta Lake Project Authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package io.delta.kernel.internal.replay;

import io.delta.kernel.data.ColumnarBatch;
import io.delta.kernel.engine.Engine;
import io.delta.kernel.internal.actions.Metadata;
import io.delta.kernel.internal.actions.Protocol;
import io.delta.kernel.types.StructType;

public class CRCInfo {

public static CRCInfo fromColumnarBatch(
Engine engine, long version, ColumnarBatch batch, int rowId) {
// fromColumnVector already takes care of nulls
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
Protocol protocol = Protocol.fromColumnVector(batch.getColumnVector(PROTOCOL_ORDINAL), rowId);
Metadata metadata = Metadata.fromColumnVector(batch.getColumnVector(METADATA_ORDINAL), rowId);
return new CRCInfo(version, metadata, protocol);
}

// We can add additional fields later
public static final StructType FULL_SCHEMA =
new StructType().add("protocol", Protocol.FULL_SCHEMA).add("metadata", Metadata.FULL_SCHEMA);

private static final int PROTOCOL_ORDINAL = 0;
private static final int METADATA_ORDINAL = 1;

private final long version;
private final Metadata metadata;
private final Protocol protocol;

protected CRCInfo(long version, Metadata metadata, Protocol protocol) {
this.version = version;
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
this.metadata = metadata;
this.protocol = protocol;
}
huan233usc marked this conversation as resolved.
Show resolved Hide resolved

/** The version of the Delta table that this VersionStats represents. */
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
public long getVersion() {
return version;
}

/** The {@link Metadata} stored in this VersionStats. May be null. */
public Metadata getMetadata() {
return metadata;
}

/** The {@link Protocol} stored in this VersionStats. May be null. */
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
public Protocol getProtocol() {
return protocol;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
/*
* Copyright (2024) The Delta Lake Project Authors.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package io.delta.kernel.internal.replay;

import static io.delta.kernel.internal.replay.CRCInfo.fromColumnarBatch;
import static io.delta.kernel.internal.util.FileNames.checksumFile;
import static io.delta.kernel.internal.util.FileNames.isChecksumFile;
import static io.delta.kernel.internal.util.Utils.singletonCloseableIterator;

import io.delta.kernel.data.ColumnarBatch;
import io.delta.kernel.engine.Engine;
import io.delta.kernel.internal.fs.Path;
import io.delta.kernel.internal.util.FileNames;
import io.delta.kernel.utils.CloseableIterator;
import io.delta.kernel.utils.FileStatus;
import java.util.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/** Utility method to load protocol and metadata from the Delta log checksum files. */
public class ChecksumReader {
private static final Logger logger = LoggerFactory.getLogger(ChecksumReader.class);

/**
* Load the protocol and metadata from the checksum file at the given version. If the checksum
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
* file is not found at the given version, it will try to find the latest checksum file that is
* created after the lower bound version or within the last 100 versions.
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
*
* @param engine the engine to use for reading the checksum file
* @param logPath the path to the Delta log
* @param readVersion the version to read the checksum file from
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
* @param lowerBoundOpt the inclusive lower bound version to search for the checksum file
* @return Optional {@link CRCInfo} containing the protocol and metadata, and the version of the
* checksum file. If the checksum file is not found, it will return an empty
*/
public static Optional<CRCInfo> getCRCInfo(
Engine engine, Path logPath, long readVersion, Optional<Long> lowerBoundOpt) {
logger.info("Loading CRC file for version {}", readVersion);
// First try to load the CRC at given version. If not found or failed to read then try to
// find the latest CRC file that is created after the lower bound version or within the last 100
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is created at or after the lower bound version

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if no lower bound is provided. --> lowerbound is provided. it is not optional. we can remove this comment.

// versions if no lower bound is provided.
Path crcFilePath = checksumFile(logPath, readVersion);
Optional<CRCInfo> crcInfoOpt = readChecksumFile(engine, crcFilePath);
if (crcInfoOpt.isPresent()
||
// we don't expect any more checksum files as it is the first version
readVersion == 0) {
return crcInfoOpt;
}

// Try to list the last 100 CRC files and see if we can find a CRC that we can use
long lowerBound = Math.max(lowerBoundOpt.orElse(0L), Math.max(0, readVersion - 100));
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
logger.info(
"CRC file for version {} not found, attempt to loading version up to {}",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"CRC file for version {} not found, listing CRC files from version {}"

readVersion,
lowerBound);

Path lowerBoundFilePath = checksumFile(logPath, lowerBound);
try (CloseableIterator<FileStatus> crcFiles =
engine.getFileSystemClient().listFrom(lowerBoundFilePath.toString())) {
List<FileStatus> crcFilesList = new ArrayList<>();
crcFiles
.filter(file -> isChecksumFile(new Path(file.getPath())))
.forEachRemaining(crcFilesList::add);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scottsand-db I wonder if this can be refactored to use any of the log listing methods in your PR?


// pick the last file which is the latest version that has the CRC file
if (crcFilesList.isEmpty()) {
logger.warn("No checksum files found in the range {} to {}", lowerBound, readVersion);
return Optional.empty();
}

FileStatus latestCRCFile = crcFilesList.get(crcFilesList.size() - 1);
return readChecksumFile(engine, new Path(latestCRCFile.getPath()));
} catch (Exception e) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intention to catch unchecked exceptions here as well? just wondering

Otherwise wondering if we should be more specific and just target IOExceptions or something

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was added in the patched PR. I thinks IOExceptions is better but keeping Exceptions is not a deal broker, so didn't change it.
Updated

logger.warn("Failed to list checksum files from {}", lowerBoundFilePath, e);
return Optional.empty();
}
}

private static Optional<CRCInfo> readChecksumFile(Engine engine, Path filePath) {
try (CloseableIterator<ColumnarBatch> iter =
engine
.getJsonHandler()
.readJsonFiles(
singletonCloseableIterator(FileStatus.of(filePath.toString())),
CRCInfo.FULL_SCHEMA,
Optional.empty())) {
// We do this instead of iterating through the rows or using `getSingularRow` so we
// can use the existing fromColumnVector methods in Protocol, Metadata, Format etc
if (!iter.hasNext()) {
logger.warn("Checksum file is empty: {}", filePath);
return Optional.empty();
}

ColumnarBatch batch = iter.next();
if (batch.getSize() != 1) {
String msg = "Expected exactly one row in the checksum file {}, found {} rows";
logger.warn(msg, filePath, batch.getSize());
return Optional.empty();
}

long crcVersion = FileNames.checksumVersion(filePath);

CRCInfo versionStats = fromColumnarBatch(engine, crcVersion, batch, 0 /* rowId */);
if (versionStats.getMetadata() == null || versionStats.getProtocol() == null) {
logger.warn("Invalid checksum file missing protocol and/or metadata: {}", filePath);
return Optional.empty();
}
return Optional.of(versionStats);
} catch (Exception e) {
// This can happen when the version does not have a checksum file
logger.warn("Failed to read checksum file {}", filePath, e);
return Optional.empty();
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,27 @@ protected Tuple2<Protocol, Metadata> loadTableProtocolAndMetadata(
return new Tuple2<>(snapshotHint.get().getProtocol(), snapshotHint.get().getMetadata());
}

// Compute the lower bound for the CRC search
// If the snapshot hint is present, we can use it as the lower bound for the CRC search.
// TODO: this can be further improved to make the lower bound until the checkpoint version
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
Optional<Long> crcSearchLowerBound = snapshotHint.map(SnapshotHint::getVersion);
scottsand-db marked this conversation as resolved.
Show resolved Hide resolved
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
Optional<CRCInfo> crcInfoOpt =
ChecksumReader.getCRCInfo(engine, logSegment.logPath, snapshotVersion, crcSearchLowerBound);
if (crcInfoOpt.isPresent()) {
CRCInfo crcInfo = crcInfoOpt.get();
if (crcInfo.getVersion() == snapshotVersion) {
// CRC is related to the desired snapshot version. Load protocol and metadata from CRC.
return new Tuple2<>(crcInfo.getProtocol(), crcInfo.getMetadata());
}
// We found the protocol and metadata in a version older than the one we are looking
scottsand-db marked this conversation as resolved.
Show resolved Hide resolved
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
// for. We need to replay the actions to get the latest protocol and metadata, but
// update the hint to read the actions from the version we found to check if the
// protocol and metadata are updated in the versions after the one we found.
snapshotHint =
scottsand-db marked this conversation as resolved.
Show resolved Hide resolved
Optional.of(
new SnapshotHint(crcInfo.getVersion(), crcInfo.getProtocol(), crcInfo.getMetadata()));
scottsand-db marked this conversation as resolved.
Show resolved Hide resolved
}

Protocol protocol = null;
Metadata metadata = null;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ private FileNames() {}

public static final String SIDECAR_DIRECTORY = "_sidecars";

private static final Pattern checksumFileRegex = Pattern.compile("(\\d+)\\.crc");

/** Returns the delta (json format) path for a given delta file. */
public static String deltaFile(Path path, long version) {
return String.format("%s/%020d.json", path, version);
Expand Down Expand Up @@ -75,6 +77,15 @@ public static String sidecarFile(Path path, String sidecar) {
return String.format("%s/%s/%s", path.toString(), SIDECAR_DIRECTORY, sidecar);
}

/** Returns the path to the checksum file for the given version. */
public static Path checksumFile(Path path, long version) {
return new Path(path, String.format("%020d.crc", version));
}

public static long checksumVersion(Path path) {
return Long.parseLong(path.getName().split("\\.")[0]);
}

/**
* Returns the prefix of all delta log files for the given version.
*
Expand Down Expand Up @@ -150,6 +161,10 @@ public static boolean isCommitFile(String fileName) {
|| UUID_DELTA_FILE_REGEX.matcher(filename).matches();
}

public static boolean isChecksumFile(Path checksumFilePath) {
huan233usc marked this conversation as resolved.
Show resolved Hide resolved
return checksumFileRegex.matcher(checksumFilePath.getName()).matches();
}

/**
* Get the version of the checkpoint, checksum or delta file. Throws an error if an unexpected
* file type is seen. These unexpected files should be filtered out to ensure forward
Expand All @@ -161,8 +176,8 @@ public static long getFileVersion(Path path) {
return checkpointVersion(path);
} else if (isCommitFile(path.getName())) {
return deltaVersion(path);
// } else if (isChecksumFile(path)) {
// checksumVersion(path);
} else if (isChecksumFile(path)) {
return checksumVersion(path);
} else {
throw new IllegalArgumentException(
String.format("Unexpected file type found in transaction log: %s", path));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,16 @@ public static FileStatus of(String path, long size, long modificationTime) {
return new FileStatus(path, size, modificationTime);
}

/**
* Create a {@link FileStatus} with the given path with size and modification time set to 0.
*
* @param path Fully qualified file path.
* @return {@link FileStatus} object
*/
public static FileStatus of(String path) {
return new FileStatus(path, 0 /* size */, 0 /* modTime */);
}

@Override
public boolean equals(Object o) {
if (this == o) {
Expand Down
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"tableSizeBytes":0,"numFiles":0,"numMetadata":1,"numProtocol":1,"protocol":{"minReaderVersion":1,"minWriterVersion":4},"metadata":{"id":"332a246e-eea2-43a5-8422-b65526c6cbe9","format":{"provider":"parquet","options":{}},"schemaString":"{\"type\":\"struct\",\"fields\":[{\"name\":\"name\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"age\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},{\"name\":\"birthday\",\"type\":\"date\",\"nullable\":true,\"metadata\":{}}]}","partitionColumns":["birthday"],"configuration":{"delta.enableChangeDataFeed":"true"},"createdTime":1664214549691},"histogramOpt":{"sortedBinBoundaries":[0,8192,16384,32768,65536,131072,262144,524288,1048576,2097152,4194304,8388608,12582912,16777216,20971520,25165824,29360128,33554432,37748736,41943040,50331648,58720256,67108864,75497472,83886080,92274688,100663296,109051904,117440512,125829120,130023424,134217728,138412032,142606336,146800640,150994944,167772160,184549376,201326592,218103808,234881024,251658240,268435456,285212672,301989888,318767104,335544320,352321536,369098752,385875968,402653184,419430400,436207616,452984832,469762048,486539264,503316480,520093696,536870912,553648128,570425344,587202560,603979776,671088640,738197504,805306368,872415232,939524096,1006632960,1073741824,1140850688,1207959552,1275068416,1342177280,1409286144,1476395008,1610612736,1744830464,1879048192,2013265920,2147483648,2415919104,2684354560,2952790016,3221225472,3489660928,3758096384,4026531840,4294967296,8589934592,17179869184,34359738368,68719476736,137438953472,274877906944],"fileCounts":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"totalBytes":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]},"txnId":"eabe3042-b6e7-4710-aa7f-70ab5cfac6cb"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{"commitInfo":{"timestamp":1664214549848,"userId":"7953272455820895","userName":"[email protected]","operation":"CREATE TABLE","operationParameters":{"isManaged":"true","description":null,"partitionBy":"[\"birthday\"]","properties":"{\"delta.enableChangeDataFeed\":\"true\"}"},"notebook":{"notebookId":"3194333061073108"},"clusterId":"0819-204509-hill72","isolationLevel":"WriteSerializable","isBlindAppend":true,"operationMetrics":{},"engineInfo":"Databricks-Runtime/11.x-snapshot-aarch64-scala2.12","txnId":"eabe3042-b6e7-4710-aa7f-70ab5cfac6cb"}}
{"protocol":{"minReaderVersion":1,"minWriterVersion":4}}
{"metaData":{"id":"332a246e-eea2-43a5-8422-b65526c6cbe9","format":{"provider":"parquet","options":{}},"schemaString":"{\"type\":\"struct\",\"fields\":[{\"name\":\"name\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"age\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},{\"name\":\"birthday\",\"type\":\"date\",\"nullable\":true,\"metadata\":{}}]}","partitionColumns":["birthday"],"configuration":{"delta.enableChangeDataFeed":"true"},"createdTime":1664214549691}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"tableSizeBytes":791,"numFiles":1,"numMetadata":1,"numProtocol":1,"protocol":{"minReaderVersion":1,"minWriterVersion":4},"metadata":{"id":"332a246e-eea2-43a5-8422-b65526c6cbe9","format":{"provider":"parquet","options":{}},"schemaString":"{\"type\":\"struct\",\"fields\":[{\"name\":\"name\",\"type\":\"string\",\"nullable\":true,\"metadata\":{}},{\"name\":\"age\",\"type\":\"integer\",\"nullable\":true,\"metadata\":{}},{\"name\":\"birthday\",\"type\":\"date\",\"nullable\":true,\"metadata\":{}}]}","partitionColumns":["birthday"],"configuration":{"delta.enableChangeDataFeed":"true"},"createdTime":1664214549691},"histogramOpt":{"sortedBinBoundaries":[0,8192,16384,32768,65536,131072,262144,524288,1048576,2097152,4194304,8388608,12582912,16777216,20971520,25165824,29360128,33554432,37748736,41943040,50331648,58720256,67108864,75497472,83886080,92274688,100663296,109051904,117440512,125829120,130023424,134217728,138412032,142606336,146800640,150994944,167772160,184549376,201326592,218103808,234881024,251658240,268435456,285212672,301989888,318767104,335544320,352321536,369098752,385875968,402653184,419430400,436207616,452984832,469762048,486539264,503316480,520093696,536870912,553648128,570425344,587202560,603979776,671088640,738197504,805306368,872415232,939524096,1006632960,1073741824,1140850688,1207959552,1275068416,1342177280,1409286144,1476395008,1610612736,1744830464,1879048192,2013265920,2147483648,2415919104,2684354560,2952790016,3221225472,3489660928,3758096384,4026531840,4294967296,8589934592,17179869184,34359738368,68719476736,137438953472,274877906944],"fileCounts":[1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],"totalBytes":[791,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]},"txnId":"f7b5a77a-f86d-40b6-b266-5da97aecbe6e"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
{"commitInfo":{"timestamp":1664214552033,"userId":"7953272455820895","userName":"[email protected]","operation":"WRITE","operationParameters":{"mode":"Append","partitionBy":"[]"},"notebook":{"notebookId":"3194333061073108"},"clusterId":"0819-204509-hill72","readVersion":0,"isolationLevel":"WriteSerializable","isBlindAppend":true,"operationMetrics":{"numFiles":"1","numOutputRows":"1","numOutputBytes":"791"},"engineInfo":"Databricks-Runtime/11.x-snapshot-aarch64-scala2.12","txnId":"f7b5a77a-f86d-40b6-b266-5da97aecbe6e"}}
{"add":{"path":"birthday=2020-01-01/part-00000-21994048-f0e7-4655-a705-09e597411943.c000.snappy.parquet","partitionValues":{"birthday":"2020-01-01"},"size":791,"modificationTime":1664214552000,"dataChange":true,"stats":"{\"numRecords\":1,\"minValues\":{\"name\":\"1\",\"age\":1},\"maxValues\":{\"name\":\"1\",\"age\":1},\"nullCount\":{\"name\":0,\"age\":0}}","tags":{"INSERTION_TIME":"1664214552000000","MIN_INSERTION_TIME":"1664214552000000","MAX_INSERTION_TIME":"1664214552000000","OPTIMIZE_TARGET_SIZE":"268435456"}}}
Loading
Loading