Improve inline verifier mismatch errors #361

driv3r · 2024-08-29T11:17:20Z

No description provided.

driv3r

New message format:

From

[paginationKeys: 7 (source: boo, target: aaa) ]

To

[PKs: 7 (type: content difference, source: boo, target: aaa, column: name) ]

In case of missing rows we skip the source/target

driv3r · 2024-09-10T10:50:03Z

inline_verifier.go

+	MismatchRowMissingOnSource    mismatchType = "row missing on source"
+	MismatchRowMissingOnTarget    mismatchType = "row missing on target"
+	MismatchContentDifference     mismatchType = "content difference"
+	MismatchChecksumDifference    mismatchType = "rows checksum difference"


added checksum difference next to content one, to differentiate between whole row checksum comparison and column data one

isn't content & checksum difference the same thing? especially now that we are reporting missing row & column separately. If all the columns are there, and content is different checksum will be different?

there's a difference, because we check both checksum of a whole row + compare each rows columns separately, so the checksum difference is for when the SQL query generating checksum for whole row doesn't match one in source, while the content difference is when we compare each single column of a row and encounter and issue there

🤔 If i understand this right, then maybe we should rename these to MismatchFieldDifference and MismatchRowChecksumDifference?

driv3r · 2024-09-10T10:50:24Z

inline_verifier.go

 type InlineVerifierMismatches struct {
 	Pk             uint64
 	SourceChecksum string
 	TargetChecksum string
+	MismatchColumn string
+	MismatchType   mismatchType


flatten the structure a bit, as we always want to have the type in

driv3r · 2024-09-10T10:51:03Z

inline_verifier.go

+				if mismatch.TargetChecksum != "" {
+					messageBuf.WriteString(", target: ")
+					messageBuf.WriteString(mismatch.TargetChecksum)
+				}


source and target checksum are optional, as we don't include them in case rows are missing (source or target)

driv3r · 2024-09-10T10:56:40Z

inline_verifier.go

 			mismatchSet[paginationKey] = InlineVerifierMismatches{
 				Pk:             paginationKey,
+				MismatchType:   MismatchChecksumDifference,


I couldn't find any reason why !exist was after the bytes comparison, it's counter intuitive as we should first check if rows are there

That's why I've split up check for !exists from !bytes.Equal(sourceHash, targetHash) so we could properly define the mismatch type

driv3r · 2024-09-10T10:57:20Z

inline_verifier.go

-		if !bytes.Equal(sourceHash, targetHash) || !exists {
+	for paginationKey, _ := range source {
+		_, exists := target[paginationKey]
+		if !exists {


if rows exist in both source & target, then they are already compared above, no need to check it again

driv3r · 2024-09-10T10:59:03Z

inline_verifier.go

+					MismatchColumn: colName,
+				}
+				break // no need to compare other columns
+			} else if !bytes.Equal(sourceData, targetData) {


again, split !exists check from bytes comparison

for bytes comparison, generate checksums for columns data (avoid storing data which might be PII)

driv3r · 2024-09-10T11:00:01Z

inline_verifier.go

-				break
+		for colName := range sourceDecompressedColumns {
+			_, exists := targetDecompressedColumns[colName]
+			if !exists {


again, there's no point of comparing the data for the second time, it should be the same as in the loop above, check for column existence only

driv3r · 2024-09-10T11:01:20Z

inline_verifier.go

-		}
+	compressedMismatch := compareDecompressedData(sourceData, targetData)
+	for paginationKey, mismatch := range compressedMismatch {
+		mismatches[paginationKey] = mismatch


with this, even if there's a hash comparison already there, we will get more details on the column involved

driv3r · 2024-09-10T11:01:57Z

inline_verifier.go

+		for tableName, _ := range mismatches[schemaName] {
+			sortedTables = append(sortedTables, tableName)
+		}
+		sort.Strings(sortedTables)


sort tables for predictability

ilikeorangutans · 2024-09-10T15:07:29Z

inline_verifier_test.go

Proper unit tests! 🎉

driv3r and others added 4 commits September 3, 2024 14:37

Basic cleanup and unit test harness

853bebf

added mismatches to InlineVerifierMismatches

7d4ce23

added todo

6fd817c

formatMessage for the verify result

2dd8c90

driv3r force-pushed the improve-inline-verifier-mismatch-errors branch from 154e234 to 2dd8c90 Compare September 3, 2024 12:40

Run standard go test as well

3045341

driv3r commented Sep 10, 2024

View reviewed changes

driv3r self-assigned this Sep 10, 2024

driv3r marked this pull request as ready for review September 10, 2024 11:06

driv3r requested review from ilikeorangutans, coding-chimp and mtaner September 10, 2024 11:07

Squash types, make sure mismatchType is always there

53c5c66

driv3r force-pushed the improve-inline-verifier-mismatch-errors branch from 0593a70 to 53c5c66 Compare September 10, 2024 14:05

ilikeorangutans reviewed Sep 10, 2024

View reviewed changes

inline_verifier_test.go Outdated

Copy link

Contributor

ilikeorangutans Sep 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proper unit tests! 🎉

ilikeorangutans approved these changes Sep 10, 2024

View reviewed changes

driv3r added 3 commits September 10, 2024 18:57

Use better naming for mismatch types

535a320

Extend tests to have better error messages

e6eb592

Update asserted verifier error messages

c8e3c3f

driv3r merged commit d0a5d63 into main Sep 11, 2024
9 of 10 checks passed

driv3r deleted the improve-inline-verifier-mismatch-errors branch September 11, 2024 12:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve inline verifier mismatch errors #361

Improve inline verifier mismatch errors #361

driv3r commented Aug 29, 2024

driv3r left a comment

driv3r Sep 10, 2024

mtaner Sep 10, 2024

driv3r Sep 10, 2024

ilikeorangutans Sep 10, 2024

driv3r Sep 10, 2024

driv3r Sep 10, 2024

driv3r Sep 10, 2024

driv3r Sep 10, 2024

driv3r Sep 10, 2024

driv3r Sep 10, 2024

driv3r Sep 10, 2024

driv3r Sep 10, 2024

ilikeorangutans Sep 10, 2024

Improve inline verifier mismatch errors #361

Improve inline verifier mismatch errors #361

Conversation

driv3r commented Aug 29, 2024

driv3r left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment