-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented pre-filtering procedure on block metadata #1503
base: master
Are you sure you want to change the base?
Implemented pre-filtering procedure on block metadata #1503
Conversation
Co-authored-by: Johannes Kalmbach <[email protected]>
Co-authored-by: Johannes Kalmbach <[email protected]>
Co-authored-by: Johannes Kalmbach <[email protected]>
Co-authored-by: Johannes Kalmbach <[email protected]>
…neral implementation
Co-authored-by: Johannes Kalmbach <[email protected]>
Co-authored-by: Johannes Kalmbach <[email protected]>
…ng correct prefix in Constants.h
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1503 +/- ##
==========================================
+ Coverage 88.27% 88.29% +0.02%
==========================================
Files 361 363 +2
Lines 27198 27373 +175
Branches 3663 3692 +29
==========================================
+ Hits 24009 24170 +161
- Misses 1954 1958 +4
- Partials 1235 1245 +10 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for this intial implementation,
I have left some comments, I have looked at everything but the unit tests so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really going into the right direction, keep up the good work.
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some reviews,
This looks really good, I just have some questions about the complex setUnion
implementation, I'll think about it, you'll think about it, and then we'll find a solution:)
public: | ||
virtual ~PrefilterExpression() = default; | ||
|
||
virtual std::unique_ptr<PrefilterExpression> logicalComplement() const = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a short comment for this function please.
// to be smaller than Ids of all other types. | ||
static auto getTiedMaskedTriple(const BlockMetadata::PermutedTriple& triple, | ||
size_t ignoreIndex = 3) { | ||
static const auto undefined = Id::makeUndefined(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a function Id::min()
which does the same, but is clearer here.
otherwise you can static_assert
that makeUndefined().getBits() == 0.
// HELPER FUNCTIONS | ||
//______________________________________________________________________________ | ||
// Given a PermutedTriple retrieve the suitable Id w.r.t. a column (index). | ||
static auto getIdFromColumnIndex(const BlockMetadata::PermutedTriple& triple, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function returns an ID, you can explicitly state the return type.
static const auto undefined = Id::makeUndefined(); | ||
switch (ignoreIndex) { | ||
case 3: | ||
return std::tie(triple.col0Id_, triple.col1Id_, triple.col2Id_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::tie returns a tuple of references, make sure that the lifetime works, or retunr std::tuple{...}
"values."); | ||
} | ||
// Check for order of IDs on evaluation Column. | ||
if (checkInvalid(std::greater<>{}, evaluationColumn + 1)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need fully sorted blocks, so you can always pass 3
as the second argument.
getTiedMaskedTriple(block2.lastTriple_, evalCol); | ||
} else { | ||
return lastTripleBlock1 < firstTripleBlock2; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IU somehow think that you here are masking in the wrong direction.
Can't you just compare the full triples on the first block
block1.firstTriple < block2.firstTriple
, and if they are equal you can even AD_CORRECTNESS_CHECK
that then the last triples are also equal, because they are the same block (otherwise you have violated invariants).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you can use the logic you already have, but without any masking (which I think shouldn't be necessary and again also think that it is wrong.
try { | ||
expression->evaluate(input, evaluationColumn); | ||
FAIL() << "Expected thrown error message."; | ||
} catch (const std::runtime_error& error) { | ||
EXPECT_EQ(std::string(error.what()), expectedErrorMessage); | ||
} catch (...) { | ||
FAIL() << "Expected an error of type std::runtime_error, but a " | ||
"different error was thrown."; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have AD_EXPECT_THROW_WITH_MESSAGE
int GtestHelpers.h for exactly this case.
using ad_utility::testing::VocabId; | ||
using sparqlExpression::TestContext; | ||
using namespace prefilterExpressions; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In test, move everything into the anonymouse namespace { <everything here }
to avoid linker problems with common names in different tests.
{blocks.b1, blocks.b2, blocks.b5, blocks.b6, blocks.b7, blocks.b8, | ||
blocks.b9, blocks.b10, blocks.b11, blocks.b12, blocks.b13, blocks.b14}); | ||
|
||
testExpression(2, IntId(-12), blocks.blocks, {blocks.b7, blocks.b14}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 2
and the blocks.blocks
are always redundant, write a common lambda that factors out that code, such that only the "moving parts" are visible (same for all tests below probably)
{blocks.b5, blocks.b6, blocks.b7, blocks.b11, blocks.b12, | ||
blocks.b13, blocks.b14}); | ||
testExpression(2, IntId(100), blocks.blocks, | ||
{blocks.b1, blocks.b2, blocks.b3, blocks.b4, blocks.b5, | ||
blocks.b6, blocks.b7, blocks.b8, blocks.b9, blocks.b10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if you write this test inside a test fixture (TEST_F in Gtest, just google it),which has the blocks b1 - b... directly as members you can always write b1
instead of blocks.b1
which also increases the readability.
What is added so far to this pull requst:
CompressedBlockPrefiltering.h
/CompressedBlockPrefiltering.cpp
CompressedBlockPrefilteringTest.cpp