-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PersonalizeSink: add client and configuration classes #4803
Conversation
Signed-off-by: Ivan Tse <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this is looking good. I have a few suggestions and comments that should be fairly straight-forward.
testImplementation testLibs.slf4j.simple | ||
} | ||
|
||
test { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need these three lines. The root project provides this configuration.
* Initialize {@link PersonalizeSinkService} | ||
*/ | ||
private void doInitializeInternal() { | ||
sinkInitialized = Boolean.TRUE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sinkInitialized = Boolean.TRUE; | |
sinkInitialized = true; |
* Implementation class of personalize-sink plugin. It is responsible for receiving the collection of | ||
* {@link Event} and uploading to amazon personalize. | ||
*/ | ||
@DataPrepperPlugin(name = "personalize", pluginType = Sink.class, pluginConfigurationType = PersonalizeSinkConfiguration.class) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm a user looking at Data Prepper documentation, I'm not sure I'd know what "personalize" means. Maybe aws_personalize
would be clearer? I'm ok keeping it as you have it as well, but want to point out the possible confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree with that. I'll change it to aws_personalize
this.personalizeSinkConfig = personalizeSinkConfig; | ||
this.sinkContext = sinkContext; | ||
|
||
sinkInitialized = Boolean.FALSE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sinkInitialized = Boolean.FALSE; | |
sinkInitialized = false; |
* @param records received records and add into buffer. | ||
*/ | ||
void output(Collection<Record<Event>> records) { | ||
LOG.info("{} records received", records.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOG.info("{} records received", records.size()); | |
LOG.trace("{} records received", records.size()); |
This is something that you would mostly use for testing the sink itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes this is placeholder until the sink gets implemented. Will change to trace
implementation 'org.jetbrains.kotlin:kotlin-stdlib:1.9.22' | ||
implementation libs.avro.core | ||
implementation(libs.hadoop.common) { | ||
exclude group: 'org.eclipse.jetty' | ||
exclude group: 'org.apache.hadoop', module: 'hadoop-auth' | ||
exclude group: 'org.apache.zookeeper', module: 'zookeeper' | ||
} | ||
implementation libs.parquet.avro | ||
implementation 'software.amazon.awssdk:apache-client' | ||
implementation 'org.jetbrains.kotlin:kotlin-stdlib-common:1.9.22' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implementation 'org.jetbrains.kotlin:kotlin-stdlib:1.9.22' | |
implementation libs.avro.core | |
implementation(libs.hadoop.common) { | |
exclude group: 'org.eclipse.jetty' | |
exclude group: 'org.apache.hadoop', module: 'hadoop-auth' | |
exclude group: 'org.apache.zookeeper', module: 'zookeeper' | |
} | |
implementation libs.parquet.avro | |
implementation 'software.amazon.awssdk:apache-client' | |
implementation 'org.jetbrains.kotlin:kotlin-stdlib-common:1.9.22' |
I do not believe you need any of these items. Certainly you shouldn't need Avro, Parquet, and Hadoop. And unless you are using a library that requires Kotlin, you can remove that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes let me clean up the build.gradle file
* Class responsible for creating PersonalizeEventsClient object, check thresholds, | ||
* get new buffer and write records into buffer. | ||
*/ | ||
public class PersonalizeSinkService { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public class PersonalizeSinkService { | |
class PersonalizeSinkService { |
Make this package protected since you will probably only use it in PersonalizeSink
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
import software.amazon.awssdk.core.retry.RetryPolicy; | ||
import software.amazon.awssdk.services.personalizeevents.PersonalizeEventsClient; | ||
|
||
public final class ClientFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public final class ClientFactory { | |
final class ClientFactory { |
I think this class can also be package protected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do
personalizeSink = createObjectUnderTest(); | ||
Assertions.assertNotNull(personalizeSink); | ||
personalizeSink.doInitialize(); | ||
assertTrue(personalizeSink.isReady(), "personalize sink is not initialized and not ready to work"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertTrue(personalizeSink.isReady(), "personalize sink is not initialized and not ready to work"); | |
assertTrue(personalizeSink.isReady(), "Expected the personalize sink to be ready, but it is reporting it is not ready."); |
I think the current message may be confusing if the test fails in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, will change the messaging
void test_personalize_Sink_plugin_isReady_negative() { | ||
personalizeSink = createObjectUnderTest(); | ||
Assertions.assertNotNull(personalizeSink); | ||
assertFalse(personalizeSink.isReady(), "personalize sink is initialized and ready to work"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assertFalse(personalizeSink.isReady(), "personalize sink is initialized and ready to work"); | |
assertFalse(personalizeSink.isReady(), "Expected the personalize sink to report that it is not ready, but it is reporting it is ready."); |
Thank you @ivan-tse for starting this work and making this initial contribution! |
Signed-off-by: Ivan Tse <[email protected]>
@Override | ||
public void doInitialize() { | ||
try { | ||
doInitializeInternal(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This doesn't have to be a function if the function is just setting a boolean. Do you expect this function to do something different in future? If yes, keep it like it is now. Other wise I think it is unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change this and add back in the future if necessary
Looks good to me. I can approve this once all tests pass. David is away this week, but I see that you have addressed all his comments. |
Signed-off-by: Ivan Tse <[email protected]>
Signed-off-by: Ivan Tse <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this PR mainly just boilerplate and configuration! Made it much easier to review
|
||
@AssertTrue(message = "sts_role_arn must be an IAM Role", groups = PersonalizeAdvancedValidation.class) | ||
boolean isValidStsRoleArn() { | ||
final Arn arn = getArn(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you will want to return false here if this method throws an IllegalArgumentException. Have you tested with invalid arn format and observed the exception?
Also this role should be optional in this configuration, so adding a null check to return true in the method should be enough. The reason it is optional here is because users can configure a default role in the data-prepper-config.yaml
(#4559)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I have tested with invalid arn format.
I didn't know about the default in data-prepper-config.yaml
. I'll make this optional
Signed-off-by: Ivan Tse <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. Thanks for this contribution and making the requested changes!
Description
Add client and configuration classes for Personalize Sink
Issues Resolved
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.