Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the reference paper for the SAC #268

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

merlintang
Copy link
Contributor

What changes were proposed in this pull request?

Update the SAC reference paper in the readme.

How was this patch tested?

No test is needed.

@merlintang
Copy link
Contributor Author

can someone review this ? thanks @ankuriitg @HeartSaVioR

@HeartSaVioR
Copy link
Collaborator

I'm sorry, but I'm not sure we can link the paper which requires account to view. (meaning pay to view)

I'm not sure you could make this being public. If possible, please change it and request again.

@HeartSaVioR
Copy link
Collaborator

Thanks for updating. As I guided here #260 as well, we're working with legal team to formalize contribution guide with ICLA/CCLA on this project to receive contributions outside of Cloudera. Thanks for the patience.

README.md Outdated
@@ -111,6 +111,10 @@ When running on cluster node, you will also need to distribute this keytab, belo

When Spark application is started, it will transparently track the execution plan of submitted SQL/DF transformations, parse the plan and create related entities in Atlas.

Reference
===
- Mingjie Tang, Saisai Shao, Weiqing Yang, Yanbo Liang, Yongyang Yu, Bikas Saha, Dongjoon Hyun. [SAC: A System for Big Data Lineage Tracking](http://merlintang.github.io/paper/sac_icde.pdf). In IEEE 35th International Conference on Data Engineering (ICDE), 2019
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than referring an individual github link why not just have the pdf as a part of the SAC repo itself?

Instead of a separate section we could just have it listed under the "Spark Atlas Connector" section in the README. Just a line as below with the link.
SAC: A System for Big Data Lineage Tracking

Copy link
Collaborator

@HeartSaVioR HeartSaVioR Jun 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't include anything unless it's safe to claim that it's under copyright of Cloudera, or the license of paper is clear to be compatible with Apache License V2. Even it is compatible, we need to explicitly mention it to LICENSE. So why not just link it to avoid dealing with any license issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is removed

Copy link
Collaborator

@HeartSaVioR HeartSaVioR Jun 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I meant link to external would be OK unless restricted. It's a different story if we "include" the paper in repo as a part of content, so I pointed out for that.

@merlintang
Copy link
Contributor Author

merlintang commented Jun 25, 2019 via email

@merlintang
Copy link
Contributor Author

merlintang commented Jun 26, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants