Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

G-Research / spark-dgraph-connector Public

Notifications You must be signed in to change notification settings
Fork 12
Star 43

Code
Issues 35
Pull requests 15
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Releases: G-Research/spark-dgraph-connector

Releases · G-Research/spark-dgraph-connector

v0.4.2 (Spark 2.4) - 2020-07-28

14 Jul 10:27

EnricoMi

v0.4.2_spark-2.4

Compare

Choose a tag to compare

Loading

v0.4.2 (Spark 2.4) - 2020-07-28

Fixed

Fixed dependency conflicts between connector dependencies and Spark.

Assets 2

Loading

All reactions

v0.4.1 (Spark 3.0) - 2020-07-27

14 Jul 10:26

EnricoMi

v0.4.1_spark-3.0

Compare

Choose a tag to compare

Loading

v0.4.1 (Spark 3.0) - 2020-07-27

Added

Add example how to load Dgraph data in PySpark. Fixed dependency conflicts between connector dependencies and Spark.

Assets 2

Loading

All reactions

v0.4.1 (Spark 2.4) - 2020-07-27

14 Jul 10:26

EnricoMi

v0.4.1_spark-2.4

Compare

Choose a tag to compare

Loading

v0.4.1 (Spark 2.4) - 2020-07-27

Added

Add example how to load Dgraph data in PySpark. Fixed dependency conflicts between connector dependencies and Spark.

Assets 2

Loading

All reactions

v0.4.0 (Spark 4.0) - 2020-07-24

14 Jul 10:23

EnricoMi

v0.4.0_spark-3.0

Compare

Choose a tag to compare

Loading

v0.4.0 (Spark 4.0) - 2020-07-24

Added

Add Spark filter pushdown and projection pushdown to improve efficiency when loading only subgraphs.
Filters like .where($"revenue".isNotNull) and projections like .select($"subject", $"`dgraph.type`", $"revenue")
will be pushed to Dgraph and only the relevant graph data will be read (issue #7).
Improve performance of PredicatePartitioner for multiple predicates per partition.
Restoring default number of predicates per partition of 1000 from before 0.3.0 (issue #22).
The PredicatePartitioner combined with UidRangePartitioner is the default partitioner now.
Add stream-like reading of partitions from Dgraph. Partitions are split into smaller chunks.
This make Spark read Dgraph partitions of any size.
Add Dgraph metrics to measure throughput, visible in Spark UI Stages page and through SparkListener.

Security

Move Google Guava dependency version to 24.1.1-jre due to known security vulnerability fixed in 24.1.1

Assets 2

Loading

All reactions

v0.4.0 (Spark 2.4) - 2020-07-24

14 Jul 10:23

EnricoMi

v0.4.0_spark-2.4

Compare

Choose a tag to compare

Loading

v0.4.0 (Spark 2.4) - 2020-07-24

Added

Add Spark filter pushdown and projection pushdown to improve efficiency when loading only subgraphs.
Filters like .where($"revenue".isNotNull) and projections like .select($"subject", $"`dgraph.type`", $"revenue")
will be pushed to Dgraph and only the relevant graph data will be read (issue #7).
Improve performance of PredicatePartitioner for multiple predicates per partition.
Restoring default number of predicates per partition of 1000 from before 0.3.0 (issue #22).
The PredicatePartitioner combined with UidRangePartitioner is the default partitioner now.
Add stream-like reading of partitions from Dgraph. Partitions are split into smaller chunks.
This make Spark read Dgraph partitions of any size.
Add Dgraph metrics to measure throughput, visible in Spark UI Stages page and through SparkListener.

Security

Move Google Guava dependency version to 24.1.1-jre due to known security vulnerability fixed in 24.1.1

Assets 2

Loading

All reactions

v0.3.0 (Spark 3.0) - 2020-06-22

14 Jul 10:20

EnricoMi

v0.3.0_spark-3.0

Compare

Choose a tag to compare

Loading

v0.3.0 (Spark 3.0) - 2020-06-22

Added

Load data from Dgraph cluster as GraphFrames GraphFrame.
Use exact uid cardinality for uid range partitioning. Combined with predicate partitioning, large predicates get split into more partitions than small predicates (issue #2).
Improve performance of PredicatePartitioner for a single predicate per partition (dgraph.partitioner.predicate.predicatesPerPartition=1). This becomes the new default for this partitioner.
Move to Spark 3.0.0 release (was 3.0.0-preview2).

Fixed

Dgraph groups with no predicates caused a NullPointerException.
Predicate names need to be escaped in Dgraph queries.

Assets 2

Loading

All reactions

v0.3.0 (Spark 2.4) - 2020-06-22

14 Jul 10:19

EnricoMi

v0.3.0_spark-2.4

Compare

Choose a tag to compare

Loading

v0.3.0 (Spark 2.4) - 2020-06-22

Added

Use exact uid cardinality for uid range partitioning. Combined with predicate partitioning, large predicates get split into more partitions than small predicates (issue #2).
Improve performance of PredicatePartitioner for a single predicate per partition (dgraph.partitioner.predicate.predicatesPerPartition=1). This becomes the new default for this partitioner.
Move to Spark 2.4.6 release (was 2.4.5).

Fixed

Dgraph groups with no predicates caused a NullPointerException.
Predicate names need to be escaped in Dgraph queries.

Assets 2

Loading

All reactions

v0.2.0 (Spark 2.4) - 2020-06-11

14 Jul 10:08

EnricoMi

v0.2.0_spark-2.4

Compare

Choose a tag to compare

Loading

v0.2.0 (Spark 2.4) - 2020-06-11

Added

Load nodes from Dgraph cluster as wide nodes (fully typed property columns).
Added dgraph.type and dgraph.graphql.schema predicates to be loaded from Dgraph cluster.

Assets 2

Loading

All reactions

v0.2.0 (Spark 3.0) - 2020-06-11

14 Jul 10:14

EnricoMi

v0.2.0_spark-3.0

Compare

Choose a tag to compare

Loading

v0.2.0 (Spark 3.0) - 2020-06-11 Pre-release

Pre-release

Added

Load nodes from Dgraph cluster as wide nodes (fully typed property columns).
Added dgraph.type and dgraph.graphql.schema predicates to be loaded from Dgraph cluster.

Assets 2

Loading

All reactions

v0.1.0 (Spark 3.0) - 2020-06-09

14 Jul 10:07

EnricoMi

Compare

Choose a tag to compare

Loading

v0.1.0 (Spark 3.0) - 2020-06-09 Pre-release

Pre-release

First release of the project

Added

Load data from Dgraph cluster as triples (as strings or fully typed), edges or node DataFrames.
Load data from Dgraph cluster as Apache Spark GraphX Graph.
Partitioning by Dgraph Group, Alpha node, predicates and uids.

Assets 2

Loading

All reactions

Previous 1 2 3 Next

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.