Skip to content

v0.4.0 (Spark 2.4) - 2020-07-24

Compare
Choose a tag to compare
@EnricoMi EnricoMi released this 14 Jul 10:23
· 174 commits to spark-3.3 since this release

Added

  • Add Spark filter pushdown and projection pushdown to improve efficiency when loading only subgraphs.
    Filters like .where($"revenue".isNotNull) and projections like .select($"subject", $"`dgraph.type`", $"revenue")
    will be pushed to Dgraph and only the relevant graph data will be read (issue #7).
  • Improve performance of PredicatePartitioner for multiple predicates per partition.
    Restoring default number of predicates per partition of 1000 from before 0.3.0 (issue #22).
  • The PredicatePartitioner combined with UidRangePartitioner is the default partitioner now.
  • Add stream-like reading of partitions from Dgraph. Partitions are split into smaller chunks.
    This make Spark read Dgraph partitions of any size.
  • Add Dgraph metrics to measure throughput, visible in Spark UI Stages page and through SparkListener.

Security