JDBCRDD

JDBCRDD is a RDD of internal binary rows that represents a structured query over a table in a database accessed via JDBC.

Note	`JDBCRDD` represents a "SELECT requiredColumns FROM table" query.

JDBCRDD is created exclusively when JDBCRDD is requested to scanTable (when JDBCRelation is requested to build a scan).

Table 1. JDBCRDD’s Internal Properties (e.g. Registries, Counters and Flags)

Name Description

columnList

Column names

Used when…FIXME

filterWhereClause

Filters as a SQL WHERE clause

Used when…FIXME

Computing Partition (in TaskContext) — `compute` Method

compute(thePart: Partition, context: TaskContext): Iterator[InternalRow]

Note	`compute` is part of Spark Core’s `RDD` Contract to compute a partition (in a `TaskContext`).

compute…FIXME

`resolveTable` Method

resolveTable(options: JDBCOptions): StructType

resolveTable…FIXME

Note	`resolveTable` is used exclusively when `JDBCRelation` is requested for the schema.

`scanTable` Method

scanTable(
  sc: SparkContext,
  schema: StructType,
  requiredColumns: Array[String],
  filters: Array[Filter],
  parts: Array[Partition],
  options: JDBCOptions): RDD[InternalRow]

scanTable…FIXME

Note	`scanTable` is used when…FIXME

Creating JDBCRDD Instance

JDBCRDD takes the following when created:

SparkContext
Function to create a Connection (() ⇒ Connection)
Schema (StructType)
Array of column names
Array of Filter predicates
Array of Spark Core’s Partitions
Connection URL
JDBCOptions

JDBCRDD initializes the internal registries and counters.

`getPartitions` Method

getPartitions: Array[Partition]

Note	`getPartitions` is part of Spark Core’s `RDD` Contract to…FIXME

getPartitions simply returns the partitions (this JDBCRDD was created with).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spark-sql-JDBCRDD.adoc

spark-sql-JDBCRDD.adoc

JDBCRDD

Computing Partition (in TaskContext) — `compute` Method

`resolveTable` Method

`scanTable` Method

Creating JDBCRDD Instance

`getPartitions` Method

Files

spark-sql-JDBCRDD.adoc

Latest commit

History

spark-sql-JDBCRDD.adoc

File metadata and controls

JDBCRDD

Computing Partition (in TaskContext) — compute Method

resolveTable Method

scanTable Method

Creating JDBCRDD Instance

getPartitions Method

Computing Partition (in TaskContext) — `compute` Method

`resolveTable` Method

`scanTable` Method

`getPartitions` Method