Skip to content

Darwin Pipeline Operation

jake-rose edited this page Aug 9, 2016 · 7 revisions

This importer is used to query data in the Darwin db2 database.

##Configuration

application.properties:

  • spring.batch.job.enabled=false (this should not change)
  • darwin.server: location of the darwin server
  • darwin.port: port used for db2 connection
  • darwin.database: database being used
  • darwin.schema: schema of database
  • darwin.chunk_size: size of chunks to be processed
  • darwin.*_view: view accessed in darwin db (e.g. darwin.timeline_view=DMT_TIMELINE_BRAINSPINE_V, check datapedia for more info)
  • darwin.*_filename: name of output files
  • darwin.username: username for db
  • darwin.password: password for db
  • darwin.driver: com.ibm.db2.jcc.DB2SimpleDataSource
  • darwin.connection_string: path to db2 connection

Usage

$JAVA -jar target/darwin-(version).jar -stage [stagingDirectory] -study [studyId (default is msk-impact)]

Captured fields

To adjust the columns being captured by the pipeline

  1. Edit model (.../cmo-pipelines/darwin/src/main/java/org/cbioportal/cmo/pipelines/darwin/model/) field_names and header_names
  2. Edit query in associated reader (.../darwin/src/.../darwin/*reader.java)
  3. Re-import data

Warnings

Spring batch will throw out warnings in regards to static job/step scope and returning implementing classes. You'll be okay, trust me