Darwin Pipeline Operation

Jump to bottom

jake-rose edited this page Aug 9, 2016 · 7 revisions

This importer is used to query data in the Darwin db2 database.

##Configuration

application.properties:

spring.batch.job.enabled=false (this should not change)
darwin.server: location of the darwin server
darwin.port: port used for db2 connection
darwin.database: database being used
darwin.schema: schema of database
darwin.chunk_size: size of chunks to be processed
darwin.*_view: view accessed in darwin db (e.g. darwin.timeline_view=DMT_TIMELINE_BRAINSPINE_V, check datapedia for more info)
darwin.*_filename: name of output files
darwin.username: username for db
darwin.password: password for db
darwin.driver: com.ibm.db2.jcc.DB2SimpleDataSource
darwin.connection_string: path to db2 connection

Usage

$JAVA -jar target/darwin-(version).jar -stage [stagingDirectory] -study [studyId (default is msk-impact)]

Captured fields

To adjust the columns being captured by the pipeline

Edit model (.../cmo-pipelines/darwin/src/main/java/org/cbioportal/cmo/pipelines/darwin/model/) field_names and header_names
Edit query in associated reader (.../darwin/src/.../darwin/*reader.java)
Re-import data

Warnings

Spring batch will throw out warnings in regards to static job/step scope and returning implementing classes. You'll be okay, trust me