COPY FROM / COPY TO for node-postgres. Stream from one database to another, and stuff.
Did you know the all powerful PostgreSQL supports streaming binary data directly into and out of a table?
This means you can take your favorite CSV or TSV or whatever format file and pipe it directly into an existing PostgreSQL table.
You can also take a table and pipe it directly to a file, another database, stdout, even to /dev/null
if you're crazy!
What this module gives you is a Readable or Writable stream directly into/out of a table in your database. This mode of interfacing with your table is very fast and very brittle. You are responsible for properly encoding and ordering all your columns. If anything is out of place PostgreSQL will send you back an error. The stream works within a transaction so you wont leave things in a 1/2 borked state, but it's still good to be aware of.
If you're not familiar with the feature (I wasn't either) you can read this for some good helps: http://www.postgresql.org/docs/9.3/static/sql-copy.html
var pg = require('pg');
var copyTo = require('pg-copy-streams').to;
pg.connect(function(err, client, done) {
var stream = client.query(copyTo('COPY my_table TO STDOUT'));
stream.pipe(process.stdout);
stream.on('end', done);
stream.on('error', done);
});
var fs = require('fs');
var pg = require('pg');
var copyFrom = require('pg-copy-streams').from;
pg.connect(function(err, client, done) {
var stream = client.query(copyFrom('COPY my_table FROM STDIN'));
var fileStream = fs.createReadStream('some_file.tsv')
fileStream.on('error', done);
stream.on('error', done);
stream.on('end', done);
fileStream.pipe(stream);
});
Important: Even if pg-copy-streams.from
is used as a Writable (via pipe
), you should not listen for the 'finish' event and expect that the COPY command has already been correctly acknowledged by the database. Internally, a duplex stream is used to pipe the data into the database connection and the COPY command should be considered complete only when the 'end' event is triggered.
$ npm install pg-copy-streams
This module only works with the pure JavaScript bindings. If you're using require('pg').native
please make sure to use normal require('pg')
or require('pg.js')
when you're using copy streams.
Before you set out on this magical piping journey, you really should read this: http://www.postgresql.org/docs/current/static/sql-copy.html, and you might want to take a look at the tests to get an idea of how things work.
Take note of the following warning in the PostgreSQL documentation:
COPY stops operation at the first error. This should not lead to problems in the event of a COPY TO, but the target table will already have received earlier rows in a COPY FROM. These rows will not be visible or accessible, but they still occupy disk space. This might amount to a considerable amount of wasted disk space if the failure happened well into a large copy operation. You might wish to invoke VACUUM to recover the wasted space.
Instead of adding a bunch more code to the already bloated node-postgres I am trying to make the internals extensible and work on adding edge-case features as 3rd party modules. This is one of those.
Please, if you have any issues with this, open an issue.
Better yet, submit a pull request. I love pull requests.
Generally how I work is if you submit a few pull requests and you're interested I'll make you a contributor and give you full access to everything.
Since this isn't a module with tons of installs and dependent modules I hope we can work together on this to iterate faster here and make something really useful.
The MIT License (MIT)
Copyright (c) 2013 Brian M. Carlson
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.