Skip to content

Java library for efficient Bulk Inserts to PostgreSQL using the Binary COPY Command

License

Notifications You must be signed in to change notification settings

patientslikeme/PgBulkInsert

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 

Repository files navigation

PgBulkInsert

PgBulkInsert is a Java library for Bulk Inserts to PostgreSQL using the Binary COPY Protocol.

It provides a wrapper around the PostgreSQL Copy command:

The COPY command is a PostgreSQL specific feature, which allows efficient bulk import or export of data to and from a table. This is a much faster way of getting data in and out of a table than using INSERT and SELECT.

Setup

PgBulkInsert is available in the Central Maven Repository.

You can add the following dependencies to your pom.xml to include PgBulkInsert in your project.

<dependency>
	<groupId>de.bytefish</groupId>
	<artifactId>pgbulkinsert</artifactId>
	<version>0.9</version>
</dependency>

License

PgBulkInsert is released with under terms of the MIT License:

Supported PostgreSQL Types

Basic Usage

Imagine we want to bulk insert a large amount of persons into a PostgreSQL database. Each Person has a first name, a last name and a birthdate.

Database Table

The table in the PostgreSQL database might look like this:

 CREATE TABLE sample.unit_test
(
    first_name text,
    last_name text,
    birth_date date
);

Domain Model

The domain model in the application might look like this:

private class Person {

    private String firstName;

    private String lastName;

    private LocalDate birthDate;

    public Person() {}

    public String getFirstName() {
        return firstName;
    }

    public void setFirstName(String firstName) {
        this.firstName = firstName;
    }

    public String getLastName() {
        return lastName;
    }

    public void setLastName(String lastName) {
        this.lastName = lastName;
    }

    public LocalDate getBirthDate() {
        return birthDate;
    }

    public void setBirthDate(LocalDate birthDate) {
        this.birthDate = birthDate;
    }
    
}

Bulk Inserter

Then you have to implement the PgBulkInsert<Person>, which defines the mapping between the table and the domain model.

public class PersonBulkInserter extends PgBulkInsert<Person>
{
    public PersonBulkInserter() {
        super("sample", "unit_test");

        mapString("first_name", Person::getFirstName);
        mapString("last_name", Person::getLastName);
        mapDate("birth_date", Person::getBirthDate);
    }
}

Using the Bulk Inserter

And finally we can write a Unit Test to insert 100000 Persons into the database. You can find the entire Unit Test on GitHub: IntegrationTest.java.

@Test
public void bulkInsertPersonDataTest() throws SQLException {
    // Create a large list of Persons:
    List<Person> persons = getPersonList(100000);
    
    // Create the BulkInserter:
    PersonBulkInserter personBulkInserter = new PersonBulkInserter();
    
    // Now save all entities of a given stream:
    personBulkInserter.saveAll(PostgreSqlUtils.getPGConnection(connection), persons.stream());
    
    // And assert all have been written to the database:
    Assert.assertEquals(100000, getRowCount());
}

private List<Person> getPersonList(int numPersons) {
    List<Person> persons = new ArrayList<>();

    for (int pos = 0; pos < numPersons; pos++) {
        Person p = new Person();

        p.setFirstName("Philipp");
        p.setLastName("Wagner");
        p.setBirthDate(LocalDate.of(1986, 5, 12));

        persons.add(p);
    }

    return persons;
}

BulkProcessor

Resources

About

Java library for efficient Bulk Inserts to PostgreSQL using the Binary COPY Command

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 97.0%
  • Batchfile 3.0%