PgBulkInsert is a Java library for Bulk Inserts to PostgreSQL using the Binary COPY Protocol.
It provides a wrapper around the PostgreSQL Copy command:
The COPY command is a PostgreSQL specific feature, which allows efficient bulk import or export of data to and from a table. This is a much faster way of getting data in and out of a table than using INSERT and SELECT.
PgBulkInsert is available in the Central Maven Repository.
You can add the following dependencies to your pom.xml to include PgBulkInsert in your project.
<dependency>
<groupId>de.bytefish</groupId>
<artifactId>pgbulkinsert</artifactId>
<version>0.9</version>
</dependency>
PgBulkInsert is released with under terms of the MIT License:
- Numeric Types
- smallint
- integer
- bigint
- real
- double precision
- Date/Time Types
- timestamp
- date
- Character Types
- text
- Boolean Type
- boolean
- Binary Data Types
- bytea
- Network Address Types
- inet (IPv4, IPv6)
- UUID Type
- uuid
- Array Type
- One-Dimensional Arrays
Imagine we want to bulk insert a large amount of persons into a PostgreSQL database. Each Person
has a first name, a last name and a birthdate.
The table in the PostgreSQL database might look like this:
CREATE TABLE sample.unit_test
(
first_name text,
last_name text,
birth_date date
);
The domain model in the application might look like this:
private class Person {
private String firstName;
private String lastName;
private LocalDate birthDate;
public Person() {}
public String getFirstName() {
return firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
public String getLastName() {
return lastName;
}
public void setLastName(String lastName) {
this.lastName = lastName;
}
public LocalDate getBirthDate() {
return birthDate;
}
public void setBirthDate(LocalDate birthDate) {
this.birthDate = birthDate;
}
}
Then you have to implement the PgBulkInsert<Person>
, which defines the mapping between the table and the domain model.
public class PersonBulkInserter extends PgBulkInsert<Person>
{
public PersonBulkInserter() {
super("sample", "unit_test");
mapString("first_name", Person::getFirstName);
mapString("last_name", Person::getLastName);
mapDate("birth_date", Person::getBirthDate);
}
}
And finally we can write a Unit Test to insert 100000
Persons into the database. You can find the entire Unit Test on GitHub: IntegrationTest.java.
@Test
public void bulkInsertPersonDataTest() throws SQLException {
// Create a large list of Persons:
List<Person> persons = getPersonList(100000);
// Create the BulkInserter:
PersonBulkInserter personBulkInserter = new PersonBulkInserter();
// Now save all entities of a given stream:
personBulkInserter.saveAll(PostgreSqlUtils.getPGConnection(connection), persons.stream());
// And assert all have been written to the database:
Assert.assertEquals(100000, getRowCount());
}
private List<Person> getPersonList(int numPersons) {
List<Person> persons = new ArrayList<>();
for (int pos = 0; pos < numPersons; pos++) {
Person p = new Person();
p.setFirstName("Philipp");
p.setLastName("Wagner");
p.setBirthDate(LocalDate.of(1986, 5, 12));
persons.add(p);
}
return persons;
}