-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add the function to embed images ICC profile #1437
base: master
Are you sure you want to change the base?
Conversation
Some considerations:
|
You can try using the jpeg-exif package. It has a sync method for parsing EXIF data. |
|
@xh4010 I see where jpeg-exif falls short, and it looks like exifr doesn't even do what we really want, as you're having to reconstruct the ICC profile from the app segments. I propose we parse out the ICC profile from the app segments ourselves so we can prevent introducing async functions. I was able to get this working by modifying the @blikblum I'm curious what your thoughts are on this approach. const MARKERS = [0xffc0, 0xffc1, 0xffc2, 0xffc3, 0xffc5, 0xffc6, 0xffc7, 0xffc8, 0xffc9, 0xffca, 0xffcb, 0xffcc, 0xffcd, 0xffce, 0xffcf];
const COLOR_SPACE_MAP = {
1: 'DeviceGray',
3: 'DeviceRGB',
4: 'DeviceCMYK'
};
class JPEG {
constructor(data, label) {
let marker;
this.data = data;
this.label = label;
if (this.data.readUInt16BE(0) !== 0xffd8) {
throw 'SOI not found in JPEG';
}
let pos = 2;
while (pos < this.data.length) {
marker = this.data.readUInt16BE(pos);
pos += 2;
if (MARKERS.includes(marker)) {
break;
}
pos += this.data.readUInt16BE(pos);
}
if (!MARKERS.includes(marker)) {
throw 'Invalid JPEG.';
}
pos += 2;
this.bits = this.data[pos++];
this.height = this.data.readUInt16BE(pos);
pos += 2;
this.width = this.data.readUInt16BE(pos);
pos += 2;
// const channels = this.data[pos++];
this.channels = this.data[pos++];
this.colorSpace = COLOR_SPACE_MAP[this.channels];
this.obj = null;
// Parse an embedded ICC profile
pos = 2;
while (pos < this.data.length) {
marker = this.data.readUInt16BE(pos);
pos += 2;
if (marker === 0xffe2) {
break;
}
pos += this.data.readUInt16BE(pos);
}
if (marker !== 0xffe2) {
console.log('no ICC profile found')
} else {
let length = this.data.readUInt16BE(pos);
pos += 2;
// Hardcoded header length of 14 bytes, not entirely sure why
this.iccProfile = this.data.slice(pos, pos + length).slice(14);
}
}
embed(document) {
if (this.obj) {
return;
}
let colorSpace = this.colorSpace;
if (this.iccProfile) {
let profile = document.ref({
Alternate: colorSpace,
N: this.channels,
Length: this.iccProfile.length,
});
profile.end(this.iccProfile);
colorSpace = document.ref([`ICCBased ${profile}`])
colorSpace.end();
}
this.obj = document.ref({
Type: 'XObject',
Subtype: 'Image',
BitsPerComponent: this.bits,
Width: this.width,
Height: this.height,
ColorSpace: colorSpace,
Filter: 'DCTDecode'
}); // add extra decode params for CMYK images. By swapping the
// min and max values from the default, we invert the colors. See
// section 4.8.4 of the spec.
if (this.colorSpace === 'DeviceCMYK') {
this.obj.data['Decode'] = [1.0, 0.0, 1.0, 0.0, 1.0, 0.0, 1.0, 0.0];
}
this.obj.end(this.data); // free memory
return this.data = null;
}
} |
Lot better. Only issue is testing if parsing is correct. But one or two integration tests are ok |
@fstrube You're right. Once we know the location and structure of the ICC profile, we can parse it ourselves without relying on other libraries. We have more work to do because the ICC profile may exist in multiple segments. Additionally, we need to verify that the extracted data is correct. Otherwise, writing it directly could result in an erroneous PDF. I will find some time to make the modifications. Thank you. |
I removed the exifr package and created a function extractICCProfile in the jpeg .js file to extract and parse the ICC Profile, added an option parameter embedICCProfile (default: true), and the test was correct. let doc = new PDFDocument();
doc.pipe(fs.createWriteStream('test.pdf'));
doc.image('tests/images/landscape+icc.jpg', 40, 70, {
width: 200,
height: 267,
embedICCProfile: true
});
doc.end(); |
@xh4010 I like this a lot better! My initial thoughts are that you should abstract the ICC parsing into a separate ICCProfile class. Maybe structure it something like the snippet below. Can you reintroduce the support for PNG's with embedded profiles? class ICCProfile {
static extractFromJPEG(jpeg) {
// ...
}
static extractFromPNG(png) {
// ...
}
constructor(buffer) {
this.buffer = buffer
this.data = this.parse(buffer)
}
parse(buffer) {
// ...
}
embed(doc) {
// ...
}
} |
Abstract the ICC parsing into a separate ICCProfile class
@fstrube |
@xh4010 great work here! Is there anything I can help with to get this PR merged in? |
static extractFromJPEG(jpeg) { | ||
let pos = 2; | ||
const buffers = []; | ||
while (pos < jpeg.length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing this gives me an error, because const marker = jpeg.readUInt16BE(pos);
only takes a max offset of length - 2 https://www.geeksforgeeks.org/node-js-buffer-readuint16be-method/.
If I change this to while (pos < jpeg.length - 4) {, it finishes sucessfully.
By adding two asynchronous methods, imageWithICC and embedWithICC, it is now possible to embed the ICC configuration file into a PDF file if the JPG image contains one.