-
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat(extractors): add image extractor
- Loading branch information
1 parent
8dcd6bc
commit f97e5f8
Showing
11 changed files
with
143 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,3 +30,4 @@ logs | |
coverage | ||
cache | ||
.zed | ||
*.traineddata |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
Lorem ipsum | ||
|
||
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sapien ante conubia vestibulum | ||
ultrices quisque nam nascetur consectetur. Viverra amet lacinia massa donec gravida primis | ||
leo tellus. Montes nulla sit cras odio penatibus cum aenean metus. Per per eros fusce et | ||
platea et feugiat ullamcorper. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
at his touch of a certain icy pang along my blood. “Come, sir,” said I. | ||
“You forget that I have not yet the pleasure of your acquaintance. Be | ||
seated, if you please.” And I showed him an example, and sat down | ||
myself in my customary seat and with as fair an imitation of my or- | ||
dinary manner to a patient, as the lateness of the hour, the nature of | ||
my preoccupations, and the horror I had of my visitor, would suffer | ||
me to muster. | ||
|
||
“I beg your pardon, Dr. Lanyon,” he replied civilly enough. “What | ||
you say is very well founded; and my impatience has shown its heels | ||
to my politeness. I come here at the instance of your colleague, Dr. | ||
Henry Jekyll, on a piece of business of some moment; and I under- | ||
stood...” He paused and put his hand to his throat, and I could see, | ||
in spite of his collected manner, that he was wrestling against the | ||
approaches of the hysteria—“T understood, a drawer...” | ||
|
||
But here I took pity on my visitor's suspense, and some perhaps | ||
on my own growing curiosity. | ||
|
||
“There it is, sir,” said I, pointing to the drawer, where it lay on the | ||
floor behind a table and still covered with the sheet. | ||
|
||
He sprang to it, and then paused, and laid his hand upon his | ||
heart: I could hear his teeth grate with the convulsive action of his | ||
jaws; and his face was so ghastly to see that I grew alarmed both for | ||
his life and reason. | ||
|
||
“Compose yourself,” said I. | ||
|
||
He turned a dreadful smile to me, and as if with the decision of | ||
despair, plucked away the sheet. At sight of the contents, he uttered | ||
one loud sob of such immense relief that I sat petrified. And the | ||
next moment, in a voice that was already fairly well under control, | ||
“Have you a graduated glass?” he asked. | ||
|
||
I rose from my place with something of an effort and gave him | ||
what he asked. | ||
|
||
He thanked me with a smiling nod, measured out a few min- | ||
ims of the red tincture and added one of the powders. The mix- | ||
ture, which was at first of a reddish hue, began, in proportion as the |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
Lorem ipsum | ||
|
||
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sapien ante conubia vestibulum | ||
ultrices quisque nam nascetur consectetur. Viverra amet lacinia massa donec gravida primis | ||
leo tellus. Montes nulla sit cras odio penatibus cum aenean metus. Per per eros fusce et | ||
platea et feugiat ullamcorper. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
import { Buffer } from 'node:buffer'; | ||
import { createWorker } from 'tesseract.js'; | ||
import { defineTextExtractor } from '../extractors.models'; | ||
|
||
export const imageExtractorDefinition = defineTextExtractor({ | ||
name: 'image', | ||
mimeTypes: [ | ||
'image/png', | ||
'image/jpeg', | ||
'image/webp', | ||
'image/gif', | ||
], | ||
extract: async ({ arrayBuffer }) => { | ||
const buffer = Buffer.from(arrayBuffer); | ||
|
||
const worker = await createWorker(); | ||
|
||
const { data: { text } } = await worker.recognize(buffer); | ||
await worker.terminate(); | ||
|
||
return { content: text }; | ||
}, | ||
}); |