Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page dewarp cuts text #11

Open
IonutQo2 opened this issue Apr 28, 2019 · 8 comments
Open

Page dewarp cuts text #11

IonutQo2 opened this issue Apr 28, 2019 · 8 comments

Comments

@IonutQo2
Copy link

IonutQo2 commented Apr 28, 2019

I have been using this using the command line python page_dewarp.py image.jpg with the included images successfully but when using an image of my own I could see that the text is cropped and cut.

The image was a 2MB jpeg with the resolution of 3400x4600px

This is the original and the resulted image:
image

Do you know what might be the problem? Thanks

@ghost
Copy link

ghost commented May 28, 2019

Try setting PAGE_MARGIN_X and PAGE_MARGIN_Y to zeros.

@jbarth-ubhd
Copy link

I personally would prefer other defaults:

  • no cropping
  • no binarization (black/white)
  • no subsampling (full resolution)
  • = dewarping only

@KyleWang-Hunter
Copy link

I personally would prefer other defaults:

  • no cropping
  • no binarization (black/white)
  • no subsampling (full resolution)
  • = dewarping only

how to set in code?

@jbarth-ubhd
Copy link

diff --git a/page_dewarp.py b/page_dewarp.py
index 6ef5b33..d095244 100755
--- a/page_dewarp.py
+++ b/page_dewarp.py
@@ -20,8 +20,8 @@ import scipy.optimize
 # for some reason pylint complains about cv2 members being undefined :(
 # pylint: disable=E1101
 
-PAGE_MARGIN_X = 50       # reduced px to ignore near L/R edge
-PAGE_MARGIN_Y = 20       # reduced px to ignore near T/B edge
+PAGE_MARGIN_X = 0       # reduced px to ignore near L/R edge
+PAGE_MARGIN_Y = 0       # reduced px to ignore near T/B edge
 
 OUTPUT_ZOOM = 1.0        # how much to zoom output relative to *original* image
 OUTPUT_DPI = 300         # just affects stated DPI of PNG, not appearance
@@ -813,17 +813,13 @@ def remap_image(name, img, small, page_dims, params):
     image_y_coords = cv2.resize(image_y_coords, (width, height),
                                 interpolation=cv2.INTER_CUBIC)
 
-    img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
-
-    remapped = cv2.remap(img_gray, image_x_coords, image_y_coords,
+    remapped = cv2.remap(img, image_x_coords, image_y_coords,
                          cv2.INTER_CUBIC,
                          None, cv2.BORDER_REPLICATE)
 
-    thresh = cv2.adaptiveThreshold(remapped, 255, cv2.ADAPTIVE_THRESH_MEAN_C,
-                                   cv2.THRESH_BINARY, ADAPTIVE_WINSZ, 25)
+    thresh = remapped
 
     pil_image = Image.fromarray(thresh)
-    pil_image = pil_image.convert('1')
 
     threshfile = name + '_thresh.png'
     pil_image.save(threshfile, dpi=(OUTPUT_DPI, OUTPUT_DPI))

@KyleWang-Hunter
Copy link

diff --git a/page_dewarp.py b/page_dewarp.py
index 6ef5b33..d095244 100755
--- a/page_dewarp.py
+++ b/page_dewarp.py
@@ -20,8 +20,8 @@ import scipy.optimize
 # for some reason pylint complains about cv2 members being undefined :(
 # pylint: disable=E1101
 
-PAGE_MARGIN_X = 50       # reduced px to ignore near L/R edge
-PAGE_MARGIN_Y = 20       # reduced px to ignore near T/B edge
+PAGE_MARGIN_X = 0       # reduced px to ignore near L/R edge
+PAGE_MARGIN_Y = 0       # reduced px to ignore near T/B edge
 
 OUTPUT_ZOOM = 1.0        # how much to zoom output relative to *original* image
 OUTPUT_DPI = 300         # just affects stated DPI of PNG, not appearance
@@ -813,17 +813,13 @@ def remap_image(name, img, small, page_dims, params):
     image_y_coords = cv2.resize(image_y_coords, (width, height),
                                 interpolation=cv2.INTER_CUBIC)
 
-    img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
-
-    remapped = cv2.remap(img_gray, image_x_coords, image_y_coords,
+    remapped = cv2.remap(img, image_x_coords, image_y_coords,
                          cv2.INTER_CUBIC,
                          None, cv2.BORDER_REPLICATE)
 
-    thresh = cv2.adaptiveThreshold(remapped, 255, cv2.ADAPTIVE_THRESH_MEAN_C,
-                                   cv2.THRESH_BINARY, ADAPTIVE_WINSZ, 25)
+    thresh = remapped
 
     pil_image = Image.fromarray(thresh)
-    pil_image = pil_image.convert('1')
 
     threshfile = name + '_thresh.png'
     pil_image.save(threshfile, dpi=(OUTPUT_DPI, OUTPUT_DPI))

thank you very much

@jbarth-ubhd
Copy link

jbarth-ubhd commented Oct 11, 2022

PS: did a simulation of the warping of pages of an open book. Approximations of this pages (x, y) look like x^4, not like x³, but I don't know how this maps to "text line curves":

image

@phamkhactu
Copy link

phamkhactu commented Nov 23, 2022

hi @jbarth-ubhd @KyleWang-Hunter I have problem same cut text. it cut text at the end of image
image
I had set margin_x, margin_y to zeros. How to fix it?? Thanks in advance

@jbarth-ubhd
Copy link

For such slightly skewed text without curvature from bent paper, I would use a much simpler algorithm, e. g. https://github.com/jbarth-ubhd/fix-perspective

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants