Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heroku tesseract build pack support? #123

Open
ansonl opened this issue Apr 15, 2018 · 3 comments
Open

Heroku tesseract build pack support? #123

ansonl opened this issue Apr 15, 2018 · 3 comments

Comments

@ansonl
Copy link
Contributor

ansonl commented Apr 15, 2018

Is there a working configuration to get this working with one of the Heroku tesseract buildpacks such as https://github.com/Dkevs/heroku-buildpack-tesseract?

When compiling go app, Heroku gives error

tessbridge.cpp:5:31: fatal error: tesseract/baseapi.h: No such file or directory
remote: compilation terminated.

I've tried setting CGO_CFLAGS in heroku like heroku config:set CGO_CFLAGS='-I ${build_dir}/tesseract/../' to no avail.

I see the example heroku project uses docker and installs libtesseract-dev. Wondering if gosseract is only tested with docker and if you can recommend a buildpack for libtesseract-dev.

@otiai10
Copy link
Owner

otiai10 commented Apr 15, 2018

Hi, @ansonl

First, have you tried LD_LIBRARY_PATH?

I personally recommend using Docker for your heroku application because it's more flexible and easy to handle.

I'm gonna try buildpack when I have a time.

@ansonl
Copy link
Contributor Author

ansonl commented Apr 24, 2018

Unfortunately I wasn't able to get the libtesseract-dev buildpack working.
I ended up just calling the tesseract command through os.Exec for a project that uses tesseract a couple hundred times.

This also confirmed what seems to be a memory leak issue in the Tesseract BaseAPI.End() function. When calling the End() function and letting client go out of scope, memory usage decreases slightly, but still takes up a couple megabytes of memory for each client struct created.

This can be seen by running the below test program:

package main

import (
	"fmt"
	"github.com/otiai10/gosseract"
	"time"
)

func main() {

	var count int

	var clients []*gosseract.Client

	for _ = range time.Tick(time.Millisecond*100) {
		client := gosseract.NewClient()
	
	
	client.SetImage("002-confusing.png")
	text, _ := client.Text()
	_, _ = client.HOCRText()
	fmt.Println(text)
	// Hello, World!
	count++
	
	clients = append(clients, client)

	if count == 20 {
		break
	}
	}

	for _ = range time.Tick(time.Millisecond*10) {
		count--
		if count == 0 {
			break
		}
	}

	for _, c := range clients {
		(*c).Close()
	}

	for _ = range time.Tick(time.Second*1) {
	}
}

For some reason, after the clients are closed and the tesseract BaseAPI End() method should be called, memory usage will remain elevated. I have tried calling the Go garbage collector functions and it seems to make no difference. The only way I have found to release the memory is to exit the program.

I looked through the .cpp file and have not seen any bugs, so this may be a tesseract library issue.

@ansonl ansonl changed the title Heroku tesseract build pack support? Heroku tesseract build pack support? + possible memory leak issue Apr 24, 2018
@otiai10
Copy link
Owner

otiai10 commented Apr 24, 2018

@ansonl
Thank you.
Do you mind separating issues please?

@ansonl ansonl closed this as completed Apr 25, 2018
@ansonl ansonl changed the title Heroku tesseract build pack support? + possible memory leak issue Heroku tesseract build pack support? Apr 25, 2018
@otiai10 otiai10 reopened this Apr 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants