-
Notifications
You must be signed in to change notification settings - Fork 428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take steps towards thread safety #307
base: master
Are you sure you want to change the base?
Conversation
This tests both parsing and expansion.
This is primarily so we can do things like CFLAGS=-fsanitize=thread. We can't compile src with it and test without it.
I found this was leading to accessing invalid memory and I suspect a segfault on Travis. I cannot see how the changes so far caused this invalid access.
Regarding the commit de-optimizing the matrix
I can't reproduce it on master which is very suspicious. But I've been staring at the code and I can't see what change could have impacted it. |
I see what caused it: The change to
Possibly the optimized version expects a certain multiple of values in the matrix. In that case, we could re-add the optimization if we update the test to respect that. We may want to check other uses to ensure they respect it too though. |
This makes
libpostal_parse_address()
andlibpostal_expand_address()
thread safe. The main change is to make them no longer mutate any shared memory (e.g., context).The global loading and destruction calls are still not thread safe. I notice they were mentioned in #34. However, I think leaving them that way would be okay as multithreaded programs could do them at startup (or get an exclusive lock or something). Especially as they are quite expensive operations.
There is still more to do, and it's possible we will have to rework the approach, but I thought I would send this and get some feedback.
Some comments:
libpostal_parse_address()
that takes anaddress_parser_context_t
from the caller and uses that.strerror()
in the logging calls should be replaced as that is not thread safe. We could possibly usestrerror_r()
instead.averaged_perceptron_tagger_predict()
path in the parser. If there are instructions on loading that model, I can do that too. I've commented one spot that makes it not thread safe.export CC=clang-5.0
export CFLAGS=-fsanitize=thread
-t 4 -i 1000
. As input I used the sample inputs in the README under "examples of normalization". If there were thread safety issues, the sanitizer would print them out.