Skip to content

Commit

Permalink
#950 refactored options line parser (#445)
Browse files Browse the repository at this point in the history
fixes #950
  • Loading branch information
tomuben authored Sep 5, 2024
1 parent 4ebbbac commit afc9ef6
Show file tree
Hide file tree
Showing 27 changed files with 463 additions and 176 deletions.
10 changes: 8 additions & 2 deletions .github/workflows/check_bazel_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,13 @@ jobs:
build:
runs-on: ubuntu-latest

env:
USE_BAZEL_VERSION=7.2.1

steps:
- uses: actions/checkout@v4


- name: Search for duplicated error codes
run: bash find_duplicate_error_codes.sh

Expand All @@ -27,12 +31,14 @@ jobs:
sudo apt-get install -y openjdk-11-jdk libzmq3-dev
- name: Java Tests
run: |
export USE_BAZEL_VERSION=7.2.1
bazel test //base/javacontainer/test/...
working-directory: ./exaudfclient/
- name: ExaudfLib Tests
run: |
export USE_BAZEL_VERSION=7.2.1
bazel test //base/exaudflib/test/...
working-directory: ./exaudfclient/
- name: Script Options Parser Tests
run: |
bazel test //base/script_options_parser/test/...
working-directory: ./exaudfclient/

2 changes: 1 addition & 1 deletion exaudfclient/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ The usage of multiple linker namespace requires some precautions in the build pr

## Precautions in the build process

In the build process you need to be cautious which libraries you link together and that no link leaks symbols from a library in one namespace to a library in the other namespace. Furthermore, you have to build a shared library with all dependency linked to it as output target. In our case, we have to main output targets //:exaudfclient and //:libexaudflib.so. Both get loaded into different linker namespaces. The language container live in the same namespace as //:exaudfclient. This namespace must not know anyhing about protobuf and zeromq, because it is possible that a language container may load protobuf or zeromq in a different version. Protobuf and zeromq are only known in the namespace of //:libexaudflib.so. The target //:libexaudflib.so depends on //base/exaudflib:exaudflib which contains the logic of the exaudflib. You must not depend on //base/exaudflib:exaudflib in //:exaudfclient or the langauge container, because this would leak zeromq and protobuf. If you need to depend on the other dependency of //base/exaudflib:exaudflib which not depend on protobuf or zeromq them self, such as //base/exaudflib:script_data_transfer_objects, //base/exaudflib:script_data_transfer_objects_wrapper, //base/exaudflib:scriptoptionlines, use either their target as self, the collection of libraries //base/exaudflib:exaudflib-deps or the collection of headers //base/exaudflib:header.
In the build process you need to be cautious which libraries you link together and that no link leaks symbols from a library in one namespace to a library in the other namespace. Furthermore, you have to build a shared library with all dependency linked to it as output target. In our case, we have to main output targets //:exaudfclient and //:libexaudflib.so. Both get loaded into different linker namespaces. The language container live in the same namespace as //:exaudfclient. This namespace must not know anyhing about protobuf and zeromq, because it is possible that a language container may load protobuf or zeromq in a different version. Protobuf and zeromq are only known in the namespace of //:libexaudflib.so. The target //:libexaudflib.so depends on //base/exaudflib:exaudflib which contains the logic of the exaudflib. You must not depend on //base/exaudflib:exaudflib in //:exaudfclient or the langauge container, because this would leak zeromq and protobuf. If you need to depend on the other dependency of //base/exaudflib:exaudflib which not depend on protobuf or zeromq them self, such as //base/exaudflib:script_data_transfer_objects, //base/exaudflib:script_data_transfer_objects_wrapper, //base/script_options_parser:scriptoptionlines, use either their target as self, the collection of libraries //base/exaudflib:exaudflib-deps or the collection of headers //base/exaudflib:header.

## Precautions in the implementations

Expand Down
2 changes: 1 addition & 1 deletion exaudfclient/base/benchmark_container/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@ cc_library(
name = "benchmark_container",
srcs = ["benchmark_container.cc","benchmark_container.h"],
hdrs = ["benchmark_container.h"],
deps = ["//base/exaudflib:exaudflib-deps","//base/exaudflib:header", "//base:debug_message_h"]+["//base/exaudflib:scriptoptionlines"]
deps = ["//base/exaudflib:exaudflib-deps","//base/exaudflib:header", "//base:debug_message_h"]+["//base/script_options_parser:scriptoptionlines"]
)
11 changes: 1 addition & 10 deletions exaudfclient/base/exaudflib/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,6 @@ exports_files(["exascript.i"])

load("//:variables.bzl", "VM_ENABLED_DEFINES")

cc_library(
name = "scriptoptionlines",
hdrs = ["vm/scriptoptionlines.h"],
srcs = ["vm/scriptoptionlines.cc","vm/scriptoptionlines.h"],
copts= ["-fno-lto"],
# We deactivate link time optimization, because otherwise we get the following linker/compiler error while linking/compiling exaudfclient
# function ExecutionGraph::extractOptionLine(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, unsigned long&, std::function<void (char const*)>): error: undefined reference to 'std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const*)'
)

cc_library(
name = "script_data_transfer_objects",
srcs = ["swig/script_data_transfer_objects.cc","swig/script_data_transfer_objects.h"],
Expand Down Expand Up @@ -82,7 +73,7 @@ cc_library(
# The only target which must depend on it is //:libexaudflib-complete.so.
# If you depend on this target you will load protobuf into your linker namespace
# which might cause problems with tensorflow. If you need
# "//base/exaudflib:scriptoptionlines","//base/exaudflib:script_data_transfer_objects" or
# "//base/exaudflib:script_data_transfer_objects" or
# "//base/exaudflib:script_data_transfer_objects_wrapper" in the language container
# depend directly on them or depend on exaudflib-deps
cc_library(
Expand Down
2 changes: 1 addition & 1 deletion exaudfclient/base/exaudflib/swig/swig_meta_data.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ class SWIGMetadata {
if (impl!=nullptr) {
delete impl;
}
}
}
virtual const char* databaseName() { return impl->databaseName(); }
virtual const char* databaseVersion() { return impl->databaseVersion(); }
virtual const char* scriptName() { return impl->scriptName(); }
Expand Down
3 changes: 1 addition & 2 deletions exaudfclient/base/exaudflib/test/BUILD
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
cc_test(
name = "exaudflib-test",
srcs = ["script_data_transfer_objects_test.cpp", "script_option_lines_test.cpp"],
srcs = ["script_data_transfer_objects_test.cpp"],
deps = [
"//base/exaudflib:script_data_transfer_objects",
"//base/exaudflib:scriptoptionlines",
"@googletest//:gtest_main",
],
)
2 changes: 1 addition & 1 deletion exaudfclient/base/javacontainer/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ cc_library(
name = "javacontainer",
srcs = [":javacontainer.cc", ":javacontainer.h", ":javacontainer_impl.cc", ":javacontainer_impl.h", ":dummy"],
hdrs = [":filter_swig_code_exascript_java_h", "exascript_java_jni_decl.h"],
deps = ["@ssl//:ssl","@java//:java", ":exascript_java", "//base/exaudflib:header", "//base:debug_message_h","//base/exaudflib:scriptoptionlines"],
deps = ["@ssl//:ssl","@java//:java", ":exascript_java", "//base/exaudflib:header", "//base:debug_message_h","//base/javacontainer/script_options:java_scriptoptionlines"],
# copts= ["-O0","-fno-lto"],
alwayslink=True,
)
Expand Down
154 changes: 18 additions & 136 deletions exaudfclient/base/javacontainer/javacontainer_impl.cc
Original file line number Diff line number Diff line change
@@ -1,19 +1,18 @@
#include <iostream>
#include <dirent.h>
#include <sys/stat.h>
#include <openssl/md5.h>
#include <set>
#include <jni.h>
#include <unistd.h>
#include <sstream>

#include "base/exaudflib/swig/swig_meta_data.h"
#include "exascript_java_jni_decl.h"

#include "base/debug_message.h"
#include "base/javacontainer/javacontainer.h"
#include "base/javacontainer/javacontainer_impl.h"
#include "base/exaudflib/vm/scriptoptionlines.h"
#include "base/javacontainer/script_options/converter.h"


using namespace SWIGVMContainers;
using namespace std;
Expand All @@ -23,11 +22,23 @@ JavaVMImpl::JavaVMImpl(bool checkOnly, bool noJNI): m_checkOnly(checkOnly), m_ex

stringstream ss;
m_exaJavaPath = "/exaudf/base/javacontainer"; // TODO hardcoded path
DBG_FUNC_CALL(cerr,getScriptClassName()); // To be called before scripts are imported. Otherwise, the script classname from an imported script could be used
DBG_FUNC_CALL(cerr,importScripts());
DBG_FUNC_CALL(cerr,getExternalJvmOptions());



JavaScriptOptions::ScriptOptionsConverter optionsConverter([&](const std::string &msg){throwException(msg);},
m_jvmOptions);

DBG_FUNC_CALL(cerr,optionsConverter.getScriptClassName(m_scriptCode)); // To be called before scripts are imported. Otherwise, the script classname from an imported script could be used
DBG_FUNC_CALL(cerr,optionsConverter.convertImportScripts(m_scriptCode));
DBG_FUNC_CALL(cerr,optionsConverter.getExternalJvmOptions(m_scriptCode));
DBG_FUNC_CALL(cerr,setClasspath());
DBG_FUNC_CALL(cerr,addExternalJarPaths());
DBG_FUNC_CALL(cerr,optionsConverter.getExternalJarPaths(m_scriptCode));

for (set<string>::iterator it = optionsConverter.getJarPaths().begin(); it != optionsConverter.getJarPaths().end();
++it) {
addJarToClasspath(*it);
}

m_needsCompilation = checkNeedsCompilation();
if (m_needsCompilation) {
DBG_FUNC_CALL(cerr,addPackageToScript());
Expand Down Expand Up @@ -233,94 +244,6 @@ void JavaVMImpl::compileScript() {
check("F-UDF-CL-SL-JAVA-1037",calledUndefinedSingleCall);
}

void JavaVMImpl::addExternalJarPaths() {
const string jarKeyword = "%jar";
const string whitespace = " \t\f\v";
const string lineEnd = ";";
size_t pos;
while (true) {
string jarPath = ExecutionGraph::extractOptionLine(m_scriptCode, jarKeyword, whitespace, lineEnd, pos, [&](const char* msg){throwException(msg);});
if (jarPath == "")
break;
for (size_t start = 0, delim = 0; ; start = delim + 1) {
delim = jarPath.find(":", start);
if (delim != string::npos) {
string jar = jarPath.substr(start, delim - start);
if (m_jarPaths.find(jar) == m_jarPaths.end())
m_jarPaths.insert(jar);
}
else {
string jar = jarPath.substr(start);
if (m_jarPaths.find(jar) == m_jarPaths.end())
m_jarPaths.insert(jar);
break;
}
}
}
for (set<string>::iterator it = m_jarPaths.begin(); it != m_jarPaths.end(); ++it) {
addJarToClasspath(*it);
}
}

void JavaVMImpl::getScriptClassName() {
const string scriptClassKeyword = "%scriptclass";
const string whitespace = " \t\f\v";
const string lineEnd = ";";
size_t pos;
string scriptClass =
ExecutionGraph::extractOptionLine(
m_scriptCode,
scriptClassKeyword,
whitespace,
lineEnd,
pos,
[&](const char* msg){throwException("F-UDF-CL-SL-JAVA-1038: "+std::string(msg));}
);
if (scriptClass != "") {
m_jvmOptions.push_back("-Dexasol.scriptclass=" + scriptClass);
}
}

void JavaVMImpl::importScripts() {
SWIGMetadata *meta = NULL;
const string importKeyword = "%import";
const string whitespace = " \t\f\v";
const string lineEnd = ";";
size_t pos;
// Attention: We must hash the parent script before modifying it (adding the
// package definition). Otherwise we don't recognize if the script imports its self
m_importedScriptChecksums.insert(scriptToMd5(m_scriptCode.c_str()));
while (true) {
string scriptName =
ExecutionGraph::extractOptionLine(
m_scriptCode,
importKeyword,
whitespace,
lineEnd,
pos,
[&](const char* msg){throwException("F-UDF-CL-SL-JAVA-1039: "+std::string(msg));}
);
if (scriptName == "")
break;
if (!meta) {
meta = new SWIGMetadata();
if (!meta)
throwException("F-UDF-CL-SL-JAVA-1040: Failure while importing scripts");
}
const char *scriptCode = meta->moduleContent(scriptName.c_str());
const char *exception = meta->checkException();
if (exception)
throwException("F-UDF-CL-SL-JAVA-1041: "+std::string(exception));
if (m_importedScriptChecksums.insert(scriptToMd5(scriptCode)).second) {
// Script has not been imported yet
// If this imported script contains %import statements
// they will be resolved in this while loop.
m_scriptCode.insert(pos, scriptCode);
}
}
if (meta)
delete meta;
}

bool JavaVMImpl::check(const string& errorCode, string& calledUndefinedSingleCall) {
jthrowable ex = m_env->ExceptionOccurred();
Expand Down Expand Up @@ -448,15 +371,6 @@ bool JavaVMImpl::checkNeedsCompilation() {
return false == trimmedScriptCode.empty();
}

vector<unsigned char> JavaVMImpl::scriptToMd5(const char *script) {
MD5_CTX ctx;
unsigned char md5[MD5_DIGEST_LENGTH];
MD5_Init(&ctx);
MD5_Update(&ctx, script, strlen(script));
MD5_Final(md5, &ctx);
return vector<unsigned char>(md5, md5 + sizeof(md5));
}

void JavaVMImpl::addJarToClasspath(const string& path) {
string jarPath = path; // m_exaJavaPath + "/jars/" + path;

Expand Down Expand Up @@ -488,38 +402,6 @@ void JavaVMImpl::addJarToClasspath(const string& path) {
m_classpath += ":" + jarPath;
}

void JavaVMImpl::getExternalJvmOptions() {
const string jvmOption = "%jvmoption";
const string whitespace = " \t\f\v";
const string lineEnd = ";";
size_t pos;
while (true) {
string options =
ExecutionGraph::extractOptionLine(
m_scriptCode,
jvmOption,
whitespace,
lineEnd,
pos,
[&](const char* msg){throwException("F-UDF-CL-SL-JAVA-1064: "+std::string(msg));}
);
if (options == "")
break;
for (size_t start = 0, delim = 0; ; start = delim + 1) {
start = options.find_first_not_of(whitespace, start);
if (start == string::npos)
break;
delim = options.find_first_of(whitespace, start);
if (delim != string::npos) {
m_jvmOptions.push_back(options.substr(start, delim - start));
}
else {
m_jvmOptions.push_back(options.substr(start));
break;
}
}
}
}

void JavaVMImpl::setJvmOptions() {
bool minHeap = false;
Expand Down
8 changes: 0 additions & 8 deletions exaudfclient/base/javacontainer/javacontainer_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,22 +35,14 @@ class JavaVMImpl {
void throwException(const char *message);
void throwException(const std::exception& ex);
void throwException(const std::string& ex);
//void throwException(swig_undefined_single_call_exception& ex);
void importScripts();
void addExternalJarPaths();
void getExternalJvmOptions();
void getScriptClassName();
void setJvmOptions();
void addJarToClasspath(const std::string& path);
std::vector<unsigned char> scriptToMd5(const char *script);
bool m_checkOnly;
std::string m_exaJavaPath;
std::string m_localClasspath;
std::string m_scriptCode;
std::string m_exaJarPath;
std::string m_classpath;
std::set<std::vector<unsigned char> > m_importedScriptChecksums;
std::set<std::string> m_jarPaths;
bool m_exceptionThrown;
std::vector<std::string> m_jvmOptions;
JavaVM *m_jvm;
Expand Down
9 changes: 9 additions & 0 deletions exaudfclient/base/javacontainer/script_options/BUILD
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
package(default_visibility = ["//visibility:public"])


cc_library(
name = "java_script_option_lines",
hdrs = [":converter.h"],
srcs = [":parser.h", ":converter.h", ":converter.cc", ":parser_legacy.h", ":parser_legacy.cc"],
deps = ["//base/script_options_parser:script_option_lines_parser", "//base/exaudflib:header", "//base/exaudflib:exaudflib-deps"],
)
Loading

0 comments on commit afc9ef6

Please sign in to comment.