You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Machine learning models are generally black boxes. These are difficult to explain or get to know the internal workings of. Meanwhile, they often provide state of the art performance in predictive tasks. There could be opportunities in understanding these blacks boxes better, by doing for example:
Global sensitivity analysis: By varying parameters within a ML model, the EMA Workbench can help identify which inputs significantly affect the model's outputs.
Scenario exploration: It allows for the exploration of a wide range of scenarios, including extreme cases, by systematically varying input parameters.
Uncertainty quantification: The workbench can quantify uncertainties associated with model inputs and outputs. In the context of ML, this means identifying how uncertainties in data or model parameters can affect predictions, aiding in the interpretation of model confidence and reliability.
And basically most other features of the workbench.
Problem
Currently there are no connectors to use machine-learned models in the EMAworkbench.
On notable thing is that both input nodes (uncertainties / policy levers) and output ones (outcomes) need to be able to be named, for most useful analysis with the workbench.
Proposal
Create said connectors for the most common formats.
Sample implementation
These sample implementations support the most common model storage formats (Pickle/Joblib, HDF5, and ONNX). Given these formats serve different types of models (e.g., scikit-learn models, TensorFlow/Keras deep learning models, and cross-platform ONNX models), there will be a generic MLModelConnector base class. This class will provide the foundational structure and methods, which specialized connectors can then extend and customize based on the specific requirements of each format.
Base Connector Class
The base class will define the common interface and utilities for initializing models, running experiments (making predictions), and resetting models if necessary. It assumes models are used for prediction in a policy analysis context, focusing on loading the model and making predictions based on input parameters provided by the EMA Workbench.
fromema_workbench.em_framework.modelimportSingleReplicationclassMLModelConnector(SingleReplication):
def__init__(self, name, model_path=None, input_features=[], output_variables=[]):
super().__init__(name)
self.model_path=model_pathself.input_features=input_featuresself.output_variables=output_variablesself.model=Nonedefmodel_init(self, policy, **kwargs):
raiseNotImplementedError("Model initialization must be implemented by subclasses")
defrun_experiment(self, experiment):
raiseNotImplementedError("Experiment execution must be implemented by subclasses")
defreset_model(self):
# Reset logic here if applicable. Some models may not require reset.pass
Pickle/Joblib Connector
This connector is tailored for loading and executing scikit-learn models (or any model) serialized with pickle or joblib.
importjoblibclassPickleJoblibConnector(MLModelConnector):
defmodel_init(self, policy, **kwargs):
super().model_init(policy)
# Load the model from a pickle or joblib file.self.model=joblib.load(self.model_path)
defrun_experiment(self, experiment):
# Create input array based on named input_featuresX= [experiment[feature] forfeatureinself.input_features]
predictions=self.model.predict([X])
# Map predictions to named output_variablesreturn {self.output_variables[i]: predictionfori, predictioninenumerate(predictions)}
HDF5 Connector for Keras/TensorFlow Models
This connector focuses on deep learning models saved in the HDF5 format by TensorFlow/Keras.
fromtensorflow.keras.modelsimportload_modelclassHDF5Connector(MLModelConnector):
defmodel_init(self, policy, **kwargs):
super().model_init(policy)
# Load the TensorFlow/Keras model from an HDF5 file.self.model=load_model(self.model_path)
defrun_experiment(self, experiment):
# Create input array based on named input_features, ensuring correct shapeX=np.array([[experiment[feature] forfeatureinself.input_features]])
predictions=self.model.predict(X)
# Assuming the model has a single output. Adjust for models with multiple outputs.# Map predictions to named output_variablesreturn {self.output_variables[i]: predictionfori, predictioninenumerate(predictions.flatten())}
ONNX Connector
The ONNX connector is for models exported in the ONNX format, enabling cross-platform interoperability.
classONNXConnector(MLModelConnector):
defmodel_init(self, policy, **kwargs):
super().model_init(policy)
# Initialize ONNX runtime session for the model.self.session=ort.InferenceSession(self.model_path)
self.input_name=self.session.get_inputs()[0].name# Assumes a single input. Adjust as necessary.defrun_experiment(self, experiment):
# Create input array based on named input_features, correctly shaped for ONNXX=np.array([[experiment[feature] forfeatureinself.input_features]], dtype=np.float32)
input_dict= {self.input_name: X}
outputs= [node.namefornodeinself.session.get_outputs()] # Get output node namespredictions=self.session.run(outputs, input_dict)
# Map predictions to named output_variables. This assumes a direct mapping and# that the length of predictions matches the number of output_variables.# Adjust as necessary for complex models with multiple outputs.return {self.output_variables[i]: predictions[0][0, i] foriinrange(len(self.output_variables))}
The text was updated successfully, but these errors were encountered:
Motivation
Machine learning models are generally black boxes. These are difficult to explain or get to know the internal workings of. Meanwhile, they often provide state of the art performance in predictive tasks. There could be opportunities in understanding these blacks boxes better, by doing for example:
And basically most other features of the workbench.
Problem
Currently there are no connectors to use machine-learned models in the EMAworkbench.
On notable thing is that both input nodes (uncertainties / policy levers) and output ones (outcomes) need to be able to be named, for most useful analysis with the workbench.
Proposal
Create said connectors for the most common formats.
Sample implementation
These sample implementations support the most common model storage formats (Pickle/Joblib, HDF5, and ONNX). Given these formats serve different types of models (e.g., scikit-learn models, TensorFlow/Keras deep learning models, and cross-platform ONNX models), there will be a generic
MLModelConnector
base class. This class will provide the foundational structure and methods, which specialized connectors can then extend and customize based on the specific requirements of each format.Base Connector Class
The base class will define the common interface and utilities for initializing models, running experiments (making predictions), and resetting models if necessary. It assumes models are used for prediction in a policy analysis context, focusing on loading the model and making predictions based on input parameters provided by the EMA Workbench.
Pickle/Joblib Connector
This connector is tailored for loading and executing scikit-learn models (or any model) serialized with pickle or joblib.
HDF5 Connector for Keras/TensorFlow Models
This connector focuses on deep learning models saved in the HDF5 format by TensorFlow/Keras.
ONNX Connector
The ONNX connector is for models exported in the ONNX format, enabling cross-platform interoperability.
The text was updated successfully, but these errors were encountered: