Skip to content

Primitive device approximation, a machine learning extension for the PREDICT Toolbox.

License

Notifications You must be signed in to change notification settings

electronics-and-drives/precept

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PRECEPT (Pre-Alpha Development)

Primitive device approximation, a machine learning extension for the PREDICT Toolbox based on Lightning. Train Neural networks with PREDICT data to model the behavior of primitive devices.

Setup

Dependencies and Requirements

Tested with:

  • conda: 4.9.2
  • pip: 21.0.1
  • python: 3.8.8
  • CUDA: 11.2
  • Nvidia Driver: 460.73.01

Everything else is specified in requirements.txt. Other/Higher versions of these dependencies may work, but are untested.

Installation

Clone this repository:

$ git clone https://github.com/electronics-and-drives/precept.git

cd precept into the directory and install the python package:

$ pip install .

With this the precept API as well as the two CLIs pct (for training) and prc (for inference) will be available.

CLI

precept comes with CLIs for both training and inference based on a .yml configuration files. For more information about all options see the corresponding help.

$ pct --help

$ prc --help

Training

$ pct --config ./examples/train.yml

Precept comes with the following options:

model:
  learning_rate: <float, default = 0.001>
  beta_1:        <float, default = 0.9>
  beta_2:        <float, default = 0.999>
data:
  data_path:     <string>   # Path to HDF5 database
  params_x:      <[string]> # List of input column names 
  params_y:      <[string]> # List of output column names 
  trafo_mask_x:  <[string]> # List of input paramters that will be transformed
  trafo_mask_y:  <[string]> # List of output paramters that will be transformed
  batch_size:    <int, default = 2000>
  test_split:    <float, default = 0.2>
  num_workers:   <int, default = 6>
  rng_seed:      <int>
serialize:       <bool, default = true>
device_name:     <string> # File name for output
model_prefix:    <string> # Path where to store output

A default config can be generated by running:

$ pct --print_config > default.yml

Additional documentation for Lightning specific configuration can be found in their documentation.

Training Data

The preferred file format for training data at this point is HDF5, for which currently two different formattings are supported. The file should either have two fields, one named columns containing a list of strings, corresponding to the operating point parameter names and another field named data containing the data matrix.

In  [1]: list(map(lambda k: f"{k}: {hdf_file[k].shape}", hdf_file.keys()))
Out [1]: ['columns: (18,)', 'data: (18, 16105100)']

Alternatively, if storing / reading strings is not wanted or possible (command line utilities, octave ...) the file may be formatted such that each parameter names it's own group in the file.

In [2]: list(map(lambda k: f"{k}: {f[k].shape}", f.keys()))
Out[2]:
['L: (14641000,)',
 'Vbs: (14641000,)',
 'Vds: (14641000,)',
 'Vgs: (14641000,)',
 'W: (14641000,)',
 'cdb: (14641000,)',
 'cds: (14641000,)',
 'cgb: (14641000,)',
 'cgd: (14641000,)',
 'cgs: (14641000,)',
 'csb: (14641000,)',
 'fug: (14641000,)',
 'gbd: (14641000,)',
 'gbs: (14641000,)',
 'gds: (14641000,)',
 'gm: (14641000,)',
 'gmbs: (14641000,)',
 'id: (14641000,)',
 'vdsat: (14641000,)',
 'vth: (14641000,)']

Where columns or group names are the headers for what is stored. These must align with the params_x and params_y specification in the given train.yml.

If you need some toy data, check out pyrdict.

Inference

Flask

The inference interface prc is much simpler. It takes as input only a dictionary to all the models that should be served.

host: <string, default = localhost> # IP or hostname 
port: <int, default = '5000'>       # Port
models:
    <model-name>: 
        model_path:  <string> # Path to <name>-model.bin
        config_path: <string> # Path to <name>-model.yaml
    ...

Start the flask server with the prc command and a configuration like the one shown in examples/infer.yml.

$ prc --config ./examples/infer.yml

 * Serving Flask app 'prc' (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: off
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)

Then, models can be evaluated by sending requests with the following structure:

$ curl -X POST -H "Content-Type: application/json" \
       -d '{"<model-name>": {"<param 1>": [vals...], "<param 2>": [vals...], ... }}' \
       127.0.0.1:5000/predict

The values for each parameter must be in a list, even if it's just one and there must be the same number of values for each parameter and the parameters must have been specified in the params_x previously.

Compiled

An alternative interence method for TorchScript models is currently in the works, but still in very early development. It can be found in the precppt repository.

API

Examples

Commented API usage examples for both training and inference can be found in examples/training.py and examples/inference.py respectively.

Reference

soonTM

TODO

  • Split classes into separate modules
  • Install requirements in setup
  • Dump scaler and transformer in after-fit
  • Infer input and output size from x-parameters and y-parameters
  • Implement serialization and compile model trace to torch script
    • Implement model inference based on Flask
    • Trochscript C++ interface
  • CSV Data support
  • Don't hardcode CSV Format
  • TSV, ASCII, PSF, nutmeg, nutbin etc... support
  • Alternative scaling and transforming for better use with serialized models
  • Get rid of hard coded processing for additional parameter calculation
  • Add training and inference API examples
    • Add toy models for inference examples
    • Notebooks as well
  • Deprecate transformation, should be part of manual preprocessing
  • Add better logging
  • Add manpages for CLI(1), API(8) and CFG(5)
  • Add tests

License

Copyright (C) 2021, Electronics & Drives Lab

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.