Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change NLU.DevOps.ModelPerformance to generic JSON compare #188

Open
rozele opened this issue Sep 21, 2019 · 2 comments
Open

Change NLU.DevOps.ModelPerformance to generic JSON compare #188

rozele opened this issue Sep 21, 2019 · 2 comments

Comments

@rozele
Copy link
Contributor

rozele commented Sep 21, 2019

Change NLU.DevOps.ModelPerformance to a configurable JSON compare library such that any two JSON values can be compared.

We could have a simple interface like IConfusionMatrixEvaluator:

public interface IConfusionMatrixEvaluator
{
    IEnumerable<TestCase> Evaluate(JToken expected, JToken actual);
}

E.g., for NLU today, we would create the following config:

{
  "intent": "default",
  "text": "string-no-punctuation",
  "entities": "NLU.DevOps.ModelPerformance.Comparers.EntitiesEvaluator"
}

Where EntitiesEvaluator is an implementation of IConfusionMatrixEvaluator and string-no-punctuation and default is syntactic sugar for other evaluator implementations.

We may want to have an option that allows you to evaluate all JSON properties in the expected and actual JSON values (not just the configured properties), in which case any unconfigured property value would just use the default evaluator.

@rozele
Copy link
Contributor Author

rozele commented Sep 23, 2019

Turns out this is likely more challenging than I originally thought. Seems for NLU at least, that we need a custom comparer specific to NLU results for each property (intent, text and entities).

@rozele
Copy link
Contributor Author

rozele commented Oct 4, 2019

Probably a better idea is to use a prioritized list of comparers with target jpath queries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant