Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implementation of option to blacklist rules when creating sessions. #445

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

wandersoncferreira
Copy link

The idea is to enable the users to blacklist a specific rule inside a namespace for example.

An example that motivated this implementation was because I have a namespace with many rules defined, however, for some clients I wish to "disable" some specific rules from being loaded into the session.

The workaround I have to do now is to load all the individual rules in the session, instead I wanted to load the entire namespace and only blacklist the ones I wish to remove.

Example of the implemented API:

(defrecord Values [id])
  (defrule testing->include
    [?value <- Values [v] (= ?id (:id v))]
    =>
    (println "INCLUDING!!! " ?value))
  (defrule testing->exclude
    [?value <- Values [v] (= ?id (:id v))]
    =>
    (println "EXCLUDING!!! " ?value))
  (def session-without (rl/mk-session 'clara.rules.compiler/testing->include
                                      'clara.rules.compiler/testing->exclude
                                      :blacklist ['clara.rules.compiler/testing->exclude]))

This is a working implementation. I don't know all the internals of the library to antecipate more drawbacks that this feature could bring. All the tests are working fine.

The idea is to enable the users to blacklist a specific rule inside a namespace for example.
@EthanEChristian
Copy link
Contributor

Not really related to the code, but more thinking of maintenance of consumer code(projects that contain rules).

Looking at the changes they seem pretty straight forward, remove productions from the productions loaded from all namespaces given a list of production names to be removed. Though this makes me wonder, given the scenario:

I have some superset of rules in a given namespace, two clients use this namespace. One of the clients(Client A) use the entire superset, where as the other(Client B) only uses a subset. Thus I have to define a blacklist for Client B, easy enough. Time goes by and Client A requires maintenance/enhancement, so rules are added to the namespace to achieve this. This would then require me to append any/all new productions to the Blacklist for Client B, which may or may not live close to the namespace itself.

I guess the root of my comment/question lies in the work required to maintain the blacklist when the namespace is ever changing. While it probably wouldn't be too big of a deal to maintain a handful of these "subset" clients, it would become a burden to handle them all in the event that the number was increasing. So in short I question wether it should be Blacklisting vs Whitelisting. I believe the latter is supported, but requires the productions to be provided rather than the namespace.

I don't think the answer to the question/comment above should block this enhancement as it is opt-in functionality, and won't affect consumers that don't opt in.

@wandersoncferreira
Copy link
Author

wandersoncferreira commented Jan 11, 2020

@EthanEChristian your comment about blacklist vs whitelist is very relevant. I endup in this situation because of shacky business rules. I have a namespace with "mandatory"-rules to all contracts that is acquired by the company. This namespace has only a small set of "critical"-rules that should be true to 100% of these contracts. However, 99% of time this assumption holds, and I did not want to break the logic of this namespace because of the 1% and this 1% can change any of the let's say 10 must-follow rules I have in the namespace.

So in the end, I could endup defining in 99% of the time each of the mandatory rules one-by-one, because of the 1%.

@EthanEChristian
Copy link
Contributor

@wandersoncferreira,
I think that answers my question, in the event that the "subset" clients are the minority case then it definitely makes sense to allow blacklisting.

Thanks for the clarification.

@WilliamParker
Copy link
Collaborator

Is your need for this in Clojure or Clojurescript? My first thought for implementing this in Clojure would be to implement the RuleSource protocol. Something like (defrecord NamespaceWithRulesBlacklist [ns blacklist] RulesSource (load-rules [] (remove (set blacklist) (load-rules ns)))

However this is only in Clojure, unfortunately ClojureScript is more restricted. I'd have to think about if there is a practical analogue there. I do see the utility of something like this, though I wonder if ClojureScript is likely to have the large rulebases that would make this most useful. This by itself isn't too complex, but I do worry about a growing core API to support similar cases. Perhaps also something like

:rule-filters [[:blacklist [rulename-1 rulename-2]]

in the session options, with the filters applied in order. This would have the benefit of allowing a single API for additions and combinations of the filters in the future if needed.

Note that when I refer to core API, that doesn't necessarily mean everything in the repo. I don't personally mind having some additional "ecosystem" in the repo, but there's a cost in complexity every time the core engine, compiler, etc. are grown, as well as additional cognitive burden for users when top-level APIs grow. That isn't to say we shouldn't ever do these things, just that there is always a cost-benefit calculation to be made.

Apologies for syntax, spelling, etc. errors, I'm on mobile at the moment. :)

@wandersoncferreira
Copy link
Author

wandersoncferreira commented Jan 13, 2020

My current need is in the Clojure spectrum only, however I see the benefits of adding this option to ClojureScript as well. I understand your point about growing the core API and I think the option of adding a :rule-filters is very interesting from the perspective that it provides space for productive ideas in the future.

it has the same burden for top level API readers and is more open to extensions.

@mrrodriguez
Copy link
Collaborator

I like the idea of :rule-filters although, I think it should be :production-filters to be inclusive of possibly queries (they don't need to be special cased). I have more to say on naming at the end of this though.

I think the question is then what can go in that sort of filter. I don't know that I'm a big fan of a "mini-DSL" in it like [:blacklist [:rule1]] etc.

I think perhaps better would be to allow 3 possible things:

  1. rule structures - filters rules that are = to that rule structure
    eg.
;; `rule-1` is a symbol resolving to a `defrule rule-1` sort of rule var in the `*ns*`
:production-filters [rule-1]
  1. symbols - filters rules with a name = to that symbol
:production-filters ['my.ns/rule-1]
  1. strings - filters rules with a name = to that string
:production-filters ["my-rule-1"]

Note: you can define productions without defrule & defquery, so there is nothing saying we have to have a ns-bound-symbol naming the rule - and strings are also supported by the schema I believe.

  1. regex - filter rules matching the regex
:production-filters [#"my.ns/rule.*"]

Points on above:

I'm not sure we need (4) at this point.
When matching with symbols, I'm not sure if we should do the = check with name on the symbols & the production name. I'm not sure in some cases that we strictly only use symbols vs strings in the production names. It may make sense to normalize both sides of the comparison with name first to compare the underlying strings.

A "filter" is Clojure should be semantically similar to the filter fn. This notably is a filter on rules to include, not exclude. So going down this path interestingly doesn't solve the "blacklist" issue very directly.

So I think it may be clearer and easier to work with to have :production-inclusions vs :production-exclusions as the options we provide.

@wandersoncferreira
Copy link
Author

wandersoncferreira commented Jan 13, 2020

@mrrodriguez Your point about the filter semantics is true, it should behave similar to Clojure fn. However, I am pro "mini-DSL" when it is well defined and its power is very restricted to some activities. The change in naming from rule to productions I think is very sane to keep the semantics of the project.

I think we can provide a simpler "mini-DSL" to use filtering from strings, symbols, structure and regexes [I like this one very much =)]

:production-filters [[:exclusions [rulename-1 rulename-2]]
                    [:inclusions ['my.ns/rule-1 "my.ns/rule-1" #"my.ns/rule.*"]]]

Or if we want to provide better "operators" to the DSL, a :not operator would be nice too instead of "exclusions" for example.

I think the point of providing an expressive DSL over production filtering has the side effect to enable more use-cases for the future. All the development can be done in a different namespace from compiler and probably only providing some interface to handle these cases in there.

@mrrodriguez
Copy link
Collaborator

@wandersoncferreira So it sounds like our opinions differ on the subject of to mini-DSL or not.

I'm still more of a fan, DSL or not, of having :production-exclusions and :production-inclusions be separate. However, overall I'm just waiting to see if anyone else has any thoughts at this point.

@EthanEChristian and/or @WilliamParker in particular may be interested in weighing in.

@wandersoncferreira
Copy link
Author

wandersoncferreira commented Jan 16, 2020

@mrrodriguez as a fact, I don't have a strong opinion though. I was trying to accomodate the previous comments as well, e.g the increase in core API described by @WilliamParker when he suggested the DSL.

I am also waiting for some agreement before changing the implementation =)

"If you don't have enough information and you can postpone a decision, is always clever to do so" rsrs

@WilliamParker
Copy link
Collaborator

My reason for suggesting having the exclusions and inclusions in a single ordered data structure was to allow for having well-defined priority between them. Otherwise expressing use-cases like "blacklist all rules matching regex A, but include rule B that matches it" and "whitelist all rules matching regex A, but exclude rule B that matches it" would require a new addition to the top-level API. That said, I share @mrrodriguez 's misgivings about adding new DSLs and their attendant complexity. My intent was to try to contain the number of new options and/or DSLs we might need later. :) These requirements seem likely to differ between users quite a lot, but I recognize that because of the limitations in the ClojureScript compilation model it is difficult to do this without core support.

Perhaps just having a :filter-rules-fn option that for ClojureScript would be required to be something that could be eval'ed in a Clojure environment would be an option. This could be a fully-qualified symbol pointing to a function in a Clojure namespace, or an s-expression that could be evaluated with nothing except clojure.core present. In any individual use-case the requirements for filtering can probably be expressed in a modest amount of code; the complexity here I think comes from trying to support such use-cases in general. Does this seem reasonable to you @mrrodriguez ? You've probably spent more time thinking about the ClojureScript compilation model than I have, is this something that could be reliably done with ClojureScript's API contracts?

@mrrodriguez
Copy link
Collaborator

mrrodriguez commented Jan 23, 2020

@WilliamParker

My reason for suggesting having the exclusions and inclusions in a single ordered data structure was to allow for having well-defined priority between them. Otherwise expressing use-cases like "blacklist all rules matching regex A, but include rule B that matches it" and "whitelist all rules matching regex A, but exclude rule B that matches it" w

Yeah, I considered the clashes between option issue, but I think trying to facilitate this in via interleaved exclusions/inclusions sequence to make a well-defined priority is just overly complex. We don't need to facilitate an option to be overly used/abused in my opinion.

In many other clj libs out there with a concept of inclusions/exclusions (pretty common really), it is typically documented that one is prioritized over the other, or that one is done first. I think the semantics for that are clearer there. Even if you give the full control to the user, it's just semantically confusing to keep track of the interleaved filtering logic. At that point, I think using this option is probably a bad thing and instead the rules should have been broken down into better "modules" to use mk-session with. Of course some of this is just my opinion (subjective).

Perhaps just having a :filter-rules-fn option that for ClojureScript would be required to be something that could be eval'ed in a Clojure environment would be an option. This could be a fully-qualified symbol pointing to a function in a Clojure namespace, or an s-expression that could be evaluated with nothing except clojure.core present.

I don't like any option that tries to do clojure-side user-given eval things during cljs compilation. (We need to stop doing that in general in the clara cljs impl, as discussed in #388 ) It is only impl-dependent that you can get away with that sometimes. It seems to be an abuse of the compiler leaking the clj environment during macro expansion anyway you try it. eval during cljs macroexpansion is not something that it really supports, it's more "accidental" that it sometimes works out (Relates to this comment )

If we are instead restricted to something like described like I suggested with (1)-(4) in the above comment it'd just be normal macroexpansion clj-side logic.
Your option of providing a symbol instead of a form is safer than arbitrary form. However, I'd hope it wouldn't need to be fully-qualified. It'd also probably be a little confusing from an API perspective that it was restricted to that. We'd probably at least have to name it filter-rules-fn-name. Even then, it's a bit odd from an API perspective.

@WilliamParker
Copy link
Collaborator

OK fair enough on the issues surrounding ClojureScript eval. The API change probably doesn't add too much cognitive overhead for users as an option ":production-exclusions" is easily understood even without additional doc once one understands that a production is either a rule or query. I'm fine with a :production-exclusions option as you outlined @mrrodriguez ; I also agree that we should maintain consistency with clojure.core in the semantic meaning of "filter". I do suggest that we be thoughtful in writing the doc on this in the session-options so that this is as quickly understood as possible for someone just quickly skimming through the options.

@EthanEChristian thoughts?

@EthanEChristian
Copy link
Contributor

@WilliamParker,
I am good with that, changing the name to production-exclusions is probably for the best and i agree we should be clear with the doc of all these opts we are adding

@wandersoncferreira
Copy link
Author

cool! I will try to submit a PR with the implementation this weekend! Great discussions around how to evolve an API. o/

@EthanEChristian EthanEChristian changed the base branch from master to main July 9, 2020 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants