Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft for reading configuration from config files #1722

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

rPraml
Copy link
Contributor

@rPraml rPraml commented Nov 8, 2024

This could be a first draft for #1720

@rPraml rPraml marked this pull request as draft November 8, 2024 09:10
@gbrail
Copy link
Collaborator

gbrail commented Nov 13, 2024

I think that this approach makes sense in that it lets configuration come from the environment, system properties, file, or classpath. I think that's a good thing and that we should follow this pattern.

For actually checking the setting of various flags, however, I think that the config class should parse all this input and set boolean (or enum) values that can be checked directly, rather than relying on a hash lookup. I think that can give us two advantages:

  1. It keeps all the logic about naming and renaming things in one place
  2. It will be faster, since many of these properties, like the various debug flags, will be checked many millions of times potentially while Rhino runs, and all those hash table lookups will dominate performance pretty soon.

@rPraml
Copy link
Contributor Author

rPraml commented Nov 20, 2024

I think I can implement this next week or so...

@gbrail
Copy link
Collaborator

gbrail commented Nov 20, 2024

If you're looking for the kinds of debug flags I'd like to potentially replace with a real debugging mechanism, I would look at these:

public static final boolean printTrees = false;

private static final boolean DEBUGSTACK = false;

And as for feature flags, the first one IMO should be for the reflect and proxy support that's currently languishing in a PR by @rbri

@rPraml
Copy link
Contributor Author

rPraml commented Dec 13, 2024

Unfortunately, I didn't have as much time as I thought, but I wanted to update the draft before Christmas of how I want to do it now.

The idea is, to check all sources (classpath, configfile, system-properties, env - the last one wins) for the settings and parse them immediately in the correct datatype. When a variable is bound to the property rhino.debugLinker for example. I try to check also the upper-camel RHINO_DEBUG_LINKER version.

@gbrail I hope, this goes in the right direction... Unfortunately (or fortunately) I won't be able to continue until next year due to vacation. :)

@rPraml
Copy link
Contributor Author

rPraml commented Jan 17, 2025

So, I found some time to update this PR.

What do we have now

The RhinoProperties class is a container of serveral config maps, that are loaded from different location:

  • looks for rhino.config & rhino-test.config in several locations. (CHECKME: do we need rhino-test.config - similar to logback-test.xml?)
  • add system-env & system-properties
  • or optional: Implement RhinoConfigLoader service with your own loading strategy

When checking for a value, RhinoProperties tries all configs in loaded order in exact spelling and camel case-conversion. (e.g rhino.printICode and RHINO_PRINT_ICODE - you can use both writings in any config source. (CHECKME: This camel case conversion is required for system-env. Do we want it for other config sources, too? It theoretically slows down the parse time, as several permutations are checked, but every config is only parsed once, if assigned to a static field, so this should not have a big impact)

There is a helper class RhinoConfig which should be used in most cases. It adds some helper methods to get properties as String, Integer or Enum.

For actually checking the setting of various flags, however, I think that the config class should parse all this input and set boolean (or enum) values that can be checked directly, rather than relying on a hash lookup. I think that can give us two advantages:

  1. It keeps all the logic about naming and renaming things in one place

  2. It will be faster, since many of these properties, like the various debug flags, will be checked many millions of times potentially while Rhino runs, and all those hash table lookups will dominate performance pretty soon.

I tried that, but did not like the idea to add configs from different packages in one class:

  1. This is not optimal, if we want to split rhino in more modules. I can imagine to have one config class per module or package (please give feedback)
  2. This should be no problem in the current PR. Each config is assigned to a static constant, so it is read only one time.

Other findings, I identified:

  • I tried to attach the current config to the current context (so to have one context with PRINT_ICODE and an other without), but this would require to pass the context/config object down to Token etc. So I think, we have to accept, that we have only ONE config per classloader and every rhino context uses the same settings. It is important in the future, to decide, if some config values should be global or at context level.
  • I read all properties (also system-properties & env). So someone that have access to this class will effectively bypass the (deprecated) security-manager

it would be good, if I get some feedback, if this goes in a way we want.

@gbrail FYI

@rPraml
Copy link
Contributor Author

rPraml commented Jan 23, 2025

@gbrail ping... Did you find already time to give some feedback?
Thanks

Copy link
Collaborator

@gbrail gbrail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good progress, and minus the attached minor comments, I think that this is the right implementation.

I also want to know how we can test this -- there are a lot of options with properties files, system properties, environment variables, and a service loader.

I really think that we should go forward with this, but only if we can put in some way to exercise more of this codebase in our test suite.

if (ret != null) {
Class<T> enumType = (Class<T>) defaultValue.getClass();
// We assume, that enums all are in UPPERCASES
return Enum.valueOf(enumType, ret.toString().toUpperCase(Locale.ROOT));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we catch "IllegalArgumentException" in case the input value is not valid and return the default in that case?

if (ret instanceof Boolean) {
return (Boolean) ret;
} else {
return "1".equals(ret) || "true".equals(ret);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you'd consider numeric values other than 1 as "true," and doing a case-insensitive comparison to "true"

props.load(new InputStreamReader(in, StandardCharsets.UTF_8));
addConfig(props);
} catch (IOException e) {
System.err.println(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love that we "println" anywhere, but I don't see that we have a choice in this case. But if it happens, it'll be buried in someone's big log file and they won't know what it means -- should we consider adding "Rhino:" or something to identify this message so that people can know why it's appearing?

props.load(new InputStreamReader(in, StandardCharsets.UTF_8));
addConfig(props);
} catch (IOException e) {
System.err.println(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here -- we should add "Rhino" or something to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants