Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-evaluate proc macro approach #3

Open
Kixunil opened this issue Nov 6, 2018 · 12 comments
Open

Re-evaluate proc macro approach #3

Kixunil opened this issue Nov 6, 2018 · 12 comments
Labels
v1.0 These issues must be resolved before 1.0 release

Comments

@Kixunil
Copy link
Owner

Kixunil commented Nov 6, 2018

Pros:

  • Easy to use
  • User can immediately see the struct
  • Syntax checking of type names etc by rustc
  • Possible to add other derives

Cons:

  • possibly harder to debug (can be resolved by some attribute?)
  • possibly weird man page generation (have filename in the source code?)
  • hygiene?
@Kixunil Kixunil added the v1.0 These issues must be resolved before 1.0 release label Nov 19, 2018
@Kixunil
Copy link
Owner Author

Kixunil commented Sep 28, 2019

Any standardization that sandboxes proc macros might kill this.

@Kixunil
Copy link
Owner Author

Kixunil commented Feb 13, 2020

I think that after cutting man page generation out from build script, this may be better, but we have to implement some kind of scanning. dtolnay/proc-macro2#213 may be a blocker for this.

@jaskij
Copy link

jaskij commented Dec 5, 2021

@Kixunil I've got a proposal written up for this, do you want it here or in a separate issue? The TL;DR is that #[path = "src/config.rs"] mod config is legal in build.rs.

@Kixunil
Copy link
Owner Author

Kixunil commented Dec 5, 2021

Sounds interesting, feel free to write it here! I also had an idea recently to do something similar to prost - they have both build script and a proc macro. My thinking was to generate the struct using build script and proceed from there using proc macro. So people who don't like build script can still use proc macro but they'll lose man page generation.

@jaskij
Copy link

jaskij commented Dec 5, 2021

TL;DR

#[path = "src/config.rs"] mod config is legal in build.rs.

What to do:

Add a #[derive] which allows for parsing, and can be used to generate what is needed in build.rs. And then import the mod with the config in build.rs.

Example

Imagine a code structure like this:

 .
├──  build.rs
├──  Cargo.toml
└──  src
   ├──  app.rs
   ├──  bin
   ├──  config.rs
   ├──  main.rs
   └──  receiver.rs

We have a config struct which (preferably) sits alone in config.rs.

#[derive(::configure_me::Config)]
struct Config {
    #[cfg_optional(true), cfg_doc("Port to listen on."]
    port: u16,
    #[cfg_default(::std::Ipv4Addr::new(0, 0, 0, 0)), cfg_doc("IP address to bind to."]
    bind_addr: ::std::net::Ipv4Addr,
    #[cfg_doc("IP address to bind to."]
    tls_cert: String,
}

With build.rs like this:

use configure_me_codegen::ConfigExt;

#[path = "src/config.rs"] mod my_app_config;

fn main() {
    my_app_config::generate_configure_toml();
}

The #[derive] would take care of the parser, while build.rs would generate config_spec.toml (for consumption by external tools like cfg_me) like in the README.

[[param]]
name = "port"
type = "u16"
optional = false
doc = "Port to listen on."

[[param]]
name = "bind_addr"
type = "::std::net::Ipv4Addr" # Yes, this works and you can use your own types as well! (impl. Deserialize and ParseArg)
default = "::std::net::Ipv4Addr::new(0, 0, 0, 0)" # Rust expression that creates the value.
doc = "IP address to bind to."

[[param]]
name = "tls_cert"
type = "String"
doc = "Path to the TLS certificate. The connections will be unsecure if it isn't provided."

The Downside

And it's a big one: this requires somehow importing the config struct in build.rs. Most likely through specifying a module with a path. This feels quite hacky. But it does work.

There is an alternative approach with moving the configs to a separate (sub)crate, but that probably has even more downsides (publishing?) than the mod approach.

Discussion

What I'm relying on heavily is the compiler - to be precise, the fact that it will drop unused functions from resulting binaries. So, if #[derive] does not introduce any new data members to the struct, there should be no impact on the code.

If further separation is desired, an extension trait to be imported only in build.rs is an option.

As a bonus including additional outputs should be fairly trivial at that point.

Closing thoughts

This is just an idea, here for discussion. To state the obvious, it is not intended to replace the the current config_spec.toml, but rather complement it, allowing defining it in code. Unless there's a good way to actually get the config struct in build.rs, this should be an optional extras.

All the details like API, generated names, etc, are temporary.

@jaskij
Copy link

jaskij commented Dec 5, 2021

Sounds interesting, feel free to write it here! I also had an idea recently to do something similar to prost - they have both build script and a proc macro. My thinking was to generate the struct using build script and proceed from there using proc macro. So people who don't like build script can still use proc macro but they'll lose man page generation.

My thinking is completely the other way around - make the struct Rust, allow making everything that's possible with just proc_macros to be done with [#derive], and use the module in build.rs to generate the TOML config and other stuff as needed.

Re: Prost, it's one of the tools which call an external process to generate stuff. Taking a page from their book adding a nice wrapper around calling cfg_me for build.rs would be nice. The use case is to have a single command build without requiring anything but cargo and a Rust toolchain.

@jaskij
Copy link

jaskij commented Dec 5, 2021

(Sorry for multiple comments, I don't feel comfortable editing in stuff in issue comments).

While Prost is a good example, I wouldn't go too far in how they do things - the use case is entirely different.

My main pain points with configure_me are 1) that the current TOML is a bit unwieldy, 2) bad IDE support for generated code. In Prost's (and Protobuf's) case 1) is not an issue because Protobuf is much more concise and 2) is necessary. We have a project where proto files are used in three different programs, written in three different languages. configure_me doesn't really need that kind of versatility.

@Kixunil
Copy link
Owner Author

Kixunil commented Dec 5, 2021

Hmm, the idea of generating Toml from .rs sounds interesting! I see one problem with it that's probably fixable: the module may contain types that are not available in build.rs (IOW some build dependencies might need to be duplicated)

The fix would be to use a proc macro that prevents rustc seeing the struct - NOT derive macro.

I will think about this more and write my thoughts.

@jaskij
Copy link

jaskij commented Dec 5, 2021

Hmm, the idea of generating Toml from .rs sounds interesting! I see one problem with it that's probably fixable: the module may contain types that are not available in build.rs (IOW some build dependencies might need to be duplicated)

True, although that could be worked around by separating the config struct in a separate module. Personally I find the idea of 1k+ line files abhorrent (I try to stick to under a hundred, five hundred max), but I've seen a few crates where the whole thing is stuffed into lib.rs.

One thing I probably have forgotten to mention earlier, as an intermediary step in the implementation, toml could be generated and the rest of the process happen like it did before.

@dr-orlovsky
Copy link

dr-orlovsky commented Dec 5, 2021

@Kixunil you may be interested in checking amplify_syn crate, which provides a lot of simplification over designing and controlling proc macro. For instance, see how deriving Getters is done with it in amplify_derive.

@jaskij
Copy link

jaskij commented Dec 13, 2021

@Kixunil for now, to avoid the hassle of generated code, I'm testing an option of doing #[derive(Deserialize, Merge, StructOpt)], Merge coming from the merge crate.

My use case does not need debconf or a manpage, so this feels like it should be enough if everything works as intended.

So far this seems to require a lot of boilerplate, due in part to structopt's maintainer's refusal to support the Default trait (see structopt#150 for discussion. The boilerplate required is practically unchanged from the sample in the OP. I have yet to check if merge::Merge and serde::Deserialize behave properly in this setting, but doubt there would be any issues.

@Kixunil
Copy link
Owner Author

Kixunil commented Dec 13, 2021

Interesting, the merge crate looks strangely familiar. From quick look at the issue about Default I agree with their reasoning if I understood it right. If you look at the generated code of my crate it works by creating a second struct that uses Option for each field and only structs with optional fields are merged and finally converted to non-option field (returning an error if a field is missing).

So it seems you'd need some way of generating the secondary struct with those three derives and then generating some code that does the final processing. There's one more issue though. To work properly all arguments must behave as optional but if some fields are mandatory --help should say it. So we're hitting a problem where structopt would have to know about the original struct too. So I don't think it can be fully composable. Maybe with collaboration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v1.0 These issues must be resolved before 1.0 release
Projects
None yet
Development

No branches or pull requests

3 participants