Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snake_case vs camelCase URL query params #10804

Open
chris48s opened this issue Jan 10, 2025 · 12 comments
Open

snake_case vs camelCase URL query params #10804

chris48s opened this issue Jan 10, 2025 · 12 comments
Labels
developer-experience Dev tooling, test framework, and CI

Comments

@chris48s
Copy link
Member

Over the last 10+ years we have accumulated a lot of code written by a lot of different people.
In general, I think we've done a reasonable job of enforcing relatively consistent code styles and naming conventions, via documentation and/or lint rules, but there are obviously some exceptions.

One significant "blind spot" where we have a lot of variance is whether we use snake_case or camelCase URL query params.

There's a bunch of places where we are using camelCase. For example:

const queryParamSchema = Joi.object({
metadataUrl: optionalUrl.required(),
versionPrefix: Joi.string().optional(),
versionSuffix: Joi.string().optional(),
}).required()

export const queryParamSchema = Joi.object({
pypiBaseUrl: optionalUrl,
}).required()

const queryParamSchema = Joi.object({
server: optionalUrl.required(),
queryOpt: Joi.string()
.regex(/(:[\w.]+=[^:]*)+/i)
.optional(),
nexusVersion: Joi.equal('2', '3'),
}).required()

and a bunch of places where we are using snake_case. For example:

const queryParamSchema = Joi.object({
repository_url: optionalUrl.required(),
include_prereleases: Joi.equal(''),
}).required()

const queryParamSchema = Joi.object({
path: relativeUri,
display_timestamp: Joi.string()
.valid(...displayEnum)
.default('author'),
gitea_url: optionalUrl,
}).required()

const queryParamSchema = Joi.object({
up_message: Joi.string(),
down_message: Joi.string(),
up_color: Joi.alternatives(Joi.string(), Joi.number()),
down_color: Joi.alternatives(Joi.string(), Joi.number()),
}).required()

I think the other day we acquired the first case where we have one of each next to each other on the same service 🤦

const queryParamSchema = Joi.object({
server_fqdn: Joi.string().hostname(),
fetchMode: Joi.string()
.valid(...fetchModeEnum)
.default('guest'),
}).required()

I have not counted exactly, but roughly speaking we have about half and half.

The standard params that apply to all badges are all camelCase: labelColor, logoColor, logoSize

I think we've at least managed to avoid having any params that use snake-case 🤞 but maybe somewhere in the codebase there is an example 😬

There are a couple of ways we can change the names of query params without making a breaking change. One way we can do it is with redirects. Another would be to write the queryParamSchemas to accept both formats but only document one for the services where we want to fix this. Having got to the stage where we have hundreds of service integrations, this is going to be quite difficult to unpick, and I wouldn't want to do it all at once. It would be nice to gradually work towards fixing this though.

I suggest we:

  1. Pick one or the other
  2. Document it
  3. Make some kind of a lint/danger rule to catch this
  4. At least get to a stage where everything we add from now onwards matches this convention
  5. Gradually bring services into line with the convention

Given:

  1. We use camelCase for variable names (which makes translating URL query params straight to correctly named variables easier)
  2. The standard params that apply to all badges are all camelCase: labelColor, logoColor, logoSize

I'm gong to suggest we standardise on camelCase, but I also feel like there is a reason why we used snake_case in a lot of cases, and I can't remember what it is.

@chris48s chris48s added the developer-experience Dev tooling, test framework, and CI label Jan 10, 2025
@jNullj
Copy link
Member

jNullj commented Jan 10, 2025

I think camelCase is used more around js/node devs, so thats a pro for that.
I think that snake_case is more readable but that's personal and we can't really use this as a good point, unless the whole team thinks so, in that case it would make sense.
My last point is that as it mainly visuals we might want to stick with whatever is most common in our codebase already to avoid extra work converting to one or the other

@jNullj
Copy link
Member

jNullj commented Jan 10, 2025

I'm gong to suggest we standardise on camelCase, but I also feel like there is a reason why we used snake_case in a lot of cases, and I can't remember what it is.

Maybe it makes sense when upstream api uses snake_case?

@chris48s
Copy link
Member Author

With a badge, the underlying API is something the user doesn't necessarily interact with so I don't think there's any benefit to vertical consistency with the upstream API here. I'd say horizontal consistency across badges gives more of a benefit for users in this case.

@cyb3rko
Copy link
Contributor

cyb3rko commented Jan 18, 2025

I'd say horizontal consistency across badges gives more of a benefit for users in this case.

I agree, I don't think you take a look e.g. at the Matrix API before creating a badge for your README.
The biggest headache I have with this topic still is how to implement that change into older badges.

@jNullj
Copy link
Member

jNullj commented Jan 18, 2025

another pro for camelCase is that it makes url more compact, this makes it a bit easier to read as part of a url thats mostly not very short.

It seems to me this discussion tends towards camelCase.
If no other objection is presented i suggest i break this into issues in a few days and start work on them (as mentioned in first post):

  1. document camelCase in our docs.
  2. add ci rule to make sure we keep consistent coding style
  3. convert the codebase (an issue to map progress)

once the follow up questions are created i think we can close this issue.

@chris48s
Copy link
Member Author

chris48s commented Jan 18, 2025

how to implement that change into older badges

Given the number of these we have, I think I would want to avoid doing it with redirects.

Given that constraint, lets sketch out a couple of quite different approaches that I have partially thought through.

Pattern 1: The Scalpel

Lets say we're starting off with this badge class:

const queryParamSchema = Joi.object({
  foo_bar: optionalUrl,
}).required()

class FakeService {
  static route = {
    base: 'service/noun',
    pattern: ':variable',
    queryParamSchema,
  }

  static openApi = {
    '/service/noun/{variable}': {
      get: {
        summary: 'Service Noun',
        parameters: [
          pathParam({
            name: 'variable',
            example: 'example',
          }),
          queryParam({
            name: 'foo_bar',
            example: 'example',
          }),
        ],
      },
    },
  }

  async handle({ project }, params) {
    const fooBar = params.foo_bar
    return { message: fooBar }
  }
}

The problem we've got is we want to change foo_bar to camelCase without breaking for existing users.

OK, so first lets define a global helper function we can use, given this will need to happen in lots of places. As a starting point, lets define it like this:

function getQueryParam(queryParams, ...keys) {
  for (const key of keys) {
    if (queryParams[key] !== undefined) {
      return queryParams[key]
    }
  }
  return undefined
}

This will allow us to pass an object and one or more keys we're going to look for in that object in order.

Now lets use that to update our class.

// modify the query param schema so it will accept both foo_bar and fooBar
const queryParamSchema = Joi.object({
  foo_bar: optionalUrl,
  fooBar: optionalUrl,
}).required()

class FakeService {
  static route = {
    base: 'service/noun',
    pattern: ':variable',
    queryParamSchema,
  }

  static openApi = {
    '/service/noun/{variable}': {
      get: {
        summary: 'Service Noun',
        parameters: [
          pathParam({
            name: 'variable',
            example: 'example',
          }),

          // document only fooBar in the frontend
          queryParam({
            name: 'fooBar',
            example: 'example',
          }),
        ],
      },
    },
  }

  async handle({ project }, queryParams) {
    // use our helper function to look for queryParams.fooBar first, and then fall back to queryParams.foo_bar second
    const fooBar = getQueryParam(queryParams, 'fooBar', 'foo_bar')
    return { message: fooBar }
  }
}

This would allow the user to use either ?fooBar= or ?foo_bar= (preferring fooBar if both are supplied), but we'd only show fooBar in the docs - the snake_case variant would be hidden but except for legacy compatibility, nudging new users to that.

This would be fine for optional query params. We'll have to be a bit more clever on the schema so that either one variant or the other is required, but not both.

One nice thing about this pattern is it gives us the ability to gradually migrate services case-by-case. The bad thing is we have this compatibility code in lots of different places.

Pattern 2: The Sledgehammer

If the bad thing about pattern 1 is we end up with this compatibility code sprinkled throughout the codebase, how about we centralise it?

Every badge request is processed by BaseService.invoke(). Specifically, this block here:

const { queryParamSchema } = this.route
let transformedQueryParams
if (!serviceError && queryParamSchema) {
try {
transformedQueryParams = validate(
{
ErrorClass: InvalidParameter,
prettyErrorMessage: 'invalid query parameter',
includeKeys: true,
traceErrorMessage: 'Query params did not match schema',
traceSuccessMessage: 'Query params after validation',
},
queryParams,
queryParamSchema,
)
trace.logTrace(
'inbound',
emojic.crayon,
'Query params after validation',
queryParams,
)
} catch (error) {
serviceError = error
}
} else {
transformedQueryParams = {}
}

What if we centralise that logic. Before we validate the queryParams against the schema, we run camelCase() on every key in the object, again we prioritise fooBar over foo_bar if both exist,

If we do that, any foo_bar will be transformed to fooBar. In this world, we update our code to use only the camelCase variants. No compatibility needed at the service layer.

This has slightly different tradeoffs.

The good things about this are:

  • It is less total code to write/think about
  • We don't have compatibility code spread all over the codebase
  • The process is completely transparent to contributors just looking at service classes
  • It prevents any snake_case query param making its way through to the service class or schema

however there are some drawbacks:

  • It is harder to break down into smaller jobs. We have to update every schema and all the code all at once (which is a lot - probably hundreds of files need to be touched in a single PR) because any schema that contains the snake_case will fail once we do this in BaseService.invoke(). We could at least update the documentation examples separately.
  • It essentially means we provide both camelCase and snake_case variants of every param. Even params where a snake_case variant never existed. Even new ones we add in future
  • If there are any edge cases I'm not thinking about (are there any classes that bypass BaseService for some strange reason? Does this cause any problem with compatibility redirects?) we might do a lot of work before we realise it, etc

So I guess the next questions are:

  • Which of these seems like the most sensible pattern?
  • Is there another good idea that I have not suggested?
  • What am I not thinking of here (particularly with pattern 2)?
  • Having seen those solution: If we end up with one of those solutions.. is the consistency gain worth it? Would the inconsistency actually be the lesser evil?

@jNullj
Copy link
Member

jNullj commented Jan 24, 2025

I think we can remove almost all drawbacks mentioned from pattern 2 by making a little change

  1. Make a list of backward compatible snake_case classes.
  2. Write a function to apply backwards compatibility programticly.
    something like:
const keepSnakeCaseCompatibility = [FakeService]; // Add all target classes here

for (const Class of keepSnakeCaseCompatibility) {
  addSnakeCaseCompatibility(Class);
}
function getQueryParam(params, camelCaseKey, snakeCaseKey) {
  return params[camelCaseKey] !== undefined ? params[camelCaseKey] : params[snakeCaseKey];
}

function addSnakeCaseCompatibility(Class) {
  const originalSchema = Class.route.queryParamSchema;

  // Extend the schema to include both snake_case and camelCase
  Class.route.queryParamSchema = originalSchema.keys(
    Object.fromEntries(
      Object.keys(originalSchema.describe().keys).map((key) => [
        key.replace(/_([a-z])/g, (_, char) => char.toUpperCase()),
        originalSchema.extract(key),
      ])
    )
  );

  // Wrap the `handle` method
  const originalHandle = Class.prototype.handle;
  Class.prototype.handle = async function (context, queryParams) {
    const normalizedParams = Object.fromEntries(
      Object.entries(queryParams).map(([key, value]) => [
        key.replace(/_([a-z])/g, (_, char) => char.toUpperCase()),
        value,
      ])
    );
    return originalHandle.call(this, context, { ...queryParams, ...normalizedParams });
  };

  return Class;
}
  1. Now we break it down to many PR, one service at a time refactor to camelCase and add to the array.
  2. When we search github in the future and see snake_case usage is near 0 we retire the class from the array.

pros:

  • can break down into small chunks of work
  • single global code not mixed all over the project
  • can easily retire snake_case in the future.
    cons:
  • more complex - easier to miss edge cases.
  • harder to notice when contributor change related services.

@cyb3rko
Copy link
Contributor

cyb3rko commented Jan 24, 2025

@jNullj I really like that approach!

@cyb3rko
Copy link
Contributor

cyb3rko commented Jan 24, 2025

Does this discussion also consider inconsistency between path parameter and query parameter cases?
Because there are multiple services who have both cases combined:

@jNullj
Copy link
Member

jNullj commented Jan 25, 2025

Yea, i think we should, didn't think the mixed cases

@chris48s
Copy link
Member Author

@cyb3rko - I don't think the case of pathParams really matters in the same way because they're not exposed to that user. What I mean by this is pathParams are positional, wheras queryParams are key->value mapping.

So for example, regardless of whether we define the route as

  • service/noun/:user_name/:repo or
  • service/noun/:userName/:repo

the URL the user calls is service/noun/badges/shields so we can change user_name to userName without having any impact on badges in the wild. It is a backwards-compatible change.

Conversely, queryParams are exposed to the user. So if I change ?user_name=badges to ?userName=badges that impacts URLs in the wild and we need to care about preserving backwards compatibility.

So if we have any pathParams that are using snake_case, we can just change them. I suspect the number of path params using snake_case on the internal identifiers is actually zero or close to zero though, but if there are any we can just change them without any consequences.

@chris48s
Copy link
Member Author

@jNullj I do like the idea of applying a solution centrally but constraining it to a hard-coded subset of classes/params so that we don't introduce two variants of every param.

When we search github in the future and see snake_case usage is near 0

Based on previous experience maintaining compatibility for older URLs, this will never happen.
If you put a badge in your README 5 years ago and it still renders today, there's no reason you're going to go back and change that URL.
Any solution we implement here should be viewed as permanent. We'll be maintaining it essentially forever.


Here's another idea that has occured to me that might be helpful in combination with one of these approaches. Joi is very flexible in what you can do at the validation stage. You can make requirements of the input object but also apply transofmrations. So for example, here is a hard-coded Joi schema that requires an object to have either a key called foo_bar OR a key called fooBar. Then applies a normalisation function to always use fooBar in the output object regardless of whether the input object used foo_bar or fooBar.

const queryParamSchema = Joi.object({
  foo_bar: Joi.string(),
  fooBar: Joi.string(),
})
  .xor('foo_bar', 'fooBar') // Ensure only one of them is present
  .custom((value, helpers) => {
    // Normalise the output to always use `fooBar`
    if (value.foo_bar) {
      value.fooBar = value.foo_bar;
    }
    delete value.foo_bar;
    return value;
  })
  .required();

The advantage of doing this is that everything we need to do happens at the validation stage. handle() doesn't need to care at all that there are 2 variants accepted.
Obviously the example I've shown here is hard-coded to specific param names, but if we can write a plugin/helper/wrapper for Joi that can do this in a generic way, that might be a technique we can combine with one of these approaches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
developer-experience Dev tooling, test framework, and CI
Projects
None yet
Development

No branches or pull requests

3 participants