Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should Input Validation be Level 1 #2476

Closed
tghosth opened this issue Dec 17, 2024 · 22 comments
Closed

Should Input Validation be Level 1 #2476

tghosth opened this issue Dec 17, 2024 · 22 comments
Labels
requirement level Issue related to requirement levels V5 Temporary label for grouping input validation, sanitization, encoding, escaping related requirements _5.0 - prep This needs to be addressed to prepare 5.0

Comments

@tghosth
Copy link
Collaborator

tghosth commented Dec 17, 2024

We currently have an explanation of the thinking around L1 which can be found here.

There are a few major areas where we have differing opinions about whether a particular type of control falls into the category of L1 for version 5.0.

This issue is to discuss input validation requirements from the current section 5.1

The currently disputed requirements are 5.1.3, 5.1.4,and 5.1.7 (ignore the level marking below). 5.1.5 is a different discussion and it is agreed that the others in this section are not L1)

# Description L1 L2 L3 CWE
5.1.3 [MODIFIED] Verify that all input is validated using positive validation, against an allowed list of values, patterns or ranges to enforce business or functional expectations for that input. 20
5.1.4 [MODIFIED, SPLIT TO 5.1.7] Verify that data items with an expected structure are validated according to the pre-defined rules. 20
5.1.7 [ADDED, SPLIT FROM 5.1.4] Verify that the application ensures that combinations of related data items are reasonable according to the pre-defined rules. 20

I personally believe that these requirements do not belong in L1 for the following reasons:

  • In all except very specific cases, input validation is a defense in depth issue. There will usually be another first line control (e.g. output encoding, parametrization, context specific sanitization, etc) which is needed to prevent a security vulnerability from being exploited. For instances where a specific mechanism is not mentioned, we already have 5.2.2 to cover those.
  • As a second layer of defense, there is less risk reduction compared to the first line control.
  • Comprehensive input validation is very labour intensive, to be done correctly it requires a full mapping of all possible inputs and their business reasoning. This is a problem because:
    • It is hard to justify the effort for a second line control at the lowest level of maturity.
    • In general, it violates the idea that L1 should have a low barrier to entry.
    • Security cannot provide guidance for this alone but will rather need the business/product management team to agree to restrict inputs for which there is likely to be pushback.
    • If I am honest, I am not sure how exactly this would be verified anyway? Certainly it would be problematic to verify "black box" style.
@tghosth tghosth added 1) Discussion ongoing Issue is opened and assigned but no clear proposal yet _5.0 - prep This needs to be addressed to prepare 5.0 V5 Temporary label for grouping input validation, sanitization, encoding, escaping related requirements requirement level Issue related to requirement levels labels Dec 17, 2024
@elarlang
Copy link
Collaborator

elarlang commented Dec 17, 2024

Although discussed, the provided arguments were a bit one-sided.

Some may think, that security is only confidentiality and maybe availability, ignoring the integrity and data quality part.

Some may also think, that input validation is a defense or replacement for output encoding, but although it may have a defensive effect in certain cases, it is not - building the output is a completely separate topic.

Why are applications made? To provide some automated solutions to satisfy the business need. And it is not possible to achieve that without input validation.

  • email address - how do you send emails? password recovery?
  • phone - if there is a need to get in contact, how do you achieve that?
  • address - if you need to send a package to this address and if it is not validated, you may spend a lot of money to handle the return or delivery mess
  • social security number / person id code - this identifier one person from another
  • date - from .. to choices? the "after date" is before the "before date"? nice solution for insurance, traveling etc.
  • prices, amounts, quantities - numbers? allowed range? int vs float?

The number of different validation rules is not usually that large.

@tghosth
Copy link
Collaborator Author

tghosth commented Dec 17, 2024

Although discussed, the provided arguments were a bit one-sided.

Yes I deliberately wanted to let you set out your own arguments rather than me paraphrasing them, apologies if that was not clear.

@tghosth
Copy link
Collaborator Author

tghosth commented Dec 17, 2024

email address - how do you send emails? password recovery?
phone - if there is a need to get in contact, how do you achieve that?

Maybe we need a specific requirement to validate an email address as there is a specific security purpose for this, it doesn't justify the full "input validation for everything" in the requirements above.

Same for phone if there is a security purpose, otherwise this is a functionality thing and not a first priority for security.

address - if you need to send a package to this address and if it is not validated, you may spend a lot of money to handle the return or delivery mess
date - from .. to choices? the "after date" is before the "before date"? nice solution for insurance, traveling etc.

These are functionality issues, maybe should be a priority for QA but not a first priority for security.

social security number / person id code - this identifier one person from another

Not convinced this should be used for any security function as it is so hard to change. As such, not a first priority for security.

prices, amounts, quantities - numbers? allowed range? int vs float?

Covered by 11.1.3

@tghosth
Copy link
Collaborator Author

tghosth commented Dec 17, 2024

And it is not possible to achieve that without input validation.

Any input validation which causes the business pain, the business will have already implemented and we don't need to mandate. For anything which doesn't cause them pain, trying to convince them that this is an essential business need is not a job for ASVS 5.0 L1.

@elarlang
Copy link
Collaborator

elarlang commented Dec 17, 2024

So in short, you say, that if the data in the application is trash, it is not a security issue?

And taking into account #2476 - you don't care about data quality, but you want to protect it with cryptography?

(for further comments from here I could get another notice through other channels :) )

@jmanico
Copy link
Member

jmanico commented Dec 17, 2024

Maybe we need a specific requirement to validate an email address as there is a specific security purpose for this, it doesn't justify the full "input validation for everything" in the requirements above.

I agree. Before accepting an email address, especially as part of authentication, I suggest:

  1. Validate the email specifically with the HTML5 email validation rules. https://html.spec.whatwg.org/multipage/input.html#email-state-(type=email)
  2. Send email to that address and seek confirmation that the email is active (a common practice).

Screenshot 2024-12-17 at 1 05 50 PM

@tghosth
Copy link
Collaborator Author

tghosth commented Dec 18, 2024

So in short, you say, that if the data in the application is trash, it is not a security issue?
And taking into account #2476 - you don't care about data quality, but you want to protect it with cryptography?

I think that there is an important mindset point to consider here.

Firstly, if there is data that we are specifically basing security controls on then we need to ensure that is collected correctly. But that is a relatively minor subset of the data we are collecting. My reasoning below relates to the rest of the data.

In the software industry, there is a long time attitude that security is it's own special thing and not part of the software requirements.

I literally had a client tell me this week that their engineering organization sees performance as a key quality indicator but not security. From their perspective, poor response times would be considered a bug but security vulnerabilities would not be. Once a security vulnerability is exploited, only then is it considered a bug to be fixed.

As a software security professional, my job is to try and promote the importance of changing that mindset to implement key security controls. I want L1 to be a primary tool for doing that.

Do I care about rubbish data in the application? Yes.

Could rubbish data be considered a security issue? In theory, yes.

Is it my primary concern? No.

Why not? Because my main concern is things that will get our data stolen or messed up on a wide scale. If someone enters junk data and it breaks the functionality of the application, then someone else like a quality manager or a product manager or application owner will be the person to push for it to be fixed.

It is not a battle that a security person needs to fight. Security people have too many other battles to fight and certainly no one is going to say, "why didn't security tell us not accept addresses in a phone number field?"

However, what they will say is, why didn't security tell us that using "Base64 encryption" on that data that got stolen out of our database was not secure.

@elarlang
Copy link
Collaborator

My reasoning below relates to the rest of the data.

The problem is - do not have input validation for the list of data that is not "critical for security" you proposed to remove the input validation from everything, opening up quite a security hole.

I literally had a client tell me this week that their engineering organization sees performance as a key quality indicator but not security. From their perspective, poor response times would be considered a bug but security vulnerabilities would not be. Once a security vulnerability is exploited, only then is it considered a bug to be fixed.

It is normal, that engineers sees security differently and it is often philosophical question. Using this attitude - inserting ' and " to some field that causes error in executing SQL query or incorrect output when building an HTML, it's just a bug till not exploited as SQL injection or "XSS".

why didn't security tell us not accept addresses in a phone number field?

what?

@tghosth tghosth added the next meeting Filter for leaders label Dec 18, 2024
@jmanico
Copy link
Member

jmanico commented Dec 21, 2024

In general I can see why good input validation is not a L1 control. Stopping injection attacks is usually about escaping, encoding and parameterization. Input validation is usually a secondary defense.

@elarlang
Copy link
Collaborator

elarlang commented Dec 22, 2024

In general I can see why good input validation is not a L1 control. Stopping injection attacks is usually about escaping, encoding and parameterization. Input validation is usually a secondary defense.

Well, details, but let me repeat that: #2476 (comment)

Some may also think, that input validation is a defense or replacement for output encoding, but although it may have a defensive effect in certain cases, it is not - building the output is a completely separate topic.

And the point is: if you measure input validation level as "secondary defense" for output encoding, then it does not take into account the data quality and business-logic area, where it is actually is important and pre-condition to achieve the application providing the business needs.

@jmanico
Copy link
Member

jmanico commented Dec 22, 2024 via email

@elarlang
Copy link
Collaborator

I have proposed it some time ago (too lazy to search where exactly), but it was not agreed back then - but I do it again.

As input validation is for data quality, that the application can make any decision based on trusted data, we should move input validation to "business logic" section, and maybe rename it to just "logic".

Why this proposal: to send a clear message that input validation is not a defense against injection attacks and it is part of the application logic. This seems to be really underestimated, undervalued and misinterpreted here as well.

We are going to have another "potentially heated" discussion on how to re-order chapters - so I think can we start with:

  • V1 Validation, Sanitization and Encoding (keep the syntax correct)
  • V2 Business Logic (keep the application flows and logic correct)

And on topic: the data used by the application for any decision, it must be trusted. To have a trusted data, we need to have input validation - so for me it is a clear L1 situation.

If there is some data that is not used as a base info for making any decision, it can (maybe) be left out from this priority, but how to draw the line here is a good question.

@tghosth
Copy link
Collaborator Author

tghosth commented Dec 23, 2024

As input validation is for data quality, that the application can make any decision based on trusted data, we should move input validation to "business logic" section.

I find this direction interesting.

and maybe rename it to just "logic"

I don't love this idea...

I would make the following suggestions:

# Description Level Suggested action
5.1.1 [MODIFIED] Verify that the application has defenses against HTTP parameter pollution attacks, particularly if the application framework makes no distinction about the source of request parameters (query string, body parameters, cookies, or header fields). L1>L2 This is specific to a technical attack so I think it should stay here.
5.1.2 [MOVED TO 10.4.4] N/A
5.1.3 [MODIFIED] Verify that input on which business or security decisions will be made is validated using positive validation, against an allowed list of values, patterns or ranges to enforce business or functional expectations for that input. L1 Move to Business Logic
5.1.4 [MODIFIED, SPLIT TO 5.1.7] Verify that data items with an expected structure on which business or security decisions will be made are validated according to the pre-defined rules. L1 Move to Business Logic
5.1.5 [MODIFIED, SPLIT TO 50.8.1] Verify that the application will only automatically redirect the user to a different URL directly from an application URL where the destination appears on an allowlist. L1 or L1>L2 Prevents technical attack, should stay here
5.1.6 [ADDED] Verify that the application validates that user-controlled input in HTTP request header fields does not exceed the server's maximum header field size limit (usually 4kB or 8kB) to prevent client-based denial of service attacks. L2 Prevents technical attack, should stay here
5.1.7 [ADDED, SPLIT FROM 5.1.4] Verify that the application ensures that combinations of related data items are reasonable according to the pre-defined rules. L1>L2 Move to Business Logic

@elarlang
Copy link
Collaborator

elarlang commented Dec 23, 2024

One option is to have 5.1.1, 5.1.5 and 5.1.6 in V10.4 Defensive Coding.

@tghosth
Copy link
Collaborator Author

tghosth commented Dec 23, 2024

One option is to have 5.1.1, 5.1.5 and 5.1.6 in V10.4 Defensive Coding.

I think the problem is that many things could fit into Defensive Coding but I prefer to keep it for things which we really can't fit anywhere else. These requirements are all around untrusted content used in a specific, security sensitive way, and therefore I prefer to keep them here

@elarlang
Copy link
Collaborator

elarlang commented Dec 23, 2024

Entire point is to not have input validation as part of current V5. If to keep some here and some there, it is even more confusing.

I would say, that 5.1.1 is not a input validation as such. It is more how the programming language, framework or whatever takes the input processes the input. Validation is - accept or reject it, I think it is not perfect match for input validation and belongs to defensive coding. This is my proposal despite what will happen with all other changes here.

V5.1.5 is not just a technical input validation - it relies on logic and a defined list to define, what is a trusted and allowed destination.

V5.1.6 it is a technical check, but during a building the header. It may involve more than one input or other logic - just that the combined length that is used in one header field value can not exceed defined limit. It is a question of program code logic and fits to defensive coding.

@tghosth
Copy link
Collaborator Author

tghosth commented Dec 23, 2024

I don't see this move as taking input validation out of V5, I see this as reframing several input validation requirements to focus more on the use of the inputs in business/security logic which is why we have the move.

I think that the primary chapter for untrusted content should still be V5.

Again, I think that you could argue that almost any requirement implemented at the code level could go into V10 but I only want to put those in there that are strictly related to coding principles (ideally language specific) and don't really fit anywhere else.

With that being said...

I would say, that 5.1.1 is not a input validation as such. It is more how the programming language, framework or whatever takes the input processes the input. Validation is - accept or reject it, I think it is not perfect match for input validation and belongs to defensive coding. This is my proposal despite what will happen with all other changes here.

I can accept that this is a little language specific so maybe fits better in V10

V5.1.5 is not just a technical input validation - it relies on logic and a defined list to define, what is a trusted and allowed destination.

5.1.5 sounds like classic "accept or reject it" validation so I think this should stay put.

V5.1.6 it is a technical check, but during a building the header. It may involve more than one input or other logic - just that the combined length that is used in one header field value can not exceed defined limit. It is a question of program code logic and fits to defensive coding.

This is not language specific but rather a conceptual question of how you incorporate untrusted content into a header. I still think it is more about input than logic but I am not strongly opinionated.

@tghosth
Copy link
Collaborator Author

tghosth commented Dec 23, 2024

@elarlang do you agree with my levels and changes on 5.1.3, 5.1.4 and 5.1.7?

@elarlang
Copy link
Collaborator

In big picture yes, but some wording need to be improved and moving requirements can not be done out of context (without taking everything else into account). next meeting.

@tghosth
Copy link
Collaborator Author

tghosth commented Dec 26, 2024

Discussed with @elarlang and @tghosth to open a PR (and a new issue) as per the points below.

# Description Level Suggested action
5.1.1 [MODIFIED] Verify that the application has defenses against HTTP parameter pollution attacks, particularly if the application framework makes no distinction about the source of request parameters (query string, body parameters, cookies, or header fields). L1>L2 Move to defensive coding as this is very language/framework specific.
5.1.2 [MOVED TO 10.4.4] N/A
5.1.3 [MODIFIED] Verify that input on which business or security decisions will be made is validated using positive validation, against an allowed list of values, patterns or ranges to enforce business or functional expectations for that input. For L2, input validation should be implemented globally. L1 Move to "Input Validation" within V11 Business Logic
5.1.4 [MODIFIED, SPLIT TO 5.1.7] Verify that data items with an expected structure on which business or security decisions will be made are validated according to the pre-defined rules. For L2, input validation should be implemented globally. L1 Move to "Input Validation" within V11 Business Logic
5.1.5 [MODIFIED, SPLIT TO 50.8.1] Verify that the application will only automatically redirect the user to a different URL directly from an application URL where the destination appears on an allowlist. L1 or L1>L2 Leave here but open a separate issue to discuss location
5.1.6 [ADDED] Verify that the application validates that user-controlled input in HTTP request header fields does not exceed the server's maximum header field size limit (usually 4kB or 8kB) to prevent client-based denial of service attacks. L2 Leave here but open a separate issue to discuss location
5.1.7 [ADDED, SPLIT FROM 5.1.4] Verify that the application ensures that combinations of related data items are reasonable according to the pre-defined rules. L1>L2 Move to "Input Validation" within V11 Business Logic

@tghosth tghosth added 5) awaiting PR A proposal hs been accepted and reviewed and we are now waiting for a PR and removed 1) Discussion ongoing Issue is opened and assigned but no clear proposal yet next meeting Filter for leaders labels Dec 26, 2024
@tghosth
Copy link
Collaborator Author

tghosth commented Dec 26, 2024

Opened #2488 but levels are going to need some love.

Also opened #2487 to deal with remaining 5.1.5 and 5.1.6

Also opened #2486 to remind me to update the several levels doc after this gets merged.

@elarlang
Copy link
Collaborator

elarlang commented Jan 2, 2025

As it is merged now - is it something to solve in scope of this issue or it can be closed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
requirement level Issue related to requirement levels V5 Temporary label for grouping input validation, sanitization, encoding, escaping related requirements _5.0 - prep This needs to be addressed to prepare 5.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants