Experimental proof-of-concept of contextual autoescaping in PUG #2895

mikesamuel · 2017-11-03T19:26:20Z

This is a proof of concept, and is not presented as ready for merging in its current form.

Make it easier to write XSS-free PUG frontends by removing a couple
of major XSS vectors

URL attribute values that are not constant expressions are now
filtered so that javascript: URLs will not cause execution.
This is the single largest XSS vector after injecting <script>
&attributes(...) are now checked so that an attacker who
controls a key and value cannot inject JavaScript via
event handlers attributes.
Text interpolation into <script> elements is JSON escaped.

For example, given the template

//-
  Use an opaque predicate to trick Constantinople into treating this as
  requiring evaluation at render time
- var js = Math.random() <= 2 ? "javascript:alert(1)//" : ""
- var attributes = {}
- attributes.href = js + '/'

p
  a(href='javascript:alert(1)') Link with constant href
p
  a&attributes({ href: 'javascript:alert(1)' }) Constant href via ampersand
  | TODO: This is not ideal.
p
  <a href="javascript:alert(1)">
    | Static link not in pug syntax
  </a>
p
  <a href="#{js}">
    | Dynamic link not in pug syntax
  </a>
ul
  li
    a(href=js + '/') Concatenation
  li
    a(href=`${js}/`) Template string
  li
    a(href!=js + '/') Unescaped concatenation
  li
    a&attributes({href: js + '/'}) Ampersand with map constructor
  li
    a&attributes(attributes) Ampersand with expression

without this change results in

<p><a href=\"javascript:alert(1)\">Link with constant href</a></p>
<p><a href=\"javascript:alert(1)\">Constant href via ampersand</a>TODO: This is not ideal.</p>
<p>
  <a href=\"javascript:alert(1)\">Static link not in pug syntax</a></p>
<p><a href=\"javascript:alert(1)//\">Dynamic link not in pug syntax</a></p>
<ul>
  <li><a href=\"javascript:alert(1)///\">Concatenation</a></li>
  <li><a href=\"javascript:alert(1)///\">Template string</a></li>
  <li><a href=\"javascript:alert(1)///\">Unescaped concatenation</a></li>
  <li><a href=\"javascript:alert(1)///\">Ampersand with map constructor</a></li>
  <li><a href=\"javascript:alert(1)///\">Ampersand with expression</a></li>
</ul>

but with these changes produces

<p><a href="javascript:alert(1)">Link with constant href</a></p>
<p><a href="#zSafehtmlz">Constant href via ampersand</a>TODO: This is not ideal.</p>
<p>
  <a href="javascript:alert(1)">Static link not in pug syntax</a></p>
<p><a href="#zSafehtmlz">Dynamic link not in pug syntax</a></p>
<ul>
  <li><a href="#zSafehtmlz">Concatenation</a></li>
  <li><a href="#zSafehtmlz">Template string</a></li>
  <li><a href="#zSafehtmlz">Unescaped concatenation</a></li>
  <li><a href="#zSafehtmlz">Ampersand with map constructor</a></li>
  <li><a href="#zSafehtmlz">Ampersand with expression</a></li>
</ul>

This adds a packages/pug-autoesc which implements a contextual
autoescaper similar to:

It modifies the code generating code in pug-code-gen and pug-attrs
to consolidate code that appends output to pug_html so that, when
options.bufferHooksExperimental is present, the generated code
instead can write to a buffer object.

packages/pug-autoesc defines such a buffer class.

It modifies packages/pug-runtime to make the buffer class and
a few supporting functions available at runtime.

Similar systems allow type-safe exceptions to the rule via typed
strings like Angular's [SafeValue][] type as detailed in "Securing
the Tangled Web".

Should the runtime check if require('@angular/platform-browser')
succeeds at runtime and treat such values as privileged when
such values are written at runtime.

SafeValue: https://angular.io/api/platform-browser/SafeValue

Once we have context about which scripts and styles are under
template author control, we can also make it easier to deploy
the templates under a Content-Security-Policy.
Would PUG users benefit from auto-noncing?

Pug has a plugin mechanism that can operate on the source files,
token stream, ASTs, or generated JS.

This proof-of-concept is not written as a plugin. Instead it
uses hooks so that generated code of the form
(pug_html = pug_html + ...) can be replaced with writes
that accumulate content onto a buffer along with information
about whether that content comes from the template author or
not.

It might be possible to implement equivalent functionality as an
AST plugin that adds calls to filter functions to the AST.
Doing this conservatively might require complicated static analysis
that is aware of template inheritance and which takes into account
the contexts in which mixins are called.

If this approach is taken, the AST pass first need to translate
constructs like

<a href="#{...}">...

to

a(href=`$(...)`)

or just reject them, and behave similarly on unescaped buffered code
interpolations like

p
  != thereBeTagsOfUnkownProvenanceHere

Doing things this way would make the functioning of the plugin
dependent on any later AST processing plugins, and it would have to
panic if there are new kinds of AST nodes added to be conservative.

ForbesLindesay · 2018-01-29T12:43:19Z

For your three examples:

href="javascript:..." - I think this is worth considering. Lots of people probably don't realise this attack vector exists. If we want to solve it though, we should do so regardless of whether it's a constant string or a dynamic value. We should tell people to use != if they want to set an href to a javascript url. What I'm unsure about is that it doesn't seem to me like there's a "correct" way to escape these. Maybe we should just error, but I don't really like that we might do that on some user input.
&attributes are already unescaped and there's loads of attack vectors. I want to replace them with a much safer operator, but I think we should update to a "spread" syntax while we're at it. I think thought on what exactly that should look like, and if it should have any clever interaction with the attributes variable within mixins, is worth having, but a separate issue.
text interpolation in script attributes - This would be a huge breaking change. I do like the idea of a separate "JSON Encoded" escaped interpolation operator (using js-stringify for proper handling of inlining into JavaScript) but I'm not sure I like overloading the existing interpolation operation. The same problem exists (for example) with interpolation in CSS and there isn't an obvious JSON equivalent there. One option would be to start throwing an error on un-escaped interpolation/buffering within script & style tags, since this is always actually unsafe unless you are doing additional escaping yourself. This would be a significant breaking change, but I think I would probably support it.

I suggest starting separate threads for any of these you wish to discuss further.

mikesamuel · 2018-02-06T16:09:27Z

I split out an issue for discussing javascript: URL filter.

The current PR does actually post-process &attributes so improving the security seems feasible even if the plan is to deprecate them in favor of something else. Spread syntax does sound intuitive.

Per interpolation in <script>, would an opt-in contextual mode be feasible or is addressing that a non-starter?

ForbesLindesay · 2018-02-07T15:43:41Z

would an opt-in contextual mode be feasible

no, I don't want to add modes to pug. I only ever want to have one "latest" version of pug. What we could do is deprecate and ultimately error on escaped interpolation inside script/style tags and instead tell people to choose between the more obviously unsafe unescaped interpolation, and some new contextual interpolation

mikesamuel · 2018-02-07T16:06:24Z

Fair enough.

Scheduled deprecation seems like a fine idea for unsafe features.

Would it make sense to also have a scheduled ramp-up to contextual mode?

Add the hooks from this PR or something like it to Pug
Make contextual escaping available as a plugin to early adopters
Communicate that something like contextual mode will become part of Pug-core and encourage early adopters to try out the plugin
Refine the plugin based on feedback
Pull the plugin into core and warn if the, now extraneous, plugin is detected in config

ForbesLindesay · 2018-02-07T16:45:59Z

It doesn't need to be a mode, we can just add the new syntax - possibly in a major version if the code would previously have been valid. As long as the syntax is obscure enough I suspect we won't need to be that gradual in implementing it.

mikesamuel · 2018-02-09T17:48:15Z

How about I separate out all the changes to produce a semantically-neutral PR that just provides the hooks that enable the rest?
If that PR ends up merging, I can follow on with a CL that uses those hooks to intercept URL parameter assignments.

I can also file an issue to capture discussion on spread syntax if that'd be helpful and maybe do some implementation work.

ForbesLindesay · 2018-02-09T18:15:22Z

Yes, implementation work for spread would be awesome.

TimothyGu · 2018-02-11T19:02:25Z

See lexer implementation here: pugjs/pug-lexer#52

This consolidates code generator code that produces instructions to append content to the output and allows customizing it. This enables features like pugjs#2895 which distinguishes between strings definitely authored by the template author and those possibly controlled by an attacker to escape strings in context. This approach allows the same hooks to both handle idiomatic pug like ```pug a [href=x] ``` and inlne html like ```html <a href="#{x}"> ``` See also pugjs#2952

mikesamuel · 2018-02-12T15:21:33Z

Separated hooks out into #2958
Original patch at https://github.com/mikesamuel/pug/tree/context-autoesc

mikesamuel mentioned this pull request Feb 6, 2018

Prevent untrusted javascript: URLs from reaching URL attributes. #2952

Open

mikesamuel closed this Feb 12, 2018

mikesamuel force-pushed the master branch from 0736fc2 to 926f7c7 Compare February 12, 2018 15:02

mikesamuel mentioned this pull request Feb 12, 2018

pug-codegen: Abstract parts of the code generator that accumulate output #2958

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimental proof-of-concept of contextual autoescaping in PUG #2895

Experimental proof-of-concept of contextual autoescaping in PUG #2895

mikesamuel commented Nov 3, 2017

ForbesLindesay commented Jan 29, 2018 •

edited

Loading

mikesamuel commented Feb 6, 2018

ForbesLindesay commented Feb 7, 2018

mikesamuel commented Feb 7, 2018

ForbesLindesay commented Feb 7, 2018

mikesamuel commented Feb 9, 2018

ForbesLindesay commented Feb 9, 2018

TimothyGu commented Feb 11, 2018

mikesamuel commented Feb 12, 2018

Experimental proof-of-concept of contextual autoescaping in PUG #2895

Experimental proof-of-concept of contextual autoescaping in PUG #2895

Conversation

mikesamuel commented Nov 3, 2017

ForbesLindesay commented Jan 29, 2018 • edited Loading

mikesamuel commented Feb 6, 2018

ForbesLindesay commented Feb 7, 2018

mikesamuel commented Feb 7, 2018

ForbesLindesay commented Feb 7, 2018

mikesamuel commented Feb 9, 2018

ForbesLindesay commented Feb 9, 2018

TimothyGu commented Feb 11, 2018

mikesamuel commented Feb 12, 2018

ForbesLindesay commented Jan 29, 2018 •

edited

Loading