GitHub Security Lab CTF 3 write-up: XSS-unsafe jQuery plugins

Challenge

The challenge was to write a CodeQL query which can detect a dangerous XSS vector pattern: $(<css-selector>) in bootstrap plugins for which the <css-selector> can be reached from one of the plugin options. The reason why this is dangerous is because clients of the plugin might not be aware of this vector and might pass unsanitized user-input as an option leading to XSS.

Steps

The challenge was broken into steps which made our query progressively better.

Step 0: Finding the arguments to potentially unsafe jQuery function calls

This step is about understanding some basic in-built models and finding valid xss sinks.

jquery(): a Data Flow node corresponding to $. From what I understood it is useful if we want to catch not only direct $(‘css-selector’) invocations but also patterns like f('css-selector') where f may have been assigned the value $
JQuery::MethodCall::interpretsArgumentAsHtml: Even better as it this can help us find other possible function calls which can be dangerous because they might be interpreting the argument as Html.

Predicate definition:

predicate isMethodArgumentInterpretedAsHtml(string name) {
  name = "after" or
  name = "append" or
  name = "appendTo" or
  name = "before" or
  name = "html" or
  name = "insertAfter" or
  name = "insertBefore" or
  name = "prepend" or
  name = "prependTo" or
  name = "replaceWith" or
  name = "wrap" or
  name = "wrapAll" or
  name = "wrapInner"
}
--omitted--									
predicate interpretsArgumentAsHtml(DataFlow::Node node) {
  // some methods interpret all their arguments as (potential) HTML
  JQuery::isMethodArgumentInterpretedAsHtml(name) and
  node = getAnArgument()
  or
  // for `$, it's only the first one
  name = "$" and
  node = getArgument(0)
}

So it’s not only $(<css-selector>) that is a valid sink, but also functions like something.after(arg) or something.append(arg) if arg happens to be a user-controlled css-selector. Query for a sink:

import javascript
import semmle.javascript.frameworks.jQuery
										
from JQuery::MethodCall mc, DataFlow::SourceNode n
where mc.interpretsArgumentAsHtml(n)
select mc.getArgument(0)

which found 133 results as the question expected.

Step 1: Finding jQuery plugins and their options

Plugin definitions can be simple like: $.fn.copyText = function() { ... } which can be detected by query like:

import javascript
							
from
  DataFlow::FunctionNode fn, DataFlow::PropWrite jqdotfndotx, DataFlow::SourceNode jq,
  DataFlow::PropRead jqdotfn
where
  jq = jquery() and // Data-flow model for $
  jqdotfn = jq.getAPropertyRead("fn") and // Reads of $.fn
  jqdotfndotx = jqdotfn.getAPropertyWrite() and // Write to $.fn.<x>
  fn = jqdotfndotx.getRhs() // RHS for the assignment to $.fn.x
select jqdotfndotx, fn

Note that $.fn.something = something_else is considered to be a prperty read for “fn” and a property write for “x”.

There might also be plugin definitions which are indirect assignments to $.fn. like:

let fn = $.fn;
let f = function() { ... }			
fn.copyText = f;

which can be if caught data-flow is taken into account with a query like:

import javascript

from
  DataFlow::FunctionNode fn, DataFlow::PropWrite jqdotfndotx, DataFlow::PropRead jqdotfn,
  DataFlow::Node intermediate, DataFlow::SourceNode jq
where
  jq = jquery() and
  jqdotfn = jq.getAPropertyRead("fn") and
  jqdotfn.flowsTo(intermediate) and
  jqdotfndotx.getBase() = intermediate and
  fn.flowsTo(jqdotfndotx.getRhs())
select jqdotfndotx, fn

which finds 13 results as expceted.

Note: The data-flow happens to the RHS of the property write (jqdotfndotx.getRhs()) and not the assignment itself (jqdotfndotx).

After finding the plugin definitions, the next step is to find plugin options. The plugin option object can be found as one of the arguments in the plugin definition. The query below selects the last parameter (as in the challenge description) or any parameter with “option” in its name:

import javascript

from
  DataFlow::FunctionNode fn, DataFlow::PropWrite jqdotfndotx, DataFlow::PropRead jqdotfn,
  DataFlow::Node intermediate, DataFlow::SourceNode jq, DataFlow::ParameterNode p
where
  jq = jquery() and
  jqdotfn = jq.getAPropertyRead("fn") and
  jqdotfn.flowsTo(intermediate) and
  jqdotfndotx.getBase() = intermediate and
  fn.flowsTo(jqdotfndotx.getRhs()) and
  (
    p = fn.getAParameter() and 
    p.getName().matches("%option%") // a parameter with "option" in its name
    or
    p = fn.getLastParameter() and // the last parameter if no such parameter exists
    not exists(DataFlow::ParameterNode p2 |
      p2 = fn.getAParameter() and var options = typeof option == 'object' && option
      p2.getName().matches("%option%")
    )
  )
select jqdotfndotx, fn, p

Results: https://lgtm.com/query/1611065064630697765/

Direct property reads of the options object gives 0 results because it gets modified before a property is read. Like this for example:

var options = typeof option == 'object' && option

In order to account for this the TaintTracking needs to be used. This is done by extending the TaintTracking:Configuration class and defining the isSource and isSink predicates for the taint flow. Valid sources are the options object and sinks are the possibly vulnerable function calls which might interpret their arguments as html (Step 0):

/**
 * @name Cross-site scripting vulnerable plugin
 * @kind path-problem
 * @id js/xss-unsafe-plugin
 */

import javascript
import DataFlow::PathGraph

class Configuration extends TaintTracking::Configuration {
  Configuration() { this = "XssUnsafeJQueryPlugin" }

  override predicate isSource(DataFlow::Node source) {
    exists(
      DataFlow::FunctionNode fn, DataFlow::PropWrite jqdotfndotx, DataFlow::PropRead jqdotfn,
      DataFlow::Node intermediate, DataFlow::SourceNode jq
    |
      jq = jquery() and
      jqdotfn = jq.getAPropertyRead("fn") and
      jqdotfn.flowsTo(intermediate) and
      jqdotfndotx.getBase() = intermediate and
      fn.flowsTo(jqdotfndotx.getRhs()) and
      (
        source = fn.getAParameter() and
        source.(DataFlow::ParameterNode).getName().matches("%option%")
        or
        source = fn.getLastParameter() and
        not exists(DataFlow::ParameterNode p2 |
          p2 = fn.getAParameter() and
          p2.getName().matches("%option%")
        )
      )
    )
  }

  override predicate isSink(DataFlow::Node sink) {
    exists(JQuery::MethodCall mc | mc.interpretsArgumentAsHtml(sink))
  }
}

from Configuration cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
selectj sink.getNode(), source, sink, "Potential XSS vulnerability in plugin."

Because isSource and isSink are predicates (with boolean return values), the queries from previous steps needed to be restructured within exists() clauses. Comparing the results of running this query on the patched and unpatched versions of bootstrap codebase:

Patched:

File	Line	Classification	Explanation
js/affix.js	19	False Positive	The ternary conditional before the vulnerable sink ensures that malicious user input doesn’t reach it. So it is not vulnerable anymore.
js/scrollspy.js	113	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code
js/scrollspy.js	127	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code

Unpatched:

File	Line	Classification	Explanation
js/affix.js	19	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code. $(document).find should be used instead.
js/collapse.js	140	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code. $(document).find should be used instead.
js/scrollspy.js	113	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code. $(document).find should be used instead.
js/scrollspy.js	127	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code. $(document).find should be used instead.
js/tooltip.js	54	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code. $(document).find should be used instead.
js/tooltip.js	207	False Negative	My query does not handle taint propagated through (pseudo) class fields

Step 3: Triaging the query results

The query finds variants that were not patched: at js/scrollspy.js lines 113,127. An exploit proof-of-concept would look like:

      var templateHTML = '<div id="scrollspyTarget"></div>';
      $(templateHTML).appendTo(document.body);

      $("#scrollspyTarget").scrollspy({
        target: "<img src=1 onerror='alert(0)'>//"
      });

The query also generated a false-negative at: js/tooltip:207. That happened because the taint tracking library does not take taint propagated through (pseudo) class fields into account by default. This essentially means that the following is not considered a taint step by default:

Object.property = source;
.
.
V
sink = Object.property;

This can be accounted for by overriding another predicate isAdditionalTaintStep for the Configuration class as follows:

  override predicate isAdditionalTaintStep(DataFlow::Node pred, DataFlow::Node succ) {
    exists(DataFlow::ClassNode c, string s |
      pred = c.getAnInstanceReference().getAPropertyWrite(s).getRhs() and
      succ = c.getAnInstanceReference().getAPropertyRead(s)
    )
  }

Note: Later going through the documentation, I came across the library function DataFlow::localFieldStep which does this.

Adding this to the original query, the comparison between the patched and unpatched version becomes:
Unpatched:

File	Line	Classification	Explanation
js/affix.js	19	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code. $(document).find should be used instead.
js/collapse.js	140	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code. $(document).find should be used instead.
js/scrollspy.js	113	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code. $(document).find should be used instead.
js/scrollspy.js	127	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code. $(document).find should be used instead.
js/tooltip.js	54	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code. $(document).find should be used instead.
js/tooltip.js	207	True Positive	My query does not handle taint propagated through (pseudo) class fields
js/tooltip.js	432	False Positive	This is an intended sink for an HTML string with a default value that indicates the same

Patched:

File	Line	Classification	Explanation
js/affix.js	19	False Positive	The ternary conditional before the vulnerable sink ensures that malicious user input doesn’t reach it. So it is not vulnerable anymore.
js/scrollspy.js	113	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code. $(document).find should be used instead.
js/scrollspy.js	127	True Positive	Intended usage is for a css selector but can be abused to execute JavaScript code. $(document).find should be used instead.
js/tooltip.js	432	False Positive	This is an intended sink for an HTML string with a default value that indicates the same

Step 4: Inspiration for further improvements

While using JQuery::MethodCall::interpretsArgumentAsHtml: functions like .append(…) are considered valid sinks by the query even though it does not interpret its argument as a css selector. A more precise query should consider only those methodcalls as valid sinks for which the argument can be ambiguously interpretted as both html and css:

  override predicate isSink(DataFlow::Node sink) {
    exists(JQuery::MethodCall mc |
      mc.interpretsArgumentAsHtml(sink) and mc.interpretsArgumentAsSelector(sink)
    )
  }

Full query: https://lgtm.com/query/8326532560602353549/

Another thing that the query might miss is deliberate XSS sinks for which the programmer intends dynamically constructed HTML to be inserted there. Taking this into account would remove the false positive found earlier at js/tooltip:432 because the default value of template is an HTML string and not a CSS selector:

<--omitted-->
string getOptionName(DataFlow::PathNode sink, Configuration cfg) {
  exists(
    DataFlow::PropRef prBase, DataFlow::PropRef prOption, DataFlow::PathNode source,
    DataFlow::PathNode prOptionPN
  |
    prOptionPN = source.getASuccessor*() and
    prOption = prOptionPN.getNode() and
    sink = prOptionPN.getASuccessor*() and
    (
      prBase = prOption.getBase() and
      prBase.getPropertyName() = "options"
      or
      prOption.getBase().toString() = "options"
    ) and
    result = prOption.getPropertyName() and
    cfg.hasFlowPath(source, sink)
  )
}

predicate isDefaultHTML(string optionName) {
  exists(
    DataFlow::ClassNode cn, DataFlow::Node n, DataFlow::PropWrite defaults,
    DataFlow::PropWrite option, StringLiteral possibleHTML
  |
    cn.flowsTo(n) and
    defaults.getBase() = n and
    defaults.getPropertyName() = "DEFAULTS" and
    option.getPropertyName() = optionName and
    option.getBase().asExpr().getParent*() = defaults.getRhs().asExpr() and
    possibleHTML = option.getRhs().asExpr() and
    possibleHTML.getStringValue().matches("<%>")
  )
}

from Configuration cfg, DataFlow::PathNode source, DataFlow::PathNode sink, string optionName
where
  cfg.hasFlowPath(source, sink) and
  optionName = getOptionName(sink, cfg) and
  not isDefaultHTML(optionName)
select sink.getNode(), source, sink, "Potential XSS vulnerability in plugin."

Full query here: https://lgtm.com/query/4286196275218248538/

The query would also probably generate more false-positives if it considers sanitized arguments as valid sinks too. These can be detected by extending the TaintTracking::SanitizerGuardNode and overriding the isSanitizerGuard predicate of our configuration class accordingly. Query at: https://lgtm.com/query/2562618391260740405/.
Another improvement to the query would be to have the plugin and option name in the alert message.
The final query looks like https://lgtm.com/query/5711000722013907995/

Actual solution

Answers & Feedback - GitHub Security Lab CTF 3 has the ideal answers and feedback to all the CTF submissions. My query was definitely not ideal but I think it was good enough for a start. I won first place and a nintendo switch which is awesome too :)

Big thanks to Github Security Lab for creating this CTF.