Display SVG files as images instead of text #14101
Labels
No Label
backport/done
backport/v1.0
backport/v1.1
backport/v1.10
backport/v1.11
backport/v1.12
backport/v1.13
backport/v1.14
backport/v1.15
backport/v1.2
backport/v1.3
backport/v1.4
backport/v1.5
backport/v1.6
backport/v1.7
backport/v1.8
backport/v1.9
bounty
changelog
dependencies
frontport/done
frontport/main
good first issue
Hacktoberfest
hacktoberfest-accepted
in progress
kind/api
kind/breaking
kind/bug
kind/build
kind/deployment
kind/deprecated
kind/docs
kind/enhancement
kind/feature
kind/lint
kind/misc
kind/moderation
kind/package
kind/proposal
kind/question
kind/refactor
kind/regression
kind/security
kind/summary
kind/testing
kind/translation
kind/ui
kind/upstream-related
kind/usability
kind/ux
lgtm/done
lgtm/need 1
lgtm/need 2
performance/bigrepo
performance/cpu
performance/memory
performance/speed
priority/critical
priority/low
priority/maybe
priority/medium
proposal/rejected
reviewed/confirmed
reviewed/duplicate
reviewed/fixed
reviewed/invalid
reviewed/not-a-bug
reviewed/wontfix
skip-changelog
stale
status/blocked
status/needs-feedback
status/wip
theme/2fa
theme/authentication
theme/avatar
theme/backup-restore
theme/docker
theme/federation
theme/issues
theme/kanban
theme/markdown
theme/migration
theme/mobile
theme/pr
theme/signing
theme/sqlite
theme/timetracker
theme/webhook
theme/wiki
No Milestone
No project
No Assignees
3 Participants
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: lunny/gitea#14101
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "jt/svg"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This is an attempt to close #1095. It makes the following changes:
Content-Type: image/svg+xml
instead oftext/plain
,text/html
, ortext/xml
. This allows users to view an SVG as an image when browsing a repo's tree and when it's embedded in a markdown file.Content-Security-Policy: default-src 'none'; style-src 'unsafe-inline'; sandbox
andX-Content-Type-Options: nosniff
. This prevents the execution of scripts in untrusted SVG files.I tested this with https://github.com/edgar-bonet/test-svg-mime in Firefox and Chrome.
Thank you for this PR, and especially thank you for including tests & documentation :)
This is not an exhaustive review, just a few preliminary comments.
Could this be grouped with other ui settings (eg.
[ui.svg]
)?Another note, could the setting itself be
ENABLE_RENDER
?Once the above have been modified, could you also add the additional settings to the "cheat sheet" found in docs/ directory.
Could you update the year to 2020?
This fixes the SMIL animation in Firefox (which counts as inline style according to this) and matches the CSP GitHub uses.
We usually try to keep these in sync.
@ -399,0 +400,4 @@
isDisplayingRendered := !isDisplayingSource
isRepresentableAsText := base.IsRepresentableAsText(buf)
ctx.Data["IsRepresentableAsText"] = isRepresentableAsText
if !isRepresentableAsText {
So we should also change
ctx.Data["IsDisplayingSource"]
?@ -47,2 +47,4 @@
ctx.Resp.Header().Set("Content-Disposition", fmt.Sprintf(`inline; filename="%s"`, name))
ctx.Resp.Header().Set("Access-Control-Expose-Headers", "Content-Disposition")
if base.IsSVGImageFile(buf) {
ctx.Resp.Header().Set("Content-Security-Policy", "default-src 'none'; style-src 'unsafe-inline'; sandbox")
Is
unsafe-inline
necessary here? ref. https://content-security-policy.com/unsafe-inline/@ -399,0 +400,4 @@
isDisplayingRendered := !isDisplayingSource
isRepresentableAsText := base.IsRepresentableAsText(buf)
ctx.Data["IsRepresentableAsText"] = isRepresentableAsText
if !isRepresentableAsText {
Fixed.
@ -47,2 +47,4 @@
ctx.Resp.Header().Set("Content-Disposition", fmt.Sprintf(`inline; filename="%s"`, name))
ctx.Resp.Header().Set("Access-Control-Expose-Headers", "Content-Disposition")
if base.IsSVGImageFile(buf) {
ctx.Resp.Header().Set("Content-Security-Policy", "default-src 'none'; style-src 'unsafe-inline'; sandbox")
Without it, many SVG images look nothing like they're intended to look. Many SVG attributes have a style alternative.
That said, I originally omitted it, but I was asked to add it in another comment in this PR. As long as it's safe, I think that including it is a good thing.
@ -47,2 +47,4 @@
ctx.Resp.Header().Set("Content-Disposition", fmt.Sprintf(`inline; filename="%s"`, name))
ctx.Resp.Header().Set("Access-Control-Expose-Headers", "Content-Disposition")
if base.IsSVGImageFile(buf) {
ctx.Resp.Header().Set("Content-Security-Policy", "default-src 'none'; style-src 'unsafe-inline'; sandbox")
Please see my ref article, It said
@ -47,2 +47,4 @@
ctx.Resp.Header().Set("Content-Disposition", fmt.Sprintf(`inline; filename="%s"`, name))
ctx.Resp.Header().Set("Access-Control-Expose-Headers", "Content-Disposition")
if base.IsSVGImageFile(buf) {
ctx.Resp.Header().Set("Content-Security-Policy", "default-src 'none'; style-src 'unsafe-inline'; sandbox")
I read the article. I'm having trouble understanding how nonce or hash is actually better. If I understand it correctly, a hash prevents modification between the server and client, which is not more than what HTTPS would do. So this is to make the HTTP case safer.
If you think that's what we should do, it's not clear to me how to actually implement it. It seems like we would need to parse the SVG file enough to either insert nonces at the appropriate places in the SVG or read all the inline styles so that we can compute their hashes. Is this what you're saying? I'm guessing this must include all the
style="..."
attributes. There could potentially be many of them. I'll think about it some more.@ -47,2 +47,4 @@
ctx.Resp.Header().Set("Content-Disposition", fmt.Sprintf(`inline; filename="%s"`, name))
ctx.Resp.Header().Set("Access-Control-Expose-Headers", "Content-Disposition")
if base.IsSVGImageFile(buf) {
ctx.Resp.Header().Set("Content-Security-Policy", "default-src 'none'; style-src 'unsafe-inline'; sandbox")
@lunny The suggestion from the ref article is not applicable in this case, as the CSP is applied when rendering user-generated content (the SVG file). The general purpose of using nonce is so that website can include inline script in non-user controlled HTML content while at the same time benefiting from CSP blocking possible XSS attacks.
@ -31,1 +38,4 @@
var svgTagRegex = regexp.MustCompile(`(?s)\A\s*(?:<!--.*?-->\s*)*<svg\b`)
var svgTagInXMLRegex = regexp.MustCompile(`(?s)\A<\?xml\b.*?\?>\s*(?:<!--.*?-->\s*)*<svg\b`)
// EncodeMD5 encodes string to md5 hex value.
mostly we use 1024 bytes - I think that's what works best for most external libraries too.
@ -31,1 +38,4 @@
var svgTagRegex = regexp.MustCompile(`(?s)\A\s*(?:<!--.*?-->\s*)*<svg\b`)
var svgTagInXMLRegex = regexp.MustCompile(`(?s)\A<\?xml\b.*?\?>\s*(?:<!--.*?-->\s*)*<svg\b`)
// EncodeMD5 encodes string to md5 hex value.
would it be possible to add a test to ensure that if a file contains SVG beyond the sniff length that the SVG is still stripped? I could imagine it being something that unfamiliar developers might want to remove and we must not allow it to sneak back in unsafely.
@ -31,1 +38,4 @@
var svgTagRegex = regexp.MustCompile(`(?s)\A\s*(?:<!--.*?-->\s*)*<svg\b`)
var svgTagInXMLRegex = regexp.MustCompile(`(?s)\A<\?xml\b.*?\?>\s*(?:<!--.*?-->\s*)*<svg\b`)
// EncodeMD5 encodes string to md5 hex value.
I copied this from the
go http module
that was being used before. I can add a test, but I'm pretty sure 512 is what was being used before.@ -31,1 +38,4 @@
var svgTagRegex = regexp.MustCompile(`(?s)\A\s*(?:<!--.*?-->\s*)*<svg\b`)
var svgTagInXMLRegex = regexp.MustCompile(`(?s)\A<\?xml\b.*?\?>\s*(?:<!--.*?-->\s*)*<svg\b`)
// EncodeMD5 encodes string to md5 hex value.
I stand corrected! I'm misremembering the bytes/runes thing and lfs etc.
Would you be able to add a test that ensures a file with the svg tag after 512 chars still gets sanitized?
@ -31,1 +38,4 @@
var svgTagRegex = regexp.MustCompile(`(?s)\A\s*(?:<!--.*?-->\s*)*<svg\b`)
var svgTagInXMLRegex = regexp.MustCompile(`(?s)\A<\?xml\b.*?\?>\s*(?:<!--.*?-->\s*)*<svg\b`)
// EncodeMD5 encodes string to md5 hex value.
(I know your code doesn't change this - it's protecting against someone else removing the sanitization later by mistake
@ -31,1 +38,4 @@
var svgTagRegex = regexp.MustCompile(`(?s)\A\s*(?:<!--.*?-->\s*)*<svg\b`)
var svgTagInXMLRegex = regexp.MustCompile(`(?s)\A<\?xml\b.*?\?>\s*(?:<!--.*?-->\s*)*<svg\b`)
// EncodeMD5 encodes string to md5 hex value.
I'd be happy to add a test, but to be clear, there is no sanitizing or stripping in this PR. The security is by the nature of how
<img>
tags and content-security-policies work. So I'm not sure what the test would actually assert. If the svg tag weren't in the first 512 bytes, the file would get detected as something else, such as text/plain, text/html, or text/xml and served as that. When viewing the file, gitea would display it as a text file. When viewing the raw file, the browser will render it as text. When embedded in an<img>
tag from a markdown file, the image would fail to display.@ -31,1 +38,4 @@
var svgTagRegex = regexp.MustCompile(`(?s)\A\s*(?:<!--.*?-->\s*)*<svg\b`)
var svgTagInXMLRegex = regexp.MustCompile(`(?s)\A<\?xml\b.*?\?>\s*(?:<!--.*?-->\s*)*<svg\b`)
// EncodeMD5 encodes string to md5 hex value.
BTW, as it's currently implemented, we only attempt to look for an SVG content type if it's already been detected as one of: text/plain, text/html, text/xml. So it's not possible that having the svg tag later in the file than 512 bytes would cause a file to be detected as something other than those three content types.
@ -31,1 +38,4 @@
var svgTagRegex = regexp.MustCompile(`(?s)\A\s*(?:<!--.*?-->\s*)*<svg\b`)
var svgTagInXMLRegex = regexp.MustCompile(`(?s)\A<\?xml\b.*?\?>\s*(?:<!--.*?-->\s*)*<svg\b`)
// EncodeMD5 encodes string to md5 hex value.
That's exactly my point - we want the HTML file with embedded SVG after 512 to have the SVG sanitised away and displayed without it.
The test would simply assert that there were no SVG els.
The point is to prevent some well meaning change to the sanitizer leading to unsanitised SVG leaking.
@ -31,1 +38,4 @@
var svgTagRegex = regexp.MustCompile(`(?s)\A\s*(?:<!--.*?-->\s*)*<svg\b`)
var svgTagInXMLRegex = regexp.MustCompile(`(?s)\A<\?xml\b.*?\?>\s*(?:<!--.*?-->\s*)*<svg\b`)
// EncodeMD5 encodes string to md5 hex value.
If you really don't want to do it - it can be added later but it'd be good to get the test in as an example of something that is not going to be supported and also something we don't want to happen.
@ -31,1 +38,4 @@
var svgTagRegex = regexp.MustCompile(`(?s)\A\s*(?:<!--.*?-->\s*)*<svg\b`)
var svgTagInXMLRegex = regexp.MustCompile(`(?s)\A<\?xml\b.*?\?>\s*(?:<!--.*?-->\s*)*<svg\b`)
// EncodeMD5 encodes string to md5 hex value.
I think there's a misunderstanding. Maybe I'm not being clear enough. If there's a file that's detected as text/html with a root
<svg>
tag after 512 bytes, and we serve it withContent-Type: text/html
, the browser will never up-convert toimage/svg+xml
because we're also serving the headerX-Content-Type-Options: nosniff
, which tells the browser, "take our word for it; don't try to do your own sniffing".So I don't see a need to assert that there are no
<svg>
tags. Am I missing something?@ -31,1 +38,4 @@
var svgTagRegex = regexp.MustCompile(`(?s)\A\s*(?:<!--.*?-->\s*)*<svg\b`)
var svgTagInXMLRegex = regexp.MustCompile(`(?s)\A<\?xml\b.*?\?>\s*(?:<!--.*?-->\s*)*<svg\b`)
// EncodeMD5 encodes string to md5 hex value.
I'm working on a test that asserts that a file with a root
<svg>
tag after 512 bytes is detected as text. Hopefully, that's what you mean.@ -47,2 +47,4 @@
ctx.Resp.Header().Set("Content-Disposition", fmt.Sprintf(`inline; filename="%s"`, name))
ctx.Resp.Header().Set("Access-Control-Expose-Headers", "Content-Disposition")
if base.IsSVGImageFile(buf) {
ctx.Resp.Header().Set("Content-Security-Policy", "default-src 'none'; style-src 'unsafe-inline'; sandbox")
nonce of CSP requires a change to the svg content dynamiclly which will make the thing complicated.