Last active December 18, 2023 18:57
Detect if a string is a data URL. Doesn't try to parse it or determine validity, just a quick check if a string appears to be a data URL. See for a demo.
// Detecting data URLs
// data URI - MDN
// The "data" URL scheme:
// Valid URL Characters:
function isDataURL(s) {
return !!s.match(isDataURL.regex);
isDataURL.regex = /^\s*data:([a-z]+\/[a-z]+(;[a-z\-]+\=[a-z\-]+)?)?(;base64)?,[a-z0-9\!\$\&\'\,\(\)\*\+\,\;\=\-\.\_\~\:\@\/\?\%\s]*\s*$/i;
var yes = [
" data:,Hello%2C%20World!",
" data:,Hello World!",
" data:text/plain;base64,SGVsbG8sIFdvcmxkIQ%3D%3D",
" data:text/html,%3Ch1%3EHello%2C%20World!%3C%2Fh1%3E",
var no = [
"data:text/html;charset,%3Ch1%3EHello!%3C%2Fh1%3E", "data:base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQAQMAAAAlPW0iAAAABlBMVEUAAAD///+l2Z/dAAAAM0lEQVR4nGP4/5/h/1+G/58ZDrAz3D/McH8yw83NDDeNGe4Ug9C9zwz3gVLMDA/A6P9/AFGGFyjOXZtQAAAAAElFTkSuQmCC",
var log = document.createElement("pre");
function printError(msg) {
var message = document.createElement("span"); = "red";
message.textContent = msg + "\n";
function printSuccess(msg) {
var message = document.createElement("span"); = "green";
message.textContent = msg + "\n";
yes.forEach(function(s) {
if (!isDataURL(s)) {
printError("Expected yes, got no: " + s);
else {
printSuccess("Expected yes, got yes: " + s);
no.forEach(function(s) {
if (isDataURL(s)) {
printError("Expected no, got yes: " + s);
else {
printSuccess("Expected no, got no: " + s);
dataurl    := "data:" [ mediatype ] [ ";base64" ] "," data
mediatype  := [ type "/" subtype ] *( ";" parameter )
data       := *urlchar
parameter  := attribute "=" value

where "urlchar" is imported from [RFC2396], and "type", "subtype", "attribute" and "value" are the corresponding tokens from [RFC2045], represented using URL escaped encoding of [RFC2396] as necessary.

Attribute values in [RFC2045] are allowed to be either represented as tokens or as quoted strings. However, within a "data" URL, the "quoted-string" representation would be awkward, since the quote mark is itself not a valid urlchar. For this reason, parameter values should use the URL Escaped encoding instead of quoted string if the parameter values contain any "tspecial".

The ";base64" extension is distinguishable from a content-type parameter by the fact that it doesn't have a following "=" sign.

guag commented May 13, 2015

I like this, but just a question: any particular reason you went with String.match(regex) instead of the faster regex.test(String)? According to this stackoverflow q&a, it can be 30-60% faster to use test(), which in this situation seems like the better option since you don't need the array of results that match() provides.

Take care :)

guag commented May 17, 2015

One other thing I noticed, this seems to fail for data URLs of types that aren't images or text, such as audio/mp3 and video/x-ms-wmv. Check out my fork of your fiddle to see what I mean:

ghost commented Jun 23, 2015

I tweaked the regex to

isDataURI.regex = /^\s*data:([a-z]+\/[a-z0-9\-]+(;[a-z\-]+\=[a-z\-]+)?)?(;base64)?,[a-z0-9\!\$\&\'\,\(\)\*\+\,\;\=\-\.\_\~\:\@\/\?\%\s]*\s*$/i;

and then it handles those additional content types. (I just added 0-9\- to the part following the forward slash).

Mottie commented Jan 16, 2016

The regex doesn't work with data:image/svg+xml;base64,... or data:image/svg+xml;charset=utf-8,... (demo)

Change the regex to the following to fix it (demo)

isDataURL.regex = /^\s*data:([a-z]+\/[a-z0-9\-\+]+(;[a-z\-]+\=[a-z0-9\-]+)?)?(;base64)?,[a-z0-9\!\$\&\'\,\(\)\*\+\,\;\=\-\.\_\~\:\@\/\?\%\s]*\s*$/i;

tansongyang commented Jun 28, 2016

I made a slight tweak so that the regex will accept types with a . character, like application/ I also removed some unnecessary backslashes (\).

isDataURL.regex = /^\s*data:([a-z]+\/[a-z0-9-+.]+(;[a-z-]+=[a-z0-9-]+)?)?(;base64)?,([a-z0-9!$&',()*+;=\-._~:@\/?%\s]*)\s*$/i;

Pamblam commented Feb 3, 2017

killmenot commented Feb 13, 2018

I used this solution for creating npm package valid-data-url and got an email that explained that my package is vulnerable for ReDoS exploit. I checked this one - it also does. I found a tool that helps to validate the regex for such exploits. Take a look:

Back to the initial solution on the top, I recommend

  1. drop \s* from the beginning and from the end of the regex
  2. use !!s.trim().match(isDataURL.regex); to support existing functionality and fix exploit issue in the same time

Hope this helps

S3gillu commented Feb 20, 2018

Thanks, it works great, here is a demo

