Skip to content

Instantly share code, notes, and snippets.

@khanzadimahdi
Last active December 1, 2024 19:17
Show Gist options
  • Save khanzadimahdi/bab8a3416bdb764b9eda5b38b35735b8 to your computer and use it in GitHub Desktop.
Save khanzadimahdi/bab8a3416bdb764b9eda5b38b35735b8 to your computer and use it in GitHub Desktop.
regex pattern base64 data uri according to RFC 2397

pattern:

^data:((?:\w+\/(?:(?!;).)+)?)((?:;[\w\W]*?[^;])*),(.+)$

test this pattern on regexr: https://regexr.com/4inht

regex pattern to match RFC 2397 data URL

syntax:

dataurl := "data:" [ mediatype ] [ ";base64" ] "," data mediatype := [ type "/" subtype ] *( ";" parameter ) data := *urlchar parameter := attribute "=" value

examples:

example1: simple

data:image/jpeg;base64,UEsDBBQAAAAI

example2: with meta key=value

data:image/jpeg;key=value;base64,UEsDBBQAAAAI

example3: without base64 key name

data:image/jpeg;key=value,UEsDBBQAAAAI

example4: without mime-type

data:;base64;sdfgsdfgsdfasdfa=s,UEsDBBQAAAAI

example5: without mime-type , base64 and meta key=value

data:,UEsDBBQAAAAI

@khanzadimahdi
Copy link
Author

and about https://bit.dev/chriso/validator-js/is-data-uri you should know, it is a NPM package. and any packages could have bugs.

@Gpinchon
Copy link

Gpinchon commented Feb 2, 2021

Ok, thanks for the clarifications there 👍
My understanding was that parameters, as being part of [mediatype] had to be placed before the [";base64"] and after the [type"/"subtype] tokens if present, but I guess I was wrong.

@Gpinchon
Copy link

Gpinchon commented Feb 2, 2021

and about https://bit.dev/chriso/validator-js/is-data-uri you should know, it is a NPM package. and any packages could have bugs.

Sure, if you know a more reliable tool to validate data uris I would be very interrested

@wfoojjaec
Copy link

Even in valid URL data is allowed to be empty, so it might be a good idea to replace
,(.+)$ to ,(.*)$

@thced
Copy link

thced commented Sep 1, 2021

One should perhaps also be humble to admit that one can have own bugs; I would second that example #4 is wrong: data:;base64;sdfgsdfgsdfasdfa=s,UEsDBBQAAAAI is not valid afaik.

The reason for me saying so is that if you look at the standard you posted, you see that:

  1. attribute = value is a parameter
  2. a parameter is placed after type / subtype as part of media-type
  3. a parameter is prefixed by a semi-colon
  4. a parameter is optional
  5. a media-type is always placed before ";base64"

A valid example according to these statements would be: data:;sdfgsdfgsdfasdfa=s;base64,UEsDBBQAAAAI

I may of course be wrong.. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment