Skip to content

Instantly share code, notes, and snippets.

@shraiwi
Created August 14, 2024 20:45
Show Gist options
  • Save shraiwi/a702744d0849915b3a4350001c023bed to your computer and use it in GitHub Desktop.
Save shraiwi/a702744d0849915b3a4350001c023bed to your computer and use it in GitHub Desktop.
SiMin

SiMin

A lightweight format for writing numbers using scientific notation.

I wrote this to make it easy to allow applications to accept values with units gracefully. Although many languages implement scientific notation, I never found it as ergonomic as just using SI prefixes. Would you rather a user input a unitless 1.024e+6 or 1_000KiB?

How does it work?

A SiMin number has three parts: a value, prefix, and unit. Let's look at an example:

tokenize('-120mT') === [ '-120', 'm', 'T' ]
  • '120' is the value. It can optionally start with a + or -, and consists of digits, decimals, and underscores. All of the following are valid values:
    • 9_800
    • +1.20030
    • -2300.123_456
  • 'm' is the prefix. It is the longest valid SI prefix that the parser can find directly after the value.
  • 'T' is the unit. It's just the rest of the string after the prefix.

More examples

// parsing numbers
console.log(parseUnit("10MB"));     // [ 10000000, 'B' ]
console.log(parseUnit("100mV"));    // [ 0.1, 'V' ]
console.log(parseUnit("-1.2uT"));   // [ -0.0000012, 'T' ]
console.log(parseUnit("123.45A"));  // [ 123.45, 'A' ]

// you can use underscores to make things more readable
console.log(parseUnit("100_000_000"));  // [ 100000000, '' ]

// automatic BigInt for really large prefixes.
console.log(parseUnit("10_000EiB"));    // [ 115292150460684697600000n, 'B' ]

// use parse to ignore the unit
console.log(parse("920d"));         // 92

// tokenizing numbers (if you want to parse it yourself)
console.log(tokenize("50Mibps"));   // [ '50', 'Mi', 'bps' ]

// you can also get the info of any SI prefix
console.log(prefixData("Mi"));      // [ 2, 20, [Function: parseFloat] ]

Prefixes

Important

SiMin numbers that use the prefixes marked with 🔢 will parse to BigNums

Prefix Base Exponent
E 🔢 10 18
P 🔢 10 15
T 10 12
G 10 9
M 10 6
k 10 3
h 10 2
da 10 1
d 10 -1
c 10 -2
m 10 -3
u, μ 10 -6
n 10 -9
p 10 -12
f 10 -15
a 10 -18
Prefix Base Exponent
Ei 🔢 2 60
Pi 🔢 2 50
Ti 2 40
Gi 2 30
Mi 2 20
Ki 2 10
const numParse = Number.parseFloat;
const bigParse = BigInt;
const prefixes = [
{ // one-letter prefixes
E: [10n, 18n, bigParse],
P: [10n, 15n, bigParse],
T: [10, 12, numParse],
G: [10, 9, numParse],
M: [10, 6, numParse],
k: [10, 3, numParse],
h: [10, 2, numParse],
d: [10, -1, numParse],
c: [10, -2, numParse],
m: [10, -3, numParse],
u: [10, -6, numParse],
μ: [10, -6, numParse],
n: [10, -9, numParse],
p: [10, -12, numParse],
f: [10, -15, numParse],
a: [10, -18, numParse],
},
{ // two-letter prefixes
Ei: [2n, 60n, bigParse],
Pi: [2n, 50n, bigParse],
Ti: [ 2, 40, numParse],
Gi: [ 2, 30, numParse],
Mi: [ 2, 20, numParse],
Ki: [ 2, 10, numParse],
da: [10, 1, numParse]
}
];
function tokenize(s) {
// extract number
let head = s.length !== 0 && (s[0] === '+' | s[0] === '-'); // consume a +/- if available
// parse value
for (; head < s.length; head++) {
if ((s[head] > '9' || s[head] < '0') && s[head] !== '.' && s[head] !== '_') break;
}
let value = s.slice(0, head);
let prefix = '';
// parse prefix
for (let len = Math.min(s.length - head, prefixes.length); len > 0; len--) {
const prefixTable = prefixes[len - 1];
const queryPrefix = s.slice(head, head + len);
if (queryPrefix in prefixTable) {
head += len;
prefix = queryPrefix;
break;
}
}
// return rest as unit
let unit = s.slice(head, s.length);
return [ value, prefix, unit ];
}
function sanitize(s) {
return s.replaceAll(/_/g, '');
}
function prefixData(prefix) {
return (prefix !== '' && prefixes[prefix.length - 1][prefix]) || [ 1, 1, numParse ];
}
function parseUnit(s) {
const [ value, prefix, unit ] = tokenize(s);
const [ base, exp, valueParser ] = prefixData(prefix);
return [ valueParser(sanitize(value)) * (base ** exp), unit ];
}
function parse(s) {
const [ value, prefix, unit ] = tokenize(s);
const [ base, exp, valueParser ] = prefixData(prefix);
return valueParser(sanitize(value)) * (base ** exp);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment