Skip to content

Instantly share code, notes, and snippets.

@airtonix
Last active August 19, 2024 01:37
Show Gist options
  • Save airtonix/84653b8117ac80840a74000f52f10d60 to your computer and use it in GitHub Desktop.
Save airtonix/84653b8117ac80840a74000f52f10d60 to your computer and use it in GitHub Desktop.
Partial Lark Grammar for Hledger CSV transform rules
// https://hledger.org/1.34/hledger.html#csv
// https://lark-parser.readthedocs.io/en/latest/grammar.html
//
// TODO:
// - IF tables (import_rule)
// - IF empty row (import_rule)
//
// NEEDS TESTING:
// - every rule except import and include
//
// Mini Help
//
// TERMINATOR: a thing that matches a thing
// rule: a thing that matches things and results in output nodes
// _UPPERCASE_WORD: means a hidden terminator
// "stri ngs": match exactly this and dont create output node
// /regex/: match this pattern
// %ignore: this pattern will not produce output nodes (unsure about this definition)
// many rules ending in a possible new line
start: rule* | _NEWLINE*
// a rule can be an "include" or an "import transform"
rule: source_rule
| separator_rule
| skip_rule
| date_format_rule
| timezone_rule
| newest_first_rule
| intra_day_reversed_rule
| decimal_mark_rule
| field_list_rule
| field_assignment_rule
| balance_type_rule
| include_rule
| import_rule
// https://hledger.org/1.34/hledger.html#source
source_rule: "source" source_value _NEWLINE*
source_value: /[^\n]+/
// https://hledger.org/1.34/hledger.html#separator
separator_rule: "separator" separator_value _NEWLINE*
separator_value: /(.|TAB|SPACE)/i
// https://hledger.org/1.34/hledger.html#skip
skip_rule: "skip" skip_row_count _NEWLINE*
skip_row_count: /0-9/
// https://hledger.org/1.34/hledger.html#date-format
date_format_rule: "date-format" date_format _NEWLINE*
date_format: /[^\n]+/
// https://hledger.org/1.34/hledger.html#timezone
timezone_rule: "timezone" timezone_value _NEWLINE*
timezone_value: /[^\n]+/
// https://hledger.org/1.34/hledger.html#newest-first
newest_first_rule: "newest-first" _NEWLINE*
// https://hledger.org/1.34/hledger.html#intra-day-reversed
intra_day_reversed_rule: "intra-day-reversed" _NEWLINE*
// https://hledger.org/1.34/hledger.html#decimal-mark
decimal_mark_rule: "decimal-mark" decimal_mark_value _NEWLINE*
decimal_mark_value: "," | "."
// https://hledger.org/1.34/hledger.html#fields-list
field_list_rule: "fields " (field_header|_HEADER_SEPARATOR)* _NEWLINE*
field_header: /\w+/
_HEADER_SEPARATOR: /,\s*/
// https://hledger.org/1.34/hledger.html#field-assignment
field_assignment_rule: assignment_field assignment_value _NEWLINE*
assignment_field: _HLEDGER_FIELD_KEY
assignment_value: /[^\n]+/
//
balance_type_rule: "balance-type" balance_type _NEWLINE*
balance_type: "="
| "=*"
| "=="
| "==*"
// Include
include_rule: "include" include_path _NEWLINE*
include_path: /[^\n]+/
// import rules start with "if"
import_rule: _IF (match_field | match_line)+ transform+ _NEWLINE*
match_line: match_line_value _NEWLINE
match_line_value: /^[^%](.+)/im
match_field: (match_field_or_key|match_field_and_key) match_field_value _NEWLINE
match_field_or_key: /%[a-zA-Z0-9-_]+/
match_field_and_key: /&\ %?[a-zA-Z0-9-_]+/
match_field_value: /.+/
transform: _INDENT transform_key transform_value _NEWLINE?
transform_key: _HLEDGER_FIELD_KEY
transform_value: /[^\n]+/
_HLEDGER_FIELD_KEY: "date"
| "date2"
| "status"
| "code"
| "description"
| /comment\d?/
| /account\d?/
| /amount\d?/
| /amount-in(-\d)?/
| /amount-out(-\d)?/
| /currency\d?/
| /balance\d?/
_WS: /^\s+$/i
_INDENT: " "+
_IF: /if.*\n?/i
_MATCH_KEY: _MATCH_OR_KEY | _MATCH_AND_KEY
_MATCH_OR_KEY: "&"
_MATCH_AND_KEY: "& %"
%import common.NEWLINE -> _NEWLINE
%import common.WORD -> _WORD
%import common.CNAME -> _CNAME
%import lark.COMMENT -> _COMMENT
%import common.WS_INLINE -> _WS_INLINE
%import common.WS
%ignore _COMMENT _NEWLINE?
%ignore _WS
from lark import Lark
hledger_parser = Lark.open(
"hledger-csv-rule.lark",
rel_to=__file__,
parser="lalr",
)
if
SOME DESCRIPTION
%a-field-name some value in field
account2 expenses:self:entertainment:eatingout ;something
comment icon:🍫
if
SOME DESCRIPTION
%a-field-name some value in field
account2 expenses:self:entertainment:eatingout ;something
comment icon:🍫
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment