Skip to content

Instantly share code, notes, and snippets.

@dmfay
Last active September 6, 2017 17:56
Show Gist options
  • Save dmfay/c4017b8768519679e7948706fc09532e to your computer and use it in GitHub Desktop.
Save dmfay/c4017b8768519679e7948706fc09532e to your computer and use it in GitHub Desktop.
Collapse JOIN query result sets into a hierarchical object graph
'use strict';
/**
* Collapses tabular result sets into a (hierarchical) object graph based on
* column nomenclature. Given a query that selects parent and child columns as
* parent__id, parent__val, children__id, children__val, this will return an
* array of objects in the form
*
* {id: 1, val: 'parent val', children: [{id: 11, val: 'child val'}]}
*
* and so on.
*
* The optional options parameter may be an object with properties of the same
* name as any child entities (so 'children' in the example query above). Each
* child property defines behavior for the matching entity.
*
* If the options for 'children' contain {single: true} then children will be
* a single object rather than an array.
*
* If the options for 'children' contain {pk: 'my_id'} then the primary key of
* rows in children will be defined as my_id for purposes of filtering
* duplicates; otherwise, the primary key name will be assumed to be the same
* as the 'pk' parameter.
*
* Sample options object:
* {1to1child: {single: true}, custompkchild: {pk: 'my_id'}}
*
* @param {[type]} parent Name of the parent entity, the first part of parent__id.
* @param {[type]} pk Name of the parent primary key field, also applied to children unless overridden.
* @param {[type]} options Options defining special behavior (see above).
* @param {[type]} data Data to operate on.
* @return {[type]} Transformed object graph.
*/
exports = module.exports = function (parent, pk, options, data) {
if (data === undefined) {
data = options;
options = {};
}
if (!data || data.length === 0) {
return [];
}
/* schemata defines the structural relationships of the entity-models and the fields each model consists of, and maps
* the final field names to the original columns in the query resultset.
* example: {id: parent__id, name: parent__name, children: {id: children__id, name: children__name}} */
var schemata = Object.keys(data[0]).reduce(function (acc, c) {
var tuple = c.split('__');
var entity = acc;
var name;
do {
name = tuple.shift();
if (name !== parent) { // avoid creating a parent schema, we want that to be the root
// this almost certainly does Bad things if the graph is cyclic
// but fortunately we don't need to worry about that since the
// column name format can't define a backwards relationship
if (!entity.hasOwnProperty(name)) {
entity[name] = {};
}
entity = entity[name];
}
} while (tuple.length > 1); // walk as deep as we need to for child__grandchild__greatgrandchild__fieldname etc
entity[tuple.pop()] = c; // set {fieldname: path__to__fieldname} pair
return acc;
}, {});
/* mapping is a nested dictionary of id:entity but otherwise in the form of the final structure we're trying to build,
* effectively hashing ids to ensure we don't duplicate any entities in cases where multiple dependent tables are
* joined into the source query.
*
* example: {1: {id: 1, name: 'hi', children: {111: {id: 111, name: 'ih'}}} */
var mapping = data.reduce(function (acc, row) {
return (function build (obj, schema, parents, name) {
var opts = options[name] || {};
var pkField = name + '__' + (opts.pk || pk);
if (parents.length) {
pkField = parents.join('__') + '__' + pkField; // anything deeper than child__id needs to build the full column name
}
var id = row[pkField];
if (id === null) { // null id means this entity doesn't exist and was likely outer joined in
return;
} else if (!obj.hasOwnProperty(id)) { // this entity is new
obj[id] = {};
}
Object.keys(schema).forEach(function (c) {
if (typeof schema[c] === 'string') { // c is a field
obj[id][c] = row[schema[c]];
} else { // c is a relation
if (!obj[id].hasOwnProperty(c)) {
obj[id][c] = {}; // current object does not have relation defined, initialize it
}
// if parent isn't the root schema include that when we recurse, otherwise ignore
build(obj[id][c], schema[c], (name !== parent) ? parents.concat([name]): parents, c);
}
});
return obj;
})(acc, schemata, [], parent);
}, {});
/* Build the final graph. The structure and data already exists in mapping, but we need to transform the {id: entity} structures
* into arrays of entities (or flat objects if required).
*
* example: [{id: 1, name: 'hi', children: [{id: 111, name: 'ih'}]}] */
return (function transform(schema, map, accumulator) {
// for every id:entity pair in the current level of mapping, if the schema defines any dependent
// entities recurse and transform them, then push the current object into the accumulator and return
return Object.keys(map).reduce(function (acc, k) {
Object.keys(schema)
.filter(function (c) { return typeof schema[c] === 'object'; }) // just structure now
.forEach(function (c) {
// we have to init & pass the accumulator into the *next* recursion since the single
// option is defined on the child rather than the parent
var accumulator = options[c] && options[c].single ? {} : [];
map[k][c] = transform(schema[c], map[k][c], accumulator);
if (options[c] && options[c].sort) {
var sort = options[c].sort;
map[k][c].sort(function (a, b) {
if (a[sort] > b[sort]) { return 1; }
else if (a[sort] < b[sort]) { return -1; }
return 0;
});
}
});
if (Array.isArray(accumulator)) { acc.push(map[k]); }
else { acc = map[k]; }
return acc;
}, []);
})(schemata, mapping, []);
};
'use strict';
var assert = require('chai').assert;
var collapse = require('../../lib/collapse');
describe('lib/collapse', function () {
it('should return empty if given empty', function () {
assert.deepEqual([], collapse('parent', 'id', []));
});
it('should collapse simple tree structures', function () {
var data = [
{parent__id: 1, parent__val: 'p1', children__id: 11, children__val: 'c1'},
{parent__id: 1, parent__val: 'p1', children__id: 12, children__val: 'c2'}
];
data = collapse('parent', 'id', data);
assert.deepEqual(data, [{id: 1, val: 'p1', children: [{id: 11, val: 'c1'}, {id: 12, val: 'c2'}]}]);
});
it('should sort children if an option is specified', function () {
var source = [
{parent__id: 1, parent__val: 'p1', children__id: 11, children__val: 'c2'},
{parent__id: 1, parent__val: 'p1', children__id: 12, children__val: 'c1'}
];
var data = collapse('parent', 'id', source);
assert.deepEqual(data, [{id: 1, val: 'p1', children: [{id: 11, val: 'c2'}, {id: 12, val: 'c1'}]}]);
data = collapse('parent', 'id', {children: {sort: 'val'}}, source);
assert.deepEqual(data, [{id: 1, val: 'p1', children: [{id: 12, val: 'c1'}, {id: 11, val: 'c2'}]}]);
});
it('should collapse multiple children with the same parent', function () {
var data = [
{parent__id: 1, parent__val: 'p1', children1__id: 11, children1__val: 'c1', children2__id: 21, children2__val: 'd1'},
{parent__id: 1, parent__val: 'p1', children1__id: 12, children1__val: 'c2', children2__id: 22, children2__val: 'd2'},
{parent__id: 1, parent__val: 'p1', children1__id: 12, children1__val: 'c2', children2__id: 23, children2__val: 'd3'}
];
data = collapse('parent', 'id', data);
assert.deepEqual(data, [{
id: 1,
val: 'p1',
children1: [{id: 11, val: 'c1'}, {id: 12, val: 'c2'}],
children2: [{id: 21, val: 'd1'}, {id: 22, val: 'd2'}, {id: 23, val: 'd3'}]
}]);
});
it('should collapse children into other children', function () {
var data = [
{parent__id: 1, parent__val: 'p1', children1__id: 11, children1__val: 'c1', children1__children2__id: 21, children1__children2__val: 'd1'},
{parent__id: 1, parent__val: 'p1', children1__id: 12, children1__val: 'c2', children1__children2__id: 22, children1__children2__val: 'd2'},
{parent__id: 1, parent__val: 'p1', children1__id: 12, children1__val: 'c2', children1__children2__id: 23, children1__children2__val: 'd3'}
];
data = collapse('parent', 'id', data);
assert.deepEqual(data, [{
id: 1,
val: 'p1',
children1: [{
id: 11,
val: 'c1',
children2: [{id: 21, val: 'd1'}]
}, {
id: 12,
val: 'c2',
children2: [{id: 22, val: 'd2'}, {id: 23, val: 'd3'}]
}]
}]);
});
it('should create empty child arrays if given null children from outer joins', function () {
var data = [
{parent__id: 1, parent__val: 'p1', children__id: null, children__val: null},
{parent__id: 2, parent__val: 'p2', children__id: 11, children__val: 'c1'}
];
data = collapse('parent', 'id', data);
assert.deepEqual(data, [{id: 1, val: 'p1', children: []}, {id: 2, val: 'p2', children: [{id: 11, val: 'c1'}]}]);
});
it('should collapse 1:1 relations with options.single', function () {
var data = [
{parent__id: 1, parent__val: 'p1', child__id: 11, child__val: 'c1'},
];
data = collapse('parent', 'id', {child: {single: true}}, data);
assert.deepEqual(data, [{id: 1, val: 'p1', child: {id: 11, val: 'c1'}}]);
});
it('should collapse tree structures on a different child pk with options.pk', function () {
// this dataset is 'bad' in that you're not usually going to see 100% duplicate rows unless you've really screwed up
// but it's more legible than reproducing the 'multiple children' data and tests the deduplication just the same
var data = [
{parent__id: 1, parent__val: 'p1', children__child_id: 11, children__val: 'c1'},
{parent__id: 1, parent__val: 'p1', children__child_id: 12, children__val: 'c2'},
{parent__id: 1, parent__val: 'p1', children__child_id: 12, children__val: 'c2'}
];
data = collapse('parent', 'id', {children: {pk: 'child_id'}}, data);
assert.deepEqual(data, [{id: 1, val: 'p1', children: [{child_id: 11, val: 'c1'}, {child_id: 12, val: 'c2'}]}]);
});
it('should apply new parents only in the correct scope', function() {
var data = [
{
'this__id': 1,
'account__id': 1,
'this__name': 'Eduardo Luiz',
'contact__email': 'email',
'contact__phone': 'phone',
'this__notes': null,
'this__archived': false,
'address__zipCode': 'zip',
'address__street': 'street',
'address__number': 'number',
'address__complement': null,
'address__neighborhood': null,
'address__city': 'Sao Paulo',
'address__state': 'Sao Paulo',
'address__coords__latitude': '1',
'address__coords__longitude': '2',
'labels__id': '297726d0-301d-4de6-b9a4-e439b81f44ba',
'labels__name': 'Contrato',
'labels__color': 'yellow',
'labels__type': 1
}, {
'this__id': 1,
'account__id': 1,
'this__name': 'Eduardo Luiz',
'contact__email': 'email',
'contact__phone': 'phone',
'this__notes': null,
'this__archived': false,
'address__zipCode': 'zip',
'address__street': 'street',
'address__number': 'number',
'address__complement': null,
'address__neighborhood': null,
'address__city': 'Sao Paulo',
'address__state': 'Sao Paulo',
'address__coords__latitude': '1',
'address__coords__longitude': '2',
'labels__id': '1db6e07f-91e2-42fb-b65c-9a364b6bad4c',
'labels__name': 'Particular',
'labels__color': 'purple',
'labels__type': 1
}
];
var options = {
account: {
single: true
},
address: {
single: true
},
contact: {
single: true
},
coords: {
single: true
}
};
var models = collapse('this', 'id', options, data);
assert.deepEqual(models, [{
'id': 1,
'account': {
'id': 1
},
'name': 'Eduardo Luiz',
'contact': {
'email': 'email',
'phone': 'phone'
},
'notes': null,
'archived': false,
'address': {
'zipCode': 'zip',
'street': 'street',
'number': 'number',
'complement': null,
'neighborhood': null,
'city': 'Sao Paulo',
'state': 'Sao Paulo',
'coords': {
'latitude': '1',
'longitude': '2'
}
},
'labels': [
{
'id': '297726d0-301d-4de6-b9a4-e439b81f44ba',
'name': 'Contrato',
'color': 'yellow',
'type': 1
}, {
'id': '1db6e07f-91e2-42fb-b65c-9a364b6bad4c',
'name': 'Particular',
'color': 'purple',
'type': 1
}
]
}]);
});
});
@lfreneda
Copy link

lfreneda commented Feb 9, 2017

@dmfay need your help D:

The following code is not working propertly:

it('should returns as expected', function() {
        var data, model, options;
        data = [
          {
            "this__id": 1,
            "account__id": 1,
            "this__name": "Eduardo Luiz",
            "contact__email": "[email protected]",
            "contact__phone": "11965874523",
            "this__notes": null,
            "this__archived": false,
            "address__zipCode": "05422010",
            "address__street": "Rua dos Pinheiros",
            "address__number": "383",
            "address__complement": null,
            "address__neighborhood": null,
            "address__city": "Sao Paulo",
            "address__state": "Sao Paulo",
            "address__coords__latitude": "1",
            "address__coords__longitude": "2",
            "labels__id": "297726d0-301d-4de6-b9a4-e439b81f44ba",
            "labels__name": "Contrato",
            "labels__color": "yellow",
            "labels__type": 1
          }, {
            "this__id": 1,
            "account__id": 1,
            "this__name": "Eduardo Luiz",
            "contact__email": "[email protected]",
            "contact__phone": "11965874523",
            "this__notes": null,
            "this__archived": false,
            "address__zipCode": "05422010",
            "address__street": "Rua dos Pinheiros",
            "address__number": "383",
            "address__complement": null,
            "address__neighborhood": null,
            "address__city": "Sao Paulo",
            "address__state": "Sao Paulo",
            "address__coords__latitude": "1",
            "address__coords__longitude": "2",
            "labels__id": "1db6e07f-91e2-42fb-b65c-9a364b6bad4c",
            "labels__name": "Particular",
            "labels__color": "purple",
            "labels__type": 1
          }
        ];
        options = {
          account: {
            single: true
          },
          address: {
            single: true
          },
          contact: {
            single: true
          }
        };

        model = collapse('this', 'id', options, data);

        expect(model).to.deep.equal({
          "id": 1,
          "account": {
            "id": 1
          },
          "name": "Eduardo Luiz mapped",
          "contact": {
            "email": "[email protected]",
            "phone": "11965874523"
          },
          "notes": null,
          "archived": false,
          "address": {
            "zipCode": "05422010",
            "street": "Rua dos Pinheiros",
            "number": "383",
            "complement": null,
            "neighborhood": null,
            "city": "Sao Paulo",
            "state": "Sao Paulo",
            "coords": {
              "latitude": "1",
              "longitude": "2"
            }
          },
          "labels": [
            {
              "id": "297726d0-301d-4de6-b9a4-e439b81f44ba",
              "name": "Contrato",
              "color": "yellow",
              "type": 1
            }, {
              "id": "1db6e07f-91e2-42fb-b65c-9a364b6bad4c",
              "name": "Particular",
              "color": "purple",
              "type": 1
            }
          ]
        });

What I found was that when mapping (build function) labels are being considered children of address :(

Might be a problem here:

            Object.keys(schema).forEach(function (c) {
                if (typeof schema[c] === 'string') {	// c is a field
                    obj[id][c] = row[schema[c]];
                } else {								// c is a relation
                    if (!obj[id].hasOwnProperty(c)) {
                        obj[id][c] = {};				// current object does not have relation defined, initialize it
                    }
                    if (name !== parent) { parents.push(name); }	// since parent is just the root schema don't add it
                    build(obj[id][c], schema[c], parents, c);
                }
            });

Can you help me out? any tips?

@dmfay
Copy link
Author

dmfay commented Mar 10, 2017

Evidently I don't get email notifications on gist comments! Did you figure it out? If not I can take a closer look later on.

@dmfay
Copy link
Author

dmfay commented Mar 10, 2017

@lfreneda I found the problem: build() wasn't scoping parents properly, so once address was added it never fell back off. I've updated the gist and added your testcase. Doublecheck it -- your version was comparing the result array to an object model, which wouldn't work out.

As for publishing this, I'm hoping to bring it into Massive at some point....

@lfreneda
Copy link

lfreneda commented Sep 6, 2017

@dmfay you are GOD! :) Thank you so much for your help 👍

I would like to share with you some modifications that I've made. I just forked it and added some conventions.

If you have any free time, please check this out: https://gist.github.com/lfreneda/09d10429ad4f7d421955c1bffed097ca
Your feedback will be really appreciated.

The idea was to collapse without providing an options object;
If your recordset column is named ending with [] it will be treated as array otherwise as an object.

For example:

given the following rows:

var rows = [
   {'this.id': 1, 'this.val': 'p1', 'this.children[].id': 11, 'this.children[].val': 'c1'},
   {'this.id': 1, 'this.val': 'p1', 'this.children[].id': 12, 'this.children[].val': 'c2'}
];

will be collapsed to:

[{id: 1, val: 'p1', children: [{id: 11, val: 'c1'}, {id: 12, val: 'c2'}]}];

Let me know your thoughts on that.

@dmfay
Copy link
Author

dmfay commented Sep 6, 2017

That's a neat idea! I've actually integrated a version of this into Massive 4.0.0 just last night 😄 You do have to provide a decomposition schema, but you can use it with any resultset without having to name your output columns.

@lfreneda
Copy link

lfreneda commented Sep 6, 2017

@dmfay That's great!
I'm currently using massive 2.*, but I'm planning to upgrade to version 4 and I will definitely use it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment