对对象数组进行分组的最有效方法

本文翻译自:Most efficient method to groupby on an array of objects

What is the most efficient way to groupby objects in an array? 在数组中对对象进行分组的最有效方法是什么?

For example, given this array of objects: 例如,给定此对象数组:

[ 
    { Phase: "Phase 1", Step: "Step 1", Task: "Task 1", Value: "5" },
    { Phase: "Phase 1", Step: "Step 1", Task: "Task 2", Value: "10" },
    { Phase: "Phase 1", Step: "Step 2", Task: "Task 1", Value: "15" },
    { Phase: "Phase 1", Step: "Step 2", Task: "Task 2", Value: "20" },
    { Phase: "Phase 2", Step: "Step 1", Task: "Task 1", Value: "25" },
    { Phase: "Phase 2", Step: "Step 1", Task: "Task 2", Value: "30" },
    { Phase: "Phase 2", Step: "Step 2", Task: "Task 1", Value: "35" },
    { Phase: "Phase 2", Step: "Step 2", Task: "Task 2", Value: "40" }
]

I'm displaying this information in a table. 我正在表中显示此信息。 I'd like to groupby different methods, but I want to sum the values. 我想对不同的方法进行分组,但是我想对这些值求和。

I'm using Underscore.js for its groupby function, which is helpful, but doesn't do the whole trick, because I don't want them “split up” but “merged”, more like the SQL group by method. 我将Underscore.js用于其groupby函数,这很有用,但并不能解决所有问题,因为我不希望它们“分裂”而是“合并”,更像SQL group by方法。

What I'm looking for would be able to total specific values (if requested). 我正在寻找的是能够总计特定值(如果需要)。

So if I did groupby Phase , I'd want to receive: 因此,如果我按Phase进行groupby,则希望收到:

[
    { Phase: "Phase 1", Value: 50 },
    { Phase: "Phase 2", Value: 130 }
]

And if I did groupy Phase / Step , I'd receive: 如果我对Phase / Step进行了分组,则会收到:

[
    { Phase: "Phase 1", Step: "Step 1", Value: 15 },
    { Phase: "Phase 1", Step: "Step 2", Value: 35 },
    { Phase: "Phase 2", Step: "Step 1", Value: 55 },
    { Phase: "Phase 2", Step: "Step 2", Value: 75 }
]

Is there a helpful script for this, or should I stick to using Underscore.js, and then looping through the resulting object to do the totals myself? 是否为此提供有用的脚本,还是应该坚持使用Underscore.js,然后循环遍历生成的对象来自己进行总计?


#1楼

参考:https://stackoom.com/question/ycCF/对对象数组进行分组的最有效方法


#2楼

This is probably more easily done with linq.js , which is intended to be a true implementation of LINQ in JavaScript ( DEMO ): 使用linq.js可能更容易做到这一点,它旨在真正实现LINQ in JavaScript( DEMO ):

var linq = Enumerable.From(data);
var result =
    linq.GroupBy(function(x){ return x.Phase; })
        .Select(function(x){
          return {
            Phase: x.Key(),
            Value: x.Sum(function(y){ return y.Value|0; })
          };
        }).ToArray();

result: 结果:

[
    { Phase: "Phase 1", Value: 50 },
    { Phase: "Phase 2", Value: 130 }
]

Or, more simply using the string-based selectors ( DEMO ): 或者,更简单地使用基于字符串的选择器( DEMO ):

linq.GroupBy("$.Phase", "",
    "k,e => { Phase:k, Value:e.Sum('$.Value|0') }").ToArray();

#3楼

Although the linq answer is interesting, it's also quite heavy-weight. 尽管linq的答案很有趣,但它也很笨重。 My approach is somewhat different: 我的方法有些不同:

var DataGrouper = (function() {
    var has = function(obj, target) {
        return _.any(obj, function(value) {
            return _.isEqual(value, target);
        });
    };

    var keys = function(data, names) {
        return _.reduce(data, function(memo, item) {
            var key = _.pick(item, names);
            if (!has(memo, key)) {
                memo.push(key);
            }
            return memo;
        }, []);
    };

    var group = function(data, names) {
        var stems = keys(data, names);
        return _.map(stems, function(stem) {
            return {
                key: stem,
                vals:_.map(_.where(data, stem), function(item) {
                    return _.omit(item, names);
                })
            };
        });
    };

    group.register = function(name, converter) {
        return group[name] = function(data, names) {
            return _.map(group(data, names), converter);
        };
    };

    return group;
}());

DataGrouper.register("sum", function(item) {
    return _.extend({}, item.key, {Value: _.reduce(item.vals, function(memo, node) {
        return memo + Number(node.Value);
    }, 0)});
});

You can see it in action on JSBin . 您可以在JSBin上看到它的运行

I didn't see anything in Underscore that does what has does, although I might be missing it. 我没有看到强调,所做的任何事has呢,虽然我可能会丢失了。 It's much the same as _.contains , but uses _.isEqual rather than === for comparisons. 它与_.contains几乎相同,但是使用_.isEqual而不是===进行比较。 Other than that, the rest of this is problem-specific, although with an attempt to be generic. 除此之外,其余的都是特定于问题的,尽管尝试是通用的。

Now DataGrouper.sum(data, ["Phase"]) returns 现在, DataGrouper.sum(data, ["Phase"])返回

[
    {Phase: "Phase 1", Value: 50},
    {Phase: "Phase 2", Value: 130}
]

And DataGrouper.sum(data, ["Phase", "Step"]) returns 然后DataGrouper.sum(data, ["Phase", "Step"])返回

[
    {Phase: "Phase 1", Step: "Step 1", Value: 15},
    {Phase: "Phase 1", Step: "Step 2", Value: 35},
    {Phase: "Phase 2", Step: "Step 1", Value: 55},
    {Phase: "Phase 2", Step: "Step 2", Value: 75}
]

But sum is only one potential function here. 但是sum只是这里的一个潜在函数。 You can register others as you like: 您可以根据需要注册其他人:

DataGrouper.register("max", function(item) {
    return _.extend({}, item.key, {Max: _.reduce(item.vals, function(memo, node) {
        return Math.max(memo, Number(node.Value));
    }, Number.NEGATIVE_INFINITY)});
});

and now DataGrouper.max(data, ["Phase", "Step"]) will return 现在DataGrouper.max(data, ["Phase", "Step"])将返回

[
    {Phase: "Phase 1", Step: "Step 1", Max: 10},
    {Phase: "Phase 1", Step: "Step 2", Max: 20},
    {Phase: "Phase 2", Step: "Step 1", Max: 30},
    {Phase: "Phase 2", Step: "Step 2", Max: 40}
]

or if you registered this: 或者如果您注册了此:

DataGrouper.register("tasks", function(item) {
    return _.extend({}, item.key, {Tasks: _.map(item.vals, function(item) {
      return item.Task + " (" + item.Value + ")";
    }).join(", ")});
});

then calling DataGrouper.tasks(data, ["Phase", "Step"]) will get you 然后调用DataGrouper.tasks(data, ["Phase", "Step"])会得到

[
    {Phase: "Phase 1", Step: "Step 1", Tasks: "Task 1 (5), Task 2 (10)"},
    {Phase: "Phase 1", Step: "Step 2", Tasks: "Task 1 (15), Task 2 (20)"},
    {Phase: "Phase 2", Step: "Step 1", Tasks: "Task 1 (25), Task 2 (30)"},
    {Phase: "Phase 2", Step: "Step 2", Tasks: "Task 1 (35), Task 2 (40)"}
]

DataGrouper itself is a function. DataGrouper本身就是一个函数。 You can call it with your data and a list of the properties you want to group by. 您可以使用数据和要分组的属性列表来调用它。 It returns an array whose elements are object with two properties: key is the collection of grouped properties, vals is an array of objects containing the remaining properties not in the key. 它返回一个数组,该数组的元素是具有两个属性的对象: key是分组属性的集合, vals是一个对象数组,其中包含不在key中的其余属性。 For example, DataGrouper(data, ["Phase", "Step"]) will yield: 例如, DataGrouper(data, ["Phase", "Step"])将产生:

[
    {
        "key": {Phase: "Phase 1", Step: "Step 1"},
        "vals": [
            {Task: "Task 1", Value: "5"},
            {Task: "Task 2", Value: "10"}
        ]
    },
    {
        "key": {Phase: "Phase 1", Step: "Step 2"},
        "vals": [
            {Task: "Task 1", Value: "15"}, 
            {Task: "Task 2", Value: "20"}
        ]
    },
    {
        "key": {Phase: "Phase 2", Step: "Step 1"},
        "vals": [
            {Task: "Task 1", Value: "25"},
            {Task: "Task 2", Value: "30"}
        ]
    },
    {
        "key": {Phase: "Phase 2", Step: "Step 2"},
        "vals": [
            {Task: "Task 1", Value: "35"}, 
            {Task: "Task 2", Value: "40"}
        ]
    }
]

DataGrouper.register accepts a function and creates a new function which accepts the initial data and the properties to group by. DataGrouper.register接受一个函数并创建一个新函数,该函数接受初始数据和要分组的属性。 This new function then takes the output format as above and runs your function against each of them in turn, returning a new array. 然后,此新函数采用上述输出格式,并依次对每个函数运行您的函数,并返回一个新数组。 The function that's generated is stored as a property of DataGrouper according to a name you supply and also returned if you just want a local reference. 根据您提供的名称,生成的函数将作为DataGrouper的属性存储,如果您只想使用本地引用,也将返回该函数。

Well that's a lot of explanation. 嗯,这有很多解释。 The code is reasonably straightforward, I hope! 我希望代码相当简单明了!


#4楼

i'd like to suggest my approach. 我想建议我的方法。 First, separate grouping and aggregating. 首先,分别分组和汇总。 Lets declare prototypical "group by" function. 让我们声明典型的“分组依据”功能。 It takes another function to produce "hash" string for each array element to group by. 它需要另一个函数为每个要分组的数组元素生成“哈希”字符串。

Array.prototype.groupBy = function(hash){
  var _hash = hash ? hash : function(o){return o;};

  var _map = {};
  var put = function(map, key, value){
    if (!map[_hash(key)]) {
        map[_hash(key)] = {};
        map[_hash(key)].group = [];
        map[_hash(key)].key = key;

    }
    map[_hash(key)].group.push(value); 
  }

  this.map(function(obj){
    put(_map, obj, obj);
  });

  return Object.keys(_map).map(function(key){
    return {key: _map[key].key, group: _map[key].group};
  });
}

when grouping is done you can aggregate data how you need, in your case 完成分组后,您可以根据需要汇总数据

data.groupBy(function(o){return JSON.stringify({a: o.Phase, b: o.Step});})
    /* aggreagating */
    .map(function(el){ 
         var sum = el.group.reduce(
           function(l,c){
             return l + parseInt(c.Value);
           },
           0
         );
         el.key.Value = sum; 
         return el.key;
    });

in common it works. 共同点是可行的。 i have tested this code in chrome console. 我已经在Chrome控制台中测试了此代码。 and feel free to improve and find mistakes ;) 并随时改善并发现错误;)


#5楼

I borrowed this method from underscore.js fiddler 我从underscore.js 提琴手那里借来了这种方法

window.helpers=(function (){
    var lookupIterator = function(value) {
        if (value == null){
            return function(value) {
                return value;
            };
        }
        if (typeof value === 'function'){
                return value;
        }
        return function(obj) {
            return obj[value];
        };
    },
    each = function(obj, iterator, context) {
        var breaker = {};
        if (obj == null) return obj;
        if (Array.prototype.forEach && obj.forEach === Array.prototype.forEach) {
            obj.forEach(iterator, context);
        } else if (obj.length === +obj.length) {
            for (var i = 0, length = obj.length; i < length; i++) {
                if (iterator.call(context, obj[i], i, obj) === breaker) return;
            }
        } else {
            var keys = []
            for (var key in obj) if (Object.prototype.hasOwnProperty.call(obj, key)) keys.push(key)
            for (var i = 0, length = keys.length; i < length; i++) {
                if (iterator.call(context, obj[keys[i]], keys[i], obj) === breaker) return;
            }
        }
        return obj;
    },
    // An internal function used for aggregate "group by" operations.
    group = function(behavior) {
        return function(obj, iterator, context) {
            var result = {};
            iterator = lookupIterator(iterator);
            each(obj, function(value, index) {
                var key = iterator.call(context, value, index, obj);
                behavior(result, key, value);
            });
            return result;
        };
    };

    return {
      groupBy : group(function(result, key, value) {
        Object.prototype.hasOwnProperty.call(result, key) ? result[key].push(value) :              result[key] = [value];
        })
    };
})();

var arr=[{a:1,b:2},{a:1,b:3},{a:1,b:1},{a:1,b:2},{a:1,b:3}];
 console.dir(helpers.groupBy(arr,"b"));
 console.dir(helpers.groupBy(arr,function (el){
   return el.b>2;
 }));

#6楼

_.groupBy([{tipo: 'A' },{tipo: 'A'}, {tipo: 'B'}], 'tipo');
>> Object {A: Array[2], B: Array[1]}

From: http://underscorejs.org/#groupBy 来自: http : //underscorejs.org/#groupBy

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章