Doubt or $ group mongodb

3

I need to use the $group grouping operator of mongodb, however every explanation I encounter is too confusing.

How does this work and what is the benefit of using this operator?

    
asked by anonymous 19.02.2015 / 17:04

1 answer

5

The $group is one of the aggregate stages. The idea of aggregate is to establish an operations pipeline on a collection that will produce a particular output. It is an alternative to map-reduce offered by MongoDB. In the aggregation documentation for MongoDB , aggregate usage is described in pseudo-code like:

db.collection.aggregate([ { <stage> }, ... ])

That is, db.collection.aggregate receives an array of stage s, stages in the pipeline (such as $group ). There are several stages described in this link above. The simplest one would be $match , which simply filters the results as they pass through it to the next stage of the pipeline. For example:

db.collection.aggregate([
  { $match: { nome: 'Wallace' } },
  { $match: { idade: 10 } }
])

You will first filter all documents by field nome and then by field idade . Note that this could be redundant and slower than if we only executed { $match: { nome: 'Wallace', idade: 10 } } , but MongoDB performs optimizations in the pipeline you define and one of them combines several% s with% s followed in a .

As for $match , the idea is to pass a field $group , which defines how you want to group the results of your pipeline and various fields that work on all documents, generating some final result. For example:

db.collection.aggregate({
  { $match: { nome: 'Wallace' } },
  { $group: { _id: '$idade', total: { $sum: 1 } } }
})

You will first filter all documents, finding those with _id and then grouping them with doc.nome == 'Wallace' . Thus, all groups of documents with the same age will be represented by a single object, with the format:

{ 
  _id: <alguma-idade>,
  total: <0 + 1 para cada documento agrupado (portanto: o total de Wallaces com essa idade)>
}

The idade above is an operator of the $sum stage. It takes some parameter, which can be calculated for each document, and produces the sum of all results for all documents. If we write:

db.collection.aggregate([
  { $group: { _id: '$nome', somaDasIdades: { $sum: '$idade' } } }
])

We would receive the sum of all ages for each group of documents with the same name.

The complete list of operators to produce results in $group stage is here:

The value next to $group or _id is any valid expression, so it can be:

  • A literal value, such as $sum

  • A path to a field in documents passing 'Wallace'

  • An object that applies multiple expressions to specific fields

An example using an object like '$documento.campo' would be:

db.collection.aggregate([
  {
    $group: {
      _id: {
        nome: '$nome',
        idade: '$idade'
      }
    }
  }
])

This will create groups (without other fields outside the _id ) of all documents with the same _id and the same idade .

There is also a function in REPL nome , but it is just a helper to make db.collection.group s only with a aggregate stage.

I think this gives the basic notion that it is possible to pass quickly. I strongly suggest you read the documentation for $group that I have listed above:

About why you use aggregate , I think it depends a lot on what you're going to do. Like aggregate , this is the type of operation to do when the amount of data you are operating on is large enough not to be worth processing in your application code. In these cases, using something like map-reduce will be more (much more) efficient than pulling a large amount of data for your application and treating it in it.

    
02.03.2015 / 17:52