Mongo Aggregation


You can sort and then limit your request.

$top_five_other =
iterator_to_array($db->find(array('category'=>'other')->sort(array('published_date'=>-1))->limit(5));


Mongo Age Group Aggregation

You were in the right place, but as $cond requires three arguments (being the evaluation , true result and false result) you need to "nest" these operations, which each subsequent $cond as the false condition. So your syntax here is a little off.

You can also do this just in the $group to avoid passing through the whole collection with a separate $project. Based on the document structure you give as an example you would form like this:

$pipeline = array(
  array(
    '$group' => array(
      '_id' => array(
        '$cond' =>  array(
          array('$lt' => array( '$age', 18 )),
          'age_0_17',
          array(
            '$cond' => array(
              array( '$lte' => array( '$age', 25 )),
              'age_18_25',
              array(
                '$cond' => array(
                  array( '$lte' => array ( '$age', 32 )),
                  'age_26_32',
                  'age_Above_32'
                )
              )
            )
          )
        )
      ),
      'count' => array( '$sum' => 1 )
    )
  )
);

Also noting that logical comparison operators such as $lt work differently in these stages to their query counterparts. They themselves take an array of arguments being the values to test and compare. They return true/false based on that comparison, which is the requirement for the first argument to $cond.

Always handy to have a json_encode somewhere where you are debugging the form of pipeline queries, as JSON will be the general scope of examples:

echo json_encode( $pipeline, JSON_PRETTY_PRINT ) . "
";

Which yields the common JSON structure:

[
    { "$group": {
        "_id": { 
            "$cond":[
                { "$lt":["$age",18] },
                "age_0_17",
                { "$cond":[
                    { "$lte":["$age",25] },
                    "age_18_25",
                    { "$cond":[
                        { "$lte":["$age",32] },
                        "age_26_32",
                        "age_Above_32"
                    ]}
                ]}
            ]
        },
        "count":{ "$sum": 1 }
    }}
]

Mongo aggregation with conditional date aggregation

A few problems in there so a bit beyond a comment. Mostly you were not enclosing several date operators with {} and as such producing invalid JSON inside an array. It helps if you change your indentation style as well to make it easier to spot formatting problems.

I also personally prefer to stick to fully strict JSON notation, it parses well with other languages and is easier to "lint", which is something you should look at to avoid coding syntax errors in the future:

db.transactions.aggregate([
    { "$group": {
        "_id": {
            "day": { "$dayOfMonth": "$creationDate" }, 
            "month": {
                "$cond": [
                    { "$gte": [ {"$month": "$creationDate"}, 9 ] },
                    9,
                    0 
                ]                                
            }, 
            "year": {
                "$cond": [
                    { "$gte": [ { "$year": "$creationDate" }, 2014] },
                    2014,
                    0
                ]                                
            }
        }, 
        "collected": {
            "$sum": {
                "$cond": [
                    { "$eq": [ "$paid", "true" ] }, 
                    "$totalamount",
                    0
                ]
            } 
        }
    }}
])

Also the logical check with $eq was missing from that $sum at the end. Make your you actually mean the "string" value of "true" and not true as a plain boolean in this case as well.


Mongo aggregation grouped by $sum?

Sure. The aggregation "pipeline" is exactly that, as you can "chain" or "pipe" stages together. To get your result you basically want two $group stages in succession:

db.commits.aggregate([
    { "$group": {
        "_id": "$name",
        "commits": { "$sum": 1 }  
    }},
    { "$group": {
        "_id": "$commits",
        "users": { "$push": "$_id" },
        "howMany": { "$sum": 1 }
    }},
    { "$sort": { "_id": -1 } }
])

So the first totals per "user" and the second collects them by "count". Optionally sorted descending into this form:

{ "_id" : 3, "users" :
[ "c" ], "howMany": 1 }
{ "_id" : 2, "users" : [ "a" ], "howMany": 1 }
{ "_id" : 1, "users" : [ "d", "b" ], "howMany": 2 }

There is no restriction on how many times a stage can appear ( within BSON size limiations )so you are not restricted to just having a single $group or other pipeline stage.


Mongo aggregation slow

If MongoDB has to load a large number of documents into memory (900,000 being a goodly amount) it is going to take some time. The way to improve this is...

  • improve your hardware
  • use sharding to distribute the load

Sharding will work well if the group reduces the number of documents significantly. This is because the initial group work will be done on each shard and then re-done on the MongoS.


Mongo Aggregation

You can sort and then limit your request.

$top_five_other =
iterator_to_array($db->find(array('category'=>'other')->sort(array('published_date'=>-1))->limit(5));

How to find match in documents in Mongo and Mongo aggregation?

What you are really asking here is how to make MongoDB return something that is actually quite different from the form in which you store it in your collection. The standard query operations do allow a "limitted" form of "projection", but even as the title on the page shared in that link suggests, this is really only about "limiting" the fields to display in results based on what is present in your document already.

So any form of "alteration" requires some form of aggregation, which with both the aggregate and mapReduce operations allow to "re-shape" the document results into a form that is different from the input. Perhaps also the main thing people miss with the aggregation framework in particular, is that it is not just all about "aggregating", and in fact the "re-shaping" concept is core to it's implementation.

So in order to get results how you want, you can take an approach like this, which should be suitable for most cases:

db.collection.aggregate([
    { "$unwind": "$students" },
    { "$unwind": "$studentDept" },
    { "$group": {
        "_id": "$students.name",
        "tfee": { "$first": "$students.fee" },
        "tdept": {
            "$min": {
                "$cond": [
                    { "$eq": [ 
                        "$students.name", 
                        "$studentDept.name"
                    ]},
                    "$studentDept.dept",
                    false
                ]
            }
        }
    }},
    { "$match": { "tdept": { "$ne": false  } } },
    { "$sort": { "_id": 1 } },
    { "$project": {
        "_id": 0,
        "name": "$_id",
        "fee": "$tfee",
        "dept": "$tdept"
    }}
])

Or alternately just "filter out" the cases where the two "name" fields do not match and then just project the content with the fields you want, if crossing content between documents is not important to you:

db.collection.aggregate([
    { "$unwind": "$students" },
    { "$unwind": "$studentDept" },
    { "$project": {
        "_id": 0,
        "name": "$students.name",
        "fee": "$students.fee",
        "dept": "$studentDept.dept",
        "same": { "$eq": [ "$students.name", "$studentDept.name" ] }
    }},
    { "$match": { "same": true } },
    { "$project": {
        "name": 1,
        "fee": 1,
        "dept": 1
    }}
])

From MongoDB 2.6 and upwards you can even do the same thing "inline" to the document between the two arrays. You still want to reshape that array content in your final output though, but possible done a little faster:

db.collection.aggregate([

  // Compares entries in each array within the document
  { "$project": {
    "students": {
      "$map": {
        "input": "$students",
        "as": "stu",
        "in": {
          "$setDifference": [
            { "$map": {
              "input": "$studentDept",
              "as": "dept",
              "in": {
                "$cond": [
                  { "$eq": [ "$$stu.name", "$$dept.name" ] },
                  {
                    "name": "$$stu.name",
                    "fee": "$$stu.fee",
                    "dept": "$$dept.dept"
                  },
                  false
                ]
              }
            }},
            [false]
          ]
        }
      }
    }
  }},

  // Students is now an array of arrays. So unwind it twice
  { "$unwind": "$students" },
  { "$unwind": "$students" },

  // Rename the fields and exclude
  { "$project": {
    "_id": 0,
    "name": "$students.name",
    "fee":  "$students.fee",
    "dept": "$students.dept"
  }},
])

So where you want to essentially "alter" the structure of the output then you need to use one of the aggregation tools to do. And you can, even if you are not really aggregating anything.



- Technology - Languages
+ Webmasters
+ Development
+ Development Tools
+ Internet
+ Mobile Programming
+ Linux
+ Unix
+ Apple
+ Ubuntu
+ Mobile & Tablets
+ Databases
+ Android
+ Network & Servers
+ Operating Systems
+ Coding
+ Design Software
+ Web Development
+ Game Development
+ Access
+ Excel
+ Web Design
+ Web Hosting
+ Web Site Reviews
+ Domain Name
+ Information Security
+ Software
+ Computers
+ Electronics
+ Hardware
+ Windows
+ PHP
+ ASP/ASP.Net
+ C/C++/C#
+ VB/VB.Net
+ JAVA
+ Javascript
+ Programming
Privacy Policy - Copyrights Notice - Feedback - Report Violation 2018 © BigHow