As part of an ongoing quest to speed up an application I’m working on, I found myself tasked with writing a fairly complicated MongoDB aggregation pipeline. I found no existing documentation on how to accomplish the task at hand, so I figured I should pay it forward and document my solution for future generations.
Widgets and Icons
Imagine we have two MongoDB collections. Our first collection holds information about widgets
in our system:
db.getCollection('widgets').insert({
_id: 1,
name: 'Name',
info: [
{
iconId: 2,
text: 'Text'
}
]
});
Every widget has a name
and a list of one or more info
objects. Each info
object has a text
field and an associated icon referenced by an iconId
.
Our icons
collection holds some basic information about each icon:
db.getCollection('icons').insert({
_id: 2,
name: 'Icon',
uri: 'https://...'
});
The goal is to write an aggregation that returns our widgets with the associated icon documents attached to each corresponding info
object:
{
_id: 1,
name: 'Name',
info: [
{
iconId: 2,
text: 'Text',
icon: {
_id: 2,
name: 'Icon',
uri: 'https://...'
}
}
]
}
Working Through the Pipeline
The aggregation that accomplishes this goal operates in six stages. Let’s work through each stage one by one. We’ll start by $unwind
ing our info
array:
db.getCollection('widgets').aggregate([
{ $unwind: '$info' }
]);
This creates a new document for every widget/info pair:
{
_id: 1,
name: 'Name',
info: {
iconId: 2,
text: 'Text',
}
}
Next, we’ll $lookup
the icon associated with the given iconId
:
db.getCollection('widgets').aggregate([
...
{
$lookup: {
from: 'icons',
localField: 'info.iconId',
foreignField: '_id',
as: 'info.icon'
}
}
]);
Our resulting document will now have a list of icons in the info.icon
field:
{
_id: 1,
name: 'Name',
info: {
iconId: 2,
text: 'Text',
icon: [
{
_id: 2,
name: 'Icon',
uri: 'https://...'
}
]
}
}
This is a step in the right direction, but we know that the info
to icons
relationship will always be a one to one relationship. We’ll always receive exactly one icon as a result of our $lookup
.
Armed with this knowledge, we know we can $unwind
on info.icon
and safely turn our info.icon
array into an object:
db.getCollection('widgets').aggregate([
...
{ $unwind: '$info.icon' }
]);
{
_id: 1,
name: 'Name',
info: {
iconId: 2,
text: 'Text',
icon: {
_id: 2,
name: 'Icon',
uri: 'https://...'
}
}
}
But now we need to roll our info
back up into an array. We can accomplish this by $group
ing our widgets together based on their _id
. However, we need to be careful to preserve the original document to avoid clobbering the entire widget:
db.getCollection('widgets').aggregate([
...
{
$group: {
_id: '$_id',
root: { $mergeObjects: '$$ROOT' },
info: { $push: '$info' }
}
}
]);
Our resulting document contains our info
array and the original, pre-$group
widget document in the root
field:
{
root: {
_id: 1,
name: 'Name',
info: {
iconId: 2,
text: 'Text',
icon: {
_id: 2,
name: 'Icon',
uri: 'https://...'
}
}
},
info: [
{
iconId: 2,
text: 'Text',
icon: {
_id: 2,
name: 'Icon',
uri: 'https://...'
}
}
]
}
The next step in our pipeline is to replace our root document with the root
object merged with the actual root document. This will override the info
object in root
with our newly grouped together info
array:
db.getCollection('widgets').aggregate([
...
{
$replaceRoot: {
newRoot: {
$mergeObjects: ['$root', '$$ROOT']
}
}
}
]);
We’re getting close to our goal:
{
_id: 1,
name: 'Name',
info: [
{
iconId: 2,
text: 'Text',
icon: {
_id: 2,
name: 'Icon',
uri: 'https://...'
}
}
],
root: {
_id: 1,
name: 'Name',
info: {
iconId: 2,
text: 'Text',
icon: {
_id: 2,
name: 'Icon',
uri: 'https://...'
}
}
}
}
An unfortunate side effect of this merger is that our resulting document still has a root
object filled with superfluous data. As a final piece of housecleaning, let’s remove that field:
db.getCollection('widgets').aggregate([
...
{
$project: {
root: 0
}
}
]);
And with that we’re left with our original goal:
{
_id: 1,
name: 'Name',
info: [
{
iconId: 2,
text: 'Text',
icon: {
_id: 2,
name: 'Icon',
uri: 'https://...'
}
}
]
}
Success!
All Together
For posterity, here’s the entire aggregation pipeline in its entirety:
db.getCollection('widgets').aggregate([
{ $unwind: '$info' },
{
$lookup: {
from: 'icons',
localField: 'info.iconId',
foreignField: '_id',
as: 'info.icon'
}
},
{ $unwind: '$info.icon' },
{
$group: {
_id: '$_id',
root: { $mergeObjects: '$$ROOT' },
info: { $push: '$info' }
}
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: ['$root', '$$ROOT']
}
}
},
{
$project: {
root: 0
}
}
]);
I’ll be the first to say that I’m not a MongoDB expert, and I’m even less knowledgeable about building aggregation pipelines. There may be other, better ways of accomplishing this same task. If you know of a better, more efficient pipeline that gives the same results, please let me know!