Well, if it were that easy... 😉 Unfortunately I don't have any such attribute, and the data I have is in 'as is' state. So the only reasonable way of aggregating the data is a spatial one I think. One way I tried was to assign such an 'building-id' attribute to the walls by spatial joining it to building footprints, but that also failed. The problem here is that you can't use multipatch in spatial join, you have to make a footprint feature class out of it. But when doing so, totally vertical walls are 'footprinted' into their 2D 'envelope', so their footprints-envelopes often intersect two or three building footprints from the other feature class. BTW vertical walls seem to cause troubles in eg. spliting the multipatches into square grid in DI.
I also thought of making polygons, which boundaries wold be roads' center lines (so in theory all the buildings should lay inside one and only one such a polygon) and aggregating them based on that. That might be some kind of solution, but still it does not solve the troubles with MeshMerger, I described.