How to Index an Item in More than One Tree Node

Some items can be organized by more than one category. For example, Movies can be assigned a Genre such as Drama or Comedy. This is also a standard Dramatic Comedy genre. This genre is both a type of Comedy and a type of Drama.

A similar approach can be taken to assign real estate listings to more than one neighborhood or region.

Dimensions

In general, a tree type dimension models this hierarchy very well. Here is an incorrect tree type dimension for movie genres. It is incorrect because an id must be unique within a dimension element.

Incorrect Dimension Definition with duplicated Ids

<dimension id="genres" key="genres" type="tree" delimiters=",">
  <element id="Comedy">
    <element id="Dramatic Comedy"/>
    <element id="Dark comedy"/>
  </element>
  <element id="Drama">
    <element id="Docudrama"/>
    <element id="Historical drama"/>
    <element id="Dramatic Comedy"/> <!-- NOT UNIQUE! -->
  </element>
</dimension>

To resolve the duplicate ids, you have to choose an arbitrarily unique id to use for each time you need to re-categorize a changeset property value. In the following example, we chose to add the number "2" to the duplicate.

We make certain to specify the value to associate with the new id so that those items are correctly placed in the "bucket".

Correct Dimension Definition with unique Ids

<dimension id="genres" key="genres" type="tree" delimiters=",">
  <element id="Comedy">
    <element id="Dramatic Comedy"/>
    <element id="Dark comedy"/>
  </element>
  <element id="Drama">
    <element id="Docudrama"/>
    <element id="Historical drama"/>
    <element id="Dramatic Comedy 2" value="Dramatic Comedy"/>
  </element>
</dimension>

Queries

The query criterion for a tree type dimension can refer to the id or the value. With the above dimension definition, querying for id Comedy will return any movies with values associated with the following element ids: Comedy, Dramatic Comedy & Dark Comedy.

The Drilldown counts for Comedy and Drama will include all of the values for that element node and its descendants. Results for queries for id Dramatic Comedy or Dramatic Comedy 2 would be identical.

{
  "dimension": "genres",
  "id": "Comedy"
}