Skip to main content

Parent-Child Hierarchy

In the XpertAI platform, to flexibly represent business structures with self-referential relationships such as organizations, individuals, or categories, a Parent-Child Hierarchy is supported for modeling. Unlike traditional multi-level dimensions, this structure does not require a predefined fixed number of levels. Instead, it dynamically constructs hierarchical relationships of arbitrary depth using a self-referential data table.

🎯 Functional Overview

The parent-child hierarchy consists of a single logical level, but each member can specify its “parent member,” forming a dynamically generated tree structure. This structure is particularly suitable for the following typical business scenarios:

  • Organizational Structure: Each employee may have a direct superior (manager), who is also an employee, forming a hierarchical relationship.
  • Product Categories: Some subcategories can be nested under other categories, with no fixed depth.
  • Geographic Regions: Administrative regions may follow a structure like “Country > State > City > District,” with some paths having more or fewer levels.
  • Menu Structures or Knowledge Trees: Used for navigation, document classification, etc.

🔑 Key Components

When building a parent-child hierarchy, the following key fields need to be configured:

FieldDescription
Member IDA unique identifier for each member, used to establish parent-child relationships.
Parent IDSpecifies the parent of the current member, typically referencing another member in the same table.
Null Parent ValueIf a member has no parent (e.g., top-level organization or root category), a specific value is used to mark it as a top-level node, such as 0, -1, or an empty string.
Closure Table (Optional)A precomputed table storing all “ancestor-descendant” pairs to improve performance for multi-level queries, especially with large datasets.

Parent Member Field

This field specifies the “parent member ID” for each member and is the key to establishing hierarchical relationships.

  • Purpose: Defines the direct parent of the current member.
  • Requirement: Typically a foreign key referencing the primary key of the same table.
  • Data Type: Must match the member’s primary key (e.g., both integers or strings).
  • Typical Use Cases: “Manager ID” in an organizational structure or “Parent Category ID” in classifications.

📌 Example

In an employee table: employee_id = 1002, supervisor_id = 1001
Indicates that employee 1002’s direct superior is employee 1001.

Top-Level Node Identifier

Used to specify a special value for members with no parent, marking them as root nodes or top-level nodes in the tree.

  • Purpose: Identifies members with no parent, serving as the starting point of the hierarchy.
  • Common Values:
    • null (empty value)
    • 0
    • -1
    • Empty string ''
  • Recommendation: Maintain consistency and choose indexable values (e.g., avoid null to prevent index inefficiencies).

📌 Example

If a record has parent_id = 0, it indicates the member has no parent and is a top-level node.

Custom Parent Expression

Used to define parent-child relationships with custom logic instead of directly referencing a column.

  • Purpose: Provides a more flexible way to handle special parent-child logic.
  • Applicable Scenarios:
    • Parent relationships depend on a combination of multiple fields.
    • Field transformations, formatting, or conditional filtering are needed in parent-child mappings.
  • Expression Format: Typically an SQL expression or platform-supported computed field logic.

📌 Example Scenario

If the parent-child relationship is constructed as CONCAT(region_code, '-', parent_id),
the expression CONCAT(region_code, '-', parent_id) can be used to identify the parent member.

🔒 Note: When using a custom parent expression, ensure the logic is correct and readable to avoid impacting hierarchy accuracy or performance.

Closure Table

In parent-child hierarchies, deep levels or large member counts can degrade performance during aggregations (e.g., summing salaries or counting child nodes). To address this, configuring a Closure Table is recommended to significantly improve analysis performance.

A closure table is an additional data table that records all relationships between members and their ancestors, regardless of the level distance.

Simply put, while the original data only records the “direct parent,” the closure table supplements “all ancestor” relationships, forming complete tree paths.

🛠 Closure Table Structure Design

A typical closure table includes the following fields:

Field NameDescription
ancestor_idID of the ancestor member
descendant_idID of the current member
distanceLevel distance between ancestor and descendant (optional)

📌 Example:

ancestor_iddescendant_iddistance
110
121
132
231
351

Indicates: Member 1 is an ancestor of member 3 with a distance of 2; member 2 is the direct parent of member 3 with a distance of 1.

🧩 Configuration Recommendations

  1. Maintain Field Consistency
    Member ID fields in the closure table should match the data type of the dimension table (e.g., both integers or strings).

  2. Set Primary Keys and Indexes
    Create a unique index on (ancestor_id, descendant_id) to improve join performance.

  3. Synchronize Updates
    The closure table must be updated whenever the member structure changes (e.g., adding/deleting relationships). This can be automated via database scripts or ETL tools.

🔄 Closure Table Update Methods

The platform does not automatically maintain the closure table. The following methods are recommended during data loading:

  • Generate Closure Table with ETL Tools (e.g., Kettle, DataStage, Airbyte, etc.).
  • Build Recursive Stored Procedures in the Database to batch-generate ancestor-descendant relationships.
  • Incremental Update Logic: For new members, only append new paths.

📌 Example: MySQL Closure Table Logic (Pseudo-code)

-- Insert all members as their own ancestors
INSERT INTO closure (ancestor_id, descendant_id, distance)
SELECT id, id, 0 FROM employee;

-- Recursively insert ancestor-descendant relationships
WHILE (new child nodes exist) {
INSERT INTO closure (...)
SELECT ...
}

🧠 Why Use Parent-Child Hierarchy?

Compared to traditional fixed-level structures, parent-child hierarchies offer the following advantages:

  • Supports Irregular Levels: Each path can have a different depth, flexibly adapting to real-world business needs.
  • Easy to Maintain: Requires managing only a single self-referential table, eliminating the need for multiple level fields.
  • Intuitive Structure: The complete hierarchy tree can be derived from parent-child relationships.
  • Less Redundancy: Avoids reserving space for unnecessary levels, improving data clarity.

🚀 Application Examples

  1. Employee Structure Chart

    • Each employee record includes their “manager employee ID.”
    • Top-level managers (e.g., CEO) have no parent, with a parent field set to 0.
    • The resulting tree can be used for permission control, organizational reporting, etc.
  2. Dynamic Menu

    • Each menu item can specify a parent menu.
    • No need to predefined fixed fields like “Level 1, Level 2, Level 3.”
    • Ideal for building navigation structures with arbitrary nesting depths.
  3. Multi-Level Product Categories

    • Some category paths may be “Appliances > Kitchen Appliances > Juicer,” while others are “Appliances > Air Conditioner.”
    • Parent-child hierarchies automatically adapt to such irregular paths.

⚙ Performance Optimization Recommendations

  • For deep hierarchies or large datasets, enable the Closure Table to precompute all ancestor-descendant relationships, significantly improving analysis speed.
  • Ensure indexes are created for member ID and parent ID fields to optimize query efficiency.
  • Use a consistent identifier value for top-level nodes (e.g., 0 or -1) to avoid omissions.