Skip to main content

Multidimensional Models

Multidimensional models are models that associate multiple dimension tables with a fact table as the basis, separating dimension data from fact data for enhanced analytical capabilities.

Multidimensional models consist of cubes and dimensions, which can be annotated and extended to support complex query construction. Business Intelligence (BI) developers create cubes to support quick responses and provide a single data source for business reporting. Given the increasing importance of business intelligence across various levels of an organization, using a single analytical data source ensures minimizing differences (if differences cannot be completely eliminated).

Noun Explanations:โ€‹

  • Cube (ๅคš็ปดๆ•ฐๆฎ้›†)
  • Dimension (็ปดๅบฆ)
  • Shared Dimension (ๅ…ฑไบซ็ปดๅบฆ)
  • Hierarchy (ๅฑ‚ๆฌก็ป“ๆž„)
  • Level (ๅฑ‚็บง)
  • Measure (ๅบฆ้‡)
  • Dimension Member (็ปดๅบฆๆˆๅ‘˜)
  • Calculated Member (่ฎก็ฎ—ๆˆๅ‘˜)
  • Virtual Cube (่™šๆ‹Ÿๆ•ฐๆฎ้›†)
  • Role (่ง’่‰ฒ)

Dimensionsโ€‹

Database dimensions are collections of related objects (referred to as attributes) used to provide information about the fact data in one or more cubes. For example, typical attributes in a product dimension might include product name, product category, product line, product specifications, and product price. These objects are bound to one or more columns in one or more tables of the data source view. By default, these attributes, like attribute hierarchies, are visible and used to understand the fact data in cubes. Attributes can be organized into user-defined hierarchies to provide navigation paths to help users browse the data in cubes.

Cubes contain all the dimensions upon which users analyze fact data. An instance of a database dimension in a multidimensional dataset is called a multidimensional dataset dimension, and it is associated with one or more sets of measure values in the multidimensional dataset. Database dimensions can be used multiple times in a multidimensional dataset. For example, a fact data table might have multiple fact data related to time, and separate multidimensional dataset dimensions can be defined to analyze each fact data related to time. However, only one database dimension related to time needs to exist, which also means only one relationship database table related to time is required to support multiple time-based multidimensional dataset dimensions.

Hierarchyโ€‹

Hierarchies are used to organize dimension members into a hierarchical structure and provide navigation paths in cubes. For example, the table below defines the dimension table for the time dimension. The dimension table supports three attributes: year, quarter, and month.

YearQuarterMonth
1999Q1January
1999Q1February
1999Q1March
1999Q2April
1999Q2May
1999Q2June
1999Q3July
1999Q3August
1999Q3September
1999Q4October
1999Q4November
1999Q4December

A user-defined hierarchy (calendar) is constructed using the attributes year, quarter, and month to form the time dimension. The relationships between levels and members of the "Calendar" dimension (regular dimension) are illustrated in the relationship diagram below:

Middle

Parent-Child Hierarchyโ€‹

A dimension's parent-child hierarchy is defined by using a special attribute (called the parent attribute) to determine the relationships between members. The parent attribute is used to explain self-referencing relationships or self-joins within the main dimension table. Parent-child hierarchies are constructed based on a single parent attribute. The levels appearing in the hierarchy are formed by parent-child relationships between members associated with the parent attribute, thus only assigning one level to a parent-child hierarchy. The dimension architecture of a parent-child hierarchy relies on self-referencing relationships provided in the main dimension table.

Hierarchy

In this dimension table, the ParentOrganizationKey column is related to the OrganizationKey primary key column. In other words, each record in this table can be associated with other records in this table through parent-child relationships. This self-join is commonly used to represent hierarchical data within entities, such as employee management structures within a department.

When creating a parent-child hierarchy, the two attributes representing columns must have the same data type and must be in the same table. By default, any member's parent key, whether equal to its own member key, empty, 0, or a value not appearing in the member key column, is considered a top-level member (excluding the "(All)" level).

Dimension Membersโ€‹

Dimension members are points within a dimension determined by specific attribute values. For example, a gender hierarchy has two members, 'M' and 'F'. "San Francisco", "California", and "USA" are members of the store hierarchy.

Note that entities like "San Francisco", "California", and "USA", although they have a hierarchical relationship, are all members within the store hierarchy, meaning they are statistically analyzed based on different levels of stores they belong to.

Cubesโ€‹

Cubes are multidimensional structures containing information for analysis; they primarily consist of dimensions and measure values. Dimensions define the structure of the multidimensional dataset on which slicing operations are performed, while measure values provide aggregated numerical values of interest to end-users. As a logical structure, cubes allow client applications to retrieve values of measure values as if they were contained within the cells of the multidimensional dataset; a cell is defined for each possible summary value. Cells in cubes are defined by intersections of dimension members and contain aggregated values of measures at that specific intersection.

Cubes provide a single location to store all relevant data for analysis purposes.

Measuresโ€‹

Measures represent a column containing measurable data (usually numeric) that can be aggregated. Measures represent aspects of organizational activities expressed in monetary terms (such as revenue, profit, or cost), by count (inventory levels, number of employees, customers, or orders), or by more complex calculations based on business logic.

Each multidimensional dataset must have at least one measure, although most have many measures, sometimes reaching hundreds. Structurally, measures typically map to source columns in fact data tables, which provide values used to populate measures. Alternatively, you can define measures with MDX.

Measures are context-sensitive, operating on numerical data in the context determined by any dimension members coincidentally included in the query. For example, computing the "Distributor Sales" measure would be supported by the Sum operator and would aggregate sales for each dimension member included in the query. Whether the query specifies a single product, aggregates to a category, or slices by time or geography, the measure should produce operations relevant

to the dimensions included in the query.

Calculated Membersโ€‹

Calculated members are measures whose values are calculated using a set calculation expression. Calculated members provide flexible calculation capabilities to multidimensional models, enabling rich functionality.

MDXโ€‹

1. Core Advantages of MDX Compared to SQLโ€‹

DimensionMDX AdvantagesCompared to SQL
Multidimensional ModelingDesigned for multidimensional data (OLAP Cube)SQL is for relational 2D tables, lacks native multidimensional support
Time IntelligenceNative support for time hierarchies (e.g., YTD, QTD, Moving Avg)SQL requires complex custom window functions or subqueries
Dimension HierarchiesDirect access to hierarchies (e.g., Year โ†’ Quarter โ†’ Month)SQL requires manual hierarchy maintenance
Aggregation ExpressivenessSupports complex dimension member aggregation (e.g., Aggregate, YTD, PeriodToDate)SQL limited to GROUP BY aggregation
Calculated MembersDynamically creates calculated members with flexible logicSQL calculated fields limited by table structure and aggregation
Context SwitchingAutomatically adjusts context based on row/column coordinates (Slicer/Axis)SQL lacks context concept, requires manual filter management

2. Example Comparisonโ€‹

๐ŸŽฏ Business Requirement:โ€‹

Calculate yearly cumulative sales (YTD) for the "Mountain Bike" product category in the "West" region

๐Ÿ”ธ SQL Implementation (Simplified):โ€‹

SELECT
YEAR(OrderDate) AS [Year],
SUM(SalesAmount) OVER (PARTITION BY YEAR(OrderDate) ORDER BY OrderDate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS YTD_Sales
FROM Sales
WHERE Region = 'West' AND ProductCategory = 'Mountain Bike'
  • SQL requires explicit time windows, filter fields, and grouping logic.
  • Not easily reusable or dimension-switchable.

๐Ÿ”น MDX Implementation:โ€‹

WITH 
MEMBER [Measures].[YTD Sales] AS
YTD([Order Date].[Calendar].CurrentMember), [Measures].[Sales Amount]

SELECT
{[Measures].[YTD Sales]} ON COLUMNS,
{[Order Date].[Calendar].[Year].Members} ON ROWS
FROM [Sales]
WHERE (
[Product].[Category].[Mountain Bike],
[Geography].[Region].[West]
)
  • Uses YTD() for direct time hierarchy accumulation, no window functions needed.
  • WHERE slicer automatically adjusts context.
  • Easily extensible to other dimensions or hierarchies (e.g., quarter, month).

3. Use Case Comparison Summary:โ€‹

ScenarioRecommended LanguageReason
Raw Transaction Data OperationsSQLEfficient for detailed data, transactional queries
Multidimensional Analysis (e.g., Cube Reports, OLAP)MDXStronger in dimension navigation, hierarchy aggregation, calculated members
Building BI Reports, Dashboard IndicatorsMDXEasier for KPI expression, time intelligence analysis
Data Warehouse ETLSQLStructured data processing, cleaning, merging