It seems that whenever I have a cross-continent flight, Mondriangets a new feature. This particular flight was from Florida back hometo California, and this particular feature is a time-dimensiongenerator.
I was on the way home from an all-hands at Pentaho's Orlando,Florida headquarters, where new CEO Quentin Gallivan had outlined hisstrategy for the company. I also got to spend time with the many smartfolks from all over the world who work for Pentaho, among them Roland Bouman, formerly an evangelist for MySQL, now with Pentaho, but stillpassionately advocating for open source databases, open sourcebusiness intelligence, and above all, keeping it simple.
Roland and I got talking about how to map Mondrian onto operationalschemas. Though not designed as star schemas, some operational schemasnevertheless have a structure that can support a cube, with a centralfact table surrounded by star or snowflake dimension tables. Often theone thing missing is a time dimension table. Since these timedimension tables look very much the same, how easy would it be forMondrian to generate them on the fly? Not that difficult, I thought,as the captain turned off the "fasten seatbelts" sign and I opened mylaptop. Here's what I came up with.
Here's how you declare a regular time dimension table in Mondrian4:
Mondrian sees the table name 'time_by_day', checks that it exists, and finds the column definitions from the JDBC catalog. The table can then be used in various dimensions in the schema.
An auto-generated time dimension is similar:
The first time Mondrian reads the schema, it notices that the table is not present in the schema, and creates and populates it. Here is the DDL and data it produces.
The columns present are all of the time-dimension domains:
Suppose you wish to choose specific column names, or have morecontrol over how values are generated. You can do that by including a
For example,
The first three columns have nested
The other columns have the standard column name for a particular timedomain, and therefore the
is shorthand for
The nested
The
Note that this functionality is checked into the mondrian-lagunitasbranch, so will only be available as part of Mondrian version 4. Thatrelease is still pre-alpha. We recently started to regularly build thebranch using Jenkins, and you should seethe number of failing tests dropping steadily over the next weeks andmonths. Already over 80% of tests pass, so it's worth downloading thelatest build to kick the tires on your application.
PlanetMySQL Voting: Vote UP / Vote DOWN
I was on the way home from an all-hands at Pentaho's Orlando,Florida headquarters, where new CEO Quentin Gallivan had outlined hisstrategy for the company. I also got to spend time with the many smartfolks from all over the world who work for Pentaho, among them Roland Bouman, formerly an evangelist for MySQL, now with Pentaho, but stillpassionately advocating for open source databases, open sourcebusiness intelligence, and above all, keeping it simple.
Roland and I got talking about how to map Mondrian onto operationalschemas. Though not designed as star schemas, some operational schemasnevertheless have a structure that can support a cube, with a centralfact table surrounded by star or snowflake dimension tables. Often theone thing missing is a time dimension table. Since these timedimension tables look very much the same, how easy would it be forMondrian to generate them on the fly? Not that difficult, I thought,as the captain turned off the "fasten seatbelts" sign and I opened mylaptop. Here's what I came up with.
Here's how you declare a regular time dimension table in Mondrian4:
<PhysicalSchema>
<Table name='time_by_day'/>
<!-- Other tables... -->
</PhysicalSchema>
Mondrian sees the table name 'time_by_day', checks that it exists, and finds the column definitions from the JDBC catalog. The table can then be used in various dimensions in the schema.
An auto-generated time dimension is similar:
<PhysicalSchema>
<AutoGeneratedDateTable name='time_by_day_generated' startDate='2012-01-01' endDate='2014-01-31'/>
<!-- Other tables... -->
</PhysicalSchema>
The first time Mondrian reads the schema, it notices that the table is not present in the schema, and creates and populates it. Here is the DDL and data it produces.
CREATE TABLE `time_by_day_generated` (
`time_id` Integer NOT NULL PRIMARY KEY,
`yymmdd` Integer NOT NULL,
`yyyymmdd` Integer NOT NULL,
`the_date` Date NOT NULL,
`the_day` VARCHAR(20) NOT NULL,
`the_month` VARCHAR(20) NOT NULL,
`the_year` Integer NOT NULL,
`day_of_month` VARCHAR(20) NOT NULL,
`week_of_year` Integer NOT NULL,
`month_of_year` Integer NOT NULL,
`quarter` VARCHAR(20) NOT NULL)
JULIAN | YYMMDD | YYYYMMDD | DATE | DAY_OF_WEEK_NAME | MONTH_NAME | YEAR | DAY_OF_MONTH | WEEK_OF_YEAR | MONTH | QUARTER |
---|---|---|---|---|---|---|---|---|---|---|
2455928 | 120101 | 20120101 | 2012-01-01 | Sunday | January | 2012 | 1 | 1 | 1 | Q1 |
2455929 | 120102 | 20120102 | 2012-01-02 | Monday | January | 2012 | 2 | 1 | 1 | Q1 |
2455930 | 120103 | 20120103 | 2012-01-03 | Tuesday | January | 2012 | 3 | 1 | 1 | Q1 |
The columns present are all of the time-dimension domains:
Domain | Default column name | Default data type | Example | Description |
---|---|---|---|---|
JULIAN | time_id | Integer | 2454115 | Julian day number (0 = January 1, 4713 BC). Additional attribute 'epoch', if specified, changes the date at which the value is zero. |
YYMMDD | yymmdd | Integer | 120219 | Decimal date with two-digit year |
YYYYMMDD | yyyymmdd | Integer | 20120219 | Decimal date with four-digit year |
DATE | the_date | Date | 2012-12-31 | Date literal |
DAY_OF_WEEK_NAME | the_day | String | Friday | Name of day of week |
MONTH_NAME | the_month | String | December | Name of month |
YEAR | the_year | Integer | 2012 | Year |
DAY_OF_MONTH | day_of_month | String | 31 | Day ordinal within month |
WEEK_OF_YEAR | week_of_year | Integer | 53 | Week ordinal within year |
MONTH | month_of_year | Integer | 12 | Month ordinal within year |
QUARTER | quarter | String | Q4 | Name of quarter |
Suppose you wish to choose specific column names, or have morecontrol over how values are generated. You can do that by including a
<ColumnDefs>
element within the table, and <ColumnDef>
elements within that — just like a regular <Table>
element.For example,
<PhysicalSchema>
<AutoGeneratedDateTable name='time_by_day_generated' startDate='2008-01-01 endDate='2020-01-31'>
<ColumnDefs>
<ColumnDef name='time_id'>
<TimeDomain role='JULIAN' epoch='1996-01-01'/>
</ColumnDef>
<ColumnDef name='my_year'>
<TimeDomain role='year'/>
</ColumnDef>
<ColumnDef name='my_month'>
<TimeDomain role='MONTH'/>
</ColumnDef>
<ColumnDef name='quarter'/>
<ColumnDef name='month_of_year'/>
<ColumnDef name='week_of_year'/>
<ColumnDef name='day_of_month'/>
<ColumnDef name='the_month'/>
<ColumnDef name='the_date'/>
</ColumnDefs>
<Key>
<Column name='time_id/>
</Key>
</AutoGeneratedDateTable>
<!-- Other tables... -->
</PhysicalSchema>
The first three columns have nested
<TimeDomain>
elements that tell the generator how to populate them.The other columns have the standard column name for a particular timedomain, and therefore the
<TimeDomain>
element can be omitted. Forinstance,<ColumnDef name='month_of_year'/>
is shorthand for
<ColumnDef name='month_of_year' type='int'>
<TimeDomain role="month"/>
</ColumnDef>
The nested
<Key>
element makes that column valid as thetarget of a link (from a foreign key in the fact table, for instance),and also declares the column as a primary key in the CREATE TABLEstatement. This has the pleasant side-effect, on all databases I knowof, of creating an index. If you need other indexes on the generatedtable, create them manually.The
<TimeDomain>
element could be extended further. For instance, wecould add a locale attribute. This would allow different translationsof month and weekday names, and also support locale-specificdifferences in how week-in-day and day-of-week numbers arecalculated.Note that this functionality is checked into the mondrian-lagunitasbranch, so will only be available as part of Mondrian version 4. Thatrelease is still pre-alpha. We recently started to regularly build thebranch using Jenkins, and you should seethe number of failing tests dropping steadily over the next weeks andmonths. Already over 80% of tests pass, so it's worth downloading thelatest build to kick the tires on your application.
Image may be NSFW.
Clik here to view.![]()
Clik here to view.
PlanetMySQL Voting: Vote UP / Vote DOWN