In this blogpost I show how to calculate the correlation in DAX. This post will be refined in the future, also to show the comparison with R. The code is shown so you see how to run it in DAX studio. We will investigate the correlation between visits and sales. We define the standard deviation for visits:
DEFINE
MEASURE Visits[sdtotal_visits] =
CALCULATE (
STDEVX.P ( VALUES ( 'Visit Date'[Month] ), [Completed Visits] ),
ALLSELECTED ( 'Visit Date' )
)
We define the mean for visits:
MEASURE Visits[meantotal_visits] =
CALCULATE (
AVERAGEX ( VALUES ( 'Visit Date'[Month] ), [Completed Visits] ),
ALLSELECTED ( 'Visit Date' )
)
Same for value
MEASURE Visits[sdtotal_ov] =
CALCULATE (
STDEVX.P ( VALUES ( 'Visit Date'[Month] ), [Order Value] ),
ALLSELECTED ( 'Visit Date' )
)
MEASURE Visits[meantotal_ov] =
CALCULATE (
AVERAGEX ( VALUES ( 'Visit Date'[Month] ), [Order Value] ),
ALLSELECTED ( 'Visit Date' )
)
We multiply the two standard deviations:
MEASURE Visits[sdsd] =
CALCULATE ( [sdtotal_visits] * [sdtotal_ov] )
We calculate the deviation from the mean for each measure:
MEASURE Visits[afwijking_visits] =
CALCULATE ( ( [Completed Visits] - [meantotal_visits] ) )
MEASURE Visits[afwijking_value] =
CALCULATE ( ( [Order Value] - [meantotal_ov] ) )
We calculate the average of the product of the two deviations:
MEASURE Visits[avgproduct] =
CALCULATE (
AVERAGEX (
VALUES ( 'Visit Date'[Month] ),
[afwijking_visits] * [afwijking_value]
),
ALLSELECTED ( 'Visit Date' )
)
We devide this value of the product of the two SD's
MEASURE Visits[correlation2] =
CALCULATE ( [avgproduct] / [sdsd] )
Now the query to view the results:
EVALUATE
CALCULATETABLE (
ADDCOLUMNS (
SUMMARIZE ( 'Visit Date', 'Visit Date'[Month] ),
"Visits", [Completed Visits],
"order value", [Order Value],
"correlation2", [correlation2]
),
'Visit Date'[Month Key] = "2015M12"
|| 'Visit Date'[Month Key] = "2016M01"
|| 'Visit Date'[Month Key] = "2016M02"
)
No comments:
Post a Comment