I tried to tacle the above drawbacks and this resulted in the following formula:
Correlation =
var cv = VALUES ( Data[Perioden] )
var cas = ALLSELECTED ( Data[Perioden] )
var x_sd = CALCULATE (STDEVX.P ( cv, [x]), cas )
var y_sd = CALCULATE (STDEVX.P ( cv, [y]), cas )
var SD_Product = CALCULATE ( x_sd * y_sd )
var x_mean = CALCULATE ( AVERAGEX ( cv, [x] ), cas )
var y_mean = CALCULATE (AVERAGEX ( cv, [y] ), cas )
var vDiff_Mean_Product =
CALCULATE ( AVERAGEX (
cv,
( [y] - y_mean ) * ([x] - x_mean )
),
cas
)
return
CALCULATE ( vDiff_Mean_Product / SD_Product)
All you need to do is replace X and Y with the (calculated) measures for which you would like to know the correlation.
The first two variables cs and cas represent the dimension that define the array of values used to calculate the correlation. So your slicers could result in subset for a certain product, geo location, etc. Then with that selection made you would like to know the correlation over a selected period. In that case period is respresented by 'Data[Perioded]'.
By using variables, performance might be improved because the number of times that specific calculation is done will be limited.
I hope you find this calculation usefull. Suggestions and feedback are much appreciated.
No comments:
Post a Comment