Presence of null values skews the results and lead to inaccurate performance assessments.For an instance, if a KPI involves averaging values and some data points are null it can change the picture of the dataset and distort the overall performance of an existing pipeline.
To avoid such instances we can implement common pratices in handling nulls during KPI calculations.
- Substitute null values with zeros to maintain consistency in calculations.
We can replace NULL values using the COALESCE()
function
SELECT NULLIFZERO(0);
+ — — — — — — — -+
| NULLIFZERO(0) |
| — — — — — — — -|
| NULL |
+ — — — — — — — -+
NVL : If expr1
is NULL, returns expr2
, otherwise returns expr1
.
NVL2 :Returns values depending on whether the first input is NULL:
- If
expr1
is NOT NULL, then NVL2 returnsexpr2
. - If
expr1
is NULL, then NVL2 returnsexpr3
.
SELECT column1, ZEROIFNULL(column1)
FROM VALUES (1), (null), (5), (0), (3.14159);
+ — — — — -+ — — — — — — — — — — -+
| COLUMN1 | ZEROIFNULL(COLUMN1) |
| — — — — -+ — — — — — — — — — — -|
| 1.00000 | 1.00000 |
| NULL | 0.00000 |
| 5.00000 | 5.00000 |
| 0.00000 | 0.00000 |
| 3.14159 | 3.14159 |
+ — — — — -+ — — — — — — — — — — -+
2.Consider omiting records with null values after performing the data integration from source to target.To perform this action an extra layer on Target would be benifical as it wouldn’t lead to data loss in case of futher use.
3.Conduction analysis on KPI’s to access the impact of different approaches on KPI results like below:
Arithmetic operations involving NULL
always return NULL
.
Check the join conditions.