Video upload date:  · Duration: PT1H46M27S  · Language: EN

A data scientist is trying to use Spark ML databricks video

ml-engineer-associate video for a data scientist is trying to use Spark ML to fill in missing values in their PySpark DataFrame 'features_df'. They want to

This is a dedicated watch page for a single video.

Full Certification Question

A data scientist is trying to use Spark ML to fill in missing values in their PySpark DataFrame 'features_df'. They want to replace the missing values in all numeric columns in 'features_df' with the median value of each corresponding numeric column. However, the code they have written does not perform the task correctly. Can you identify the reason why the code is not performing the imputation task as intended? my_imputer = imputer ( strategy = "median" , inputCols = input_columns , outputCols = output_columns ) imputed_df = my_imputer . transform ( features_df )