AWS Exams GCP Exams Azure Exams GitHub Exams Jira Exams ISC2 Exams

Video: A data scientist is trying to use Spark ML databricks video

Question 1 Be Honest
« Back   Next ml-engineer-associate Question »
Answer

Full Certification Question

A data scientist is trying to use Spark ML to fill in missing values in their PySpark DataFrame 'features_df'. They want to replace the missing values in all numeric columns in 'features_df' with the median value of each corresponding numeric column. However, the code they have written does not perform the task correctly. Can you identify the reason why the code is not performing the imputation task as intended? my_imputer = imputer ( strategy = "median" , inputCols = input_columns , outputCols = output_columns ) imputed_df = my_imputer . transform ( features_df )