What should the Specialist do to optimize the data for training on SageMaker?
Use the SageMaker batch transform feature to transform the training data into a DataFrame.
Use AWS Glue to compress the data into the Apache Parquet format.
Transform the dataset into the RecordIO protobuf format.
Use the SageMaker hyperparameter optimization feature to automatically optimize the data.
Explanations:
Using the SageMaker batch transform feature is for inference tasks, not for optimizing training data format. It is unrelated to the format of data used in training.
AWS Glue can help with data transformation, but compressing into Apache Parquet is not the most optimal format for training built-in SageMaker algorithms.
The RecordIO protobuf format is optimized for training with SageMaker built-in algorithms. It improves speed and performance by efficiently storing large datasets for fast data retrieval.
The SageMaker hyperparameter optimization feature is used to tune hyperparameters, not to optimize the format of training data.