A machine learning specialist is training on a classification model that classifies different types of sea creatures from their physical and biological features. The specialist has a massive data which contains different features for each creature. The dataset is currently on the specialist’s private S3 bucket. It has over 20,000 rows where each row contains 35 different features of the creature and a label containing the target “Name of creature”. There are 20 different creatures such that each one has exactly 1000 high-resolution images. The dataset is ordered such that row 1-1000 contains the first creature, row 1001-2000 contains the second creature and so on. The specialist noticed that after 20 epochs using stochastic gradient descent, the model cannot extract a pattern to differentiate those creatures. What is the most probable solution to this problem?