In this project, we study the optimization methods impacts on deep learning tasks, where we particularly focus on adaptive learning rate optimizers (e.g., AdaGrad, RMSProp, and Adam) and describe them, stating their strengths, weaknesses, and scenarios where they excel or underperform. We employ an experimental approach to analyze their performance, generalization, computational efficiency, and hyperparameter sensitivity. The study compares the performance of adaptive optimizers against a traditional method (SGD) and a non-tuning machine learning model (LDA). Our empirical results show that Adam performs best both on the train and test set in terms of accuracy, speed, generalization, and computational efficiency.