Ultrasound (US) is commonly used for the diagnosis of liver masses. Ensemble learning has been widely used for image classification, but its methods have not been fully optimized. This study was performed to investigate the usefulness of ensemble learning and compare a number of ensemble learning techniques using multiple convolutional neural network (CNN)-trained models for image classification of liver masses in US images. The US imaging data set was classified into four categories: benign liver tumor (BLT, 6320 images), liver cyst (LCY, 2320 images), metastatic liver cancer (MLC, 9720 images), and primary liver cancer (PLC, 7840 images). In this study, 250 test images were randomly selected for each class, for a total of 1000 images, and the remaining images were used for training. Sixteen different CNNs were used for to train and test the US images. All four types of ensemble learning—soft voting (SV), stacking (ST), weighted average voting (WAV), and weighted hard voting (WHV)—showed greater accuracy than the single CNN. All four types also showed significantly better deep learning performance than ResNeXt101 alone. For image classification of liver masses using US images, ensemble learning improved the performance of deep learning over a single CNN.