Bipolar disorder is a severe mood disorder and is one of the top 20 reasons of disability in the world. It causes a huge burden on society. In this study, the prediction models of bipolar disorder were constructed based on the concept of knowledge distillation. The input data consisted of patients of bipolar disorder and matched controls, all of which were selected from the open database MIMIC. The method of kernel density estimation (KDE) was exploited to generate probability density functions (PDF) which identify distributions of input data. The PDF values referred to as the soft labels were combined with the input data to construct the prediction models of bipolar disorder using decision tree and artificial neural network respectively. According to the evaluation results, indicators for identifying positive samples of bipolar disorder were improved. Meanwhile, the indicators for identifying negative samples have also been advanced. In addition, the branching attributes selected by the decision trees can be mapped back to specific disease diagnoses, which are all associated with bipolar disorder. In conclusion, using KDE to generate the soft label information of the input data can make knowledge distillation work and has improved the performances of prediction models for bipolar disorder.