3D object reconstruction from depth image streams using Kinect-style depth cameras have been extensively studied. We propose an approach for accurate camera tracking and volumetric dense surface reconstruction, assuming a known cuboid reference object is present in the scene. Our contribution is threefold: (a) we keep drift-free camera pose tracking by incorporating the 3D geometric constraints of the cuboid reference object into the image registration process; (b) on the problem of depth stream fusion, we reformulate it as a binary classification problem, enabling high fidelity of surface reconstruction, especially in concave zones of the objects; (c) we further present a surface denoising strategy, facilitating the generation of noise-free triangle mesh, making the models more suitable for 3D printing and other applications. We extend our public dataset CU3D with several fresh image sequences, test our algorithm on these sequences and compare them with other state-of-the-art algorithms. Both our dataset and algorithm are available as open-source at https://github.com/zhangxaochen/CuFusion, for other researchers to reproduce and verify our results.