Data science techniques are increasingly employed to enhance process efficiency, reduce energy consumption and operational costs, enable active process control, ensure consistent product quality, and support predictive maintenance in modern manufacturing systems. A central question arising from recent developments is: How can data models fundamentally transform manufacturing processes, and what are the primary barriers to their widespread adoption? Contemporary manufacturing sectors are progressively integrating data models within digital twin and digital shadow frameworks to enable real-time process optimization and data-informed decision-making. However, the inherent complexity of manufacturing processes—combined with the frequent scarcity of high-quality, balanced datasets—often limits the generalizability and interpretability of purely data-driven models. In practice, quality, contextual relevance, representativeness, and richness of data are significantly more critical than its sheer volume when developing robust and reliable models. This paper provides a comprehensive overview of the application of data modeling in dynamic manufacturing environments. It examines key aspects such as data generation, sampling strategies, data preprocessing and handling, and model development methodologies across steady-state, transient, and generative process regimes.