In the trend of transforming existing systems and services into non-face-to-face models, the concert industry is also showing movements toward transitioning to virtual formats. Physical concerts in the real world require venues that can accommodate hundreds to tens of thousands of spectators, but non-face-to-face methods that can accommodate large audiences face various limitations. Moreover, to elevate the satisfaction level of virtual concert attendees to that of real-world concerts, it's important to implement interaction between performers and audiences. Modern metaverse platforms apply cutting-edge network technologies to accommodate numerous users within a single channel. Many researchers are adopting network technologies such as SDN (Software Defined Networking) and CDN (Content Delivery Network) to set up a virtual concert that can accommodate large audiences. In this paper, we propose a network framework to be designed for the composition of virtual concerts. In particular, we separate a channel dedicated to interaction in order to provide an immersive experience of exchanging interactions between performers and audiences. As massive audiences transmitting interaction data to the performer in a 1:N format can lead to problems with acceptance and latency, this study introduces a concept of a channel form called 'Zone' and proposes an interaction data channel network framework that does not compromise immersion. The proposed framework supports tasks for effectively transmitting interaction data using network technologies for metaverse platforms like XDN and clustering algorithms like fuzzy c-means. We also suggest a CDN-based architecture that can ensure low latency for performers to transmit interaction data to the audience.