Redundancy Avoidance for Big Data in Data Centers: A Conventional Neural Network Approach


As the innovative data collection technologies are applying to every aspect of our society, the data volume is skyrocketing. Such phenomenon poses tremendous challenges to data centers with respect to enabling storage. In this paper, a hybrid-stream big data analytics model is proposed to perform multimedia big data analysis. This model contains four procedures, i.e., data pre-processing, data classification, data recognition and data load reduction. Specifically, an innovative multi-dimensional Convolution Neural Network (CNN) is proposed to assess the importance of each video frame. Thus, those unimportant frames can be dropped by a reliable decision-making algorithm. In order to ensure video quality, minimal correlation and minimal redundancy (MCMR) are combined to optimize the decision-making algorithm. Simulation results show that the amount of processed video is significantly reduced, and the quality of video is preserved due to the addition of MCMR. The simulation also proves that the proposed model performs steadily and is robust enough to scale up to accommodate the big data crush in data centers.

Existing System:

A strategy that could reduce the transmission load a data centers and leverage the storage system is required to address the above problems. Some studies on reducing the transmission load have focused on optimizing route selection as well as detecting and dropping anomaly traffic. Vertical handoff (VHO) decision algorithm performs well in heterogeneous wireless networks [13], but when the time dimension is considered in multimedia transmission and storage, VHO becomes extremely complex. Convolution Neural Network, which incorporates pooling, can improve generalization on pattern recognition problems by sharing weights and biases. Hybrid Convolutional Neural Network (HCNN) combines CNN and winner-takesall mechanism to further boost the recognition speed [14].

Proposed System:

This paper focuses on finding a proper proportion of reduced redundancy data, which reduces important data loss. Specifically, four main steps are presented, including Video Pre-processing, Frame Classification, Frame-Load-Reduction Processing, and Video Decision. Pre-processing provides an intermediate layer to adapt input video streams to our model. The purpose of Frame Classification is to evaluate the significance of each frame. Simultaneously, Frame-Load-Reduction Processing and Video Decision are proposed to perform data redundancy avoidance and ensure vital videos are not dropped. Generally, our work focuses on enabling real-time video processing by enhancing the storage efficiency in terms of storing useful data, and by further relieving multimedia data redundancy in data centers.


In this paper, a hybrid-stream big data analytics model has been proposed to enhance the classification precision and relieve the data centers’ network and storage overload. The model can improve the speed to deal with the videos and recognizing, deciding the important frames and whether to drop the unimportant ones in every video. Compared to conventional methods like deep learning to address image analysis problems, this paper has improved the method to deal with video analysis. Besides, this network and storage overload problem of video is considered as an optimization problem, which can show a practical algorithm over a largescale of real-time data from numerous nodes. The conducted simulations represent that our model performs well in most of the data sets. Moreover, the hybrid-stream big data analytics model and the improved video with recognized algorithm can lead to a fairly good video stream and save storage space in the Internet of Things. Our algorithm also provides a way to relieve the network and storage load. The model can reduce network and storage overload, and it will not destroy the truly important videos as well.


[1] J. Wu, I. Bisio, C. Gniady, E. Hossain, M. Valla, and H. Li, “Contextaware networking and communications: Part 1 [guest editorial],” IEEE Communications Magazine, vol. 52, no. 6, pp. 14–15, June 2014.

[2] X. He, K. Wang, H. Huang, and B. Liu, “Qoe-driven big data architecture for smart city,” IEEE Communications Magazine, vol. 56, no. 2, pp. 88–93, Feb 2018.

[3] C. Ge, Z. Sun, N. Wang, K. Xu, and J. Wu, “Energy management in cross-domain content delivery networks: A theoretical perspective,” IEEE Transactions on Network and Service Management, vol. 11, no. 3, pp. 264–277, Sept 2014.

[4] K. Wang, Y. Shao, L. Shu, C. Zhu, and Y. Zhang, “Mobile big data fault-tolerant processing for ehealth networks,” IEEE Network, vol. 30, no. 1, pp. 36–42, January 2016.

[5] H. Jiang, K. Wang, Y. Wang, M. Gao, and Y. Zhang, “Energy big data: A survey,” IEEE Access, vol. 4, pp. 3844–3861, 2016.

[6] L. Gu, D. Zeng, S. Guo, Y. Xiang, and J. Hu, “A general communication cost optimization framework for big data stream processing in geo-distributed data centers,” IEEE Transactions on Computers, vol. 65, no. 1, pp. 19–29, Jan 2016.

[7] Z. B. Liu, “The realization of virtual storage networking based on asymmetric architecture,” Computer Science, vol. 31, no. 6, pp. 52–55, 2004.

[8] C. F. Lai, Y. X. Lai, M. S. Wang, and J. W. Niu, “An adaptive energyefficient stream decoding system for cloud multimedia network on multicore architectures,” IEEE Systems Journal, vol. 8, no. 1, pp. 194–201, March 2014.

[9] J. Wu, S. Guo, J. Li, and D. Zeng, “Big data meet green challenges: Big data toward green applications,” IEEE Systems Journal, vol. 10, no. 3, pp. 888–900, Sept 2016.

[10] K. Wang, Y. Wang, X. Hu, Y. Sun, D. J. Deng, A. Vinel, and Y. Zhang, “Wireless big data computing in smart grid,” IEEE Wireless Communications, vol. 24, no. 2, pp. 58–64, April 2017.

[11] K. Wang, H. Li, Y. Feng, and G. Tian, “Big data analytics for system stability evaluation strategy in the energy internet,” IEEE Transactions on Industrial Informatics, vol. 13, no. 4, pp. 1969–1978, Aug 2017.

[12] H. Liu, S. Liu, X. Meng, C. Yang, and Y. Zhang, “Lbvs: A load balancing strategy for virtual storage,” in 2010 International Conference on Service Sciences, May 2010, pp. 257–262.

[13] S. Lee, K. Sriram, K. Kim, Y. H. Kim, and N. Golmie, “Vertical handoff decision algorithms for providing optimized performance in heterogeneous wireless networks,” IEEE Transactions on Vehicular Technology, vol. 58, no. 2, pp. 865–881, Feb 2009.

[14] K. Wang, J. Mi, C. Xu, Q. Zhu, L. Shu, and D.-J. Deng, “Real-time load reduction in multimedia big data for mobile internet,” ACM Trans. Multimedia Comput. Commun. Appl., vol. 12, no. 5s, pp. 76:1–76:20, oct 2016. [Online]. Available:

[15] K. Simonyan and A. Zisserman, “Two-stream convolutional networks for action recognition in videos,” in Proceedings of the 27th International Conference on Neural Information Processing Systems, ser. NIPS’14. Cambridge, MA, USA: MIT Press, 2014, pp. 568–576. [Online]. Available:

[16] H. Ye, Z. Wu, R.-W. Zhao, X. Wang, Y.-G. Jiang, and X. Xue, “Evaluating two-stream cnn for video classification,” in Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, ser. ICMR ’15. New York, NY, USA: ACM, 2015, pp. 435–442. [Online]. Available:

[17] B. Prabavathy, K. Priya, and C. Babu, “A load balancing algorithm for private cloud storage,” in 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT), July 2013, pp. 1–6.

[18] L. Zhou, Y. C. Wang, J. L. Zhang, J. Wan, and Y. J. Ren, “Optimize block-level cloud storage system with load-balance strategy,” in 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops PhD Forum, May 2012, pp. 2162–2167.

[19] M. Baktashmotlagh, M. Harandi, B. C. Lovell, and M. Salzmann, “Discriminative non-linear stationary subspace analysis for video classification,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 12, pp. 2353–2366, Dec 2014.

[20] X. He, K. Wang, H. Huang, T. Miyazaki, Y. Wang, and S. Guo, “Green resource allocation based on deep reinforcement learning in content-centric iot,” IEEE Transactions on Emerging Topics in Computing, pp. 1–1, 2018.