Journal of Xidian University

Quick Search

Current Issue

20 August 2024, Volume 51 Issue 4

Previous Issue

Smooth interactive compression network for infrared small target detection

ZHANG Mingjin, ZHOU Nan, LI Yunsong

2024, 51(4): 1-14. doi:10.19665/j.issn1001-2400.20231203

Abstract ( 20 )   HTML( 12 )   PDF (3393KB) ( 12 )   Save

Infrared Small Target Detection is a critical focus of various fields,including earth observation and disaster relief efforts,receiving considerable attention within the academic community.Since infrared small targets often occupy just a few dozen pixels and are scattered within complex backgrounds,it becomes paramount to extract semantic information from a broad range of image features to distinguish targets from their surroundings and enhance detection performance.Traditional convolutional neural networks,due to their limited receptive fields and substantial computational demands,face challenges in effectively capturing the shape and precise positioning of small targets,leading to missed detections and false alarms.In response to these challenges,this paper proposes a novel Smooth Interactive Compression Network comprising two main components:the Smooth Interaction Module and the Cross Compression Module.The Smooth Interaction Module extends the feature map's receptive field and enhances inter-feature dependencies,thus bolstering the network’s detection robustness in complex background scenarios.The Cross Compression Module takes into account channel contributions and the interpretability of pruning,dynamically fusing feature maps of varying resolutions.Extensive experiments conducted on the publicly available SIRST dataset and IRSTD-1K dataset demonstrate that the proposed network effectively addresses issues such as target loss,a high false alarm rate,and subpar visual results.Taking the SIRST dataset as an example,compared to the second-best performing model,the proposed model achieved a remarkable improvement in metrics:IoU,nIoU,and P_d are increased by 3.05%,3.41%,and 1.02%,respectively.Meanwhile,F_a and FLOPs are decreased by 33.33% and 82.30%,respectively.

Application of the high-order S₂₁ fitting strategy in coupling-matrix-extraction methods

XIE Hanyu, WU Bian, YANG Yimin, ZHAO Yutong, CHENG...

2024, 51(4): 15-28. doi:10.19665/j.issn1001-2400.20240204

Abstract ( 12 )   HTML( 6 )   PDF (3782KB) ( 6 )   Save

Fitting the measured or simulated sampled data by a rational function model is an important step in a filter coupling-matrix-extraction method.A high-order transmission coefficient(S₂₁) fitting strategy is proposed to address the problem of the deviation between the fitted and sampled data near the transmission zeros with small amplitudes.This strategy achieves a high fitting accuracy by fitting the sampled transmission coefficient with a rational function with a numerator polynomial of Nth-order(N being the order of filter) to accurately locate the transmission zeros.Then,the numerator polynomial of the transmission coefficient is reconstructed by selecting the N_z(N_z being the number of transmission zeros of the real filter) desired transmission zeros among the N fitted transmission zeros to guarantee that the number of transmission zeros of the extracted coupling matrix is the same as that of the real filter.For verification,a ninth-order coaxial filter with three transmission zeros is used as an example to validate the conventional Cauchy method,the Cauchy method that applies the higher-order transmission coefficient fitting strategy,and the model-based vector fitting method(MVF).The results show that this strategy can improve the fitting accuracy near the transmission zeros.Since the robustness of the Cauchy method is not high enough,a coupling-matrix-extraction method by identifying the zeros of S-parameters using vector fitting is proposed in this paper by consisting the operation steps of the Cauchy method and MVF,this method can fit the zero of S-parameters more accurately than MVF and has a less influence by noise than the Cauchy method.

Algorithm for optimization of joint spatial and power resources for cooperative active and passive localization

LYU Peixia, ZHAO Yue, LI Zan, BAI Dou, HAO Benjian

2024, 51(4): 29-38. doi:10.19665/j.issn1001-2400.20240102

Abstract ( 16 )   HTML( 7 )   PDF (2429KB) ( 7 )   Save

The rapid development of UAVs has brought great convenience to today's society,but their potential misuse poses a risk to public safety.As a result,in recent years,surveillance and localization technologies for UAVs have been widely studied.In response to the application problem of difficulty in accurate localization of long-range low-flying UAVs,a cooperative localization framework is proposed,mainly for passive localization,and it is supplemented by active detection.Based on the passive localization using the time difference of arrival(TDOA),the active detection equipment supporting round-trip time of arrival(RT-TOA) measurement is introduced to locate the UAVs opportunistically and actively.These devices compensate for the missing target elevation information of passive localization,to improve the three-dimensional localization accuracy of UAVs.This paper delves into the spatial and power sources allocation methods for active localization nodes under the pre-deployment of passive localization nodes.Under the framework of cooperative localization,it derives the localization accuracy measurement indicator and formulates the joint optimization problem for spatial and power resources.A resource optimization algorithm for improved gray wolf optimization based on nonlinear convergence factors and memory guidance(CM-IGWO) is proposed.Simulation results show that the active and passive cooperative localization effect is better than the passive localization effect,and that the elevation localization accuracy in typical scenarios is significantly improved by 96.33%.In addition,the proposed CM-IGWO algorithm is superior to the gray wolf optimization(GWO) and IGWO when solving the joint optimization problem for spatial and power resources.

Algorithm for the reconstruction of adaptive acceleration multi-path matching pursuit

LU Yan, LIAO Guisheng, WANG Xiaopeng

2024, 51(4): 39-50. doi:10.19665/j.issn1001-2400.20231204

Abstract ( 10 )   HTML( 2 )   PDF (3545KB) ( 2 )   Save

In compressive sensing reconstruction algorithms,the multi-path matching pursuit algorithm improves the possibility of obtaining the global optimal solution by searching multiple paths,but a large number of redundant paths will cause a serious drop in performance.To solve this problem,a multi-path matching pursuit reconstruction algorithm based on adaptive acceleration is proposed.First,the number of generated child branches is controlled by setting the threshold,optimizing the strategy of the original algorithm in allocating the number of paths evenly,so that the parent branches with a strong coherence traverse more child branches and atoms with a low coherence are restricted from being assigned to new paths.Second,by using the reconstruction residuals generated by the first path,a new pruning criterion is designed to perform secondary screening on candidate paths,thus reducing computational expenses.Finally,under an ideal state,the proposed algorithm derives the restricted isometry property condition to accurately reconstruct the signal,and presents the signal-to-noise ratio limit for the accurate reconstruction of the signal in the presence of noise interference.Simulation results show that in the reconstruction experiments for one-dimensional and two-dimensional signals,the proposed algorithm effectively improves the reconstruction efficiency compared to the multi-path matching pursuit algorithm,while ensuring a high reconstruction accuracy.

Multi-source sensor box particle LMB filtering algorithm

ZHANG Yongquan, LI Zhibin, ZHANG Wenbo, SU Zhenzhe...

2024, 51(4): 51-66. doi:10.19665/j.issn1001-2400.20240104

Abstract ( 12 )   HTML( 1 )   PDF (4429KB) ( 1 )   Save

With the emergence of a large number of complex tracking scenarios,conventional multi-source sensor multi-target tracking algorithms have shortcomings of high computational complexity,low tracking accuracy,and inability to estimate target trajectories,making them unable to meet the needs of modern warfare.In this paper,we focus on the implementation of the multi-source sensor tracking problem with the background of the multi-source sensor system composed of active and passive sensors.For the problem of a “multi-active+multi-passive” multi-source sensor system that the measurement cannot be fully integrated and the overall algorithm complexity is high,a multi-source sensor box particle labeled multi-Bernoulli(MS-BPF-LMB)filtering algorithm is proposed.First,the sensors are grouped according to different active sensors,i.e.,all sensors are divided into several "single active + multiple passive" sensor groups;and then,through parallel operations,a multi-sensor information fusion method based on angle correlation is applied to each sensor group to obtain the effective measurements required for tracking.Finally,in the tracking filtering stage,the obtained measurement points are divided into several box particles by introducing the box particle filtering numerical calculation method,and the update coefficients of multi-sensor measurements under box particle filtering are redefined to achieve LMB filtering with a low computational complexity.Simulation results show that the proposed method can effectively deal with the problem of multi-source information fusion of heterogeneous data by significantly reducing the error and decreasing the complexity of the algorithm by about 40% on the basis of maintaining the tracking accuracy of the target.

Sparse array design for improving uniform degrees of freedom

QI Liangang, ZHANG Yiquan, GUO Qiang, WANG Yani, K...

2024, 51(4): 67-77. doi:10.19665/j.issn1001-2400.20240307

Abstract ( 8 )   HTML( 4 )   PDF (1242KB) ( 4 )   Save

An improved sparse array design is proposed to address the limitations of the underdetermined direction of arrival estimation performance caused by holes and low uniform degrees of freedom in the difference co-array of the coprime array with compressed inter-element spacing.First,by analyzing the influence of the array structure on the redundancy and hole position of the difference co-array,it is concluded that adjusting the subarray position of the coprime array with compressed inter-element spacing can effectively reduce the redundancy of the difference co-array,increase the consecutive lags,and improve the uniform degree of freedom.Second,the physical array element positions are adjusted and the closed-form expression for designing sparse array physical array element positions is provided.Then,based on the array element positions,the closed-form expressions for the proposed sparse array degrees of freedom,uniform degrees of freedom,and the difference co-array hole positions are derived.Theoretical analysis shows that,compared to the coprime array with compressed inter-element spacing and the improved sparse array,the proposed array possesses higher degrees of freedom and uniform degrees of freedom with the same number of elements.Finally,simulation experiments on the direction of arrival estimation are conducted in various scenarios using the multiple signal classification algorithm based on spatial smoothing.The results show that the proposed sparse array has a better direction of arrival estimation performance under various experimental conditions,thus verifying the effectiveness of the design.

Modulation recognition based on the two-dimensional asynchronous in-phase quadrature histogram

WAN Pengwu, HUI Xi, CHEN Dongrui, WU Bo

2024, 51(4): 78-90. doi:10.19665/j.issn1001-2400.20240312

Abstract ( 8 )   HTML( 1 )   PDF (6304KB) ( 1 )   Save

Automatic modulation recognition technology accurately identifies the modulation type of signals,making it a key technology in the field of signal processing.Traditional recognition methods suffer from low accuracy at low signal-to-noise ratios,and performance degradation or failure when dealing with signal frequency instability or asynchronous sampling.In this paper,we investigate modulation recognition technology based on deep learning for low-speed asynchronous sampled signals under channel conditions with varying signal-to-noise ratios and delays.We start by modeling low-speed asynchronous sampled signals and generating a two-dimensional asynchronous in-phase quadrature histogram using their in-phase and quadrature components.Subsequently,we employ a Radial Basis Function Neural Network to extract feature parameters from this two-dimensional image,thus achieving modulation type recognition for the input signal.Extensive computer simulations validate the proposed method’s accuracy in recognizing seven modulation types under the influence of additive white Gaussian noise.Experimental results demonstrate that,in the presence of additive white Gaussian noise in the channel model and with an input signal-to-noise ratio of 6 dB,the average recognition accuracy can exceed 95%.Comparative experiments further verify the effectiveness and robustness of the proposed approach.

Trajectory optimization method for the OFDM-UAV relay broadcast communication system

LI Dongxia, MENG Yan, HUANG Gengming, LIU Haitao

2024, 51(4): 91-101. doi:10.19665/j.issn1001-2400.20240305

Abstract ( 12 )   HTML( 3 )   PDF (1102KB) ( 3 )   Save

To improve the performance of the unmanned aerial vehicle (UAV) relay broadcast communication system,taking into consideration the entire communication links from the base station to the UAV and from the UAV to the users,the orthogonal frequency division multiplexing (OFDM) based trajectory optimization method is proposed for the UAV relay broadcast communication system with frequency selective fading channels.First,the OFDM-based UAV relay broadcast communication model is provided.Then,approximate calculation formula for the interruption probability for a single-user node and that for the system average interruption probability are theoretically derived.Furthermore,two trajectory methods for the relay UAV are proposed based on the optimization criteria of minimizing the average interruption probability and minimizing the maximum user interruption probability.The effectiveness and correctness of the proposed optimization method are demonstrated by computer simulations,which indicates that the OFDM-based UAV relay broadcasting communication system can effectively overcome frequency-selective fading,and achieve better connectivity under the decode-and-forward transmission mode than the amplify-and-forward mode in multipath channels.The interruption performance of the system,derived based on the criterion of minimizing average interruption probability,surpasses that obtained by minimizing the maximum user interruption probability criterion.

Retinal image quality grading for fused attention spectrum non-local blocks

LIANG Liming, DONG Xin, LEI Kun, XIA Yuchen, WU Ji...

2024, 51(4): 102-113. doi:10.19665/j.issn1001-2400.20231101

Abstract ( 10 )   HTML( 5 )   PDF (4323KB) ( 5 )   Save

Retinal image quality assessment(RIQA) is one of the key components of screening for diabetic retinopathy.Aiming at the problems of large differences in retinal image quality and insufficient generalization ability of quality evaluation models,a multi-feature algorithm that combines non-local blocks of the attention spectrum is proposed to predict and rank RIQA.First,the ResNet50 network of fused spectral non-local blocks is used to extract the features of the input images;Second,efficient channel attention is introduced to improve the model's ability to express data and effectively capture the characteristic information relationship between channels;Then,the feature iterative attention fusion module is used to fuse the local feature information.Finally,the combined focus loss and regular loss further improve the effect of quality classification.On the Eye-Quality dataset,the accuracy rate is 88.59%,the precision is 87.56%,the sensitivity and F1 value are 86.10% and 86.74%,respectively.The accuracy and F1 values on the RIQA-RFMiD dataset are 84.22% and 67.17%,respectively,and simulation experiments show that the proposed algorithm has a good generalization ability for retinal image quality assessment tasks.

Task scheduling method for minimizing completion time in edge collaborative environment

ZHANG Chao, ZHAO Hui, ZHANG Zhifeng, WANG Jing, WA...

2024, 51(4): 114-127. doi:10.19665/j.issn1001-2400.20240308

Abstract ( 13 )   HTML( 1 )   PDF (1349KB) ( 1 )   Save

The uneven geographical distribution of users may lead to unbalanced load on edge servers,which makes it difficult to provide satisfactory service quality for users.In addition,the available resources of the edge server are limited,and some large tasks may be difficult to offload to the edge server.To solve the above problems,this paper proposes a task scheduling method to minimize the completion time in the edge collaboration environment by utilizing the collaboration among multiple edge servers and combining the task partial offloading technology.First,by combining the edge of horizontal collaboration and task partial offloading technology and considering the position relationship between users and edge servers in multi-user multi-edge server scenario,a task partial offloading and scheduling model is established to minimize the task completion time.Second,a task scheduling algorithm based on the Improved Group Teaching Optimization Algorithm(IGTOA) is proposed to jointly optimize the edge server computing resource allocation,user-edge server association decision,task offloading ratio and execution location decision.With minimizing the task completion time as the goal,efficient task scheduling is achieved under edge computing environment.Finally,the proposed task scheduling algorithm is compared with DTOSO,HJTORA and ACS algorithms under multiple indexes.Experimental results show that the proposed method can effectively reduce the task completion time.

Joint feature approach for image-text cross-modal retrieval

GAO Dihui, SHENG Lijie, XU Xiaodong, MIAO Qiguang

2024, 51(4): 128-138. doi:10.19665/j.issn1001-2400.20240302

Abstract ( 13 )   HTML( 2 )   PDF (1931KB) ( 2 )   Save

With the rapid development of deep learning,cross-modal retrieval performance has been significantly improved.However,existing methods only match the image text as a whole or only use local information for matching,there are limitations in the use of graphic and textual information,and the retrieval performance needs to be further improved.In order to fully exploit the potential semantic relationship between images and texts,this paper proposes a cross-modal retrieval model based on joint features.In the feature extraction part,two sub-networks are used to deal with the local features and global features of images and texts respectively,and a bilinear layer structure based on the attention mechanism is designed to filter redundant information.In the loss function part,the triplet ranking loss and semantic label classification loss are used to realize feature joint optimization.And the proposed model has a wide range of generality,which can effectively improve the performance of the model only based on local information.A series of experimental results on the public datasets Flickr30k and MS COCO show that the proposed model effectively improves the performance of cross-modal image-text retrieval tasks.In the Flickr30k dataset retrieval task,the proposed model improves 5.1% on the R@1 metric for text retrieval and 2.8% on the R@1 metric for image retrieval.

Graph neural network vulnerability detection for ethernet smart contracts

LI Xiaohan, YANG Yanbo, ZHANG Jiawei, LI Baoshan, ...

2024, 51(4): 139-150. doi:10.19665/j.issn1001-2400.20240306

Abstract ( 15 )   HTML( 4 )   PDF (2352KB) ( 4 )   Save

A smart contract is an important part of the blockchain,and the Ethereum platform enables decentralized applications by deploying a large number of smart contracts,which is associated with billions of dollars worth of digital currency.However,a smart contract is a piece of code written in a high-level language,which can be vulnerable to attacks and cause huge economic losses.Currently,smart contract vulnerabilities are one of the serious threats to Ethereum.Traditional smart contract vulnerability detection methods rely heavily on fixed expert rules,resulting in low accuracy and time-consuming.In recent years,some researchers have used machine learning methods for vulnerability detection,but the detection methods they use do not fully utilize the semantic information of smart contract source code.In this paper,the smart contract source code is constructed as a smart contract graph with a data flow and control flow information,and the attention mechanism is utilized to assign different weights to the nodes in the graph according to their criticality to update the graph node features for contract vulnerability detection.In the paper,experiments are conducted on reentrant vulnerabilities and timestamp vulnerabilities.Experimental results show that compared with the traditional graph neural network detection model,the model in the paper improves the accuracy in the two vulnerability detections by 11.18% and 10.06%,respectively.The experiments demonstrate that smart contract vulnerabilities are not only related to the structural features of the contract code,but also closely related to different functions and data variables.

Blockchain searchable encryption scheme for multi-user environment

ZHAI Sheping, ZHANG Ruiting, YANG Rui, CAO Yongqia...

2024, 51(4): 151-169. doi:10.19665/j.issn1001-2400.20240205

Abstract ( 16 )   HTML( 6 )   PDF (2449KB) ( 6 )   Save

How to perform search and realize data sharing on encrypted data that have lost the original features of a plaintext is the key issue in the research on searchable encryption technology.In view of the problems existing in traditional asymmetric searchable encryption schemes,it is difficult to support multi-user multi-keyword search,semi-honest third-party search service,and centralized authorization management,so this paper proposes a searchable encryption scheme for multi-user environment based on blockchain.First,the traditional asymmetric searchable encryption scheme is combined with conditional broadcast proxy re-encryption technology.By encrypting the ciphertext for user groups,verifying user authorization and re-encrypting search results for users meeting the conditions,the secure search and controllable sharing of secret data is realized in multi-user environment.Second,smart contracts are called on the alliance chain to perform multi-keyword ciphertext search,thus reducing the risk of semi-honest third-party false search,and the improved PBFT algorithm is used to elect consensus nodes to rotate as authorization managers,thereby reducing the threat of single point failure or malicious attacks of traditional central authorities.Finally,by analyzing the security and correctness of the scheme,it is shown that the scheme can effectively improve the problems existing in the traditional scheme.Simulation shows that compared with the existing searchable encryption schemes,the proposed scheme has obvious advantages in ensuring the privacy of data search,with the computing cost relatively low.

K-anonymity privacy-preserving data sharing for a dynamic game scheme

CAO Laicheng, HOU Yangning, FENG Tao, GUO Xian

2024, 51(4): 170-179. doi:10.19665/j.issn1001-2400.20240201

Abstract ( 13 )   HTML( 1 )   PDF (1512KB) ( 1 )   Save

Aiming for fact that the deep trained learning model has some problems,such as lack of a large amount of labeled training data and data privacy leakage,a k-anonymity privacy-preserving data sharing for the dynamic game(KPDSDG) scheme is proposed.First,by using the dynamic game strategy,the optimal data k-anonymization scheme is designed,which achieves secure data sharing while protecting data privacy.Second,a data anonymization evaluation framework is proposed to evaluate data anonymization schemes based on the availability,privacy,and information loss of anonymous data,which can further improve the privacy and availability of data and reduce the risk of reidentification.Finally,owing to adopting the conditional generative adversarial network to generate data,the problem that model training lacks a large amount of labeled training samples is solved.The security analysis shows that the entire sharing process can ensure that the privacy information of the data owner is not leaked.Meanwhile,experiment shows that the accuracy of the model trained on the data generated after privacy in this scheme is higher than that of other schemes,with the optimal situation being 8.83% higher,that the accuracy of the proposed solution in this paper is basically consistent with the accuracy of the model trained based on raw data,with a difference of only 0.34% in the optimal situation and that the scheme has a lower computing cost.Therefore,the scheme satisfies data anonymity,data augmentation,and data security sharing simultaneously.

Secure lightweight query solution for location privacy

LE Yanfen, LI Tianchen, SONG Weiran

2024, 51(4): 180-191. doi:10.19665/j.issn1001-2400.20240402

Abstract ( 9 )   HTML( 1 )   PDF (1443KB) ( 1 )   Save

With the rapid development of various location-based services related applications,there is a service demand for counting the visiting users to a specific area of interest.Existing schemes realize the privacy protection of visiting users,but the encryption protocol used introduces a high computational overhead,which prevents real-time statistics on mobile users and suffers from the problem of misjudgment in different areas of interest.A new lightweight private location query scheme is proposed based on the bloom filter and scalar product computation.The proposed scheme designs a compound spatial bloom filter to efficiently encode location data,which,in combination with a secure scalar product computation protocol,allows service providers to learn whether a user is at a specific point of interest while preserving the user's location privacy.The proposed scheme can efficiently achieve the user’s position privacy access control while minimizing the overhead of computation and communication.Experimental results show that this scheme avoids the problem of user misjudgment in different areas of interest and improves the query accuracy compared with typical representative schemes;that under the set experimental conditions,the offline and online computational overheads can be reduced by two orders of magnitude,and that the scheme can reduce the communication overhead by about 50%.

Handover authentication enhancement scheme based on the chaos map and Chinese remainder theorem

CHEN Yong, CHANG Ting, ZHANG Bingwang

2024, 51(4): 192-205. doi:10.19665/j.issn1001-2400.20240313

Abstract ( 15 )   HTML( 3 )   PDF (1785KB) ( 3 )   Save

As the next generation of the high-speed railway mobile communication system in China,the safety of 5G-R is crucial for ensuring railway operation safety.Aiming at the problems of 5G-R network handover authentication process,such as vulnerability to desynchronization attack,lack of forward security and high computing cost,an enhancement scheme based on chebyshev chaotic mapping and the Chinese remainder theorem for high-speed railway handover authentication is proposed.First,based on chebyshev chaotic mapping semigroup,a key negotiation mechanism is designed to realize bidirectional identity authentication between source base station which can effectively resist pseudo-base stations and desynchronization attacks and target base station during handover authentication.Then,by using the secret sharing principle of the Chinese remainder theorem,the session key of the train and target base station is derived,which ensures the secure transmission of link count value next-hop chaining count during handover,and overcomes the shortage of forward security in handover authentication.Finally,the security of the proposed method is verified by using the BAN logic formalization theory and Scyther security analysis tool,and the proposed method is compared with similar protocols for security and efficiency analysis.The results show that the proposed scheme has higher security and better performance than the comparison method in terms of communication and computing overhead,and can effectively meet the requirements of 5G-R handover authentication security.

Research on the application of machine learning to intrusion detection in WSN

JIANG Laiwei, GU Haiyang, XIE Lixia, YANG Hongyu

2024, 51(4): 206-225. doi:10.19665/j.issn1001-2400.20231202

Abstract ( 16 )   HTML( 3 )   PDF (2124KB) ( 3 )   Save

With the continuous development of computer and communication technologies,networks often face a variety of attacks.The distributed and wireless transmission characteristics of the Wireless Sensor Network(WSN) make it easy to suffer from network attacks,which brings a severe test for the design of the WSN security protection program.As an important means of network attack detection,intrusion detection is a proactive security protection technology and a key technology to ensure the security of WSN network environment.In recent years machine learning methods have made tremendous progress in many fields,and have achieved certain application research results in the field of WSN intrusion detection.In order to facilitate the in-depth study of WSN intrusion detection technology,this paper starts from the characteristics of WSN and the uniqueness of WSN intrusion detection research,and categorizes and synthesizes the relevant research in this field in recent years.First,the challenges and development status of the WSN are briefly introduced.Then,the challenges faced when intrusion detection is designed in WSNs are analyzed based on the characteristics of WSNs.Subsequently,literature review and categorization of research related to intrusion detection in WSNs are conducted,focusing on the categorization and discussion of applied research methods based on machine learning.Finally,the future prospects and directions of this research direction are discussed to provide valuable references for promoting in-depth research and practical applications in the field of WSN intrusion detection.

Hiding images in audio based on invertible neural networks

ZHANG Xiaohong, XIANG Shijun, HUANG Hongbin

2024, 51(4): 226-238. doi:10.19665/j.issn1001-2400.20240303

Abstract ( 14 )   HTML( 7 )   PDF (4847KB) ( 7 )   Save

Invertible Neural Networks(INNs) are well suited for the field of information hiding due to the fact that their inherent reversible structure.Images are able to efficiently convey information in a vivid and hierarchical manner,while audio is a widely used and distributed media file with a large embedding capacity.Therefore,hiding images in audio is of high research and application value.In the task of hiding images in audio,how to represent audio and image data and how to improve the quality of reconstructed images while reducing audio distortion are two important issues.To address these two problems,this paper proposes an algorithm based on INNs to hide images in audio.Inspired by the data processing methods in JPEG image compression,an image feature extraction and representation module is proposed for data feature representation.This module performs block-wise discrete cosine transform,Zigzag scanning,and high-low frequency separation operation on color images,extracting the frequency domain features of the image and obtaining its one-dimensional representation.In addition,in order to reduce audio distortion and improve the quality of reconstructed images,this paper uses the wavelet transform to separate the high and low frequency components of audio and introduces INNs to embed the secret image into the high-frequency region of the cover audio.Experimental results show that the proposed algorithm can generate higher quality steganographic audio and reconstruct more restored color images while achieving a high embedding rate,and that the proposed algorithm exhibits good security.