Defensive Future Studies

Defensive Future Studies

Three-Dimensional UAV Sound Source Localization Using Deep Audio Feature Learning

Document Type : Original Article

Author
Master's student in Artificial Intelligence & Robotics, Department of Computer Engineering, Malek ashtar University, Tehran, Iran.
10.22034/dfsr.2026.2070645.1943
Abstract
Objective: Acoustic localization of unmanned aerial vehicles (UAVs) plays a critical role in military and surveillance applications, as it enables the detection and tracking of hostile drones under real-world conditions. This study proposes a deep learning-based framework that leverages joint time-frequency features for three-dimensional UAV localization.
Methodology: A publicly available benchmark dataset consisting of multichannel UAV flight recordings was employed for training and evaluation. Spectral representations were extracted using Mel-spectrograms and subsequently analyzed through an integrated time-frequency processing scheme based on the Mamba architecture, allowing accurate estimation of spatial parameters including range, altitude, azimuth and elevation angles.
Findings: Experimental results demonstrate that the proposed model achieves precise estimation of UAV spatial parameters and maintains robust performance under noisy conditions and across varying microphone distances.
Conclusion: The proposed approach, leveraging deep learning and multichannel audio data, can serve as an effective tool for defense and surveillance systems in the acoustic detection and tracking of UAVs.
Keywords
Subjects

-      Chen, H. & Ser, W. (2011). Sound source DOA estimation and localization in noisy reverberant environments using least-squares support vector machines. Journal of Signal Processing Systems, 63(3), 287-300. (DOI: https://doi.org/10.1007/s11265-009-0423-7)
-      Chen, J. C. Yao, K. & Hudson, R. E. (2003). Acoustic source localization and beamforming: theory and practice. EURASIP journal on advances in signal processing, 2003(4), 926837. (DOI: https://doi.org/10.1155/S1110865703212038)
-      Chalaki, M. ahmadzadeh fard, M. H. and rajabpour, J. (2024). The use of UAV in the Detection mission of the Air Defense Force of the Islamic Republic of Iran. War Studies, 6(22), 1-24. [in Persian] (DOI: https://doi.org/10.22034/qjws.2024.2026774.1203)
-      Chung, M. A. Chou, H. C. & Lin, C. W. (2022). Sound localization based on acoustic source using multiple microphone array in an indoor environment. Electronics, 11(6), 890. (DOI: https://doi.org/10.3390/electronics11060890)
-      Davis, S. & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE transactions on acoustics, speech, and signal processing, 28(4), 357-366. (DOI: https://doi.org/10.1109/TASSP.1980.1163420)
-      Gadre, C. M. Patole, R. K. & Metkar, S. P. (2023, September). Comparative analysis of KNN and CNN for Localization of Single Sound Source. In 2023 International Conference on Network, Multimedia and Information Technology (NMITCON) (pp. 1-6). IEEE. (DOI: https://doi.org/10.1109/NMITCON58196.2023.10275895)
-      Gombots, S. Nowak, J. & Kaltenbacher, M. (2021). Sound source localization–state of the art and new inverse scheme. e & i Elektrotechnik und Informationstechnik, 138(3), 229-243. (DOI: https://doi.org/10.1007/s00502-021-00881-6)
-      Gu, A. & Dao, T. (2023). Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752. (DOI: https://doi.org/10.48550/arXiv.2312.00752)
-      Habibi, Nikbakhsh. (2017). Presenting an effective model for optimal utilization of unmanned aerial vehicles in empowering future operations of defense organizations (Case study: flight operations of the Air Force). Defense Futures Studies, 2(4), 35-62. [in Persian] (URL: https://www.dfsr.ir/article_30716.html)
-      Hu, F. Song, X. He, R. & Yu, Y. (2023). Sound source localization based on residual network and channel attention module. Scientific Reports, 13(1), 5443. (DOI: https://doi.org/10.1038/s41598-023-32657-7)
-      JekateryƄczuk, G. & Piotrowski, Z. (2023). A survey of sound source localization and detection methods and their applications. Sensors, 24(1), 68.( DOI: https://doi.org/10.3390/s24010068)
-      JekateryƄczuk, G. Szadkowski, R. & Piotrowski, Z. (2025). UaVirBASE: A Public-Access Unmanned Aerial Vehicle Sound Source Localization Dataset. Applied Sciences, 15(10), 5378. (DOI: https://doi.org/10.3390/app15105378)
-      Khan, A. Waqar, A. Kim, B. & Park, D. (2025). A review on recent advances in sound source localization techniques, challenges, and applications. Sensors and Actuators Reports, 100313. (DOI: https://doi.org/10.1016/j.snr.2025.100313)
-      Luo, Z. Lu, B. Huang, J. Ran, C. & He, H. (2023). Sound source direction-of-arrival estimation method for microphone array based on ultra-weak fiber Bragg grating distributed acoustic sensor. Optics Express, 31(19), 31342-31353. (DOI: https://doi.org/10.1364/OE.498027)
-      Ma, S. Wang, J. Abbas, S. Ding, X. & Tu, X. (2025, July). Self-supervised Sound Source Localization for UAVs Using GCC-PHAT in Low SNR Environments. In International Conference on Intelligent Computing (pp. 498-510). Singapore: Springer Nature Singapore. (DOI: https://doi.org/10.1007/978-981-96-9894-3_41)
-      Mu, D. Zhang, Z. Yue, H. Wang, Z. Tang, J. & Yin, J. (2024). Seld-mamba: Selective state-space model for sound event localization and detection with source distance estimation. arXiv preprint arXiv:2408.05057. (DOI: https://doi.org/10.48550/arXiv.2408.05057)
-      Ning, Y. M. Ma, S. Meng, F. Y. & Wu, Q. (2020). DOA estimation based on ESPRIT algorithm method for frequency scanning LWA. IEEE Communications Letters, 24(7), 1441-1445. (DOI: https://doi.org/10.1109/LCOMM.2020.2988020)
-      Perotin, L. Serizel, R. Vincent, E. & Guérin, A. (2018, September). CRNN-based joint azimuth and elevation localization with the Ambisonics intensity vector. In 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC) (pp. 241-245). (IEEE. DOI: https://doi.org/10.1109/IWAENC.2018.8521403)
-      Qayyum, A. B. A. Hassan, K. N. Anika, A. Shadiq, M. F. Rahman, M. M. Islam, M. T. ... & Haque, M. A. (2020). DOANet: a deep dilated convolutional neural network approach for search and rescue with drone-embedded sound source localization. EURASIP Journal on Audio, Speech, and Music Processing, 2020(1), 16. (DOI: https://doi.org/10.1186/s13636-020-00184-2)
-      Ruiz-Espitia, O. Martinez-Carranza, J. & Rascon, C. (2018, June). AIRA-UAS: an evaluation corpus for audio processing in unmanned aerial system. In 2018 International Conference on Unmanned Aircraft Systems (ICUAS) (pp. 836-845). IEEE. (DOI: https://doi.org/10.1109/ICUAS.2018.8453466)
-      Salvati, D. Drioli, C. & Foresti, G. L. (2016). A weighted MVDR beamformer based on SVM learning for sound source localization. Pattern Recognition Letters, 84, 15-21. (DOI: https://doi.org/10.1016/j.patrec.2016.07.003)
-      Schmidt, R. (1986). Multiple emitter location and signal parameter estimation. IEEE transactions on antennas and propagation, 34(3), 276-280. (DOI: https://doi.org/10.1109/TAP.1986.1143830)
-      Song, X. Qin, Q. Wang, S. Yao, F. Qiu, H. Wang, M. & Jiang, H. (2025). Embedding and Beamforming Network for Sound Source Localization in Spherical Harmonic Domain. IEEE Sensors Journal. (DOI: https://doi.org/10.1109/JSEN.2025.3595385)
-      Strauss, M. Mordel, P. Miguet, V. & Deleforge, A. (2018, October). DREGON: Dataset and methods for UAV-embedded sound source localization. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 1-8). IEEE. (DOI: https://doi.org/110.1109/IROS.2018.8593581)
-      Takeda, R. & Komatani, K. (2017, March). Unsupervised adaptation of deep neural networks for sound source localization using entropy minimization. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2217-2221). IEEE. (DOI: https://doi.org/10.1109/ICASSP.2017.7952550)
-      Tang, H. (2014). DOA estimation based on MUSIC algorithm.
-      Wang, K. & Zhang, M. (2024). Sound source localization system based on TDOA algorithm. In Intelligent Computing Technology and Automation (pp. 350-356). IOS Press. (DOI: https://doi.org/10.3233/ATDE231207)
-      Wang, L. Sanchez-Matilla, R. & Cavallaro, A. (2019, November). Audio-visual sensing from a quadcopter: dataset and baselines for source localization and sound enhancement. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 5320-5325). IEEE. (DOI: https://doi.org/10.1109/IROS40897.2019.8968183)
-      Xiao, Y. & Das, R. K. (2024). Tf-mamba: A time-frequency network for sound source localization. arXiv preprint arXiv:2409.05034. (DOI: https://doi.org/10.48550/arXiv.2409.05034)
-      Xu, K. Zong, Z. Liu, D. Wang, R. & Yu, L. (2025). Deep Learning-Based Sound Source Localization: A Review. Applied Sciences, 15(13), 7419. (DOI: https://doi.org/10.3390/app15137419)