Master's student in Artificial Intelligence & Robotics, Department of Computer Engineering, Malek ashtar University, Tehran, Iran.
10.22034/dfsr.2026.2070645.1943
Abstract
Objective: Acoustic localization of unmanned aerial vehicles (UAVs) plays a critical role in military and surveillance applications, as it enables the detection and tracking of hostile drones under real-world conditions. This study proposes a deep learning-based framework that leverages joint time-frequency features for three-dimensional UAV localization. Methodology: A publicly available benchmark dataset consisting of multichannel UAV flight recordings was employed for training and evaluation. Spectral representations were extracted using Mel-spectrograms and subsequently analyzed through an integrated time-frequency processing scheme based on the Mamba architecture, allowing accurate estimation of spatial parameters including range, altitude, azimuth and elevation angles. Findings: Experimental results demonstrate that the proposed model achieves precise estimation of UAV spatial parameters and maintains robust performance under noisy conditions and across varying microphone distances. Conclusion: The proposed approach, leveraging deep learning and multichannel audio data, can serve as an effective tool for defense and surveillance systems in the acoustic detection and tracking of UAVs.
-Chen, H. & Ser, W. (2011). Sound source DOA estimation and localization in noisy reverberant environments using least-squares support vector machines. Journal of Signal Processing Systems, 63(3), 287-300. (DOI: https://doi.org/10.1007/s11265-009-0423-7)
-Chen, J. C. Yao, K. & Hudson, R. E. (2003). Acoustic source localization and beamforming: theory and practice. EURASIP journal on advances in signal processing, 2003(4), 926837. (DOI: https://doi.org/10.1155/S1110865703212038)
-Chalaki, M. ahmadzadeh fard, M. H. and rajabpour, J. (2024). The use of UAV in the Detection mission of the Air Defense Force of the Islamic Republic of Iran. War Studies, 6(22), 1-24. [in Persian] (DOI: https://doi.org/10.22034/qjws.2024.2026774.1203)
-Chung, M. A. Chou, H. C. & Lin, C. W. (2022). Sound localization based on acoustic source using multiple microphone array in an indoor environment. Electronics, 11(6), 890. (DOI: https://doi.org/10.3390/electronics11060890)
-Davis, S. & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE transactions on acoustics, speech, and signal processing, 28(4), 357-366. (DOI: https://doi.org/10.1109/TASSP.1980.1163420)
-Gadre, C. M. Patole, R. K. & Metkar, S. P. (2023, September). Comparative analysis of KNN and CNN for Localization of Single Sound Source. In 2023 International Conference on Network, Multimedia and Information Technology (NMITCON) (pp. 1-6). IEEE. (DOI: https://doi.org/10.1109/NMITCON58196.2023.10275895)
-Gombots, S. Nowak, J. & Kaltenbacher, M. (2021). Sound source localization–state of the art and new inverse scheme. e & i Elektrotechnik und Informationstechnik, 138(3), 229-243. (DOI: https://doi.org/10.1007/s00502-021-00881-6)
-Gu, A. & Dao, T. (2023). Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752. (DOI: https://doi.org/10.48550/arXiv.2312.00752)
-Habibi, Nikbakhsh. (2017). Presenting an effective model for optimal utilization of unmanned aerial vehicles in empowering future operations of defense organizations (Case study: flight operations of the Air Force). Defense Futures Studies, 2(4), 35-62. [in Persian] (URL: https://www.dfsr.ir/article_30716.html)
-Hu, F. Song, X. He, R. & Yu, Y. (2023). Sound source localization based on residual network and channel attention module. Scientific Reports, 13(1), 5443. (DOI: https://doi.org/10.1038/s41598-023-32657-7)
-JekateryĆczuk, G. & Piotrowski, Z. (2023). A survey of sound source localization and detection methods and their applications. Sensors, 24(1), 68.( DOI: https://doi.org/10.3390/s24010068)
-JekateryĆczuk, G. Szadkowski, R. & Piotrowski, Z. (2025). UaVirBASE: A Public-Access Unmanned Aerial Vehicle Sound Source Localization Dataset. Applied Sciences, 15(10), 5378. (DOI: https://doi.org/10.3390/app15105378)
-Khan, A. Waqar, A. Kim, B. & Park, D. (2025). A review on recent advances in sound source localization techniques, challenges, and applications. Sensors and Actuators Reports, 100313. (DOI: https://doi.org/10.1016/j.snr.2025.100313)
-Luo, Z. Lu, B. Huang, J. Ran, C. & He, H. (2023). Sound source direction-of-arrival estimation method for microphone array based on ultra-weak fiber Bragg grating distributed acoustic sensor. Optics Express, 31(19), 31342-31353. (DOI: https://doi.org/10.1364/OE.498027)
-Ma, S. Wang, J. Abbas, S. Ding, X. & Tu, X. (2025, July). Self-supervised Sound Source Localization for UAVs Using GCC-PHAT in Low SNR Environments. In International Conference on Intelligent Computing (pp. 498-510). Singapore: Springer Nature Singapore. (DOI: https://doi.org/10.1007/978-981-96-9894-3_41)
-Mu, D. Zhang, Z. Yue, H. Wang, Z. Tang, J. & Yin, J. (2024). Seld-mamba: Selective state-space model for sound event localization and detection with source distance estimation. arXiv preprint arXiv:2408.05057. (DOI: https://doi.org/10.48550/arXiv.2408.05057)
-Ning, Y. M. Ma, S. Meng, F. Y. & Wu, Q. (2020). DOA estimation based on ESPRIT algorithm method for frequency scanning LWA. IEEE Communications Letters, 24(7), 1441-1445. (DOI: https://doi.org/10.1109/LCOMM.2020.2988020)
-Perotin, L. Serizel, R. Vincent, E. & Guérin, A. (2018, September). CRNN-based joint azimuth and elevation localization with the Ambisonics intensity vector. In 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC) (pp. 241-245). (IEEE. DOI: https://doi.org/10.1109/IWAENC.2018.8521403)
-Qayyum, A. B. A. Hassan, K. N. Anika, A. Shadiq, M. F. Rahman, M. M. Islam, M. T. ... & Haque, M. A. (2020). DOANet: a deep dilated convolutional neural network approach for search and rescue with drone-embedded sound source localization. EURASIP Journal on Audio, Speech, and Music Processing, 2020(1), 16. (DOI: https://doi.org/10.1186/s13636-020-00184-2)
-Ruiz-Espitia, O. Martinez-Carranza, J. & Rascon, C. (2018, June). AIRA-UAS: an evaluation corpus for audio processing in unmanned aerial system. In 2018 International Conference on Unmanned Aircraft Systems (ICUAS) (pp. 836-845). IEEE. (DOI: https://doi.org/10.1109/ICUAS.2018.8453466)
-Salvati, D. Drioli, C. & Foresti, G. L. (2016). A weighted MVDR beamformer based on SVM learning for sound source localization. Pattern Recognition Letters, 84, 15-21. (DOI: https://doi.org/10.1016/j.patrec.2016.07.003)
-Schmidt, R. (1986). Multiple emitter location and signal parameter estimation. IEEE transactions on antennas and propagation, 34(3), 276-280. (DOI: https://doi.org/10.1109/TAP.1986.1143830)
-Song, X. Qin, Q. Wang, S. Yao, F. Qiu, H. Wang, M. & Jiang, H. (2025). Embedding and Beamforming Network for Sound Source Localization in Spherical Harmonic Domain. IEEE Sensors Journal. (DOI: https://doi.org/10.1109/JSEN.2025.3595385)
-Strauss, M. Mordel, P. Miguet, V. & Deleforge, A. (2018, October). DREGON: Dataset and methods for UAV-embedded sound source localization. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 1-8). IEEE. (DOI: https://doi.org/110.1109/IROS.2018.8593581)
-Takeda, R. & Komatani, K. (2017, March). Unsupervised adaptation of deep neural networks for sound source localization using entropy minimization. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2217-2221). IEEE. (DOI: https://doi.org/10.1109/ICASSP.2017.7952550)
-Tang, H. (2014). DOA estimation based on MUSIC algorithm.
-Wang, K. & Zhang, M. (2024). Sound source localization system based on TDOA algorithm. In Intelligent Computing Technology and Automation (pp. 350-356). IOS Press. (DOI: https://doi.org/10.3233/ATDE231207)
-Wang, L. Sanchez-Matilla, R. & Cavallaro, A. (2019, November). Audio-visual sensing from a quadcopter: dataset and baselines for source localization and sound enhancement. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 5320-5325). IEEE. (DOI: https://doi.org/10.1109/IROS40897.2019.8968183)
-Xiao, Y. & Das, R. K. (2024). Tf-mamba: A time-frequency network for sound source localization. arXiv preprint arXiv:2409.05034. (DOI: https://doi.org/10.48550/arXiv.2409.05034)
-Xu, K. Zong, Z. Liu, D. Wang, R. & Yu, L. (2025). Deep Learning-Based Sound Source Localization: A Review. Applied Sciences, 15(13), 7419. (DOI: https://doi.org/10.3390/app15137419)