The Deeplomatics project aims at developing an anti-drone surveillance system based on the use of Deep Learning techniques using audio and video sensors. The system consists of several microphonic antennas, each with its own embedded artificial intelligence. Each antenna is able to detect, recognize and localize in real time the intrusion of a drone. After merging this information, an optronic system is directed towards the target, to confirm the intrusion thanks to different cameras, which also have their own artificial intelligence. The deep neural network used for the acoustics is a variation of the BeamLearning architecture proposed by the authors, which allows to jointly estimate in real time the position and the nature of the source. Each antenna can monitor an area of 100 hectares, within which the angular estimation error is less than 3◦, and the recognition rate exceeds 85%. A confidence criterion is also included in the cost function, which allows during inference to estimate the relevance of each of the 40 estimates per second. This project totals several tens of hours of drone flight recordings, obtained thanks to several microphone antennas adapted to a 3rd to 5th order ambisonic encoding. The interest of this encoding is to be able to restore these recordings in the laboratory, thanks to a 3D spatialization system, controlled by the GPS positions recorded on site. The data can then be re-recorded on any microphone antenna, and to train the deep neural network for antennas not present during the initial measurement campaigns. Thanks to this approach, it is also easy to augment the data (rotation of the sound scene, addition of realistic acoustic environment…), to obtain, in fine, an accurate and robust system in real environment.