About MedBIoT data set
Creators Alejandro Guerra Manzanares, Jorge Alberto Medina Galindo, Hayretdin Bahsi and Sven Nõmm Department of Software Science, Center for Digital Forensics and Cyber Security; Tallinn University of Technology; Estonia. Public release date: 27.02.2020 (Paper publication)
Data set information The experimental setup preparation was performed by Jorge Alberto Medina Galindo as the core part of his master's thesis. The main aim of this research was to fulfill the gap in the lack of data set for IoT botnet detection. Its main features are:- Combination of real and emulated IoT devices in a medium-sized network (i.e., 83 devices). No other data set before dealt with the combination of such devices before.
- Actual malware was deployed, providing real malware network data. Three prominent botnet malware were deployed: Mirai, BashLite, and Torii.
- Labelled data set. The data set is split according to the traffic source (i.e., normal or malware traffic) allowing to easily label the data and extract features from the raw pcap files.
- The data set is focused on the early stages of botnet deployment: spreading and C&C communication.
- Sonoff Tasmota smart switch
- TPLink smart switch
- TPLink light bulb
- Lock
- Switch
- Fan
- Light
Data set structure and files At the moment, the data set is provided in raw pcap files in two main formats:
- Bulk: pcap files are provided for each data source type (i.e., legitimate, Mirai, BashLite, and Torii). They can be accessed here.
- Fine-grained: pcap files are provided for each data source, botnet phase, and device type. For example, mirai_mal_CC_lock.pcap refers to Mirai botnet malware data corresponding to C&C communication for lock devices. They can be accessed here.
Citation Request Guerra-Manzanares, A.; Medina-Galindo, J.; Bahsi, H. and Nõmm, S. (2020). MedBIoT: Generation of an IoT Botnet Dataset in a Medium-sized IoT Network. In Proceedings of the 6th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, ISBN 978-989-758-399-5, pages 207-218. DOI: 10.5220/0009187802070218
List of publications If you publish a paper using this data set, please inform us to add you to the list of publications using this data set, which will be maintained and up-to-date here.
Contact If you have some doubts, comments or need further information you can contact us on the following email: alejandro.guerra@taltech.ee