YOLOv3 architecture

Jayakrishnan Ajayakumar; Andrew J. Curtis; Vanessa Rouzier; Jean William Pape; Sandra Bempah; Meer Taifur Alam; Md. Mahbubul Alam; Mohammed H. Rashid; Afsar Ali; John Glenn Morris

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

YOLOv3 architecture

JA Jayakrishnan Ajayakumar

AC Andrew J. Curtis

VR Vanessa Rouzier

JP Jean William Pape

SB Sandra Bempah

MA Meer Taifur Alam

MA Md. Mahbubul Alam

MR Mohammed H. Rashid

AA Afsar Ali

JM John Glenn Morris

This method is extracted from research article: Int J Health Geogr, Jan 2021

Exploring convolutional neural networks and spatial video for on-the-ground mapping in informal settlements

DOI: 10.1186/s12942-021-00259-z

Request a Protocol

Ask a question

Favorite

YOLOv3 utilizes Darknet-53 [40] as its backbone network for feature extraction. Each image in the training set, for example the muddy water (M.Water) seen in Fig. 1, is divided into a 2D matrix of NxN (N usually 7) grid. The network outputs five bounding boxes for each grid cell along with an “objectness” score for each bounding box. It also outputs K class probabilities where K represents the total number of classes. Thus each grid produces a total number of 25 + K (5 × 4 + 5 + K) values. Rather than predicting the absolute coordinates of the bounding box centers, YOLOv3 predicts an offset relative to the coordinates of the grid cell. For each grid cell, YOLOv3 is trained to predict only the bounding boxes whose center lies in that grid cell. Confidence for predictions in each of the grid cell is given by Eq. 1.

The YOLOv3 model. Object detection is posed as a regression problem

Here p_r(Object) is 1 if the target is in the grid and 0 otherwise. ${IOU}_{pred}^{truth}$ (intersection over union) is defined as the overlap ratio between the predicted bounding box and the true bounding box (Eq. 2). The confidence provides estimates about whether a grid contains an object and the accuracy of the bounding box that the network has predicted.

In-order to reduce the detection error, anchor boxes which are a priori bounding boxes (5 for each grid), are generated by using a k-means algorithm applied to the height and width of the training set of bounding boxes. These make the network more likely to predict appropriate sized bounding boxes which also speeds up training [40]. For training, YOLOv3 uses sum-squared error in the output as the optimization procedure. The loss function is a combination of errors on the bounding box prediction, object prediction, and class prediction (Eq. 3).

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol