SSD算法学习及PyTorch代码分析[3]---先验框匹配

SSD 算法在先验框匹配上,采用了两个原则:

  • 对于图像中每一个 ground truth 找到与其 IOU 最大的的先验框, 该先验框为正样本, 若一个先验框没有与任何的 ground truth 匹配,则为负样本。
  • 对于剩下的未匹配的先验框,若与某个 ground truth 的 IOU 大于某个阈值(一般取0.5),则该先验框也与 ground truth 匹配

通过代码可以看出:

 def match(threshold, truths, priors, variances, labels, loc_t, conf_t, idx):
    """Match each prior box with the ground truth box of the highest jaccard overlap, encode the bounding boxes, then return the matched indices corresponding to both confidence and location preds. Args: threshold: (float) The overlap threshold used when mathing boxes. truths: (tensor) Ground truth boxes, Shape: [num_obj, 4]. priors: (tensor) Prior boxes from priorbox layers, Shape: [n_priors,4]. variances: (tensor) Variances corresponding to each prior coord, Shape: [num_priors, 4]. labels: (tensor) All the class labels for the image, Shape: [num_obj]. loc_t: (tensor) Tensor to be filled w/ endcoded location targets. conf_t: (tensor) Tensor to be filled w/ matched indices for conf preds. idx: (int) current batch index Return: The matched indices corresponding to 1)location and 2)confidence preds. """
    # jaccard index 每个真实框和先验框的IOU
    overlaps = jaccard(
        truths, #(x1,y1,w,h)
        point_form(priors)  # priors:(cx,cy,w,h) 转换成(x1,y1,w,h)
    ) # 二维张量,真实box数*先验框数
    
    # (Bipartite Matching)
    # [num_objects,1] best prior for each ground truth 每个真值对应的最好的先验框,依然保持维度不变
    # best_prior_idx存放的是先验框的id
    best_prior_overlap, best_prior_idx = overlaps.max(1, keepdim=True)
    # [1,num_priors] best ground truth for each prior 每一个先验框对应最好的真值
    # best_truth_idx存放的是真值的id
    best_truth_overlap, best_truth_idx = overlaps.max(0, keepdim=True)
    # 往往 len(best_truth_idx) > len(best_prior_idx)

    best_truth_idx.squeeze_(0) # [num_priors]
    best_truth_overlap.squeeze_(0)
    best_prior_idx.squeeze_(1) # [num_objects]
    best_prior_overlap.squeeze_(1)
    best_truth_overlap.index_fill_(0, best_prior_idx, 2)  # ensure best prior

    for j in range(best_prior_idx.size(0)): # 0 -> (num_objects-1)
        best_truth_idx[best_prior_idx[j]] = j
    # 广播,best_truth_idx长度为num_priors,best_truth_idx装着objects序号(truths序号)
    # 表示第i个先验框对应的truths框坐标,总共num_priors个先验框
    matches = truths[best_truth_idx]  # Shape: [num_priors,4]
    
    # conf装着每个先验框对应的label值 +1处理,为了添加背景这一类
    conf = labels[best_truth_idx] + 1         # Shape: [num_priors]
    conf[best_truth_overlap < threshold] = 0  # label as background
    loc = encode(matches, priors, variances)
    loc_t[idx] = loc    # [num_priors,4] encoded offsets to learn
    conf_t[idx] = conf  # [num_priors] top class label for each prior


全部评论

相关推荐

不愿透露姓名的神秘牛友
07-03 18:13
点赞 评论 收藏
分享
不要停下啊:大二打开牛客,你有机会开卷了,卷起来,去找课程学习,在牛客上看看大家面试笔试都需要会什么,岗位有什么需求就去学什么,努力的人就一定会有收获,这句话从来都经得起考验,像我现在大三了啥也不会,被迫强行考研,炼狱难度开局,啥也不会,找工作没希望了,考研有丝丝机会
点赞 评论 收藏
分享
06-05 19:46
已编辑
武汉大学 后端
点赞 评论 收藏
分享
评论
点赞
收藏
分享

创作者周榜

更多
牛客网
牛客网在线编程
牛客网题解
牛客企业服务