How Do You Define a Circle?

Perception and Computer Vision Diagnostics

Susan Vanderplas & Muxin Hua

2023-12-06

Problem Overview

Footwear Forensics

Shoe Scanner Outside Setup
  • Collect images of shoe soles from the population using the scanner

  • Identify features in the tread patterns w/ computer vision

  • Generate a local database of common pattern features

  • Characterize frequency of a new shoe w/ random match probability computed from database

Quantifying the frequency of shoes in a local population is an unsolveable problem - Leslie Hammer, Hammer Forensics, March 2018

Footwear Forensics

  • Other researchers use the output from the CNN
    (without a trained model head)

    • hard to explain to practitioners

    • hard to understand meaning

    • for models to be accepted in forensics, they need to be explainable!

Our Assumption in 2018

 

Picture of an African Elephant

A picture of an Asian elephant

 

If models can differentiate between types of elephants, they can identify shapes… right?

Circles

Quads

???

:::

XKCD: Tasks

In the 60s, Marvin Minsky assigned a couple of undergrads to spend the summer programming a computer to use a camera to identify objects in a scene. He figured they'd have the problem solved by the end of the summer. Half a century later, we're still working on it.

Complication: Different CV Models

We can reasonably pose this problem in 3 different ways:

Classification: same-size regions labeled with one or more classes

Object Detection: Propose a bounding box and label for each object in an image

Image segmentation: find regions of the image and label each region

Each method requires a different labeling schema, annotation method, and data format

In Search of Human-Friendly Model Output

(What we’ve tried so far)

Initial Approach (~2019)

  • Use VGG16 to classify 256x256 px chunks of images

  • Goal is to label the entire chunk with one or more classes

VGG16 Shoe Example approach

  • Hard to integrate predictions into the main image

Initial Approach (~2019)

Not terrible but a lot of class confusion between e.g. Circles & Text, Quad & Polygon, Quad & Triangle

Synthetic Data Test (2020)

  • If we create different shapes, can a neural network differentiate them?

Class Examples

Synthetic Data Test (2020)

Synthetic Data Test (2020)

Image Circle 3 4 5 6 7 8 9 Star
.9999 0 0 0 0 0 0 0 0
.9999 0 0 0 0 0 0 0 0
  • Shoe data is complicated to label

  • Predictions made on ambiguous data don’t work as well as we’d like

Object Detection (2021-2023)

Object Detection: Propose a bounding box and label for each object in an image

  • re-encode labels using a different data format

  • Toolkit:

    • Started with FastAI, but had terrible support/documentation
    • Eventually rewrote everything in PyTorch

Fundamental Problem

What shape is in the box? Text? Circle? Triangle? Star?

  • Neural networks are trained on millions of human-annotated photos

  • Even shoe soles are artificial relative to a natural scene

  • Networks weren’t trained on the artificial patterns or layouts in shoe soles

  • Labeling is fraught with errors and incomplete information

  • Labeling schema are very complex & must account for human perception

Approaching the Problem Backwards

Approaching the Problem Backwards

  1. Generate a large library of synthetic data

    • pre-labeled

    • complex characteristics

    • Train preliminary model

  2. Run 2D patterns through an existing network to generate more realistic 3D images

    • Train 2nd-gen model

Approaching the Problem Backwards

  1. Run 2D patterns through an existing network to generate more realistic 3D images

    • Train 2nd-gen model
  2. Train on marketing-quality pictures labeled by humans

    • Update 2nd gen model (transfer learning)

Approaching the Problem Backwards

  1. Train on marketing-quality pictures labeled by humans

    • Update 2nd gen model (transfer learning)
  2. Train on Scanner Photos

    • Update 3rd gen model weights
      (account for lower-quality photos)

Measure performance/accuracy changes over time on a consistent set of stimuli created from real shoes

Synthetic Pattern Generation

Synthetic Pattern Generation

Regions of distinct patterns

Synthetic Data Generation

Different Patterns

Snowflake

Hexagon Open Circle

Stud

Target Solid Circle

Circle Bar Across

Solid Circle Array

Targets with Arcs

6-pointed star

Synthetic Data Generation

Shoe Outlines

Synthetic Data Generation

Synthetic Data Generation

Advantages

  • SVGs can include metadata

  • Easy scaling

  • SVG intersection operations will automatically mark partial objects

  • Flexible data format:

    • Region segmentation
    • Object Detection
    • Object Classification
      all generated from same source data

Disadvantages

  • Manual SVG creation
    (52 images \(\approx\) 8h )

  • Creating a library to generate data

  • 3D rendering after 2D stage:

    • digital via OpenSCAD + SVG?
    • Can apply different surface colors
  • Lots of work required before we start in on photos

End Goal

Human Friendly Model Outputs

  • Familiar features for database search

  • Data quality flexibility:

    • Messy photos for database creationi
    • Neat images for search
  • Goal: reliable estimates of random match probability RMP: the probability that someone in the area has a shoe with similar characteristics.

Questions?

Acknowledgements

This work was funded (or partially funded) by the Center for Statistics and Applications in Forensic Evidence (CSAFE) through Cooperative Agreements 70NANB15H176 and 70NANB20H019 between NIST and Iowa State University, which includes activities carried out at Carnegie Mellon University, Duke University, University of California Irvine, University of Virginia, West Virginia University, University of Pennsylvania, Swarthmore College and University of Nebraska, Lincoln.

Students who have worked on this project:

  • Muxin Hua (2022-)
  • Jayden Stack (2021-2022)
  • Miranda Tilton (2018-2019)