Local Population Footwear Class Characteristics

An End-to-End Pipeline for Automatic Data Acquisition and Analysis

Susan Vanderplas & Rick Stone

Discussion

How are you currently using footwear forensics?

Use of Footwear Evidence

Some reasons we’ve heard

  • Few individuals trained

    • Collection of footwear impression evidence is difficult

    • First responders often damage evidence at the scene

    • Equipment for collecting prints is difficult to use and expensive

    • Insufficient detail in prints for RAC analysis

    • Insufficient people to perform RAC analysis

  • Not as useful in court as other types of evidence

How do we make footwear evidence more useful?

Random Match Probability

What is the probability of a coincidental match?

  1. Define the comparison population

  2. Sample from the comparison population
    \(N\) total shoes

  3. Identify similar shoes from the comparison population
    \(S\) similar shoes in the \(N\) shoe sample

  4. Estimate the probability of a coincidental match: \[\hat{p} = \frac{S}{N}\]

Quantifying the frequency of shoes in a local population is an unsolveable problem - Leslie Hammer, Hammer Forensics, March 2018

Obstacles: Characterizing Comparison Populations

  • No 100% complete database of all shoes

    • manufacturer, model, size, tread style, manufacturing molds
  • Shoe purchases vs. frequency of wear (temperature, weather dependence)

  • Local populations may differ wildly

  • New tread patterns appear frequently

Relevant Features

  • Make, Model, Tread pattern, Size, Type of shoe

  • Cannot be used to identify an individual match

  • Used for exclusion

Relevant Features

Features other than make/model and size:

  • Knockoffs often have very similar tread patterns
  • Similar styles have similar tread patterns across brands
  • Unknown shoes can still be classified and assessed
Dr. Martens Eastland Timberland
Work 2295 Rigger 1955 Edition Jett 6” Premium Boot

Automatic Shoe Data Acquisition

Design Philosophy

Requirements (Outdoor)

Requirements (Indoor)

Tech Specs

Scanner Demonstration

Automatic Feature Identification

Automatic Feature ID Goals

  • ID geometric features in outsole images

  • Robust

    • lighting conditions
    • rotation
    • image quality
    • tread colors
  • Fast processing of new images

  • Identify features using human-friendly terms

Statistical Importance

  • Assemble a database of shoe images

    • from local populations
    • with identified features
  • Calculate random match probability

  • Provide more weight to class characteristic comparisons

    • eventually, probabilistic comparisons?

:::

Computer Vision

Baby’s First Feature Set

Bowtie Chevron Circle
Bowtie examples Chevron examples Circle examples
Line Polygon Quadrilateral
Line examples Polygon examples Quadrilateral examples
Star Text Triangle
Star examples Text examples Triangle examples

Used to separate shoes by make/model in (small) local samples

Labeling the Data

Screenshot from LabelStudio demonstrating the labeling process.

Model Training

  • Provide images and labels to the algorithm

  • Algorithm tries to reduce mismatch b/w algorithms and labels
    (loss function)

  • End result is an algorithm which takes new images and outputs matching labels (with a corresponding probability)

Babies work similarly, but are a lot cuter (and a lot more needy)

Classification vs. Detection

Classification assigns an image to one or more of a fixed set of categories

Is this a rabbit or a deer?

Detection identifies the location of objects in an image and assigns a label

All dogs are identified and have bounding boxes.

Results

When classifying images, we get fairly good results, though some classes are confused.

Definitions

Classes get confusing

Blue: Prediction matches image label

Grey: Prediction does not match image label

Not everything is labeled correctly

What does the model see?

unscaled heatmapp - DC Yellow = high activation

Blue: Prediction matches image label

Grey: Prediction does not match image label

unscaled heatmapp - DC Yellow = high activation

Blue: Prediction matches image label

Grey: Prediction does not match image label

unscaled heatmapp - DC Yellow = high activation

Blue: Prediction matches image label

Grey: Prediction does not match image label

Class Characteristic Labeling Activity

Questions

Discussion

  • Collaborate with us!

  • Collect population level data

  • Data sharing

  • Other uses for the scanner or software?

Susan Vanderplas: susan.vanderplas@unl.edu

Rick Stone: rstone@iastate.edu