Perception and Computer Vision Diagnostics
2023-12-06
Collect images of shoe soles from the population using the scanner
Identify features in the tread patterns w/ computer vision
Generate a local database of common pattern features
Characterize frequency of a new shoe w/ random match probability computed from database
Quantifying the frequency of shoes in a local population is an unsolveable problem - Leslie Hammer, Hammer Forensics, March 2018
Other researchers use the output from the CNN
(without a trained model head)
hard to explain to practitioners
hard to understand meaning
for models to be accepted in forensics, they need to be explainable!
If models can differentiate between types of elephants, they can identify shapes… right?
:::
We can reasonably pose this problem in 3 different ways:
Each method requires a different labeling schema, annotation method, and data format
(What we’ve tried so far)
Use VGG16 to classify 256x256 px chunks of images
Goal is to label the entire chunk with one or more classes
Image | Circle | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Star |
---|---|---|---|---|---|---|---|---|---|
.9999 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
.9999 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Shoe data is complicated to label
Predictions made on ambiguous data don’t work as well as we’d like
re-encode labels using a different data format
Toolkit:
Neural networks are trained on millions of human-annotated photos
Even shoe soles are artificial relative to a natural scene
Networks weren’t trained on the artificial patterns or layouts in shoe soles
Labeling is fraught with errors and incomplete information
Labeling schema are very complex & must account for human perception
Generate a large library of synthetic data
pre-labeled
complex characteristics
Train preliminary model
Run 2D patterns through an existing network to generate more realistic 3D images
Run 2D patterns through an existing network to generate more realistic 3D images
Train on marketing-quality pictures labeled by humans
Train on marketing-quality pictures labeled by humans
Train on Scanner Photos
Measure performance/accuracy changes over time on a consistent set of stimuli created from real shoes
SVGs can include metadata
Easy scaling
SVG intersection operations will automatically mark partial objects
Flexible data format:
Manual SVG creation
(52 images \(\approx\) 8h )
Creating a library to generate data
3D rendering after 2D stage:
Lots of work required before we start in on photos
Familiar features for database search
Data quality flexibility:
Goal: reliable estimates of random match probability RMP: the probability that someone in the area has a shoe with similar characteristics.
This work was funded (or partially funded) by the Center for Statistics and Applications in Forensic Evidence (CSAFE) through Cooperative Agreements 70NANB15H176 and 70NANB20H019 between NIST and Iowa State University, which includes activities carried out at Carnegie Mellon University, Duke University, University of California Irvine, University of Virginia, West Virginia University, University of Pennsylvania, Swarthmore College and University of Nebraska, Lincoln.
Students who have worked on this project: