class: center, middle, inverse, title-slide # Welcome to Forensic Statistics ## (The Study of Doom) ### Susan Vanderplas
@srvanderplas
--- class: primary-blue ## A Brief Intro to Forensic Statistics - Big challenge: Validate ad-hoc visual methods scientifically - Unknowns: every shoe in the world, with every wear pattern possible, infinite potential for incidental damage - Simpler questions: - Can we distinguish shoes of the same make/model/size? - How do shoes accumulate damage over time? - What measurement methods are best? - Do lab methods translate to "real world" crime scenes? ??? Forensic science has been questioned in the past decade, particularly because examiners make statements like "This shoe made this print to the exclusion of all other shoes in the world", with the only evidence for that claim being the experience of the examiner. In order to validate forensic disciplines, we need to collect data. Since we knew assembling a database of every shoe in the world was impossible, we decided to start "small" - with a few questions that were seen as foundational to our ability to create algorithms to assess the statistical probability that two prints matched. Starting small, incidentally, meant that we tried to answer all 4 of these questions at the same time. --- class: secondary-blue ## Longitudinal Shoe Study - 160 Pairs of shoes, each worn for ~6 months - 2 shoe models with different patterns and sole materials - 4 half-sizes for each shoe model - 7 different methods for imaging the shoes - 3-4 different time points <table width="100%"> <tr class="blank"><td><img src="small-005772L_20171031_2_1_1_csafe_bpbryson.jpg" alt="Digital Scan Image - Nike"/></td><td><img src="small-005772L_20171213_4_1_1_csafe_jekruse.jpg" alt="Camera Image - Nike"/></td><td><img src="small-005772L_20171211_5_1_1_csafe_kruse-bryson-zwart.jpg" alt="Film Image - Nike"/></td><td><img src="small-005772L_20180411_6_1_1_kruse_zwart_Kruse.jpg" alt="Paper Image - Nike"/></td><td><img src="3d_shoes_Participant_005_Turntable_scan.png" alt="3D scan - high quality - Nike"/></td><td> <img src="small-005772L_20171108_7_1_1_csafe_bpbryson.jpg" alt = "Realistic crime scene print - Nike"/></td></tr> <tr class="blank"><td><img src="small-007961L_20171031_2_1_1_csafe_bpbryson.jpg" alt="Digital Scan Image - Adidas"/></td><td><img src="small-007961L_20171030_4_1_1_csafe_hdzwart.jpg" alt="Camera Image - Adidas"/></td><td><img src="small-007961L_20171211_5_1_1_csafe_kruse-bryson-zwart.jpg" alt="Film Image - Adidas"/></td><td><img src="small-007961L_20180411_6_1_1_kruse_zwart_Kruse.jpg" alt="Paper Image - Adidas"/></td><td><img src="3d_shoes_Participant_007_Turntable_scan.png" alt="3D scan - high quality - Adidas"/></td><td> <img src="small-007961L_20171108_7_1_1_csafe_bpbryson.jpg" alt = "Realistic crime scene print - Adidas"/></td></tr> </table> ??? We collaborated with a group at NIST who primarily work in metrology - how to measure things - to set this experiment up. We had 7 different methods for gathering data, and a thorough procedure to control for how the data was collected - right down to the socks the undergraduates wore as they created the shoe prints. As you can imagine, this experiment was pretty expensive - 160 pairs of shoes, plus the measurement equipment, plus the lab workers to take all of the measurements. It was a pretty cool (and extensive) study - bigger than anything anyone had done before, and we were making the data available to the community for free! So let me tell you how things fell apart. --- class: primary-red ## Things Fall Apart ... <img src="3d_shoes_Participant_005_Turntable_scan.png" width="35%" alt="3D scan using the turntable (high res)" style = "position:relative;left:10%;"/><img src="3d_shoes_Participant_005_Handheld_scan.png" width="35%" style = "position:absolute;right:10%;" alt="3D scan using the handheld protocol (low res)"/> - Not enough play time with the collection instruments ??? This particular issue was discovered at the end of the experiment - the 2d digital scanner came with a turntable no one had noticed. Turns out, the turntable was a bit slower, but produced much higher resolution scans. So of course, we changed the protocol to use the turntable for the last round of scans... too bad they didn't match the rest of the scans, right? -- - Modifying data collection procedures mid-experiment to account for helpful suggestions from collaborators ??? We went through several rounds of modifications to the digital photography protocol, involving backgrounds, rulers, and other accessories that made the photos more useful... but also made any automatic processing of those photographs nearly impossible. Every single batch of photos taken has different background, rulers, and general configuration. -- - We didn't control for who was wearing the shoes ??? It turns out that it really matters who is wearing the shoes - the features that show up in the prints are different when the weight distribution is different. So while we controlled for the socks people wore, we didn't control for who was wearing those socks. It's really hard to compare print-to-print when you don't control for who is wearing the shoes. -- - Analysis pipeline? It's just data, we can figure out how to analyze it later .center[<img width="33%" alt="inside out angry gif" src="https://media1.tenor.com/images/54d526fd183bb842780b9df05ebf6af6/tenor.gif?itemid=5628546"/>] ??? By far the biggest problem with this study, though, was that no one had a plan for how to analyze the data. There was no pipeline to remove e.g. the background information from the images, or to clean the images - because none of the people involved had ever worked with images before. They had no idea what they were getting into. So I'd been working on this for 6 months when I managed to actually get 2 images aligned *some* of the time, and my boss is looking at me like I am insane for doing a happy dance in the office, because she had no idea it was going to be that big of an issue. --- class:primary-cyan ## What we learned - The analysis pipeline is the most important part of the project, and can inform how you collect the data <br/><br/> - Certain patterns of shoes make for easier analysis... but the Nike/Adidas models we originally picked aren't among them. <br/><br/> - There's little interest in databases that don't have a demonstration of how the database should be analyzed <br/><br/> - We should teach statisticians how to collect data ??? The study has been completed for about 3 years now, and I'm still working on building a proper analysis pipeline. It's frustrating that sometimes you don't realize how little you know until you actually try to do something, but I don't think we're unique in this kind of mishap: we had some of the most respected specialists in forensics, experimental design, and metrology working with us to design this experiment. The best thing I can suggest to avoid this type of mishap in the future is to go *very* slow and to ensure you can analyze some pilot data before you jump into the deep end feet first. Otherwise, you end up with Terabytes of data, and more existential questions than answers. --- class:inverse-green,center # Acknowledgements <div style="margin-left:30%;margin-right:30%;text-align:left"> <h3>Collaborators</h3> <ul><li>Alicia Carriquiry<li>Guillermo Basulto-Elias<li>James Kruse<li>Soyoung Park</ul> </div> <div style="text-align:left;margin-top:80px;"> This work was partially funded by the Center for Statistics and Applications in Forensic Evidence (CSAFE) through Cooperative Agreement #70NANB15H176 between NIST and Iowa State University, which includes activities carried out at Carnegie Mellon University, University of California Irvine, and University of Virginia.</div>