How to Find a Needle in a Haystack From the Insider Threat to Solo Perpetrators
Searching for a needle in a haystack is an important task in several contexts of data analysis and decision-making. Examples include identifying the insider threat within an organization, the prediction of failure in industrial production, or pinpointing the unique signature of a solo perpetrator, such as a school shooter or a lone wolf terrorist. It is a challenge different from that of identifying a rare event (e.g., a tsunami) or detecting anomalies because the "needle" is not easily distinguished from the haystack. This challenging context is imbued with particular difficulties, from the lack of sufficient data to train a machine learning model through the identification of the relevant features and up to the painful price of false alarms, which might cause us to question the relevance of machine learning solutions even if they perform well according to common performance criteria. In this book, Prof. Neuman approaches the problem of finding the needle by specifically focusing on the human factor, from solo perpetrators to insider threats. Providing for the first time a deep, critical, multidimensional, and methodological analysis of the challenge, the book offers data scientists and decision makers a deep scientific foundational approach combined with a pragmatic practical approach that may guide them in searching for a needle in a haystack.
Acknowledgments. Preface. 1 The Needle Challenge: From Shipping Vessels to the Insider Threat. 2 What Is a Needle in a Haystack?: A Lesson from Miss Lucy and Vladimir Putin. 3 How Are Rare Events Formed?: Modeling through the Galton Machine. 4 Crying Wolf: False Alarms and Their Price. 5 Why Is It Difficult to Find the Needle?: On Rare and Common Paths. 6 Why Do We Fail to Find the Needle?: The Binary Fallacy and the Bayesian Approach. 7 How to Reduce the Size of the Haystack: On Impostors, Cats, and False Positives. 8 Needles in the Wild: Some Lessons from Nature. 9 Lupus and the Needle: A Contextual-Dynamic Approach to the Needle Challenge. 10 How to Deal with Tiny Datasets: The Power of AI. 11 Concluding Discussion: Isolated Lights in the Abyss of Ignorance. Index.
"This is a thoroughly informative and entertaining short book. The needle in the haystack refers primarily to ‘human needles’ like the ones who, among many with similar attributes, are those who will actually carry through a terrorist attack. Yair Neuman uses data science to help reduce the ‘search space’ of individuals who need to be most closely monitored. Because the book deals in part with a subject that the main stream media prefers to avoid in case they are accused of ‘racism’, Yair Neuman thankfully avoids the ludicrous level of self-censorship imposed by most academics. The book is written in an accessible style that will be understandable to a very wide audience. This includes excellent lay explanations of some quite complex machine learning concepts. I was also very happy to see extensive use of the Bayesian approach to evidence evaluation in several chapters. I strongly recommend this book."
--Prof. Norman Fenton, Queen Marry University, author of "Risk assessment and decision analysis with Bayesian networks" (with M. Neil)
"Yair Neuman provides an original and captivating treatise on rare events classification, the ‘needle in a haystack’ problem that remains a pervasive challenge in academic research and applied data science wherever risk mitigation is concerned. He draws on decades of scientific and engineering experience to demystify rigorous and technical solutions through lucid, real-world examples on topics ranging from terrorist identification to machine failure. What results contains a hallmark of brilliant artwork as much as science: it becomes so entertaining readers will find themselves revisiting it to absorb the details they missed. This book is essential reading if you intend to analyze rare events, whether you are a student or trained machine learning professional."
--Dr. Joshua Tschantret, Emory University