Information Systems and Technology (CGU)
Databases and Information Systems | Theory and Algorithms
Much information that could help solve and prevent crimes is never gathered because the reporting methods available to citizens and law enforcement personnel are not optimal. Detectives do not have sufficient time to interview crime victims and witnesses. Moreover, many victims and witnesses are too scared or embarrassed to report incidents. We are developing an interviewing system that will help collect such information. We report here on one component, the crime information extraction module, which uses natural language processing to extract crime information from police reports, newspaper articles, and victims’ and witnesses’ crime narratives. We tested our approach with two types of document: police and witness narrative reports. Our algorithms extract crime-related information, namely weapons, vehicles, time, people, clothes, and locations. We achieved high precision (96%) and recall (83%) for police narrative reports and comparable precision (93%) but somewhat lower recall (77%) for witness narrative reports. The difference in recall was significant at p < .05. We then used a spell checker to evaluate if this would help with witness narrative processing. We found that both precision (94 %) and recall (79%) improved slightly.
© 2008 Chih Hao Ku, Alicia Iriberri, and Gondy Leroy
C. H. Ku, A. Iriberri, and G.Leroy, "Natural Language Processing and e-Government: Crime Information Extraction from Heterogeneous Data Sources," Ninth International Conference on Digital Government Research (DG.O 2008), May 18-21, 2008, Montreal, Canada.