CGU Faculty Publications and Research

Natural Language Processing and e-Government: Crime Information Extraction from Heterogeneous Data Sources

Chih Hao Ku '12, Claremont Graduate UniversityFollow
Alicia Iriberri '06, Claremont Graduate UniversityFollow
Gondy Leroy, Claremont Graduate UniversityFollow

Document Type

Poster

Department

Information Systems and Technology (CGU)

Publication Date

2008

Disciplines

Databases and Information Systems | Theory and Algorithms

Abstract

Much information that could help solve and prevent crimes is never gathered because the reporting methods available to citizens and law enforcement personnel are not optimal. Detectives do not have sufficient time to interview crime victims and witnesses. Moreover, many victims and witnesses are too scared or embarrassed to report incidents. We are developing an interviewing system that will help collect such information. We report here on one component, the crime information extraction module, which uses natural language processing to extract crime information from police reports, newspaper articles, and victims’ and witnesses’ crime narratives. We tested our approach with two types of document: police and witness narrative reports. Our algorithms extract crime-related information, namely weapons, vehicles, time, people, clothes, and locations. We achieved high precision (96%) and recall (83%) for police narrative reports and comparable precision (93%) but somewhat lower recall (77%) for witness narrative reports. The difference in recall was significant at p < .05. We then used a spell checker to evaluate if this would help with witness narrative processing. We found that both precision (94 %) and recall (79%) improved slightly.

Rights Information

Terms of Use & License Information

Recommended Citation

C. H. Ku, A. Iriberri, and G.Leroy, "Natural Language Processing and e-Government: Crime Information Extraction from Heterogeneous Data Sources," Ninth International Conference on Digital Government Research (DG.O 2008), May 18-21, 2008, Montreal, Canada.

Download

Included in

Databases and Information Systems Commons, Theory and Algorithms Commons

COinS

CGU Faculty Publications and Research

Natural Language Processing and e-Government: Crime Information Extraction from Heterogeneous Data Sources

Document Type

Department

Publication Date

Disciplines

Abstract

Rights Information

Terms of Use & License Information

Recommended Citation

Included in

Search

Browse

Author Corner

Useful Links

CGU Faculty Publications and Research

Natural Language Processing and e-Government: Crime Information Extraction from Heterogeneous Data Sources

Authors

Document Type

Department

Publication Date

Disciplines

Abstract

Rights Information

Terms of Use & License Information

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Useful Links