Student Co-author

CGU Graduate

Document Type

Conference Proceeding


Information Systems and Technology (CGU)

Publication Date



Databases and Information Systems | Management Information Systems


Crime reports are used by law enforcement to find criminals, prevent further violations, identify problems causing crimes and allocate government resources. Unfortunately, many crimes go unreported. This may lead to an incorrect crime picture and suboptimal responses to the existing situation. Our goal is to use a data mining approach to increase understanding of when crime is reported or not. An increased understanding could lead to new, more effective programs to fight crime or changes to existing programs. We use the National Crime Victimization Survey (NCVS) which comprises data collected from 45,000 households about incidents, victims, suspects and if the incident was reported or not. We use decision trees to predict when incidents are reported or not. We compare decision trees that are built based on domain knowledge with those automatically created. For the automatically created trees, we compare three variable selection methods: two filters, Chi-squared and Cramer’s V Coefficient, and a forward selection wrapper. We found that the decision trees that are automatically constructed are as accurate as those based on domain knowledge while they show a different picture. We conclude that decision trees lead to several new hypotheses for criminologists while they are automatically constructed and easy to understand which makes them practical and useful.

Rights Information

© 2007 AIS Electronic Library (AISeL)

Terms of Use & License Information

Terms of Use for work posted in Scholarship@Claremont.