Student Co-author

CGU Graduate

Document Type

Conference Proceeding


Information Systems and Technology (CGU)

Publication Date



Databases and Information Systems | Linguistics | Management Information Systems | Medicine and Health Sciences


Although understanding health information is important, the texts provided are often difficult to understand. There are formulas to measure readability levels, but there is little understanding of how linguistic structures contribute to these difficulties. We are developing a toolkit of linguistic metrics that are validated with representative users and can be measured automatically. In this study, we provide an overview of our corpus and how readability differs by topic and source. We compare two documents for three groups of linguistic metrics. We report on a user study evaluating one of the differentiating metrics: the percentage of function words in a sentence. Our results show that this percentage correlates significantly with ease of understanding as indicated by users but not with the readability formula levels commonly used. Our study is the first to propose a user validated metric, different from readability formulas.


Nominated for Distinguished Paper Award

Rights Information

Copyright © 2008 AMIA - All rights reserved.

Terms of Use & License Information

Terms of Use for work posted in Scholarship@Claremont.