Abstract / Synopsis
A permutation π in Sn can be decomposed into its runs π = τ1τ2 . . . τk, where a run of π is a maximal contiguous subsequence whose elements are in increasing order. If the first values of each run are in increasing order, then π is said to be flattened. Motivated by the study of flattened permutations, we study the words in the Danish, German, English, Spanish, French, Italian, Dutch, and Norwegian languages. In each language considered, our work provides the following: a list of the longest flattened words, histograms for the proportion of words aggregated by their length and number of runs, and histograms for flattened words aggregated by number of runs and length. We also analyze purely numeric languages whose words are of up to length seven and whose letters are elements of {1, 2, . . . , 9} so as to compare the distributions of natural languages to these numeric languages. We include a thorough description on our data gathering process, the code used for our analysis, and link to a webpage which implements the code and provides the histograms of our findings. We end with some directions for future work.
DOI
10.5642/jhummath.EWXJ2437
Recommended Citation
Jennifer Elder, Pamela E. Harris & Anthony Simpson, "Language Analysis via the Run and Flattened Statistics on Permutations," Journal of Humanistic Mathematics, Volume 14 Issue 2 (July 2024), pages 127-241. DOI: 10.5642/jhummath.EWXJ2437. Available at: https://scholarship.claremont.edu/jhm/vol14/iss2/7
Terms of Use & License Information
Included in
English Language and Literature Commons, French and Francophone Language and Literature Commons, German Language and Literature Commons, Italian Language and Literature Commons, Mathematics Commons, Spanish and Portuguese Language and Literature Commons