Foundations and Techniques for Computational Analysis: Working With the 16th and 17th Century English Prose Fiction Corpus
MetadataShow full item record
This dissertation delineates and develops foundational work necessary to accomplish a computational analysis of 16th and 17th English prose fiction. The corpus of prose fiction in the period is defined. The work necessary to convert the corpus to plain-text, in a standardized and computationally useful form, is described in detail and is accomplished to a sufficient degree to permit meaningful statistical results. Examples are provided of the use of various tools – POS tagging, Topic Analysis, LSA techniques, Psychometrics – frequently used to work with large corpora. The theoretical basis for analyzing literary works of art is discussed with reference to the neuroscience & biological underpinnings of our use of words, language, and concepts of meaning. Because most computational techniques work at the level of individual words and chunks of words, problems with the techniques in relation to reading – specifically reading prose fiction – are discussed.