Givenness

Givenness module.

Giveness is defined as the amount of given information a text exposes over successive constituents [HDM+05]. Givenness is can be used as a proxy of text complexity.

TRUNAJOD.givenness.pronoun_density(doc: spacy.tokens.doc.Doc) → float

Compute pronoun density.

This is a measurement of text complexity, in the sense that a text with a higher pronoun density will be more difficult to read than a text with lower pronoun density (due to inferences needed). The way this is computed is taking the ratio between third person pronouns and total words in the text.

Parameters

doc (Spacy Doc) – Document to be processed.

Returns

Pronoun density

Return type

float

TRUNAJOD.givenness.pronoun_noun_ratio(doc: spacy.tokens.doc.Doc) → float

Compute Pronoun Noun ratio.

This is an approximation of text complexity/readability, since pronouns are co-references to a proper noun or a noun. This is computed as the taking the ratio between third person pronouns and total nouns.

Parameters

doc (Spacy doc) – Text to be processed

Returns

pronoun-noun ratio

Return type

float

HDM+05

Christian F Hempelmann, David Dufty, Philip M McCarthy, Arthur C Graesser, Zhiqiang Cai, and Danielle S McNamara. Using lsa to automatically identify givenness and newness of noun phrases in written discourse. In Proceedings of the 27th annual conference of the Cognitive Science Society, 941–946. Erlbaum Mahwah, NJ, 2005.