Entity Grids

Entity grid module for TRUNAJOD.

In this module, entity grid based features are implemented. On one side, an entity grid [BL08] implementation is provided. We also provide an implementation of the entity graph coherence modeling [GS13].

Danger

These set of features or measurements Really depends on the dependency parsing accuracy, which relies on the CORPUS the dependency parsed was trained. There is no guarantee that this will work with all types of texts. On the other hand, the implementation is simple and we do not do any coreference resolution for noun-phrases and just rely on simple heuristics.

It is also worth noting, that we consider an entity grid of two-sentence sequence and the API currently does not provide any hyper-parameter tunning to change this.

class TRUNAJOD.entity_grid.EntityGrid(doc, model_name='spacy')

Entity grid class.

Class Entity Grid, creates an entity grid from a doc, which is output of applying spacy.nlp(text) to a text. Thus, this class depends on spacy module. It only supports 2-transitions entity grid.

get_egrid() → dict

Return obtained entity grid (for debugging purposes).

Returns

entity grid represented as a dict

Return type

dict

get_nn_transitions() → float

Get – transitions.

Returns

Ratio of transitions

Return type

float

get_no_transitions() → float

Get -O transitions.

Returns

Ratio of transitions

Return type

float

get_ns_transitions() → float

Get -S transitions.

Returns

Ratio of transitions

Return type

float

get_nx_transitions() → float

Get -X transitions.

Returns

Ratio of transitions

Return type

float

get_on_transitions() → float

Get O- transitions.

Returns

Ratio of transitions

Return type

float

get_oo_transitions() → float

Get OO transitions.

Returns

Ratio of transitions

Return type

float

get_os_transitions() → float

Get OS transitions.

Returns

Ratio of transitions

Return type

float

get_ox_transitions() → float

Get OX transitions.

Returns

Ratio of transitions

Return type

float

get_sentence_count() → int

Return sentence count obtained while processing.

Returns

Number of sentences

Return type

int

get_sn_transitions() → float

Get S- transitions.

Returns

Ratio of transitions

Return type

float

get_so_transitions() → float

Get SO transitions.

Returns

Ratio of transitions

Return type

float

get_ss_transitions() → float

Get SS transitions.

Returns

Ratio of transitions

Return type

float

get_sx_transitions() → float

Get SX transitions.

Returns

Ratio of transitions

Return type

float

get_xn_transitions() → float

Get X- transitions.

Returns

Ratio of transitions

Return type

float

get_xo_transitions() → float

Get XO transitions.

Returns

Ratio of transitions

Return type

float

get_xs_transitions() → float

Get XS transitions.

Returns

Ratio of transitions

Return type

float

get_xx_transitions() → float

Get XX transitions.

Returns

Ratio of transitions

Return type

float

TRUNAJOD.entity_grid.dependency_mapping(dep: str) → str

Map dependency tag to entity grid tag.

We consider the notation provided in [BL08]:

EGrid Tag

Dependency Tag

S

nsub, csubj, csubjpass, dsubjpass

O

iobj, obj, pobj, dobj

X

For any other dependency tag

Parameters

dep (string) – Dependency tag

Returns

EGrid tag

Return type

string

TRUNAJOD.entity_grid.get_local_coherence(egrid: TRUNAJOD.entity_grid.EntityGrid) → [<class 'float'>, <class 'float'>, <class 'float'>, <class 'float'>]

Get local coherence from entity grid.

This method gets the coherence value using all the approaches described in [GS13]. This include:

  • local_coherence_PU

  • local_coherence_PW

  • local_coherence_PACC

  • local_coherence_PU_dist

  • local_coherence_PW_dist

  • local_coherence_PACC_dist

Parameters

egrid (EntityGrid) – An EntityGrid object.

Returns

Local coherence based on different heuristics

Return type

tuple of floats

TRUNAJOD.entity_grid.weighting_syntactic_role(entity_role: str) → int

Return weight given an entity grammatical role.

Weighting scheme for syntactic role of an entity. This uses the heuristic from [GS13], which is:

EGrid Tag

Weight

S

3

O

2

X

1

dash

0

Parameters

entity_role (string) – Entity grammatical role (S, O, X, -)

Returns

Role weight

Return type

int

BL08(1,2)

Regina Barzilay and Mirella Lapata. Modeling local coherence: an entity-based approach. Computational Linguistics, 34(1):1–34, 2008.

GS13(1,2,3)

Camille Guinaudeau and Michael Strube. Graph-based local coherence modeling. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 93–103. 2013.