Extractor

aser_extractor

class aser.extract.aser_extractor.BaseASERExtractor(corenlp_path='', corenlp_port=0, **kw)[source]

Bases: object

Base ASER Extractor to extract both eventualities and relations. It includes an instance of BaseEventualityExtractor and an instance of BaseRelationExtractor.

Parameters:
  • corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2

  • corenlp_port (int (default = 0)) – corenlp port, e.g., 9000

  • kw (Dict[str, object]) – other parameters

close()[source]

Close the extractor safely

extract_eventualities_from_parsed_result(parsed_result, output_format='Eventuality', in_order=True, use_lemma=True, **kw)[source]

Extract eventualities from the parsed result

Parameters:
  • parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp

  • output_format (str (default = "Eventuality")) – which format to return, “Eventuality” or “json”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • use_lemma (bool (default = True)) – whether the returned eventuality uses lemma

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted eventualities

Return type:

Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]]

Input:

[{'dependencies': [(1, 'nmod:poss', 0),
                   (3, 'nsubj', 1),
                   (3, 'aux', 2),
                   (3, 'dobj', 5),
                   (3, 'punct', 6),
                   (5, 'nmod:poss', 4)],
  'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'],
  'mentions': [],
  'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'],
  'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP '
           '(PRP$ your) (NN boat)))) (. .)))',
  'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'],
  'text': 'My army will find your boat.',
  'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']},
 {'dependencies': [(2, 'case', 0),
                   (2, 'det', 1),
                   (6, 'nmod:in', 2),
                   (6, 'punct', 3),
                   (6, 'nsubj', 4),
                   (6, 'cop', 5),
                   (6, 'ccomp', 9),
                   (6, 'punct', 13),
                   (9, 'nsubj', 7),
                   (9, 'aux', 8),
                   (9, 'iobj', 10),
                   (9, 'dobj', 12),
                   (12, 'amod', 11)],
  'lemmas': ['in',
             'the',
             'meantime',
             ',',
             'I',
             'be',
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodation',
             '.'],
  'mentions': [],
  'ners': ['O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O'],
  'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP '
           "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD "
           'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS '
           'accommodations)))))))) (. .)))',
  'pos_tags': ['IN',
               'DT',
               'NN',
               ',',
               'PRP',
               'VBP',
               'JJ',
               'PRP',
               'MD',
               'VB',
               'PRP',
               'JJ',
               'NNS',
               '.'],
  'text': "In the meantime, I'm sure we could find you suitable "
          'accommodations.',
  'tokens': ['In',
             'the',
             'meantime',
             ',',
             'I',
             "'m",
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodations',
             '.']}]

Output:

[[my army will find you boat],
 [i be sure, we could find you suitable accommodation]]
extract_eventualities_from_text(text, output_format='Eventuality', in_order=True, use_lemma=True, annotators=None, **kw)[source]

Extract eventualities from a raw text

Parameters:
  • text (str) – a raw text

  • output_format (str (default = "Eventuality")) – which format to return, “Eventuality” or “json”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • use_lemma (bool (default = True)) – whether the returned eventuality uses lemma

  • annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted eventualities

Return type:

Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]]

Input:

"My army will find your boat. In the meantime, I'm sure we could find you suitable accommodations."

Output:

[[my army will find you boat],
 [i be sure, we could find you suitable accommodation]]
extract_from_parsed_result(parsed_result, eventuality_output_format='Eventuality', relation_output_format='Relation', in_order=True, use_lemma=True, **kw)[source]

Extract both eventualities and relations from a parsed result

Parameters:
  • parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp

  • eventuality_output_format (str (default = "Eventuality")) – which format to return eventualities, “Eventuality” or “json”

  • relation_output_format (str (default = "Relation")) – which format to return relations, “Relation” or “triplet”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • use_lemma (bool (default = True)) – whether the returned eventuality uses lemma

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted eventualities and relations

Return type:

Tuple[Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]], Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]]

Input:

[{'dependencies': [(1, 'nmod:poss', 0),
                   (3, 'nsubj', 1),
                   (3, 'aux', 2),
                   (3, 'dobj', 5),
                   (3, 'punct', 6),
                   (5, 'nmod:poss', 4)],
  'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'],
  'mentions': [],
  'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'],
  'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP '
           '(PRP$ your) (NN boat)))) (. .)))',
  'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'],
  'text': 'My army will find your boat.',
  'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']},
 {'dependencies': [(2, 'case', 0),
                   (2, 'det', 1),
                   (6, 'nmod:in', 2),
                   (6, 'punct', 3),
                   (6, 'nsubj', 4),
                   (6, 'cop', 5),
                   (6, 'ccomp', 9),
                   (6, 'punct', 13),
                   (9, 'nsubj', 7),
                   (9, 'aux', 8),
                   (9, 'iobj', 10),
                   (9, 'dobj', 12),
                   (12, 'amod', 11)],
  'lemmas': ['in',
             'the',
             'meantime',
             ',',
             'I',
             'be',
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodation',
             '.'],
  'mentions': [],
  'ners': ['O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O'],
  'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP '
           "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD "
           'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS '
           'accommodations)))))))) (. .)))',
  'pos_tags': ['IN',
               'DT',
               'NN',
               ',',
               'PRP',
               'VBP',
               'JJ',
               'PRP',
               'MD',
               'VB',
               'PRP',
               'JJ',
               'NNS',
               '.'],
  'text': "In the meantime, I'm sure we could find you suitable "
          'accommodations.',
  'tokens': ['In',
             'the',
             'meantime',
             ',',
             'I',
             "'m",
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodations',
             '.']}],
[[my army will find you boat],
 [i be sure, we could find you suitable accommodation]]

Output:

([[my army will find you boat],
  [i be sure, we could find you suitable accommodation]],
 [[],
  [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})],
  [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]])
extract_from_text(text, eventuality_output_format='Eventuality', relation_output_format='Relation', in_order=True, use_lemma=True, annotators=None, **kw)[source]

Extract both eventualities and relations from a raw text

Parameters:
  • text (str) – a raw text

  • eventuality_output_format (str (default = "Eventuality")) – which format to return eventualities, “Eventuality” or “json”

  • relation_output_format (str (default = "Relation")) – which format to return relations, “Relation” or “triplet”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • use_lemma (bool (default = True)) – whether the returned eventuality uses lemma

  • annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted eventualities and relations

Return type:

rtype:

Tuple[Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]], Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]]

Input:

"My army will find your boat. In the meantime, I'm sure we could find you suitable accommodations."

Output:

([[my army will find you boat],
  [i be sure, we could find you suitable accommodation]],
 [[],
  [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})],
  [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]])
extract_relations_from_parsed_result(parsed_result, para_eventualities, output_format='Relation', in_order=True, **kw)[source]

Extract relations from a parsed result (of a paragraph) and extracted eventualities

Parameters:
  • parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp

  • para_eventualities (List[aser.eventuality.Eventuality]) – eventualities in the paragraph

  • output_format (str (default = "Relation")) – which format to return, “Relation” or “triplet”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted relations

Return type:

Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]

Input:

    [{'dependencies': [(1, 'nmod:poss', 0),
                       (3, 'nsubj', 1),
                       (3, 'aux', 2),
                       (3, 'dobj', 5),
                       (3, 'punct', 6),
                       (5, 'nmod:poss', 4)],
      'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'],
      'mentions': [],
      'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'],
      'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP '
               '(PRP$ your) (NN boat)))) (. .)))',
      'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'],
      'text': 'My army will find your boat.',
      'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']},
     {'dependencies': [(2, 'case', 0),
                       (2, 'det', 1),
                       (6, 'nmod:in', 2),
                       (6, 'punct', 3),
                       (6, 'nsubj', 4),
                       (6, 'cop', 5),
                       (6, 'ccomp', 9),
                       (6, 'punct', 13),
                       (9, 'nsubj', 7),
                       (9, 'aux', 8),
                       (9, 'iobj', 10),
                       (9, 'dobj', 12),
                       (12, 'amod', 11)],
      'lemmas': ['in',
                 'the',
                 'meantime',
                 ',',
                 'I',
                 'be',
                 'sure',
                 'we',
                 'could',
                 'find',
                 'you',
                 'suitable',
                 'accommodation',
                 '.'],
      'mentions': [],
      'ners': ['O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O'],
      'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP '
               "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD "
               'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS '
               'accommodations)))))))) (. .)))',
      'pos_tags': ['IN',
                   'DT',
                   'NN',
                   ',',
                   'PRP',
                   'VBP',
                   'JJ',
                   'PRP',
                   'MD',
                   'VB',
                   'PRP',
                   'JJ',
                   'NNS',
                   '.'],
      'text': "In the meantime, I'm sure we could find you suitable "
              'accommodations.',
      'tokens': ['In',
                 'the',
                 'meantime',
                 ',',
                 'I',
                 "'m",
                 'sure',
                 'we',
                 'could',
                 'find',
                 'you',
                 'suitable',
                 'accommodations',
                 '.']}],
    [[my army will find you boat],
     [i be sure, we could find you suitable accommodation]]

    Output:

    [[],
     [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})],
     [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]]
extract_relations_from_text(text, output_format='Relation', in_order=True, annotators=None, **kw)[source]

Extract relations from a raw text and extracted eventualities

Parameters:
  • text (str) – a raw text

  • output_format (str (default = "Relation")) – which format to return, “Relation” or “triplet”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted relations

Return type:

Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]

Input:

"My army will find your boat. In the meantime, I'm sure we could find you suitable accommodations."

Output:

[[],
 [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})],
 [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]]
parse_text(text, annotators=None)[source]

Parse a raw text by corenlp

Parameters:
Returns:

the parsed result

Return type:

List[Dict[str, object]]

Input:

"My army will find your boat. In the meantime, I'm sure we could find you suitable accommodations."

Output:

[{'dependencies': [(1, 'nmod:poss', 0),
                   (3, 'nsubj', 1),
                   (3, 'aux', 2),
                   (3, 'dobj', 5),
                   (3, 'punct', 6),
                   (5, 'nmod:poss', 4)],
  'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'],
  'mentions': [],
  'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'],
  'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP '
           '(PRP$ your) (NN boat)))) (. .)))',
  'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'],
  'text': 'My army will find your boat.',
  'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']},
 {'dependencies': [(2, 'case', 0),
                   (2, 'det', 1),
                   (6, 'nmod:in', 2),
                   (6, 'punct', 3),
                   (6, 'nsubj', 4),
                   (6, 'cop', 5),
                   (6, 'ccomp', 9),
                   (6, 'punct', 13),
                   (9, 'nsubj', 7),
                   (9, 'aux', 8),
                   (9, 'iobj', 10),
                   (9, 'dobj', 12),
                   (12, 'amod', 11)],
  'lemmas': ['in',
             'the',
             'meantime',
             ',',
             'I',
             'be',
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodation',
             '.'],
  'mentions': [],
  'ners': ['O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O'],
  'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP '
           "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD "
           'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS '
           'accommodations)))))))) (. .)))',
  'pos_tags': ['IN',
               'DT',
               'NN',
               ',',
               'PRP',
               'VBP',
               'JJ',
               'PRP',
               'MD',
               'VB',
               'PRP',
               'JJ',
               'NNS',
               '.'],
  'text': "In the meantime, I'm sure we could find you suitable "
          'accommodations.',
  'tokens': ['In',
             'the',
             'meantime',
             ',',
             'I',
             "'m",
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodations',
             '.']}]
class aser.extract.aser_extractor.DiscourseASERExtractor(corenlp_path='', corenlp_port=0, **kw)[source]

Bases: BaseASERExtractor

ASER Extractor based on discourse parsing to extract both eventualities and relations (for ASER v2.0)

Parameters:
  • corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2

  • corenlp_port (int (default = 0)) – corenlp port, e.g., 9000

  • kw (Dict[str, object]) – other parameters

extract_from_parsed_result(parsed_result, eventuality_output_format='Eventuality', relation_output_format='Relation', in_order=True, use_lemma=True, **kw)[source]

Extract both eventualities and relations from a parsed result

Parameters:
  • parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp

  • eventuality_output_format (str (default = "Eventuality")) – which format to return eventualities, “Eventuality” or “json”

  • relation_output_format (str (default = "Relation")) – which format to return relations, “Relation” or “triplet”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • use_lemma (bool (default = True)) – whether the returned eventuality uses lemma

  • kw (Dict[str, object]) – other parameters (e.g., syntax_tree_cache)

Returns:

the extracted eventualities and relations

Return type:

rtype:

Tuple[Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]], Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]]

class aser.extract.aser_extractor.SeedRuleASERExtractor(corenlp_path='', corenlp_port=0, **kw)[source]

Bases: BaseASERExtractor

ASER Extractor based on rules to extract both eventualities and relations (for ASER v1.0)

Parameters:
  • corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2

  • corenlp_port (int (default = 0)) – corenlp port, e.g., 9000

  • kw (Dict[str, object]) – other parameters

eventuality_extractor

class aser.extract.eventuality_extractor.BaseEventualityExtractor(corenlp_path='', corenlp_port=0, **kw)[source]

Bases: object

Base ASER eventuality extractor to extract eventualities

Parameters:
  • corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2

  • corenlp_port (int (default = 0)) – corenlp port, e.g., 9000

  • kw (Dict[str, object]) – other parameters

close()[source]

Close the extractor safely

extract_from_parsed_result(parsed_result, output_format='Eventuality', in_order=True, use_lemma=True, **kw)[source]

Extract eventualities from the parsed result

Parameters:
  • parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp

  • output_format (str (default = "Eventuality")) – which format to return, “Eventuality” or “json”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • use_lemma (bool (default = True)) – whether the returned eventuality uses lemma

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted eventualities

Return type:

Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]]

Input:

[{'dependencies': [(1, 'nmod:poss', 0),
                   (3, 'nsubj', 1),
                   (3, 'aux', 2),
                   (3, 'dobj', 5),
                   (3, 'punct', 6),
                   (5, 'nmod:poss', 4)],
  'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'],
  'mentions': [],
  'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'],
  'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP '
           '(PRP$ your) (NN boat)))) (. .)))',
  'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'],
  'text': 'My army will find your boat.',
  'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']},
 {'dependencies': [(2, 'case', 0),
                   (2, 'det', 1),
                   (6, 'nmod:in', 2),
                   (6, 'punct', 3),
                   (6, 'nsubj', 4),
                   (6, 'cop', 5),
                   (6, 'ccomp', 9),
                   (6, 'punct', 13),
                   (9, 'nsubj', 7),
                   (9, 'aux', 8),
                   (9, 'iobj', 10),
                   (9, 'dobj', 12),
                   (12, 'amod', 11)],
  'lemmas': ['in',
             'the',
             'meantime',
             ',',
             'I',
             'be',
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodation',
             '.'],
  'mentions': [],
  'ners': ['O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O'],
  'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP '
           "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD "
           'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS '
           'accommodations)))))))) (. .)))',
  'pos_tags': ['IN',
               'DT',
               'NN',
               ',',
               'PRP',
               'VBP',
               'JJ',
               'PRP',
               'MD',
               'VB',
               'PRP',
               'JJ',
               'NNS',
               '.'],
  'text': "In the meantime, I'm sure we could find you suitable "
          'accommodations.',
  'tokens': ['In',
             'the',
             'meantime',
             ',',
             'I',
             "'m",
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodations',
             '.']}]

Output:

[[my army will find you boat],
 [i be sure, we could find you suitable accommodation]]
extract_from_text(text, output_format='Eventuality', in_order=True, use_lemma=True, annotators=None, **kw)[source]

Extract eventualities from a raw text

Parameters:
  • text (str) – a raw text

  • output_format (str (default = "Eventuality")) – which format to return, “Eventuality” or “json”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • use_lemma (bool (default = True)) – whether the returned eventuality uses lemma

  • annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted eventualities

Return type:

Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]]

Input:

"My army will find your boat. In the meantime, I'm sure we could find you suitable accommodations."

Output:

[[my army will find you boat],
 [i be sure, we could find you suitable accommodation]]
parse_text(text, annotators=None)[source]

Parse a raw text by corenlp

Parameters:
Returns:

the parsed result

Return type:

List[Dict[str, object]]

Input:

"My army will find your boat. In the meantime, I'm sure we could find you suitable accommodations."

Output:

[{'dependencies': [(1, 'nmod:poss', 0),
                   (3, 'nsubj', 1),
                   (3, 'aux', 2),
                   (3, 'dobj', 5),
                   (3, 'punct', 6),
                   (5, 'nmod:poss', 4)],
  'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'],
  'mentions': [],
  'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'],
  'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP '
           '(PRP$ your) (NN boat)))) (. .)))',
  'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'],
  'text': 'My army will find your boat.',
  'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']},
 {'dependencies': [(2, 'case', 0),
                   (2, 'det', 1),
                   (6, 'nmod:in', 2),
                   (6, 'punct', 3),
                   (6, 'nsubj', 4),
                   (6, 'cop', 5),
                   (6, 'ccomp', 9),
                   (6, 'punct', 13),
                   (9, 'nsubj', 7),
                   (9, 'aux', 8),
                   (9, 'iobj', 10),
                   (9, 'dobj', 12),
                   (12, 'amod', 11)],
  'lemmas': ['in',
             'the',
             'meantime',
             ',',
             'I',
             'be',
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodation',
             '.'],
  'mentions': [],
  'ners': ['O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O'],
  'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP '
           "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD "
           'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS '
           'accommodations)))))))) (. .)))',
  'pos_tags': ['IN',
               'DT',
               'NN',
               ',',
               'PRP',
               'VBP',
               'JJ',
               'PRP',
               'MD',
               'VB',
               'PRP',
               'JJ',
               'NNS',
               '.'],
  'text': "In the meantime, I'm sure we could find you suitable "
          'accommodations.',
  'tokens': ['In',
             'the',
             'meantime',
             ',',
             'I',
             "'m",
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodations',
             '.']}]
class aser.extract.eventuality_extractor.DiscourseEventualityExtractor(corenlp_path='', corenlp_port=0, **kw)[source]

Bases: BaseEventualityExtractor

ASER eventuality extractor based on constituency analysis to extract eventualities (for ASER v2.0)

Parameters:
  • corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2

  • corenlp_port (int (default = 0)) – corenlp port, e.g., 9000

  • kw (Dict[str, object]) – other parameters

extract_from_parsed_result(parsed_result, output_format='Eventuality', in_order=True, use_lemma=True, **kw)[source]

Extract eventualities from the parsed result

Parameters:
  • parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp

  • output_format (str (default = "Eventuality")) – which format to return, “Eventuality” or “json”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • use_lemma (bool (default = True)) – whether the returned eventuality uses lemma

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted eventualities

Return type:

Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]]

Input:

[{'dependencies': [(1, 'nmod:poss', 0),
                   (3, 'nsubj', 1),
                   (3, 'aux', 2),
                   (3, 'dobj', 5),
                   (3, 'punct', 6),
                   (5, 'nmod:poss', 4)],
  'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'],
  'mentions': [],
  'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'],
  'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP '
           '(PRP$ your) (NN boat)))) (. .)))',
  'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'],
  'text': 'My army will find your boat.',
  'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']},
 {'dependencies': [(2, 'case', 0),
                   (2, 'det', 1),
                   (6, 'nmod:in', 2),
                   (6, 'punct', 3),
                   (6, 'nsubj', 4),
                   (6, 'cop', 5),
                   (6, 'ccomp', 9),
                   (6, 'punct', 13),
                   (9, 'nsubj', 7),
                   (9, 'aux', 8),
                   (9, 'iobj', 10),
                   (9, 'dobj', 12),
                   (12, 'amod', 11)],
  'lemmas': ['in',
             'the',
             'meantime',
             ',',
             'I',
             'be',
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodation',
             '.'],
  'mentions': [],
  'ners': ['O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O'],
  'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP '
           "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD "
           'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS '
           'accommodations)))))))) (. .)))',
  'pos_tags': ['IN',
               'DT',
               'NN',
               ',',
               'PRP',
               'VBP',
               'JJ',
               'PRP',
               'MD',
               'VB',
               'PRP',
               'JJ',
               'NNS',
               '.'],
  'text': "In the meantime, I'm sure we could find you suitable "
          'accommodations.',
  'tokens': ['In',
             'the',
             'meantime',
             ',',
             'I',
             "'m",
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodations',
             '.']}]

Output:

[[my army will find you boat],
 [i be sure, we could find you suitable accommodation]]
class aser.extract.eventuality_extractor.SeedRuleEventualityExtractor(corenlp_path='', corenlp_port=0, **kw)[source]

Bases: BaseEventualityExtractor

ASER eventuality extractor based on rules to extract eventualities (for ASER v1.0)

Parameters:
  • corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2

  • corenlp_port (int (default = 0)) – corenlp port, e.g., 9000

  • kw (Dict[str, object]) – other parameters, e.g., “skip_words” to drop sentences that contain such words

extract_from_parsed_result(parsed_result, output_format='Eventuality', in_order=True, use_lemma=True, **kw)[source]

Extract eventualities from the parsed result

Parameters:
  • parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp

  • output_format (str (default = "Eventuality")) – which format to return, “Eventuality” or “json”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • use_lemma (bool (default = True)) – whether the returned eventuality uses lemma

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted eventualities

Return type:

Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]]

Input:

[{'dependencies': [(1, 'nmod:poss', 0),
                   (3, 'nsubj', 1),
                   (3, 'aux', 2),
                   (3, 'dobj', 5),
                   (3, 'punct', 6),
                   (5, 'nmod:poss', 4)],
  'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'],
  'mentions': [],
  'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'],
  'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP '
           '(PRP$ your) (NN boat)))) (. .)))',
  'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'],
  'text': 'My army will find your boat.',
  'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']},
 {'dependencies': [(2, 'case', 0),
                   (2, 'det', 1),
                   (6, 'nmod:in', 2),
                   (6, 'punct', 3),
                   (6, 'nsubj', 4),
                   (6, 'cop', 5),
                   (6, 'ccomp', 9),
                   (6, 'punct', 13),
                   (9, 'nsubj', 7),
                   (9, 'aux', 8),
                   (9, 'iobj', 10),
                   (9, 'dobj', 12),
                   (12, 'amod', 11)],
  'lemmas': ['in',
             'the',
             'meantime',
             ',',
             'I',
             'be',
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodation',
             '.'],
  'mentions': [],
  'ners': ['O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O',
           'O'],
  'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP '
           "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD "
           'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS '
           'accommodations)))))))) (. .)))',
  'pos_tags': ['IN',
               'DT',
               'NN',
               ',',
               'PRP',
               'VBP',
               'JJ',
               'PRP',
               'MD',
               'VB',
               'PRP',
               'JJ',
               'NNS',
               '.'],
  'text': "In the meantime, I'm sure we could find you suitable "
          'accommodations.',
  'tokens': ['In',
             'the',
             'meantime',
             ',',
             'I',
             "'m",
             'sure',
             'we',
             'could',
             'find',
             'you',
             'suitable',
             'accommodations',
             '.']}]

Output:

[[my army will find you boat],
 [i be sure, we could find you suitable accommodation]]

relation_extractor

class aser.extract.relation_extractor.BaseRelationExtractor(**kw)[source]

Bases: object

Base ASER relation rxtractor to extract relations

close()[source]
extract_from_parsed_result(parsed_result, para_eventualities, output_format='Relation', in_order=True, **kw)[source]

Extract relations from the parsed result

Parameters:
  • parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp

  • para_eventualities (List[aser.eventuality.Eventuality]) – eventualities in the paragraph

  • output_format (str (default = "Relation")) – which format to return, “Relation” or “triplet”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted relations

Return type:

Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]

Input:

    [{'dependencies': [(1, 'nmod:poss', 0),
                       (3, 'nsubj', 1),
                       (3, 'aux', 2),
                       (3, 'dobj', 5),
                       (3, 'punct', 6),
                       (5, 'nmod:poss', 4)],
      'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'],
      'mentions': [],
      'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'],
      'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP '
               '(PRP$ your) (NN boat)))) (. .)))',
      'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'],
      'text': 'My army will find your boat.',
      'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']},
     {'dependencies': [(2, 'case', 0),
                       (2, 'det', 1),
                       (6, 'nmod:in', 2),
                       (6, 'punct', 3),
                       (6, 'nsubj', 4),
                       (6, 'cop', 5),
                       (6, 'ccomp', 9),
                       (6, 'punct', 13),
                       (9, 'nsubj', 7),
                       (9, 'aux', 8),
                       (9, 'iobj', 10),
                       (9, 'dobj', 12),
                       (12, 'amod', 11)],
      'lemmas': ['in',
                 'the',
                 'meantime',
                 ',',
                 'I',
                 'be',
                 'sure',
                 'we',
                 'could',
                 'find',
                 'you',
                 'suitable',
                 'accommodation',
                 '.'],
      'mentions': [],
      'ners': ['O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O'],
      'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP '
               "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD "
               'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS '
               'accommodations)))))))) (. .)))',
      'pos_tags': ['IN',
                   'DT',
                   'NN',
                   ',',
                   'PRP',
                   'VBP',
                   'JJ',
                   'PRP',
                   'MD',
                   'VB',
                   'PRP',
                   'JJ',
                   'NNS',
                   '.'],
      'text': "In the meantime, I'm sure we could find you suitable "
              'accommodations.',
      'tokens': ['In',
                 'the',
                 'meantime',
                 ',',
                 'I',
                 "'m",
                 'sure',
                 'we',
                 'could',
                 'find',
                 'you',
                 'suitable',
                 'accommodations',
                 '.']}],
    [[my army will find you boat],
     [i be sure, we could find you suitable accommodation]]

    Output:

    [[],
     [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})],
     [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]]
class aser.extract.relation_extractor.DiscourseRelationExtractor(**kw)[source]

Bases: BaseRelationExtractor

ASER relation extractor based on discourse parsing to extract relations (for ASER v2.0)

extract_from_parsed_result(parsed_result, para_eventualities, output_format='triplet', in_order=False, **kw)[source]

Extract relations from the parsed result

Parameters:
  • parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp

  • para_eventualities (List[aser.eventuality.Eventuality]) – eventualities in the paragraph

  • output_format (str (default = "Relation")) – which format to return, “Relation” or “triplet”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted relations

Return type:

Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]

Input:

    [{'dependencies': [(1, 'nmod:poss', 0),
                       (3, 'nsubj', 1),
                       (3, 'aux', 2),
                       (3, 'dobj', 5),
                       (3, 'punct', 6),
                       (5, 'nmod:poss', 4)],
      'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'],
      'mentions': [],
      'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'],
      'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP '
               '(PRP$ your) (NN boat)))) (. .)))',
      'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'],
      'text': 'My army will find your boat.',
      'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']},
     {'dependencies': [(2, 'case', 0),
                       (2, 'det', 1),
                       (6, 'nmod:in', 2),
                       (6, 'punct', 3),
                       (6, 'nsubj', 4),
                       (6, 'cop', 5),
                       (6, 'ccomp', 9),
                       (6, 'punct', 13),
                       (9, 'nsubj', 7),
                       (9, 'aux', 8),
                       (9, 'iobj', 10),
                       (9, 'dobj', 12),
                       (12, 'amod', 11)],
      'lemmas': ['in',
                 'the',
                 'meantime',
                 ',',
                 'I',
                 'be',
                 'sure',
                 'we',
                 'could',
                 'find',
                 'you',
                 'suitable',
                 'accommodation',
                 '.'],
      'mentions': [],
      'ners': ['O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O'],
      'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP '
               "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD "
               'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS '
               'accommodations)))))))) (. .)))',
      'pos_tags': ['IN',
                   'DT',
                   'NN',
                   ',',
                   'PRP',
                   'VBP',
                   'JJ',
                   'PRP',
                   'MD',
                   'VB',
                   'PRP',
                   'JJ',
                   'NNS',
                   '.'],
      'text': "In the meantime, I'm sure we could find you suitable "
              'accommodations.',
      'tokens': ['In',
                 'the',
                 'meantime',
                 ',',
                 'I',
                 "'m",
                 'sure',
                 'we',
                 'could',
                 'find',
                 'you',
                 'suitable',
                 'accommodations',
                 '.']}],
    [[my army will find you boat],
     [i be sure, we could find you suitable accommodation]]

    Output:

    [[],
     [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})],
     [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]]
class aser.extract.relation_extractor.SeedRuleRelationExtractor(**kw)[source]

Bases: BaseRelationExtractor

ASER relation extractor based on rules to extract relations (for ASER v1.0)

extract_from_parsed_result(parsed_result, para_eventualities, output_format='Relation', in_order=True, **kw)[source]

Extract relations from the parsed result

Parameters:
  • parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp

  • para_eventualities (List[aser.eventuality.Eventuality]) – eventualities in the paragraph

  • output_format (str (default = "Relation")) – which format to return, “Relation” or “triplet”

  • in_order (bool (default = True)) – whether the returned order follows the input token order

  • kw (Dict[str, object]) – other parameters

Returns:

the extracted relations

Return type:

Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]

Input:

    [{'dependencies': [(1, 'nmod:poss', 0),
                       (3, 'nsubj', 1),
                       (3, 'aux', 2),
                       (3, 'dobj', 5),
                       (3, 'punct', 6),
                       (5, 'nmod:poss', 4)],
      'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'],
      'mentions': [],
      'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'],
      'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP '
               '(PRP$ your) (NN boat)))) (. .)))',
      'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'],
      'text': 'My army will find your boat.',
      'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']},
     {'dependencies': [(2, 'case', 0),
                       (2, 'det', 1),
                       (6, 'nmod:in', 2),
                       (6, 'punct', 3),
                       (6, 'nsubj', 4),
                       (6, 'cop', 5),
                       (6, 'ccomp', 9),
                       (6, 'punct', 13),
                       (9, 'nsubj', 7),
                       (9, 'aux', 8),
                       (9, 'iobj', 10),
                       (9, 'dobj', 12),
                       (12, 'amod', 11)],
      'lemmas': ['in',
                 'the',
                 'meantime',
                 ',',
                 'I',
                 'be',
                 'sure',
                 'we',
                 'could',
                 'find',
                 'you',
                 'suitable',
                 'accommodation',
                 '.'],
      'mentions': [],
      'ners': ['O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O',
               'O'],
      'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP '
               "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD "
               'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS '
               'accommodations)))))))) (. .)))',
      'pos_tags': ['IN',
                   'DT',
                   'NN',
                   ',',
                   'PRP',
                   'VBP',
                   'JJ',
                   'PRP',
                   'MD',
                   'VB',
                   'PRP',
                   'JJ',
                   'NNS',
                   '.'],
      'text': "In the meantime, I'm sure we could find you suitable "
              'accommodations.',
      'tokens': ['In',
                 'the',
                 'meantime',
                 ',',
                 'I',
                 "'m",
                 'sure',
                 'we',
                 'could',
                 'find',
                 'you',
                 'suitable',
                 'accommodations',
                 '.']}],
    [[my army will find you boat],
     [i be sure, we could find you suitable accommodation]]

    Output:

    [[],
     [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})],
     [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]]

parsed_reader

class aser.extract.parsed_reader.ParsedReader[source]

Bases: object

File reader to read parsed results from Disk

close()[source]
generate_sid(sentence, file_name, line_no)[source]
Parameters:
  • sentence (str) – the raw text

  • file_name (str) – the file name

  • line_no (int) – the line number

Returns:

the corresponding sentence id

Return type:

str

get_parsed_paragraphs_from_file(processed_path)[source]

This method retrieves all paragraphs from a processed file

Parameters:

processed_path (str or None) – the file path of the processed file

Returns:

a list of lists of dicts

get_parsed_sentence_and_context(sid, context_window_size=0)[source]

Retrieve the parsed results of the corresponding sentence and its context

Parameters:
  • sid (str) – the sentence id

  • context_window_size (int (default = 0)) – the context window size

Returns:

a dictionary that contains the “sentence”, “left_context”, and “right_context”

Return type:

Dict[str, object]

sentence_reader

class aser.extract.sentence_parser.SentenceParser(corenlp_path='', corenlp_port=0, **kw)[source]

Bases: object

Sentence parser to process files that contain raw texts

Parameters:
  • corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2

  • corenlp_port (int (default = 0)) – corenlp port, e.g., 9000

  • kw (Dict[str, object]) – other parameters

close()[source]

Close the parser safely

generate_sid(sentence, file_name, sid)[source]
Parameters:
  • sentence (str) – the raw text

  • file_name (str) – the file name

  • line_no (int) – the line number

Returns:

the corresponding sentence id

Return type:

str

parse(paragraph, annotators=None, max_len=1024)[source]
Parameters:
  • paragraph (str) – a raw text

  • annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html

  • max_len (int (default = 1024)) – the max length of a paragraph (constituency parsing cannot handle super-long sentences)

Returns:

the parsed result

Return type:

List[Dict[str, object]]

parse_raw_file(raw_path, processed_path=None, annotators=None, max_len=1024)[source]

Parse all raw texts in the given file

Parameters:
  • raw_path (str) – the file path that contains raw texts

  • processed_path (str) – the file path that stores the parsed result

  • annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html

  • max_len (int (default = 1024)) – the max length of a paragraph (constituency parsing cannot handle super-long sentences)

Returns:

the parsed result

Return type:

List[List[Dict[str, object]]]