Extractor
aser_extractor
- class aser.extract.aser_extractor.BaseASERExtractor(corenlp_path='', corenlp_port=0, **kw)[source]
Bases:
object
Base ASER Extractor to extract both eventualities and relations. It includes an instance of BaseEventualityExtractor and an instance of BaseRelationExtractor.
- Parameters:
corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2
corenlp_port (int (default = 0)) – corenlp port, e.g., 9000
kw (Dict[str, object]) – other parameters
- extract_eventualities_from_parsed_result(parsed_result, output_format='Eventuality', in_order=True, use_lemma=True, **kw)[source]
Extract eventualities from the parsed result
- Parameters:
parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp
output_format (str (default = "Eventuality")) – which format to return, “Eventuality” or “json”
in_order (bool (default = True)) – whether the returned order follows the input token order
use_lemma (bool (default = True)) – whether the returned eventuality uses lemma
kw (Dict[str, object]) – other parameters
- Returns:
the extracted eventualities
- Return type:
Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]]
Input: [{'dependencies': [(1, 'nmod:poss', 0), (3, 'nsubj', 1), (3, 'aux', 2), (3, 'dobj', 5), (3, 'punct', 6), (5, 'nmod:poss', 4)], 'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP ' '(PRP$ your) (NN boat)))) (. .)))', 'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'], 'text': 'My army will find your boat.', 'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']}, {'dependencies': [(2, 'case', 0), (2, 'det', 1), (6, 'nmod:in', 2), (6, 'punct', 3), (6, 'nsubj', 4), (6, 'cop', 5), (6, 'ccomp', 9), (6, 'punct', 13), (9, 'nsubj', 7), (9, 'aux', 8), (9, 'iobj', 10), (9, 'dobj', 12), (12, 'amod', 11)], 'lemmas': ['in', 'the', 'meantime', ',', 'I', 'be', 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodation', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP ' "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD " 'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS ' 'accommodations)))))))) (. .)))', 'pos_tags': ['IN', 'DT', 'NN', ',', 'PRP', 'VBP', 'JJ', 'PRP', 'MD', 'VB', 'PRP', 'JJ', 'NNS', '.'], 'text': "In the meantime, I'm sure we could find you suitable " 'accommodations.', 'tokens': ['In', 'the', 'meantime', ',', 'I', "'m", 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodations', '.']}] Output: [[my army will find you boat], [i be sure, we could find you suitable accommodation]]
- extract_eventualities_from_text(text, output_format='Eventuality', in_order=True, use_lemma=True, annotators=None, **kw)[source]
Extract eventualities from a raw text
- Parameters:
text (str) – a raw text
output_format (str (default = "Eventuality")) – which format to return, “Eventuality” or “json”
in_order (bool (default = True)) – whether the returned order follows the input token order
use_lemma (bool (default = True)) – whether the returned eventuality uses lemma
annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html
kw (Dict[str, object]) – other parameters
- Returns:
the extracted eventualities
- Return type:
Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]]
Input: "My army will find your boat. In the meantime, I'm sure we could find you suitable accommodations." Output: [[my army will find you boat], [i be sure, we could find you suitable accommodation]]
- extract_from_parsed_result(parsed_result, eventuality_output_format='Eventuality', relation_output_format='Relation', in_order=True, use_lemma=True, **kw)[source]
Extract both eventualities and relations from a parsed result
- Parameters:
parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp
eventuality_output_format (str (default = "Eventuality")) – which format to return eventualities, “Eventuality” or “json”
relation_output_format (str (default = "Relation")) – which format to return relations, “Relation” or “triplet”
in_order (bool (default = True)) – whether the returned order follows the input token order
use_lemma (bool (default = True)) – whether the returned eventuality uses lemma
kw (Dict[str, object]) – other parameters
- Returns:
the extracted eventualities and relations
- Return type:
Tuple[Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]], Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]]
Input: [{'dependencies': [(1, 'nmod:poss', 0), (3, 'nsubj', 1), (3, 'aux', 2), (3, 'dobj', 5), (3, 'punct', 6), (5, 'nmod:poss', 4)], 'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP ' '(PRP$ your) (NN boat)))) (. .)))', 'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'], 'text': 'My army will find your boat.', 'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']}, {'dependencies': [(2, 'case', 0), (2, 'det', 1), (6, 'nmod:in', 2), (6, 'punct', 3), (6, 'nsubj', 4), (6, 'cop', 5), (6, 'ccomp', 9), (6, 'punct', 13), (9, 'nsubj', 7), (9, 'aux', 8), (9, 'iobj', 10), (9, 'dobj', 12), (12, 'amod', 11)], 'lemmas': ['in', 'the', 'meantime', ',', 'I', 'be', 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodation', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP ' "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD " 'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS ' 'accommodations)))))))) (. .)))', 'pos_tags': ['IN', 'DT', 'NN', ',', 'PRP', 'VBP', 'JJ', 'PRP', 'MD', 'VB', 'PRP', 'JJ', 'NNS', '.'], 'text': "In the meantime, I'm sure we could find you suitable " 'accommodations.', 'tokens': ['In', 'the', 'meantime', ',', 'I', "'m", 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodations', '.']}], [[my army will find you boat], [i be sure, we could find you suitable accommodation]] Output: ([[my army will find you boat], [i be sure, we could find you suitable accommodation]], [[], [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})], [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]])
- extract_from_text(text, eventuality_output_format='Eventuality', relation_output_format='Relation', in_order=True, use_lemma=True, annotators=None, **kw)[source]
Extract both eventualities and relations from a raw text
- Parameters:
text (str) – a raw text
eventuality_output_format (str (default = "Eventuality")) – which format to return eventualities, “Eventuality” or “json”
relation_output_format (str (default = "Relation")) – which format to return relations, “Relation” or “triplet”
in_order (bool (default = True)) – whether the returned order follows the input token order
use_lemma (bool (default = True)) – whether the returned eventuality uses lemma
annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html
kw (Dict[str, object]) – other parameters
- Returns:
the extracted eventualities and relations
- Return type:
- rtype:
Tuple[Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]], Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]]
Input: "My army will find your boat. In the meantime, I'm sure we could find you suitable accommodations." Output: ([[my army will find you boat], [i be sure, we could find you suitable accommodation]], [[], [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})], [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]])
- extract_relations_from_parsed_result(parsed_result, para_eventualities, output_format='Relation', in_order=True, **kw)[source]
Extract relations from a parsed result (of a paragraph) and extracted eventualities
- Parameters:
parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp
para_eventualities (List[aser.eventuality.Eventuality]) – eventualities in the paragraph
output_format (str (default = "Relation")) – which format to return, “Relation” or “triplet”
in_order (bool (default = True)) – whether the returned order follows the input token order
kw (Dict[str, object]) – other parameters
- Returns:
the extracted relations
- Return type:
Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]
Input: [{'dependencies': [(1, 'nmod:poss', 0), (3, 'nsubj', 1), (3, 'aux', 2), (3, 'dobj', 5), (3, 'punct', 6), (5, 'nmod:poss', 4)], 'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP ' '(PRP$ your) (NN boat)))) (. .)))', 'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'], 'text': 'My army will find your boat.', 'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']}, {'dependencies': [(2, 'case', 0), (2, 'det', 1), (6, 'nmod:in', 2), (6, 'punct', 3), (6, 'nsubj', 4), (6, 'cop', 5), (6, 'ccomp', 9), (6, 'punct', 13), (9, 'nsubj', 7), (9, 'aux', 8), (9, 'iobj', 10), (9, 'dobj', 12), (12, 'amod', 11)], 'lemmas': ['in', 'the', 'meantime', ',', 'I', 'be', 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodation', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP ' "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD " 'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS ' 'accommodations)))))))) (. .)))', 'pos_tags': ['IN', 'DT', 'NN', ',', 'PRP', 'VBP', 'JJ', 'PRP', 'MD', 'VB', 'PRP', 'JJ', 'NNS', '.'], 'text': "In the meantime, I'm sure we could find you suitable " 'accommodations.', 'tokens': ['In', 'the', 'meantime', ',', 'I', "'m", 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodations', '.']}], [[my army will find you boat], [i be sure, we could find you suitable accommodation]] Output: [[], [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})], [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]]
- extract_relations_from_text(text, output_format='Relation', in_order=True, annotators=None, **kw)[source]
Extract relations from a raw text and extracted eventualities
- Parameters:
text (str) – a raw text
output_format (str (default = "Relation")) – which format to return, “Relation” or “triplet”
in_order (bool (default = True)) – whether the returned order follows the input token order
annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html
kw (Dict[str, object]) – other parameters
- Returns:
the extracted relations
- Return type:
Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]
Input: "My army will find your boat. In the meantime, I'm sure we could find you suitable accommodations." Output: [[], [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})], [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]]
- parse_text(text, annotators=None)[source]
Parse a raw text by corenlp
- Parameters:
text (str) – a raw text
annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html
- Returns:
the parsed result
- Return type:
List[Dict[str, object]]
Input: "My army will find your boat. In the meantime, I'm sure we could find you suitable accommodations." Output: [{'dependencies': [(1, 'nmod:poss', 0), (3, 'nsubj', 1), (3, 'aux', 2), (3, 'dobj', 5), (3, 'punct', 6), (5, 'nmod:poss', 4)], 'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP ' '(PRP$ your) (NN boat)))) (. .)))', 'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'], 'text': 'My army will find your boat.', 'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']}, {'dependencies': [(2, 'case', 0), (2, 'det', 1), (6, 'nmod:in', 2), (6, 'punct', 3), (6, 'nsubj', 4), (6, 'cop', 5), (6, 'ccomp', 9), (6, 'punct', 13), (9, 'nsubj', 7), (9, 'aux', 8), (9, 'iobj', 10), (9, 'dobj', 12), (12, 'amod', 11)], 'lemmas': ['in', 'the', 'meantime', ',', 'I', 'be', 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodation', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP ' "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD " 'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS ' 'accommodations)))))))) (. .)))', 'pos_tags': ['IN', 'DT', 'NN', ',', 'PRP', 'VBP', 'JJ', 'PRP', 'MD', 'VB', 'PRP', 'JJ', 'NNS', '.'], 'text': "In the meantime, I'm sure we could find you suitable " 'accommodations.', 'tokens': ['In', 'the', 'meantime', ',', 'I', "'m", 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodations', '.']}]
- class aser.extract.aser_extractor.DiscourseASERExtractor(corenlp_path='', corenlp_port=0, **kw)[source]
Bases:
BaseASERExtractor
ASER Extractor based on discourse parsing to extract both eventualities and relations (for ASER v2.0)
- Parameters:
corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2
corenlp_port (int (default = 0)) – corenlp port, e.g., 9000
kw (Dict[str, object]) – other parameters
- extract_from_parsed_result(parsed_result, eventuality_output_format='Eventuality', relation_output_format='Relation', in_order=True, use_lemma=True, **kw)[source]
Extract both eventualities and relations from a parsed result
- Parameters:
parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp
eventuality_output_format (str (default = "Eventuality")) – which format to return eventualities, “Eventuality” or “json”
relation_output_format (str (default = "Relation")) – which format to return relations, “Relation” or “triplet”
in_order (bool (default = True)) – whether the returned order follows the input token order
use_lemma (bool (default = True)) – whether the returned eventuality uses lemma
kw (Dict[str, object]) – other parameters (e.g., syntax_tree_cache)
- Returns:
the extracted eventualities and relations
- Return type:
- rtype:
Tuple[Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]], Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]]
- class aser.extract.aser_extractor.SeedRuleASERExtractor(corenlp_path='', corenlp_port=0, **kw)[source]
Bases:
BaseASERExtractor
ASER Extractor based on rules to extract both eventualities and relations (for ASER v1.0)
- Parameters:
corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2
corenlp_port (int (default = 0)) – corenlp port, e.g., 9000
kw (Dict[str, object]) – other parameters
eventuality_extractor
- class aser.extract.eventuality_extractor.BaseEventualityExtractor(corenlp_path='', corenlp_port=0, **kw)[source]
Bases:
object
Base ASER eventuality extractor to extract eventualities
- Parameters:
corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2
corenlp_port (int (default = 0)) – corenlp port, e.g., 9000
kw (Dict[str, object]) – other parameters
- extract_from_parsed_result(parsed_result, output_format='Eventuality', in_order=True, use_lemma=True, **kw)[source]
Extract eventualities from the parsed result
- Parameters:
parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp
output_format (str (default = "Eventuality")) – which format to return, “Eventuality” or “json”
in_order (bool (default = True)) – whether the returned order follows the input token order
use_lemma (bool (default = True)) – whether the returned eventuality uses lemma
kw (Dict[str, object]) – other parameters
- Returns:
the extracted eventualities
- Return type:
Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]]
Input: [{'dependencies': [(1, 'nmod:poss', 0), (3, 'nsubj', 1), (3, 'aux', 2), (3, 'dobj', 5), (3, 'punct', 6), (5, 'nmod:poss', 4)], 'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP ' '(PRP$ your) (NN boat)))) (. .)))', 'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'], 'text': 'My army will find your boat.', 'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']}, {'dependencies': [(2, 'case', 0), (2, 'det', 1), (6, 'nmod:in', 2), (6, 'punct', 3), (6, 'nsubj', 4), (6, 'cop', 5), (6, 'ccomp', 9), (6, 'punct', 13), (9, 'nsubj', 7), (9, 'aux', 8), (9, 'iobj', 10), (9, 'dobj', 12), (12, 'amod', 11)], 'lemmas': ['in', 'the', 'meantime', ',', 'I', 'be', 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodation', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP ' "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD " 'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS ' 'accommodations)))))))) (. .)))', 'pos_tags': ['IN', 'DT', 'NN', ',', 'PRP', 'VBP', 'JJ', 'PRP', 'MD', 'VB', 'PRP', 'JJ', 'NNS', '.'], 'text': "In the meantime, I'm sure we could find you suitable " 'accommodations.', 'tokens': ['In', 'the', 'meantime', ',', 'I', "'m", 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodations', '.']}] Output: [[my army will find you boat], [i be sure, we could find you suitable accommodation]]
- extract_from_text(text, output_format='Eventuality', in_order=True, use_lemma=True, annotators=None, **kw)[source]
Extract eventualities from a raw text
- Parameters:
text (str) – a raw text
output_format (str (default = "Eventuality")) – which format to return, “Eventuality” or “json”
in_order (bool (default = True)) – whether the returned order follows the input token order
use_lemma (bool (default = True)) – whether the returned eventuality uses lemma
annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html
kw (Dict[str, object]) – other parameters
- Returns:
the extracted eventualities
- Return type:
Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]]
Input: "My army will find your boat. In the meantime, I'm sure we could find you suitable accommodations." Output: [[my army will find you boat], [i be sure, we could find you suitable accommodation]]
- parse_text(text, annotators=None)[source]
Parse a raw text by corenlp
- Parameters:
text (str) – a raw text
annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html
- Returns:
the parsed result
- Return type:
List[Dict[str, object]]
Input: "My army will find your boat. In the meantime, I'm sure we could find you suitable accommodations." Output: [{'dependencies': [(1, 'nmod:poss', 0), (3, 'nsubj', 1), (3, 'aux', 2), (3, 'dobj', 5), (3, 'punct', 6), (5, 'nmod:poss', 4)], 'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP ' '(PRP$ your) (NN boat)))) (. .)))', 'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'], 'text': 'My army will find your boat.', 'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']}, {'dependencies': [(2, 'case', 0), (2, 'det', 1), (6, 'nmod:in', 2), (6, 'punct', 3), (6, 'nsubj', 4), (6, 'cop', 5), (6, 'ccomp', 9), (6, 'punct', 13), (9, 'nsubj', 7), (9, 'aux', 8), (9, 'iobj', 10), (9, 'dobj', 12), (12, 'amod', 11)], 'lemmas': ['in', 'the', 'meantime', ',', 'I', 'be', 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodation', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP ' "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD " 'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS ' 'accommodations)))))))) (. .)))', 'pos_tags': ['IN', 'DT', 'NN', ',', 'PRP', 'VBP', 'JJ', 'PRP', 'MD', 'VB', 'PRP', 'JJ', 'NNS', '.'], 'text': "In the meantime, I'm sure we could find you suitable " 'accommodations.', 'tokens': ['In', 'the', 'meantime', ',', 'I', "'m", 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodations', '.']}]
- class aser.extract.eventuality_extractor.DiscourseEventualityExtractor(corenlp_path='', corenlp_port=0, **kw)[source]
Bases:
BaseEventualityExtractor
ASER eventuality extractor based on constituency analysis to extract eventualities (for ASER v2.0)
- Parameters:
corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2
corenlp_port (int (default = 0)) – corenlp port, e.g., 9000
kw (Dict[str, object]) – other parameters
- extract_from_parsed_result(parsed_result, output_format='Eventuality', in_order=True, use_lemma=True, **kw)[source]
Extract eventualities from the parsed result
- Parameters:
parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp
output_format (str (default = "Eventuality")) – which format to return, “Eventuality” or “json”
in_order (bool (default = True)) – whether the returned order follows the input token order
use_lemma (bool (default = True)) – whether the returned eventuality uses lemma
kw (Dict[str, object]) – other parameters
- Returns:
the extracted eventualities
- Return type:
Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]]
Input: [{'dependencies': [(1, 'nmod:poss', 0), (3, 'nsubj', 1), (3, 'aux', 2), (3, 'dobj', 5), (3, 'punct', 6), (5, 'nmod:poss', 4)], 'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP ' '(PRP$ your) (NN boat)))) (. .)))', 'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'], 'text': 'My army will find your boat.', 'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']}, {'dependencies': [(2, 'case', 0), (2, 'det', 1), (6, 'nmod:in', 2), (6, 'punct', 3), (6, 'nsubj', 4), (6, 'cop', 5), (6, 'ccomp', 9), (6, 'punct', 13), (9, 'nsubj', 7), (9, 'aux', 8), (9, 'iobj', 10), (9, 'dobj', 12), (12, 'amod', 11)], 'lemmas': ['in', 'the', 'meantime', ',', 'I', 'be', 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodation', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP ' "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD " 'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS ' 'accommodations)))))))) (. .)))', 'pos_tags': ['IN', 'DT', 'NN', ',', 'PRP', 'VBP', 'JJ', 'PRP', 'MD', 'VB', 'PRP', 'JJ', 'NNS', '.'], 'text': "In the meantime, I'm sure we could find you suitable " 'accommodations.', 'tokens': ['In', 'the', 'meantime', ',', 'I', "'m", 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodations', '.']}] Output: [[my army will find you boat], [i be sure, we could find you suitable accommodation]]
- class aser.extract.eventuality_extractor.SeedRuleEventualityExtractor(corenlp_path='', corenlp_port=0, **kw)[source]
Bases:
BaseEventualityExtractor
ASER eventuality extractor based on rules to extract eventualities (for ASER v1.0)
- Parameters:
corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2
corenlp_port (int (default = 0)) – corenlp port, e.g., 9000
kw (Dict[str, object]) – other parameters, e.g., “skip_words” to drop sentences that contain such words
- extract_from_parsed_result(parsed_result, output_format='Eventuality', in_order=True, use_lemma=True, **kw)[source]
Extract eventualities from the parsed result
- Parameters:
parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp
output_format (str (default = "Eventuality")) – which format to return, “Eventuality” or “json”
in_order (bool (default = True)) – whether the returned order follows the input token order
use_lemma (bool (default = True)) – whether the returned eventuality uses lemma
kw (Dict[str, object]) – other parameters
- Returns:
the extracted eventualities
- Return type:
Union[List[List[aser.eventuality.Eventuality]], List[List[Dict[str, object]]], List[aser.eventuality.Eventuality], List[Dict[str, object]]]
Input: [{'dependencies': [(1, 'nmod:poss', 0), (3, 'nsubj', 1), (3, 'aux', 2), (3, 'dobj', 5), (3, 'punct', 6), (5, 'nmod:poss', 4)], 'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP ' '(PRP$ your) (NN boat)))) (. .)))', 'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'], 'text': 'My army will find your boat.', 'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']}, {'dependencies': [(2, 'case', 0), (2, 'det', 1), (6, 'nmod:in', 2), (6, 'punct', 3), (6, 'nsubj', 4), (6, 'cop', 5), (6, 'ccomp', 9), (6, 'punct', 13), (9, 'nsubj', 7), (9, 'aux', 8), (9, 'iobj', 10), (9, 'dobj', 12), (12, 'amod', 11)], 'lemmas': ['in', 'the', 'meantime', ',', 'I', 'be', 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodation', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP ' "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD " 'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS ' 'accommodations)))))))) (. .)))', 'pos_tags': ['IN', 'DT', 'NN', ',', 'PRP', 'VBP', 'JJ', 'PRP', 'MD', 'VB', 'PRP', 'JJ', 'NNS', '.'], 'text': "In the meantime, I'm sure we could find you suitable " 'accommodations.', 'tokens': ['In', 'the', 'meantime', ',', 'I', "'m", 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodations', '.']}] Output: [[my army will find you boat], [i be sure, we could find you suitable accommodation]]
relation_extractor
- class aser.extract.relation_extractor.BaseRelationExtractor(**kw)[source]
Bases:
object
Base ASER relation rxtractor to extract relations
- extract_from_parsed_result(parsed_result, para_eventualities, output_format='Relation', in_order=True, **kw)[source]
Extract relations from the parsed result
- Parameters:
parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp
para_eventualities (List[aser.eventuality.Eventuality]) – eventualities in the paragraph
output_format (str (default = "Relation")) – which format to return, “Relation” or “triplet”
in_order (bool (default = True)) – whether the returned order follows the input token order
kw (Dict[str, object]) – other parameters
- Returns:
the extracted relations
- Return type:
Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]
Input: [{'dependencies': [(1, 'nmod:poss', 0), (3, 'nsubj', 1), (3, 'aux', 2), (3, 'dobj', 5), (3, 'punct', 6), (5, 'nmod:poss', 4)], 'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP ' '(PRP$ your) (NN boat)))) (. .)))', 'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'], 'text': 'My army will find your boat.', 'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']}, {'dependencies': [(2, 'case', 0), (2, 'det', 1), (6, 'nmod:in', 2), (6, 'punct', 3), (6, 'nsubj', 4), (6, 'cop', 5), (6, 'ccomp', 9), (6, 'punct', 13), (9, 'nsubj', 7), (9, 'aux', 8), (9, 'iobj', 10), (9, 'dobj', 12), (12, 'amod', 11)], 'lemmas': ['in', 'the', 'meantime', ',', 'I', 'be', 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodation', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP ' "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD " 'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS ' 'accommodations)))))))) (. .)))', 'pos_tags': ['IN', 'DT', 'NN', ',', 'PRP', 'VBP', 'JJ', 'PRP', 'MD', 'VB', 'PRP', 'JJ', 'NNS', '.'], 'text': "In the meantime, I'm sure we could find you suitable " 'accommodations.', 'tokens': ['In', 'the', 'meantime', ',', 'I', "'m", 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodations', '.']}], [[my army will find you boat], [i be sure, we could find you suitable accommodation]] Output: [[], [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})], [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]]
- class aser.extract.relation_extractor.DiscourseRelationExtractor(**kw)[source]
Bases:
BaseRelationExtractor
ASER relation extractor based on discourse parsing to extract relations (for ASER v2.0)
- extract_from_parsed_result(parsed_result, para_eventualities, output_format='triplet', in_order=False, **kw)[source]
Extract relations from the parsed result
- Parameters:
parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp
para_eventualities (List[aser.eventuality.Eventuality]) – eventualities in the paragraph
output_format (str (default = "Relation")) – which format to return, “Relation” or “triplet”
in_order (bool (default = True)) – whether the returned order follows the input token order
kw (Dict[str, object]) – other parameters
- Returns:
the extracted relations
- Return type:
Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]
Input: [{'dependencies': [(1, 'nmod:poss', 0), (3, 'nsubj', 1), (3, 'aux', 2), (3, 'dobj', 5), (3, 'punct', 6), (5, 'nmod:poss', 4)], 'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP ' '(PRP$ your) (NN boat)))) (. .)))', 'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'], 'text': 'My army will find your boat.', 'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']}, {'dependencies': [(2, 'case', 0), (2, 'det', 1), (6, 'nmod:in', 2), (6, 'punct', 3), (6, 'nsubj', 4), (6, 'cop', 5), (6, 'ccomp', 9), (6, 'punct', 13), (9, 'nsubj', 7), (9, 'aux', 8), (9, 'iobj', 10), (9, 'dobj', 12), (12, 'amod', 11)], 'lemmas': ['in', 'the', 'meantime', ',', 'I', 'be', 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodation', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP ' "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD " 'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS ' 'accommodations)))))))) (. .)))', 'pos_tags': ['IN', 'DT', 'NN', ',', 'PRP', 'VBP', 'JJ', 'PRP', 'MD', 'VB', 'PRP', 'JJ', 'NNS', '.'], 'text': "In the meantime, I'm sure we could find you suitable " 'accommodations.', 'tokens': ['In', 'the', 'meantime', ',', 'I', "'m", 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodations', '.']}], [[my army will find you boat], [i be sure, we could find you suitable accommodation]] Output: [[], [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})], [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]]
- class aser.extract.relation_extractor.SeedRuleRelationExtractor(**kw)[source]
Bases:
BaseRelationExtractor
ASER relation extractor based on rules to extract relations (for ASER v1.0)
- extract_from_parsed_result(parsed_result, para_eventualities, output_format='Relation', in_order=True, **kw)[source]
Extract relations from the parsed result
- Parameters:
parsed_result (List[Dict[str, object]]) – the parsed result returned by corenlp
para_eventualities (List[aser.eventuality.Eventuality]) – eventualities in the paragraph
output_format (str (default = "Relation")) – which format to return, “Relation” or “triplet”
in_order (bool (default = True)) – whether the returned order follows the input token order
kw (Dict[str, object]) – other parameters
- Returns:
the extracted relations
- Return type:
Union[List[List[aser.relation.Relation]], List[List[Dict[str, object]]], List[aser.relation.Relation], List[Dict[str, object]]]
Input: [{'dependencies': [(1, 'nmod:poss', 0), (3, 'nsubj', 1), (3, 'aux', 2), (3, 'dobj', 5), (3, 'punct', 6), (5, 'nmod:poss', 4)], 'lemmas': ['my', 'army', 'will', 'find', 'you', 'boat', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (NP (PRP$ My) (NN army)) (VP (MD will) (VP (VB find) (NP ' '(PRP$ your) (NN boat)))) (. .)))', 'pos_tags': ['PRP$', 'NN', 'MD', 'VB', 'PRP$', 'NN', '.'], 'text': 'My army will find your boat.', 'tokens': ['My', 'army', 'will', 'find', 'your', 'boat', '.']}, {'dependencies': [(2, 'case', 0), (2, 'det', 1), (6, 'nmod:in', 2), (6, 'punct', 3), (6, 'nsubj', 4), (6, 'cop', 5), (6, 'ccomp', 9), (6, 'punct', 13), (9, 'nsubj', 7), (9, 'aux', 8), (9, 'iobj', 10), (9, 'dobj', 12), (12, 'amod', 11)], 'lemmas': ['in', 'the', 'meantime', ',', 'I', 'be', 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodation', '.'], 'mentions': [], 'ners': ['O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O'], 'parse': '(ROOT (S (PP (IN In) (NP (DT the) (NN meantime))) (, ,) (NP (PRP ' "I)) (VP (VBP 'm) (ADJP (JJ sure) (SBAR (S (NP (PRP we)) (VP (MD " 'could) (VP (VB find) (NP (PRP you)) (NP (JJ suitable) (NNS ' 'accommodations)))))))) (. .)))', 'pos_tags': ['IN', 'DT', 'NN', ',', 'PRP', 'VBP', 'JJ', 'PRP', 'MD', 'VB', 'PRP', 'JJ', 'NNS', '.'], 'text': "In the meantime, I'm sure we could find you suitable " 'accommodations.', 'tokens': ['In', 'the', 'meantime', ',', 'I', "'m", 'sure', 'we', 'could', 'find', 'you', 'suitable', 'accommodations', '.']}], [[my army will find you boat], [i be sure, we could find you suitable accommodation]] Output: [[], [(7d9ea9023b66a0ebc167f0dbb6ea8cd75d7b46f9, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Co_Occurrence': 1.0})], [(8540897b645962964fd644242d4cc0032f024e86, 25edad6781577dcb3ba715c8230416fb0d4c45c4, {'Synchronous': 1.0})]]
parsed_reader
- class aser.extract.parsed_reader.ParsedReader[source]
Bases:
object
File reader to read parsed results from Disk
- generate_sid(sentence, file_name, line_no)[source]
- Parameters:
sentence (str) – the raw text
file_name (str) – the file name
line_no (int) – the line number
- Returns:
the corresponding sentence id
- Return type:
str
- get_parsed_paragraphs_from_file(processed_path)[source]
This method retrieves all paragraphs from a processed file
- Parameters:
processed_path (str or None) – the file path of the processed file
- Returns:
a list of lists of dicts
- get_parsed_sentence_and_context(sid, context_window_size=0)[source]
Retrieve the parsed results of the corresponding sentence and its context
- Parameters:
sid (str) – the sentence id
context_window_size (int (default = 0)) – the context window size
- Returns:
a dictionary that contains the “sentence”, “left_context”, and “right_context”
- Return type:
Dict[str, object]
sentence_reader
- class aser.extract.sentence_parser.SentenceParser(corenlp_path='', corenlp_port=0, **kw)[source]
Bases:
object
Sentence parser to process files that contain raw texts
- Parameters:
corenlp_path (str (default = "")) – corenlp path, e.g., /home/xliucr/stanford-corenlp-3.9.2
corenlp_port (int (default = 0)) – corenlp port, e.g., 9000
kw (Dict[str, object]) – other parameters
- generate_sid(sentence, file_name, sid)[source]
- Parameters:
sentence (str) – the raw text
file_name (str) – the file name
line_no (int) – the line number
- Returns:
the corresponding sentence id
- Return type:
str
- parse(paragraph, annotators=None, max_len=1024)[source]
- Parameters:
paragraph (str) – a raw text
annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html
max_len (int (default = 1024)) – the max length of a paragraph (constituency parsing cannot handle super-long sentences)
- Returns:
the parsed result
- Return type:
List[Dict[str, object]]
- parse_raw_file(raw_path, processed_path=None, annotators=None, max_len=1024)[source]
Parse all raw texts in the given file
- Parameters:
raw_path (str) – the file path that contains raw texts
processed_path (str) – the file path that stores the parsed result
annotators (Union[List, None] (default = None)) – annotators for corenlp, please refer to https://stanfordnlp.github.io/CoreNLP/annotators.html
max_len (int (default = 1024)) – the max length of a paragraph (constituency parsing cannot handle super-long sentences)
- Returns:
the parsed result
- Return type:
List[List[Dict[str, object]]]