soton_corenlppy.lexico.wordnet_lib module

Wordnet lexicon processing

soton_corenlppy.lexico.wordnet_lib.expand_hypernyms(set_lexicon, syn, lang='eng', pos='asrnv', max_depth=3, depth=0, dict_lexicon_config=None)[source]

get all inherited wordnet hypernym synsets and sisters (immediate hyponyms), and add them to set_lexicon. hyponyms: Y is a hyponym of X if every Y is a (kind of) X (dog is a hyponym of canine) troponym: the verb Y is a troponym of the verb X if the activity Y is doing X in some manner (to lisp is a troponym of to talk) in NLTK WordNet hyponyms (nouns) == troponym (verbs)

Parameters
  • set_lexicon (set) – set of WordNet lexicon synsets and lemma names

  • syn (nltk.corpus.reader.wordnet.Synset) – WordNet synset

  • lang (str) – WordNet language

  • pos (str) – WordNet POS filter

  • max_depth (int) – maximum depth for inherited search

  • depth (int) – current depth (internal recursive use only)

  • dict_lexicon_config (dict) – config object returned from lexicon_lib.get_lexicon_config()

soton_corenlppy.lexico.wordnet_lib.find_base_word_form(lemma=None, morphy_pos_list=None, dict_lexicon_config=None)[source]

use wordnet to find the base form of a word. original lemma case is preserved where possible.

Parameters
  • lemma (unicode) – lemma to lookup

  • morphy_pos_list (str) – WordNet POS_LIST entry (see morphy) or None

  • dict_lexicon_config (dict) – config object returned from lexicon_lib.get_lexicon_config()

Returns

base phrase after WordNet lookup, or None if none found

Return type

unicode

soton_corenlppy.lexico.wordnet_lib.get_lemma(set_lexicon, syn, lang='eng', pos='asrnv', dict_lexicon_config=None)[source]

get all lemma (direct and derived) for a WordNet synset and add them to set_lexicon.

Parameters
  • set_lexicon (set) – set of WordNet lexicon synsets and lemma names

  • syn (nltk.corpus.reader.wordnet.Synset) – WordNet synset

  • lang (str) – WordNet language

  • pos (str) – WordNet POS filter

  • dict_lexicon_config (dict) – config object returned from lexicon_lib.get_lexicon_config()

soton_corenlppy.lexico.wordnet_lib.get_lemma_with_freq(set_lexicon, syn, lang='eng', pos='asrnv', dict_lexicon_config=None)[source]

get all lemma with a freq count

Parameters
  • set_lexicon (set) – set of tuples = ( nltk.corpus.reader.wordnet.Lemma, count )

  • syn (nltk.corpus.reader.wordnet.Synset) – WordNet synset

  • lang (str) – WordNet language

  • pos (str) – WordNet POS filter

  • dict_lexicon_config (dict) – config object returned from lexicon_lib.get_lexicon_config()

soton_corenlppy.lexico.wordnet_lib.get_synset(wordnet_synset_name, dict_lexicon_config=None)[source]

lookupword in wordnet and return the synset match (or None)

Parameters
  • wordnet_synset_name (str) – valid wordnet name such as dog.n.01

  • dict_lexicon_config (dict) – config object returned from lexicon_lib.get_lexicon_config()

Returns

wordnet synset

Return type

nltk.corpus.reader.wordnet.Synset

soton_corenlppy.lexico.wordnet_lib.get_synset_names(lemma, pos='asrnv', lang='eng', dict_lexicon_config=None)[source]

lookup lemma and return all possible synset names

Parameters
  • lemma (unicode) – lemma to lookup

  • pos (str) – WordNet POS filter

  • lang (str) – WordNet language

  • dict_lexicon_config (dict) – config object returned from lexicon_lib.get_lexicon_config()

Returns

list of nltk.corpus.reader.wordnet.Synset (empty if none found) e.g. [Synset(‘dog.n.01’), Synset(‘frump.n.01’), Synset(‘dog.n.03’), Synset(‘cad.n.01’)]

Return type

list

soton_corenlppy.lexico.wordnet_lib.inherited_entailments(set_lexicon, syn, lang='eng', pos='asrnv', max_depth=3, depth=0, dict_lexicon_config=None)[source]

get all inherited wordnet entailments synsets and add them to set_lexicon. entailment: the verb Y is entailed by X if by doing X you must be doing Y (to sleep is entailed by to snore)

Parameters
  • set_lexicon (set) – set of WordNet lexicon synsets and lemma names

  • syn (nltk.corpus.reader.wordnet.Synset) – WordNet synset

  • lang (str) – WordNet language

  • pos (str) – WordNet POS filter

  • max_depth (int) – maximum depth for inherited search

  • depth (int) – current depth (internal recursive use only)

  • dict_lexicon_config (dict) – config object returned from lexicon_lib.get_lexicon_config()

soton_corenlppy.lexico.wordnet_lib.inherited_hypernyms(set_lexicon, syn, lang='eng', pos='asrnv', max_depth=3, depth=0, dict_lexicon_config=None)[source]

get all inherited wordnet hypernym synsets and add them to set_lexicon. hypernyms: Y is a hypernym of X if every X is a (kind of) Y (canine is a hypernym of dog)

Parameters
  • set_lexicon (set) – set of WordNet lexicon synsets and lemma names

  • syn (nltk.corpus.reader.wordnet.Synset) – WordNet synset

  • lang (str) – WordNet language

  • pos (str) – WordNet POS filter

  • max_depth (int) – maximum depth for inherited search

  • depth (int) – current depth (internal recursive use only)

  • dict_lexicon_config (dict) – config object returned from lexicon_lib.get_lexicon_config()

soton_corenlppy.lexico.wordnet_lib.inherited_hyponyms(set_lexicon, syn, lang='eng', pos='asrnv', max_depth=3, depth=0, dict_lexicon_config=None)[source]

get all inherited wordnet hyponym synsets and add them to set_lexicon. hyponyms: Y is a hyponym of X if every Y is a (kind of) X (dog is a hyponym of canine) troponym: the verb Y is a troponym of the verb X if the activity Y is doing X in some manner (to lisp is a troponym of to talk) in NLTK WordNet hyponyms (nouns) == troponym (verbs)

Parameters
  • set_lexicon (set) – set of WordNet lexicon synsets and lemma names

  • syn (nltk.corpus.reader.wordnet.Synset) – WordNet synset

  • lang (str) – WordNet language

  • pos (str) – WordNet POS filter

  • max_depth (int) – maximum depth for inherited search

  • depth (int) – current depth (internal recursive use only)

  • dict_lexicon_config (dict) – config object returned from lexicon_lib.get_lexicon_config()

soton_corenlppy.lexico.wordnet_lib.verb_groups(set_lexicon, syn, lang='eng', pos='v', dict_lexicon_config=None)[source]

get verb group for a synset and add them to set_lexicon. verb group = Some similar senses of verbs have been grouped (manually) by the lexicographers

Parameters
  • set_lexicon (set) – set of WordNet lexicon synsets and lemma names

  • syn (nltk.corpus.reader.wordnet.Synset) – WordNet synset

  • lang (str) – WordNet language

  • pos (str) – WordNet POS filter

  • dict_lexicon_config (dict) – config object returned from lexicon_lib.get_lexicon_config()