Frog
Public Member Functions | List of all members
CGNTagger Class Reference

#include <cgn_tagger_mod.h>

Inheritance diagram for CGNTagger:
Inheritance graph
[legend]
Collaboration diagram for CGNTagger:
Collaboration graph
[legend]

Public Member Functions

 CGNTagger (TiCC::LogStream *l, TiCC::LogStream *d=0)
 
bool init (const TiCC::Configuration &)
 
void add_declaration (folia::Document &, folia::processor *) const
 
void post_process (frog_data &)
 
void add_tags (const std::vector< folia::Word * > &, const frog_data &) const
 
std::string getSubSet (const std::string &, const std::string &, const std::string &) const
 
- Public Member Functions inherited from BaseTagger
 BaseTagger (TiCC::LogStream *, TiCC::LogStream *, const std::string &)
 
virtual ~BaseTagger ()
 
virtual void Classify (frog_data &)
 
void add_provenance (folia::Document &, folia::processor *) const
 
std::string getTagset () const
 
std::string set_eos_mark (const std::string &)
 
bool fill_map (const std::string &, std::map< std::string, std::string > &)
 
std::vector< Tagger::TagResult > tagLine (const std::string &)
 
std::vector< Tagger::TagResult > tag_entries (const std::vector< tag_entry > &)
 
std::string version () const
 

Additional Inherited Members

- Protected Member Functions inherited from BaseTagger
void extract_words_tags (const std::vector< folia::Word * > &, const std::string &, std::vector< std::string > &, std::vector< std::string > &)
 
std::vector< Tagger::TagResult > call_server (const std::vector< tag_entry > &) const
 
 BaseTagger (const BaseTagger &)
 
- Protected Attributes inherited from BaseTagger
int debug
 
std::string _label
 
std::string tagset
 
std::string _version
 
std::string textclass
 
TiCC::LogStream * err_log
 
TiCC::LogStream * dbg_log
 
std::string base
 
std::string _host
 
std::string _port
 
MbtAPI * tagger
 
TiCC::UniFilter * filter
 
std::vector< std::string > _words
 
std::vector< Tagger::TagResult > _tag_result
 
std::map< std::string, std::string > token_tag_map
 

Constructor & Destructor Documentation

◆ CGNTagger()

CGNTagger::CGNTagger ( TiCC::LogStream *  l,
TiCC::LogStream *  d = 0 
)
inlineexplicit

Member Function Documentation

◆ add_declaration()

void CGNTagger::add_declaration ( folia::Document &  doc,
folia::processor *  proc 
) const
virtual

add POS annotation as an AnnotationType to the document

Parameters
docthe Document the add to
procthe processor to add

Implements BaseTagger.

◆ add_tags()

void CGNTagger::add_tags ( const std::vector< folia::Word * > &  wv,
const frog_data fd 
) const

add the tagger results to the folia:Word list

Parameters
wvThe folia::Word vector to add to
fdthe frog_data structure with the tagger results

◆ getSubSet()

string CGNTagger::getSubSet ( const std::string &  val,
const std::string &  head,
const std::string &  fullclass 
) const

get a specific subset value. (FoLiA output only)

Parameters
valthe val to look up
headthe head of the CGN POS-tag
fullclassthe original full CGN tag, used for error messages only
Returns
a string with the found value or throws when there is a problem
Note
for a well-trained CGN tagger, all values should belong to a subset AND there may never be a constraints conflict

A full class may be N(soort,ev,basis,zijd,stan), so the head is N.

For every value in 'soort,ev,basis,zijd,stan' we lookup the subset in the cgn_subsets, and when the constraints on the head are satisfied we return the subset.

For instance: 'soort' is found in the subsets to belong to the subset 'ntype' and there are no 'head' constrainst on ntype, so the lookup for 'soort' yields 'ntype'

And would the fullclass have been VNW(betr,pron,stan,vol,persoon,getal) then the subset for 'getal' is 'getal' AND the constraints for 'getal' are 'VNW, N', so these are satisfied and the result is 'getal'

◆ init()

bool CGNTagger::init ( const TiCC::Configuration &  config)
virtual

initalize a CGN tagger from 'config'

Parameters
configthe TiCC::Configuration
Returns
true on succes, false otherwise

first BaseTagger::init() is called to set generic values, then the CGN specific values for subset and constraints file-names are added and those files are read, except when these have the value 'ignore'

Reimplemented from BaseTagger.

◆ post_process()

void CGNTagger::post_process ( frog_data words)
virtual

add the found tagging results to the frog_data structure

Parameters
wordsThe frog_data structure to extend

Implements BaseTagger.


The documentation for this class was generated from the following files: