Frog
Public Member Functions | List of all members
Mblem Class Reference

#include <mblem_mod.h>

Public Member Functions

 Mblem (TiCC::LogStream *, TiCC::LogStream *=0)
 create a Timbl based lemmatizer More...
 
 ~Mblem ()
 
bool init (const TiCC::Configuration &)
 
void add_provenance (folia::Document &, folia::processor *) const
 
void Classify (frog_record &)
 
void Classify (const icu::UnicodeString &)
 
std::vector< std::pair< std::string, std::string > > getResult () const
 
std::string getTagset () const
 
std::string version () const
 
void filterTag (const std::string &)
 
void makeUnique ()
 
void add_lemmas (const std::vector< folia::Word * > &, const frog_data &) const
 

Constructor & Destructor Documentation

◆ Mblem()

Mblem::Mblem ( TiCC::LogStream *  errlog,
TiCC::LogStream *  dbglog = 0 
)
explicit

create a Timbl based lemmatizer

Parameters
errloga LogStream for errors
dbgloga LogStream for debugging

◆ ~Mblem()

Mblem::~Mblem ( )

Member Function Documentation

◆ add_lemmas()

void Mblem::add_lemmas ( const std::vector< folia::Word * > &  wv,
const frog_data fd 
) const

add the lemma from 'fd' to the FoLiA list of Word

Parameters
wvThe folia:Word vector
fdthe folia_data with added lemmatizer results

◆ add_provenance()

void Mblem::add_provenance ( folia::Document &  doc,
folia::processor *  main 
) const

add provenance information to the FoLiA document

Parameters
docthe foLiA document we are working on
mainthe main processor (presumably Frog) we want to add a new one to

◆ Classify() [1/2]

void Mblem::Classify ( const icu::UnicodeString &  uWord)

give the lemma for 1 word

Parameters
uWorda Unicode string with the word the mblemResult struct will be filled with 1 or more (alternative) solutions of a lemma + a POS-tag

◆ Classify() [2/2]

void Mblem::Classify ( frog_record fd)

add lemma information to the frog_data

Parameters
fdThe frog_data

this handles some special cases like ABBREVIATION, the token-strip rules and the one-one rules. All 'normal' cases are handled over to the Timbl classifier

◆ filterTag()

void Mblem::filterTag ( const std::string &  postag)

filater all non-matching tags out of the mblem results

Parameters
postagthe tag, given by the CGN-tagger, that should match

Mblem produces a range of possible solutions with tags. We use the POS tag given by the CGN tagger to remove all solutions with a different tag

◆ getResult()

vector< pair< string, string > > Mblem::getResult ( ) const

extract the results into a list of lemma/tag pairs

◆ getTagset()

std::string Mblem::getTagset ( ) const
inline

◆ init()

bool Mblem::init ( const TiCC::Configuration &  config)

initialize the lemmatizer using the config

Parameters
configthe Configuration to use
Returns
true when no problems are detected

◆ makeUnique()

void Mblem::makeUnique ( )

filter out all results that are equal

◆ version()

std::string Mblem::version ( ) const
inline

The documentation for this class was generated from the following files: