Refine
Year of publication
- 2007 (2)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2) (remove)
Keywords
- Rumänisch (2) (remove)
Institute
- Extern (2) (remove)
U radu se iznose tzv. lažni parovi (prijatelji), leksemi u hrvatskom i rumunjskom jeziku koji zbog svoje izrazne podudarnosti navode na pogrešno prevođenje. Navode se značajke koje su dovele do takvih pojava. S obzirom na podrijetlo, najčešće je riječ o leksemima naslijeđenima iz latinskoga jezika ili kasnijim romanizmima te dakako slavenskima, kojih je u rumunjskome nezanemariv broj. Izdvojeni se leksemi razvrstavaju u tablicu koja omogućuje njihovu prozirniju usporedbu i lakše prepoznavanje.
Recent approaches to Word Sense Disambiguation (WSD) generally fall into two classes: (1) information-intensive approaches and (2) information-poor approaches. Our hypothesis is that for memory-based learning (MBL), a reduced amount of data is more beneficial than the full range of features used in the past. Our experiments show that MBL combined with a restricted set of features and a feature selection method that minimizes the feature set leads to competitive results, outperforming all systems that participated in the SENSEVAL-3 competition on the Romanian data. Thus, with this specific method, a tightly controlled feature set improves the accuracy of the classifier, reaching 74.0% in the fine-grained and 78.7% in the coarse-grained evaluation.