Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

Preparing VerbaLex Printed Edition Dana Hlav´aˇckov´a Aleˇs Hor´ak Karel Pala Natural Language Processing Centre Faculty of Informatics, Masaryk University Botanick´ a 68a, CZ-602 00 Brno, Czech Republic E-mail: {hlavack, hales, pala}@fi.muni.cz

RASLAN 2013

Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

Outline

History VerbaLex Valency Lexicon VerbaLex in Print

Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

until 2005

History until 2005: I

ˇ cek: Valence ˇcesk´ych sloves (Valencies of Czech Verbs), Pala, Seveˇ 1997. Syntactic valency frames (called BRIEF), 15 000 verbs: opustit hPTc4,hPTc4-hPTc3r{kv˚ uli},hPTc4-hPTc4r{pro}

I

Balkanet EU project, 2002. Czech WordNet valency frames, 3 000 verbs: Synonyms: opustit:6(opouˇ stˇ et), nechat:9(-) Valency: kdo1*AG(person:1)=koho4*PAT((person:1)|(animal:1)) ?(kv˚ uli komu3|pro koho4)*CAUSE(person:1)

I

Prague Vallex 1.0, 1 000 verbs: ~ impf: opouˇ stˇ et pf: opustit + ACT(1;obl) PAT(4;obl) CAUS(kv˚ uli+3,pro+4;typ)

Problems – small coverage, machine processing features not unified Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

2006–2013

History 2006–2013: I

development of VerbaLex valency lexicon

I

start – inspiration from the Vallex development procedure

I

all tools newly developed, HTML layout reused

I

edit – formatted plain text in VIM

I

transform tool to XML

I

exports to HTML and LaTeX (PDF)

I

main editing work by 5–10 linguists

Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

VIM VerbaLex Editing

Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

Current VerbaLex Editor

Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

VerbaLex XML

get down:7, begin:1, get:34... pˇ ristoupit ke kon´ an´ ı nˇ eˇ ceho d´ at se d´ avat se ... stavebn´ ı firma se pustila do stavby domu ... Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

VerbaLex HTML Browser

Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

VerbaLex Features

Valency Frames and Semantic Roles in Czech VerbaLex: I

10 500 Czech verb lemmata, 19 000 verb frames

I

Surface and deep valencies

I

Inventory of semantic roles

I

Reasons for two-level notation – EWN Top Ontology, BCs – general labels

I

Subcategorization features – literals from PWN 2

I

Verb semantic classes

I

Strong connection to WordNet – machine processing

Vallex: I

6 500 verbs

I

different approaches – oriented to linguistic processing

Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

Basic Valency Frames

Surface and deep valencies obligatory 1st -level semantic role AG–agens

INS–instrument verb position

optional

what who what AG ( nom ; hperson:1i; obl) VERB SUBS ( acc ;hfood:1i; obl) INS ( with ins ;hcutlery:2i; opt)

pronoun and case 2nd -level

I I

SUBS–substance semantic role

basic valency frames – predicate-argument structure of a verb semantic roles – subcategorization features (or selectional restrictions) required by the meaning of the verb.

Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

Complex Valency Frames

Surface and deep valencies j´ıst:1 (impf), poˇ z´ıt:2(pf), poˇ z´ıvat:2(impf) (eat:1) definition: pˇrij´ımat potravu (take in solid food) class: eat-39.1 passive: yes j´ıst:1 (eat:1) ≈ who -frame: AG( nom ;hperson:1i;obl) VERB SUBS( what ;hfood:1i;obl) acc what with INS( ;hcutlery:2i;opt) ins -example: synovec jedl zmrzlinu (impf) (the nephew ate an ice cream) -example: dcera j´ı pol´ evku lˇz´ıc´ı (impf) (the daughter eats a soop with a spoon) -use: prim -reflexivity: no

complex valency frames – verb sense, aspect, verb semantic classes, ability to form passive voice (transitive and intransitive verbs), reflexivity, behaviour of the verb in a context: primary, figurative, idiomatic usage Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

Semantic Roles

VerbaLex Semantic roles I

1st -level – 32 top PWN hypernyms, broad class

I

2nd -level – PWN literals, typical class

Substance – in VerbaLex a semantic role: 1st -level – SUBS 2nd -level, PWN hypernym – substance:1 Two-layer semantic role – SUBS(substance:1) Hyponymic lexical units as specifiers: SUBShsolid:1i, SUBShliquid:3i, SUBShgas:2i, SUBShfood:1i, SUBShbeverage:1i, ... Hyponymic subclass of particular examples: SUBShbeverage:1i = milk:1, alcohol:1, chocolate:1, fruit juice:1, soft drink:1, coffee:1, tea:1, drinking water:1, ... Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

VerbaLex and WordNet

VerbaLex and WordNet I

valency frames for (sub)synsets and verb senses, not verb lemma

I

synsets are linked to their English equivalents in PWN

I

3 686 whole new synsets

I

15 % of them – no lexicalized equivalent in English – perfective, reflexive or prefixed verbs verbs with expressive or metaphoric meaning povyskoˇcit (“jump up a little”) povyskakovat (“jump out of [something] one after another”) povyˇrizovat (“finish doing things successively”) dovyplnit (“fill in an extra information”)

Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

Semantic Classes

Semantic Classes

I

classification system of English verbs by B. Levin – 48 classes

I

extended by M. Palmer VerbNet project – 83 classes

I

VerbaLex – 109 classes

withdraw-80 – abdikovat:1, odstoupit:2, vzd´at se:1, couvnout:2 contribute-13.2-1 – alokovat:1, pˇridˇelit:5, rozdˇelit:7 correlate-84 – adaptovat se:1, aklimatizovat se:1, pˇrizp˚ usobit se:1

Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

Print Preparation

Preparation of VerbaLex Book I I

500–800 verbs selection by frequency b´yt m´ıt moct cht´ıt st´at j´ıt d´at vˇedˇet ˇr´ıct musit vidˇet dostat zaˇc´ıt

Preparing VerbaLex Printed Edition

160 587 902 31 663 942 16 108 948 7 301 412 6 207 847 6 092 565 5 138 742 4 846 575 4 494 682 4 394 941 3 943 882 3 568 890 3 509 862 NLPCentre FI MU Brno

Outline

History

VerbaLex Valency Lexicon

VerbaLex in Print

VerbaLex Book

VerbaLex Book Format

Preparing VerbaLex Printed Edition

NLPCentre FI MU Brno

Preparing VerbaLex Printed Edition

History. VerbaLex Valency Lexicon. VerbaLex in Print. Preparing VerbaLex Printed Edition. Dana Hlavácková Aleš Horák Karel Pala. Natural Language Processing Centre. Faculty of Informatics, Masaryk University. Botanická 68a, CZ-602 00 Brno, Czech Republic. E-mail: {hlavack, hales, pala}@fi.muni.cz. RASLAN 2013.

553KB Sizes 4 Downloads 162 Views

Recommend Documents

Preparing VerbaLex Printed Edition
1st-level semantic role. AG–agens pronoun and case. 2nd-level semantic role obligatory verb position. SUBS–substance. INS–instrument optional. ▷ basic valency frames – predicate-argument structure of a verb. ▷ semantic roles – subcatego

(with XLSTAT Education Edition Printed Access Card ...
clearly demonstrate how statistical information informs decisions in today's business ... date, more than 350 real business examples, practical cases, and hands-on ... materials like MindTap and CengageNOW online course management systems equip you w

Printed Guide.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Printed Guide.

Apparatus for preparing bread dough
Jan 11, 1982 - ated water. U.S. Pat. No. .... these belts may be varied by a controller such as a rheo ... enclosure indicated at 22 in broken lines to control.

preparing my funeral.pdf
There was a problem loading this page. Retrying... Whoops! There was a problem loading this page. Retrying... preparing my funeral.pdf. preparing my funeral.

Process for preparing powder formulations
Jan 28, 2004 - inhalation. In addition to the administration of therapeuti ... powders characterised by a high degree of homogeneity in the sense of a ..... about 15 s (in the case of 200 mg). Cycle time: 20 ms. Start/stop at: 1% on channel 28.

Printed in Singapore.
part Wp with the known function p(z,y), and Wp replaced with a(z,y). .... Time (second). FIGURE 3 The state z, and its estimate 2. 1,2 |. 08 t. "y os f ", o 25 y ...--".

No. of Printed Pages : 4
r r -1:cr l r, 10 Tzt I 9z-1 9 . wrrrq -Etr-ffuff -vrff Tzt 4;1. I TITIFTft t mil)k-7-4T. )0I- -terT*74 Yid. IR I. (q) 111:1T7 p1T 2IT 13. T -g3TE rig t-u trTdT 2TT Feb 3Tcrq 3TK1 ...

No. of Printed Pages : 4
T -g3TE rig t-u trTdT 2TT Feb 3Tcrq 3TK1#. AvicR q41 \TIGII-1 A c4-)1. MN* 1:1k r("R 611. cR IRT-F-hqtk ch.) { A. 4l. 3 LI*51*.k A-U Trzi t t61. 2ft I 977 tilci 410 tWT ...

Process for preparing isomers of salbutamol
Mar 12, 2009 - (45) Date of Reissued Patent: ... Foreign Application Priority Data. Dec. ..... re?ux, cooled to room temperature and ?ltered to obtain the.

Preparing to spread its wings
many one-offs to cloud comparisons, we focus on reconstructing a segmental breakdown for F&N's soft drinks and beer ...... DTAC – Very Good, EA - Good, ECL – not available, EGCO - Excellent, GFPT - Very Good, GLOBAL - Good, GLOW - Good, GRAMMY -