Chapter 2

Words


2.1    Introduction

This chapter details the range of possible words as the bottom layer of sentence structure, while chapter 3 considers phrase layers, and chapter 4 ascends to clause layers. This unravelling of the layers of grammatical structure will show all elements as inter-related components. Consequently, understanding of what is possible at one layer really requires the integration of knowledge from all of the other layers. But discovery of structure has to start somewhere, hence the focus here on words as foundational components.

    With this chapter, words are categorised into classes informed by how words function in phrases and clauses. Word analysis involves saying with a TAG the WORD CLASS to which a given word in context belongs. This task is known as TAGGING.

    One criteria for tagging is to identify words that can be minimal components for phrase layers: words for noun phrases are discussed in section 2.2; words for adjective phrases, in section 2.3; and words for adverb phrases, in section 2.4. Section 2.5 covers other word types that appear under a noun phrase as modifiers.

    Section 2.6 presents tags for verbs which are minimal components for clause layers. Section 2.8 adds other word types that occur under a clause layer. Section 2.9 considers words that serve as connectives to combine layers. Section 2.10 looks at the treatment of punctuation. Section 2.11 deals with interjection, reaction signals, and formulaic expressions. Section 2.12 notes other possible elements of a parse.

2.1.1    Beware: Same character string in a different context with a different tag

Section 1.2.1 identifies words through the conventions of the English writing system as character strings delimited by a space on each side, while noting exceptions. Word identification is not a complete word analysis: The same character string might be used as a word in different ways in different contexts. Tagging completes the word analysis. For example, down in (2.1) occurs as a noun (N), adjective (ADJ), verb (VB), adverb particle (RP), and preposition (P-ROLE).

(2.1)
a.
(VBP;~Ipr filled) (P-ROLE with) (ADJ pure) (N down)
b.
(PRO she) (BED;~La was) (ADV very) (ADJ down)
c.
(PRO he) (VBD;~cat_Vt started) (TO to) (VB;~Tn down) (D the) (N whisky)
d.
(VB;~Tn.p tear) (RP down) (D that) (N wall)
e.
(PRO You) (MD;~cat_Vi may) (VB;~Ipr look) (ADV back) (P-ROLE down) (D the) (N mountain)

The examples of (2.1) show how there might be different tags for what are different instances in different contexts of the same character string. Tag differences due to context can happen even within the same sentence. For example, that in (2.2) occurs as a complementizer (C), determiner (D), noun (N), relative pronoun (RPRO), and demonstrative pronoun (D;_nphd_).

(2.2)
IP-MAT,NP-SBJ,PRO,I
IP-MAT,VBP;~Tf,hope
IP-MAT,CP-THT-OB1,IP-SUB,C,that
IP-MAT,CP-THT-OB1,IP-SUB,NP-SBJ,D,that
IP-MAT,CP-THT-OB1,IP-SUB,NP-SBJ,N,that
IP-MAT,CP-THT-OB1,IP-SUB,NP-SBJ,IP-REL,NP-OB1,RPRO,that
IP-MAT,CP-THT-OB1,IP-SUB,NP-SBJ,IP-REL,NP-SBJ,D;_nphd_,that
IP-MAT,CP-THT-OB1,IP-SUB,NP-SBJ,IP-REL,VBD;~Tn,started
IP-MAT,CP-THT-OB1,IP-SUB,MD;~cat_Vi,will
IP-MAT,CP-THT-OB1,IP-SUB,IP-INF-CAT,VB;~I,end
IP-MAT,CP-THT-OB1,IP-SUB,IP-INF-CAT,ADVP-TMP,ADV,soon
ID,215_x_buffalo

Yet another possible tag for that is as a subordinating conjunction word (P-CONN), as seen in (2.3).

(2.3)
IP-IMP,VB;~Dn.*,tell
IP-IMP,NP-OB2,PRO,me
IP-IMP,PUNC,<comma>
IP-IMP,PP-SCON-CNT,P-CONN,that
IP-IMP,PP-SCON-CNT,IP-ADV,NP-SBJ,PRO,I
IP-IMP,PP-SCON-CNT,IP-ADV,MD;~cat_Vi,might
ID,90_a_wilde_2_1888

2.2    Words that can head a noun phrase

Certain words form the minimal component required for the phrase layers that immediately contain them. Such a minimal component is called the HEAD of the phrase. The word class tags of Table 2.2 distinguish words that can head a noun phrase (NP; see section 3.2).

Table 2.2: Tags for words that can head a noun phrase

NSplural common noun (e.g., children, revelations, times, wishes)
Ncommon noun not subclassified as NS, that is, either singular (e.g., child, revelation, time, wish), or neutral for number (e.g., committee, fish, information)
NPRSplural proper noun (e.g., Heathers, Koreas)
NPRproper noun not subclassified as NPRS, that is either singular (e.g., Heather, Tokyo), or neutral for number (e.g., Andes, IBM)
PNXreflexive pronoun (e.g., myself, yourself, itself, ourselves), or reciprocal pronoun (e.g., each_other)
PROpersonal pronoun (e.g., I, you, them, us)
PRO;_genm_possessive pronoun, pre-nominal (Genitive 1 of Table 2.4 below) (e.g., my, your, our)
PRO;_ppge_nominal possessive personal pronoun that is the only daughter of a genitive marked preposition phrase (Genitive 2 of Table 2.4 below) (e.g., mine, yours, ours)
WPROWH-pronoun (e.g., what, who, whom)
WPRO;_genm_genitive WH-pronoun (whose)
RPROrelative pronoun (e.g., which, who, whom, that)
RPRO;_genm_genitive relative pronoun (whose)
Q;_nphd_Indefinite pronoun with quantification, which can be a compound (e.g., everybody, nothing), or a word which often occurs with the preposition of (e.g., much, many, a_lot)
D;_nphd_Indefinite pronoun not subclassified as Q;_nphd_ (e.g., someone, anything, another), and demonstrative pronoun (e.g., this, that, these, those)

The rest of this section describes morphosyntactic properties for identifying the words of Table 2.2.

2.2.1    Suffixes that derive nouns

There are many suffixes that derive nouns from other nouns or from words of other classes. Some examples include: {age} (package, usage), {er} (officer, teacher), {ness} (illness, awareness), {ship} (championship, relationship), {tion} (action, organisation), {ty} (ability, responsibility).

2.2.2    Count nouns and plural marking

Many nouns take the suffix {(e)s} when they refer to plural items:

(2.4)
(N hedgehog)      (NS hedgehogs)
(N revelation)    (NS revelations)
(N time)          (NS times)
(N wish)          (NS wishes)
etc.

Such nouns are instances of COUNT nouns, so called, because they can combine with numerals (e.g., three wishes). A few count nouns take the irregular suffix {(r)en} in the plural:

(2.5)
(N child)         (NS children)
(N ox)            (NS oxen)

There are also count nouns that use the same form in the singular and plural. For example, nouns for the animal species in (2.6) look like singular nouns, and so are tagged N, but can also be used as plurals (e.g., three deer); while (2.7) is tagged NS for looking like a plural, but might also refer to a single pair.

(2.6)
(N deer)
(N fish)
(2.7)
(NS scissors)

2.2.3    Mass nouns

In contrast to count nouns, there are MASS (or noncount) nouns that can neither combine with numerals nor inflect for number. For example:

(2.8)
(N gold)
(N information)

It is possible for the same noun to belong to both count and mass categories: in Her hair is brown, hair is a mass noun, but in I found a hair in my soup, it is a count noun.

2.2.4    Proper nouns

Proper nouns are written with an initial capital letter. Inflection for number is rare, but is possible in talk about more than one entity with the same name, e.g., There are two Heathers in my class.

(2.9)
(NPR Heather)
(NPRS Heathers)

2.2.5    Genitive case

Both proper nouns and common nouns can be case marked — but only for genitive case — by the genitive suffix {'s} (or {'} if the noun already has the plural suffix {s}). For parsing, the genitive marking is made to form a distinct word that is tagged GENM:

Table 2.3: Tag for genitive marker

GENMThe genitive marker {'s} or {'}

This marker is placed leftmost under an NP-GENV projection that comprises the genitive content. For example:

(2.10)
NP-SBJ,NP-GENV,NPR,Wendy
NP-SBJ,NP-GENV,GENM,<apos>s
NP-SBJ,N,boyfriend
ID,100_lucy_bnc_b33
(2.11)
NP-OB1,NP-GENV,D,a
NP-OB1,NP-GENV,N,couple
NP-OB1,NP-GENV,PP,P-ROLE,of
NP-OB1,NP-GENV,PP,NP,NS,hours
NP-OB1,NP-GENV,GENM,<apos>
NP-OB1,N,work
ID,212_christine_t37

Double marking of a genitive is possible when the content of the genitive has both a following genitive marker and a preceding of preposition, as in (2.12).

(2.12)
NP-NSBJ,D,that
NP-NSBJ,N,microphone
NP-NSBJ,PP-GENV,P-ROLE,of
NP-NSBJ,PP-GENV,NP,NPR,Scott
NP-NSBJ,PP-GENV,NP,GENM,<apos>s
ID,249_christine_t29

2.2.6    Reciprocal, reflexive, and personal pronouns

This section presents an overview of reciprocal and reflexive pronouns (PNX), and personal pronouns (PRO). Such words are understood in the context of their occurrence, often by taking already mentioned noun phrases as antecedents. They can appear in full noun phrase positions, with the range of grammatical markings detailed in Table 2.4.

Table 2.4: Reciprocal and reflexive pronouns (PNX), and personal pronouns (PRO)

1st person 2nd person 3rd person
Singular Plural Singular Plural Singular Plural
Masculine Feminine Neuter
Reciprocal (PNX) each_other, one_another
Reflexive (PNX) myself ourselves yourself yourselves himself herself itself themselves
Subject (PRO) I we you he she it they
Non-subject (PRO) me us him her them
Genitive 1 (PRO;_genm_) my our your his its their
Genitive 2 (PRO;_ppge_) mine ours yours hers theirs

    The reciprocal pronouns each_other and one_another are used to indicate a relationship between conjoined nouns, for example, the love relationship in (2.13).

(2.13)
IP-MAT,NP-SBJ,NLYR,NP,NPR,John
IP-MAT,NP-SBJ,NLYR,CONJP,CONJ,and
IP-MAT,NP-SBJ,NLYR,CONJP,NP,NPR,Mary
IP-MAT,VBP;~Tn,love
IP-MAT,NP-OB1,PNX,each_other
ID,133_x_blue_book

Reflexive personal pronouns can indicate that the subject and object are the same entity, as in (2.14).

(2.14)
IP-MAT,NP-SBJ;{GUN},PRO,It
IP-MAT,VBP;~Tn,fires
IP-MAT,NP-OB1;{GUN},PNX,itself
ID,322_a_dick_1952

Reflexive forms are also used for emphasis, as seen with (2.15), where myself is annotated as an adverbial noun phrase with reflexive function (NP-RFL).

(2.15)
IP-MAT,NP-SBJ,PRO,I
IP-MAT,VBD;~Tn,made
IP-MAT,NP-OB1,PRO,it
IP-MAT,ADVP-MNR,ADV,all
IP-MAT,NP-RFL,PNX,myself
ID,2_lucy_child_09_m73

    A look across the columns of Table 2.4 shows that the ‘person’ category is useful when differentiating the various reflexive and personal pronouns. The first person refers to the speaker(s), the second person refers to the hearer(s), and the third person refers to other entities. Person is further distinguished into singular and plural reference. Further distinction is possible with third person singular pronouns (reflexive and personal), since they inflect for gender: This gives: masculine he/him(self)/his, feminine she/her(self)/hers, and neuter it(self)/its.

    A look down the rows of Table 2.4 shows that pronouns are marked for more grammatical cases than common nouns or proper nouns. In addition to genitive case, pronouns are marked for nominative case (marking the subject of the clause: I shrugged), and accusative case (marking the object acted on by the verb: James tickled me).

(2.16)
IP-MAT,NP-SBJ,PRO,I
IP-MAT,VBD;~I,shrugged
ID,50_a_loosechange
(2.17)
IP-MAT,NP-SBJ,NPR,James
IP-MAT,VBD;~Tn,tickled
IP-MAT,NP-OB1,PRO,me
ID,91_christine_t03

    Genitive pronouns can be either:

(2.18)
NP,NP-GENV,PRO;_genm_,our
NP,NS,friends
ID,10_lucy_child_09_m51
(2.19)
IP-MAT,NP-SBJ,PRO,it
IP-MAT,BEP;~Ln,<apos>s
IP-MAT,NP-PRD2,NP-GENV,PRO;_ppge_,theirs
ID,468_christine_t13

As shown by examples (2.18) and (2.19), with parsing, a genitive pronoun (dependent (PRO;_genm_) or independent (PRO;_ppge_)) is the only element of an NP-GENV layer. Furthermore, for PRO;_ppge_, the NP-GENV layer is the only element of the containing noun phrase.

2.2.7    Other types of pronouns

Other types of pronouns found in English include:

(2.20)
CP-QUE-MAT,IP-SUB,NP-SBJ,WPRO,Who
CP-QUE-MAT,IP-SUB,VBD;~I,came
CP-QUE-MAT,PUNC,?
ID,712_x_textbook_kisonihongo
(2.21)
CP-QUE-MAT,IP-SUB,NP-OB1,WPRO,What
CP-QUE-MAT,IP-SUB,DOD,did
CP-QUE-MAT,IP-SUB,NP-SBJ,PRO,it
CP-QUE-MAT,IP-SUB,VB;~Tn,mean
CP-QUE-MAT,PUNC,?
ID,131_susanne_n14
(2.22)
NP,D;_nphd_,anything
NP,IP-REL,NP-SBJ,RPRO,that
NP,IP-REL,BEP;~La,<apos>s
NP,IP-REL,ADJP-PRD2,ADJ,relevant
ID,88_christine_t29
(2.23)
NP-ESBJ,D,an
NP-ESBJ,N,iron
NP-ESBJ,N;@3,hook
NP-ESBJ,IP-REL,NP-OB1,RPRO,which
NP-ESBJ,IP-REL,NP-SBJ,PRO,he
NP-ESBJ,IP-REL,VBD;~Tn,unfastened
ID,38_susanne_n12
(2.24)
PP-SCON,P-CONN,as
PP-SCON,IP-ADV,NP-SBJ,Q;_nphd_,everyone
PP-SCON,IP-ADV,VBP;~I,knows
ID,80_susanne_g17
(2.25)
IP-MAT,NP-SBJ,Q;_nphd_,nobody
IP-MAT,VBP;~Tf,knows
ID,51_a_ted_talk_11
(2.26)
NP-OB1,Q;_nphd_,all
NP-OB1,PP,P-ROLE,of
NP-OB1,PP,NP,PRO,them
ID,90_a_ibm_1401
(2.27)
NP-SBJ,Q;_nphd_,none
NP-SBJ,PP,P-ROLE,of
NP-SBJ,PP,NP,PRO,them
ID,499_christine_t03
(2.28)
PP-SCON-CNT,P-CONN,in_case
PP-SCON-CNT,IP-ADV,NP-SBJ,D;_nphd_,somebody
PP-SCON-CNT,IP-ADV,VBP;~I,comes
ID,57_christine_t28
(2.29)
NP-SBJ,D;_nphd_,some
NP-SBJ,PP,P-ROLE,of
NP-SBJ,PP,NP,PRO,them
ID,26_susanne_a12
(2.30)
PP-SCON,P-CONN,as
PP-SCON,IP-ADV,NP-SBJ,D;_nphd_,one
PP-SCON,IP-ADV,MD;~cat_Vi,would
PP-SCON,IP-ADV,IP-INF-CAT,VB;~I,expect
ID,27_susanne_g12
(2.31)
PP-SCON-MOD,P-CONN,as
PP-SCON-MOD,IP-ADV,NP-SBJ,PRO,you
PP-SCON-MOD,IP-ADV,MD;~cat_Vi,can
PP-SCON-MOD,IP-ADV,IP-INF-CAT,VB;~I,see
ID,217_a_ted_talk_11
(2.32)
IP-MAT,NP-SBJ,PRO,I
IP-MAT,DOP,do
IP-MAT,NEG;_clitic_,n<apos>t
IP-MAT,VB;~Tn,want
IP-MAT,NP-OB1,D;_nphd_,that
ID,558_christine_t35
(2.33)
IP-MAT,NP-SBJ,PRO,I
IP-MAT,MD;~cat_Vi,<apos>ll
IP-MAT,IP-INF-CAT,HV;~Tn,have
IP-MAT,IP-INF-CAT,NP-OB1,D;_nphd_,these
ID,172_christine_t16
(2.34)
NP-OB1,D,this
NP-OB1,N,point
ID,5_lucy_student_e16

Table 2.5: Demonstrative pronouns (D;_nphd_)/determiners (D)

Near speaker Away from speaker
Singular this that
Plural these those

2.3    Words that can head an adjective phrase

For a word to be an adjective, it must be able to function as the head of an adjective phrase (ADJP; see section 3.3), often as the only element of the phrase. Adjectives are tagged as in Table 2.6 to distinguish comparative and superlative forms from general forms.

Table 2.6: Tags for words that can head an adjective phrase

ADJGeneral adjective: an adjective not subclassified as ADJR or ADJS (e.g., old, good, male)
ADJRComparative adjective (e.g., older, better)
ADJSSuperlative adjective (e.g., oldest, best)

    The comparative form of an adjective is typically indicated by the suffix {er}, whereas the superlative form is typically indicated by the suffix {est}; see Table 2.7.

Table 2.7: Forms of adjectives

General (ADJ) Comparative (ADJR) Superlative (ADJS)
Gradable old older oldest
good better best
Non-gradable male

2.4    Words that can head an adverb phrase

For a word to be an adverb, it must be able to function as the head of an adverb phrase (ADVP; see section 3.4), often as the only element of the phrase. Adverbs are tagged as in Table 2.8 to distinguish comparative, superlative, and WH froms from general forms.

Table 2.8: Tags for words that can head an adverb phrase

ADVGeneral adverb: an adverb not subclassified as ADVR, ADVS, RADV, or WADV (e.g., often, well, really).
ADVRComparative adverb (e.g., more, less, farther)
ADVSSuperlative adverb (e.g., most, least, farthest)
RADVWh-adverb that is the relative adverb of a relative clause (e.g., how, when, where, whereby)
WADVWh-adverb (e.g., how, when, where, why)
RPAdverbial particle (e.g., up, off, out)

2.5    Other noun phrase level words

Section 2.2 above has already discussed words that can head a noun phrase. Table 2.9 gives tags for other words that can be immediate components of a noun phrase, but that can't be a noun phrase head. Rather, these words precede the head within noun phrases and function to modify the head in terms of definiteness, item under question, or quantity.

Table 2.9: Tags for words that can be immediate components of a noun phrase

DDeterminer, which includes articles (e.g., a, the) and demonstratives (e.g., this, that)
RDWh-determiner that is the relative determiner of a relative clause (e.g., what, whatever)
WDWh-determiner (e.g., which, what, whichever)
NUMNumeral (e.g., one, 1975)
QQuantifier (e.g., every, no)

    Prototypical nouns can take one of two articles:

Articles for the singular head noun wish and the plural head noun wishes are seen in (2.35).

(2.35)
(D a) (N wish)                   (D the) (N wish)
(D some) (NS wishes)             (D the) (NS wishes)

2.6    Verbs

Verbs occur at clause levels of structure in the annotation. There are tags to subclassify verbs in accordance to their form:

2.6.1    Lexical verbs

Verbs can change in shape to show tense. For example, the verb SUPPORT in John supports Peter takes a third person present tense {s} inflection, while in John supported Peter it has a past tense {ed} inflection. A verb that has tense is called a finite verb.

    Tenseless forms of verbs are called nonfinite verbs, which are comprised of:

Note that participle forms are tenseless despite their full names! Infinitive forms occur in infinitive clauses, often preceded by the infinitive marker to (e.g., John happened to support Peter.). Present participles are used in the progressive construction (e.g., John is supporting Peter). Past participles are used in the perfect construction (e.g., John has supported Peter) and the passive construction (e.g., Peter is supported by John).

    The distinctions in verb forms just sketched are captured in the tags for lexical verbs of Table 2.10.

Table 2.10: Tags for lexical verbs

VBPpresent tense form of lexical verbs (e.g., reaches, supports, writes, sinks, puts, reach, support, write, sink, put)
VBDpast tense form of lexical verbs (e.g., reached, supported, wrote, sank, put)
VBinfinitive form of lexical verbs (e.g., reach, support, write, sink, put)
VAGpresent participle ({ing}) form of lexical verbs (used in the progressive construction) (e.g., reaching, supporting, writing, sinking, putting)
VVNpast participle ({ed}/{en}) form of lexical verbs (used in the perfect construction and the passive construction) (e.g., reached, supported, written, sunk, put)

    Table 2.11 further illustrates with examples the distinctions between the different lexical verb forms. This includes examples of irregular verbs that do not have a regular past tense inflection.

Table 2.11: Forms of lexical verb

Tensed forms Tenseless forms
Tense Infinitive (VB) Participles
Present (VBP) Past (VBD) Present (VAG) Past (VVN)
3rd person singular Other
Regular reaches reach reached reach reaching reached
supports support supported support supporting supported
Irregular writes write wrote write writing written
sinks sink sank sink sinking sunk
puts put put put putting put

2.6.2    HAVE

The forms of HAVE are tagged as in Table 2.12.

Table 2.12: Tags for HAVE

HVPpresent tense forms of the verb HAVE: have, 've, has, 's
HVDpast tense form of the verb HAVE: had, 'd
HVinfinitive form of the verb HAVE: have
HAGpresent participle form of the verb HAVE: having
HVNpast participle form of the verb HAVE: had

HAVE has the non-contracted inflections of Table 2.13.

Table 2.13: Forms of HAVE

Tensed forms Tenseless forms
Tense Infinitive (HV) Participles
Present (HVP) Past (HVD) Present (HAG) Past (HVN)
3rd person singular Other
has have had have having had

2.6.3    BE

The forms of BE are tagged as in Table 2.14.

Table 2.14: Tags for BE

BEPpresent tense forms of the verb BE: is, am, are, 'm, 're, 's
BEDpast tense forms of the verb BE: was, were
BEinfinitive form of the verb BE: be
BAGpresent participle form of the verb BE: being
BENpast participle form of the verb BE: been

Table 2.15 presents an overview of the eight different non-contracted forms of BE. This is the widest range of distinct forms for the same verb lexme in English, with extra person-number contrasts in the past and present tenses.

Table 2.15: Forms of BE

Tensed forms Tenseless forms
Tense Infinitive (BE) Participles
Present (BEP) Past (BED) Present (BAG) Past (BEN)
3rd person singular 1st person singular Other singular plural
is am are was were be being been

2.6.4    DO

The forms of DO are tagged as in Table 2.16.

Table 2.16: Tags for DO

DOPpresent tense forms of the verb DO: do, does, 's
DODpast tense form of the verb DO: did
DOinfinitive form of the verb DO: do
DAGpresent participle form of the verb DO: doing
DONpast participle form of the verb DO: done

DO has the the non-contracted inflections of Table 2.17.

Table 2.17: Forms of DO

Tensed forms Tenseless forms
Tense Infinitive (DO) Participles
Present (DOP) Past (DOD) Present (DAG) Past (DON)
3rd person singular Other
does do did do doing done

2.6.5    Modal verbs

Modal verbs express meanings such as certainty, ability, or obligation. The main modal verbs are WILL, WOULD, CAN, COULD, MAY, MIGHT, SHALL, SHOULD, MUST and OUGHT. A modal verb only has finite forms and has no suffixes (e.g., I sing — he sings, but I must — he must). Modal verbs are tagged as in Table 2.18.

Table 2.18: Tag for modal verbs

MD;~cat_Vimodal auxiliary verb (e.g., will, would, can, could, 'll, 'd)
MD;~cat_Vtmodal catenative (ought, used)

2.7    Subject indicating words

EXexistential there, i.e., there of the there is ... or there are ... construction co-occurring with an existential subject (NP-ESBJ)
PRO;_cleft_cleft it occuring as part of a cleft construction (so it was you that got them together)
PRO;_expletive_expletive it e.g., occuring in a weather construction (it's raining)
PRO;_provisional_provisional it occuring with extraposition (it bothered her that she probably would never know)

2.8    Other clause level words

Besides verbs, other clause level components are words with the tags of Table 2.19.

Table 2.19: Tags for clause level components

NEGnegative particle not
NEG;_clitic_negative clitic particle n't
TOInfinitive marker to
CONJ;_cl_discourse coordination (e.g., And, But)

    We can see some of these clause level components in the annotation of (2.36). This begins with there (EX) to create an existential construction, and includes the negative clitic particle n't (NEG;_clitic_). The IP-INF-CAT as selected complement of the existential verb (BED;~ex_cat_Vt; see section 8.2.5) includes infinitive marker to (TO).

(2.36)
IP-MAT,EX,There
IP-MAT,BED;~ex_cat_Vt,were
IP-MAT,NEG;_clitic_,n<apos>t
IP-MAT,NP-ESBJ,D,any
IP-MAT,NP-ESBJ,NS,incidents
IP-MAT,IP-INF-CAT,TO,to
IP-MAT,IP-INF-CAT,VB;~Ipr,get
IP-MAT,IP-INF-CAT,PP-CLR-LOC,P-ROLE,in
IP-MAT,IP-INF-CAT,PP-CLR-LOC,NP,D,the
IP-MAT,IP-INF-CAT,PP-CLR-LOC,NP,N,way
ID,143_lucy_bnc_c08

    It is also possible for a word tagged RP to occur as a clause level component. The RP tag is used to mark adverbial particles (e.g., up, off, out) and was seen in section 2.4 as the tag for a word that can head an adverb phrase. When an RP tagged word occurs as a clause level component it is part of a phrasal verb, as in (2.37).

(2.37)
IP-MAT,VBP;~phr_Vp,hold
IP-MAT,RP,on
IP-MAT,NP-TMP,D,a
IP-MAT,NP-TMP,N,second
ID,33_christine_t28

2.9    Connective words

So far we have considered words that serve as components of either phrases or clauses. There is a further class of words with the tags of Table 2.20 that serve as the means to connect phrases and clauses.

Table 2.20: Tags for connective words

CONJCoordinating conjunction (e.g., and, or, but)
CThe complementizer that
WQMarker of indirect question (whether or if)
P-CONNSubordinating conjunction (e.g., although, when)
P-ROLERole preposition (e.g., in, of, under)

When phrases and clauses are connected they are said to be COMPLEX.


2.10    Punctuation

Punctuation points, quotation marks, and brackets (‘.’ ‘?’ ‘!’ ‘:’ ‘;’ ‘,’ ‘-’ ‘(’ ‘)’ etc.) are treated as words for the purposes of word tagging with the tags of Table 2.21.

Table 2.21: Tags for punctuation

PUNCPunctuation: general separating mark — i.e., . , ! : ; - or ?
PULBPunctuation: left bracket — i.e., ( or [
PURBPunctuation: right bracket — i.e., ) or ]
PULQPunctuation: left quotation mark — i.e., or
PURQPunctuation: right quotation mark — i.e., or

This makes punctuation part of a sentence in its own right. When creating constituent structure, punctuation is placed as high as possible. For example, a full stop that ends a sentence is treated as the last constituent of the highest clause layer (IP/CP/FRAG).


2.11    Interjection, reaction signals, and formulaic expressions

Interjections, reaction signals, and formulaic expressions are treated as single words with the tags of Table 2.22. They have a high placement in structure, typically occurring as elements of clause or fragment layers.

Table 2.22: Tags for interjection, reaction signals, and formulaic expressions

INTJInterjection (e.g., aah, eh, ummmmm)
REACTReaction signal (e.g., good_grief, really, yes, wow)
FRMFormulaic expression (e.g., good_afternoon, you_see, thank_you)
(2.38)
FRAG,INTJ,Well
FRAG,PUNC,<comma>
FRAG,REACT,no
FRAG,PUNC,<comma>
FRAG,ADVP-MOD,ADV,of_course
FRAG,NEG,not
ID,19_lucy_bnc_b19
(2.39)
FRAG,FRM,Thank_you
FRAG,NP-MNR,ADJP,ADVP,ADV,very
FRAG,NP-MNR,ADJP,ADJ,much
ID,43_a_ted_talk_11

2.12    Other tags

Tags for other possible elements of a parse are given in Table 2.23.

Table 2.23: Other tags

FOFormula
FWForeign word
LSList item (e.g., 1, a, i)
SYMSymbol