What is pOS Tagging — Penn Treebank tagset?

POS Tagging, Chunking & Dependency Parsing: POS Tagging — Penn Treebank tagset. Learn more in the LumiChats AI Glossary at https://lumichats.com/glossary/pos-tagging-parsing

What is practice questions?

POS Tagging, Chunking & Dependency Parsing: Practice questions. Learn more in the LumiChats AI Glossary at https://lumichats.com/glossary/pos-tagging-parsing

POS Tagging, Chunking & Dependency Parsing

Part-Of-Speech (POS) tagging assigns a grammatical category (noun, verb, adjective, etc.) to each token. Chunking groups consecutive POS-tagged tokens into phrase-level units (noun phrases, verb phrases). Dependency parsing maps the syntactic relationships between words — identifying which word is the subject, object, or modifier of another. These three tasks form the syntactic analysis layer of the NLP pipeline and are prerequisites for information extraction, semantic analysis, and many downstream applications.

Labeling every word with its grammatical role and mapping sentence structure.

Category: Natural Language Processing

Real-life analogy: The grammar teacher

POS tagging is like a grammar teacher marking parts of speech in a sentence: "The quick (adjective) brown (adjective) fox (noun) jumps (verb) over the lazy (adjective) dog (noun)." Dependency parsing goes further — the teacher also draws arrows showing "jumps" is the root, "fox" is its subject, "over the dog" is its location modifier. Chunking groups "the quick brown fox" into a single noun phrase box.

POS Tagging — Penn Treebank tagset

POS Tag	Meaning	Example
NN	Noun, singular	dog, city, model
NNS	Noun, plural	dogs, cities, models
VB	Verb, base form	run, eat, train
VBZ	Verb, 3rd person singular	runs, eats, trains
JJ	Adjective	quick, large, neural
RB	Adverb	quickly, very, never
DT	Determiner	the, a, an, this
IN	Preposition/conjunction	in, on, of, because
PRP	Personal pronoun	I, he, she, they
NNP	Proper noun singular	London, Google, Ravi

import nltk
nltk.download('averaged_perceptron_tagger_eng', quiet=True)
from nltk.tokenize import word_tokenize
from nltk import pos_tag

sentence = "The quick brown fox jumps over the lazy dog"
tokens = word_tokenize(sentence)
tagged = pos_tag(tokens)
print(tagged)
# [('The', 'DT'), ('quick', 'JJ'), ('brown', 'JJ'), ('fox', 'NN'),
#  ('jumps', 'VBZ'), ('over', 'IN'), ('the', 'DT'), ('lazy', 'JJ'), ('dog', 'NN')]

# Modern approach: spaCy (faster, more accurate)
import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("Apple is looking at buying U.K. startup for $1 billion")
for token in doc:
    print(f"{token.text:<15} {token.pos_:<8} {token.tag_:<6} {token.dep_}")
# Apple           PROPN    NNP    nsubj
# is              AUX      VBZ    aux
# looking         VERB     VBG    ROOT
# at              ADP      IN     prep
# buying          VERB     VBG    pcomp

Chunking — shallow parsing

Chunking (shallow parsing) groups tagged tokens into multi-word phrases without building a full parse tree. The most common chunk type is Noun Phrase (NP): a determiner + adjectives + noun. Chunking uses regular expressions over POS tag sequences.

import nltk
from nltk import RegexpParser
from nltk.tokenize import word_tokenize
from nltk import pos_tag

sentence = "He bought a brand new electric car from a local dealer"
tokens   = word_tokenize(sentence)
tagged   = pos_tag(tokens)

# Grammar: NP = optional DT + optional JJ* + NN/NNS
grammar = r"""
  NP: {<DT>?<JJ>*<NN.*>+}
  VP: {<VB.*><NP>}
"""
parser = RegexpParser(grammar)
tree   = parser.parse(tagged)
tree.pretty_print()

# Output:
# (S
#   He/PRP
#   bought/VBD
#   (NP a/DT brand/JJ new/JJ electric/JJ car/NN)
#   from/IN
#   (NP a/DT local/JJ dealer/NN))

Dependency Parsing

Dependency parsing builds a tree where each word points to its head (the word it modifies or depends on). Every sentence has exactly one root word (the main verb). The tree captures grammatical relations: subject (nsubj), direct object (dobj), modifier (amod), preposition (prep).

Example: She enjoys reading books. Dependency tree: enjoys(root) ← She(nsubj), enjoys → reading(xcomp), reading → books(dobj).

Universal Dependencies: The Universal Dependencies (UD) project defines a consistent set of dependency relations across 100+ languages, enabling cross-lingual NLP. SpaCy, Stanza, and modern parsers all support UD. Key relations: nsubj (nominal subject), dobj/obj (direct object), amod (adjectival modifier), advmod (adverbial modifier), prep (prepositional modifier), cc (coordinating conjunction).

Analysis type	Output	Use case
POS tagging	Token → grammatical tag	Feature for NER, chunking, parsing
Chunking	Token spans → phrase type	IE, shallow syntax for fast pipelines
Constituency parsing	Full phrase-structure tree	Grammar checking, formal syntax analysis
Dependency parsing	Word → head + relation	NLU, semantic role labeling, QA, coreference

Practice questions

POS tag the sentence "The dog barks loudly." (Answer: The/DT dog/NN barks/VBZ loudly/RB ./.)
What is the difference between constituency parsing and dependency parsing? (Answer: Constituency builds a phrase-structure tree (NP, VP). Dependency builds a word-relation tree showing which word governs which.)
In the NP chunk grammar {
?*+}, what does the ? mean? (Answer: Optional — the determiner DT may appear zero or one time.)
Why is POS tagging considered a sequence labeling problem? (Answer: The correct tag for a word depends on its context — "run" is NN in "a run" but VB in "I run". Models must consider the entire sequence.)
Which dependency relation connects "She" to "enjoys" in "She enjoys music"? (Answer: nsubj — nominal subject. "She" is the subject of the root verb "enjoys".)

When you paste a document into LumiChats and ask it to extract key entities or summarize the main actions, the underlying model applies dependency parsing to understand subject-verb-object relationships — enabling it to answer questions like who did what to whom.

Definition

Real-life analogy: The grammar teacher

POS Tagging — Penn Treebank tagset

POS Tag	Meaning	Example
NN	Noun, singular	dog, city, model
NNS	Noun, plural	dogs, cities, models
VB	Verb, base form	run, eat, train
VBZ	Verb, 3rd person singular	runs, eats, trains
JJ	Adjective	quick, large, neural
RB	Adverb	quickly, very, never
DT	Determiner	the, a, an, this
IN	Preposition/conjunction	in, on, of, because
PRP	Personal pronoun	I, he, she, they
NNP	Proper noun singular	London, Google, Ravi

POS tagging with NLTK and spaCy

import nltk
nltk.download('averaged_perceptron_tagger_eng', quiet=True)
from nltk.tokenize import word_tokenize
from nltk import pos_tag

sentence = "The quick brown fox jumps over the lazy dog"
tokens = word_tokenize(sentence)
tagged = pos_tag(tokens)
print(tagged)
# [('The', 'DT'), ('quick', 'JJ'), ('brown', 'JJ'), ('fox', 'NN'),
#  ('jumps', 'VBZ'), ('over', 'IN'), ('the', 'DT'), ('lazy', 'JJ'), ('dog', 'NN')]

# Modern approach: spaCy (faster, more accurate)
import spacy
nlp = spacy.load("en_core_web_sm")
doc = nlp("Apple is looking at buying U.K. startup for $1 billion")
for token in doc:
    print(f"{token.text:<15} {token.pos_:<8} {token.tag_:<6} {token.dep_}")
# Apple           PROPN    NNP    nsubj
# is              AUX      VBZ    aux
# looking         VERB     VBG    ROOT
# at              ADP      IN     prep
# buying          VERB     VBG    pcomp

Chunking — shallow parsing

Noun phrase chunking with NLTK

import nltk
from nltk import RegexpParser
from nltk.tokenize import word_tokenize
from nltk import pos_tag

sentence = "He bought a brand new electric car from a local dealer"
tokens   = word_tokenize(sentence)
tagged   = pos_tag(tokens)

# Grammar: NP = optional DT + optional JJ* + NN/NNS
grammar = r"""
  NP: {<DT>?<JJ>*<NN.*>+}
  VP: {<VB.*><NP>}
"""
parser = RegexpParser(grammar)
tree   = parser.parse(tagged)
tree.pretty_print()

# Output:
# (S
#   He/PRP
#   bought/VBD
#   (NP a/DT brand/JJ new/JJ electric/JJ car/NN)
#   from/IN
#   (NP a/DT local/JJ dealer/NN))

Dependency Parsing

Example: She enjoys reading books. Dependency tree: enjoys(root) ← She(nsubj), enjoys → reading(xcomp), reading → books(dobj).

Universal Dependencies

The Universal Dependencies (UD) project defines a consistent set of dependency relations across 100+ languages, enabling cross-lingual NLP. SpaCy, Stanza, and modern parsers all support UD. Key relations: nsubj (nominal subject), dobj/obj (direct object), amod (adjectival modifier), advmod (adverbial modifier), prep (prepositional modifier), cc (coordinating conjunction).

Analysis type	Output	Use case
POS tagging	Token → grammatical tag	Feature for NER, chunking, parsing
Chunking	Token spans → phrase type	IE, shallow syntax for fast pipelines
Constituency parsing	Full phrase-structure tree	Grammar checking, formal syntax analysis
Dependency parsing	Word → head + relation	NLU, semantic role labeling, QA, coreference

Practice questions

POS tag the sentence "The dog barks loudly." (Answer: The/DT dog/NN barks/VBZ loudly/RB ./.)
What is the difference between constituency parsing and dependency parsing? (Answer: Constituency builds a phrase-structure tree (NP, VP). Dependency builds a word-relation tree showing which word governs which.)
In the NP chunk grammar {
?*+}, what does the ? mean? (Answer: Optional — the determiner DT may appear zero or one time.)
Why is POS tagging considered a sequence labeling problem? (Answer: The correct tag for a word depends on its context — "run" is NN in "a run" but VB in "I run". Models must consider the entire sequence.)
Which dependency relation connects "She" to "enjoys" in "She enjoys music"? (Answer: nsubj — nominal subject. "She" is the subject of the root verb "enjoys".)

On LumiChats

Try it free

POS Tagging, Chunking & Dependency Parsing

Real-life analogy: The grammar teacher

POS Tagging — Penn Treebank tagset

Chunking — shallow parsing

Dependency Parsing

Practice questions

POS Tagging, Chunking & Dependency Parsing

Real-life analogy: The grammar teacher

POS Tagging — Penn Treebank tagset

Chunking — shallow parsing

Dependency Parsing

Practice questions

Practice what you just learned

Related Terms