In this talk,
Hashemi introduces their newly published work entitled “An Evaluation of Parser
Robustness for Ungrammatical Sentences”. In the realistic setting, natural
language sentences not always correct and well-edited. In addition to those
heavily edited texts such as news and formal reports, there are massive noisier
texts including microblogs, tweets, consumer reviews, English-as-a-Second
language writings (ESL) and Machine translation outputs (MT). Hashemi points
out that, as the first step of natural languages processing, parsing influences
the entire downstream applications. For the same parser, the parsing results
for ungrammatical sentences are extremely different from that of grammatical
sentences. As shown in Figure 1, one extra word “about” will heavily change the
parsing results. Therefore, it is necessary to test the robustness of the
state-of-the-art parsers on ungrammatical sentences.
They compare the
robustness of those existing parsers by applying them into ungrammatical
sentences. If a parser can overlook problems such as grammar mistakes and
produce a parse tree that closely resembles the correct analysis for the
intended sentence, then they say this parser is robust to ungrammatical
sentences. Since manually annotated gold standards trees for ungrammatical
sentences are time-consuming and expensive, they propose gold standard free
approach. In specific, they take parse trees of well-formed sentences as gold
standard. In this case the traditional evaluation metrics cannot be employed as
words of ungrammatical sentences and its grammatical counterpart do not
necessarily match. So, they present two definitions: error-related dependency
and shared dependency. Error-related dependency corresponds to dependency
connected to an extra word while shared dependency as mutual dependency between
two trees. And Hashemi presents their measurements as:
·
Precision is (# of shared dependencies) / (# of # of
error-related dependencies of the ungram- matical sentence);
· Recall is (# of shared
dependencies) / (# of de- pendencies of the grammatical sentence - # of
error-related dependencies of the grammatical sentence); and
In their
experiments, they test eight leading dependency parsers: Malt Parser, Mate
Parser, MST Parser, Stanford Neural Network Parser, SyntaxNet, Turbo Parser,
Tweebo Parser and Yara Parser. Their training data consists of Penn Treebank (50000
sentences) and Tweebank (1000 sentences). Their test data contains ESL and MT
sentences. They find that different parsers exhibit various robustness
performance on different datasets. If a data is more similar to tweets, Malt or
Turbo may be good. If it is more like MT, SyntaxNet, Malt and Turbo are good
choices.
About the talk:
Speaker: Homa
Hashemi
Date: Nov 18,
2016
No comments:
Post a Comment