-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
executable file
·1213 lines (750 loc) · 61.3 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html lang="en"><!-- Beautiful Jekyll | MIT license | Copyright Dean Attali 2016 --><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0">
<title>Yang Xiao</title>
<meta name="author" content="Copy from Kai-Wei Chang">
<link rel="alternate" type="application/rss+xml" title="Yang Xiao" href="https://xiaoyang66.github.io/">
<!-- <link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.2.0/css/font-awesome.min.css" rel="stylesheet"> -->
<link rel="stylesheet" href="https://cdn.staticfile.org/font-awesome/4.7.0/css/font-awesome.css">
<!-- <link rel="stylesheet" href="./JinlanFu/font-awesome.min.css"> -->
<link rel="stylesheet" href="./JinlanFu/bootstrap.min.css">
<link rel="stylesheet" href="./JinlanFu/bootstrap-social.css">
<link rel="stylesheet" href="./JinlanFu/main.css">
<link rel="stylesheet" href="./JinlanFu/css">
<link rel="stylesheet" href="./JinlanFu/css(1)">
<body data-new-gr-c-s-check-loaded="14.1024.0" data-gr-ext-installed="">
<nav class="navbar navbar-default navbar-fixed-top navbar-custom">
<div class="container-fluid">
<div class="navbar-header">
<button type="button" class="navbar-toggle" data-toggle="collapse" data-target="#main-navbar">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
</div>
<div class="collapse navbar-collapse" id="main-navbar">
<ul class="nav navbar-nav navbar-left">
<!-- <ul class="navbar-nav mr-auto"> -->
<li>
<a href="https://xiaoyang66.github.io">Yang Xiao</a>
</li>
<li>
<a href="https://xiaoyang66.github.io">Home</a>
</li>
<li>
<a href="#award">Award</a>
</li>
<li>
<a href="#project">Project</a>
</li>
<li>
<a href="#publication">Publication</a>
</li>
<!--<li>
<a href="#service">Service</a>
</li>-->
<!--<li>
<a href="#opening">Opening</a>
</li>-->
<!-- <li>
<a href="http://web.cs.ucla.edu/~kwchang/talks">Talks</a>
</li> -->
<!--
<li>
<a href="http://web.cs.ucla.edu/~kwchang/blog">News</a>
</li>
<li>
<a href="http://web.cs.ucla.edu/~kwchang/awards">Awards</a>
</li>
<li>
<a href="http://web.cs.ucla.edu/~kwchang/funding">Funding</a>
</li>
<li>
<a href="http://web.cs.ucla.edu/~kwchang/application">Opening</a>
</li> -->
</ul>
</div>
<!-- <div class="avatar-container">
<div class="avatar-img-border">
<a href="http://web.cs.ucla.edu/~kwchang">
<img class="avatar-img" src="./JinlanFu/uclanlp.png">
</a>
</div>
</div> -->
</div>
</nav>
<!-- TODO this file has become a mess, refactor it -->
<div class="intro-header"></div>
<div class="container" role="main">
<div class="row">
<div class="col-lg-10 col-lg-offset-1 col-md-10 col-md-offset-1">
<div class="jumbotron profile" id="main-profile">
<div class="container">
<div class="col-md-3" align="right">
<p><img src="./JinlanFu/xiaoyang.jpeg" alt="image-title-here" class="avatar-img"></p>
</div>
<div class="col-md-9">
<h2 align="center"> Yang Xiao </h2>
<!-- <h2 align="center"> XXXXX </h2> -->
<div class="col-md-7 col-md-offset-1" id="main-profile">
<ul>
<li><a href="https://xiaoyang66.github.io/">
<span class="fa-stack fa-lg">
<i class="fa fa-circle fa-stack-2x"></i>
<i class="fa fa-university fa-stack-1x fa-inverse"></i>
</span>
</a> Ph.D. student @ PolyU-CS </li>
<li><a href="mailto:[email protected]" title="Email me">
<span class="fa-stack fa-lg">
<i class="fa fa-circle fa-stack-2x"></i>
<i class="fa fa-envelope fa-stack-1x fa-inverse"></i>
</span>
</a> [email protected]</li>
<li><a href="https://xiaoyang66.github.io/">
<span class="fa-stack fa-lg">
<i class="fa fa-circle fa-stack-2x"></i>
<i class="fa fa-location-arrow fa-stack-1x fa-inverse"></i>
</span> </a> PolyU Rm VA316</li>
<li><a href="https://xiaoyang66.github.io/">
<span class="fa-stack fa-lg">
<i class="fa fa-circle fa-stack-2x"></i>
<i class="fa fa-hashtag fa-stack-1x fa-inverse"></i>
<!-- </span> </a> <span id="smallbox">Natural Language Processing<br> Machine Learning</span> -->
</span> </a> <span id="smallbox">Natural Language Processing</span>
</li>
</ul>
</div>
<div class="col-md-4" id="main-profile">
<ul>
<li><a href="https://scholar.google.com/citations?user=rLqDPtQAAAAJ&hl=en">
<span class="fa-stack fa-lg">
<i class="fa fa-circle fa-stack-2x"></i>
<i class="fa fa-graduation-cap fa-stack-1x fa-inverse"></i>
</span> Google Scholar </a><a href="https://scholar.google.com/citations?user=rLqDPtQAAAAJ&hl=en">
<i class="fa fa-rss"></i>
</a>
</li>
<!-- <li><a href="https://dblp.org/pid/218/7289.html">
<span class="fa-stack fa-lg">
<i class="fa fa-circle fa-stack-2x"></i>
<i class="fa fa-youtube fa-stack-1x fa-inverse"></i>
</span> DBLP</a>
</li> -->
<!--<li><a href="https://dblp.org/pid/218/7289.html">
<span class="fa-stack fa-lg">
<i class="fa fa-circle fa-stack-2x"></i>
<i class="fa fa-id-badge fa-stack-1x fa-inverse"></i>
</span> DBLP</a>-->
</li>
<li><a href="https://twitter.com/EtdfA361sIoLA61">
<span class="fa-stack fa-lg">
<i class="fa fa-circle fa-stack-2x"></i>
<i class="fa fa-twitter fa-stack-1x fa-inverse"></i>
</span> Twitter</a>
</li>
<li><a href="https://drive.google.com/file/d/1XqsNO0KSxC5ZlnwDgjav1LJ4oLYs4ajS/view?usp=sharing">
<span class="fa-stack fa-lg">
<i class="fa fa-circle fa-stack-2x"></i>
<i class="fa fa-id-badge fa-stack-1x fa-inverse"></i>
</span> Curriculum Vitae</a>
</li>
</ul>
</div>
</div>
</div>
</div>
<div class="row">
<h2> About Me </h2>
<p>
Hi there. Currently, I am a Ph.D. student at The Hong Kong Polytechnic University (PolyU) advised by Prof. <a class="urllink" href="https://scholar.google.com/citations?user=Rx5swD4AAAAJ&hl=en" rel="nofollow">Wenjie Li</a>. My research experience mainly focused on generative agent, dialogue generation and evaluation for NLP tasks but I’m open to other areas as well. My long-time research goal is to develop intelligent agents for human goods.
</p>
<p>
I received my Bachelor’s degree from Fudan University in 2022, majoring software engineering and computer science. During my Bachelor’s period, I worked closely with Dr. <a class="urllink" href="http://pfliu.com/#" rel="nofollow">Pengfei Liu</a>, Dr. <a class="urllink" href="https://jinlanfu.github.io/" rel="nofollow">Jinlan Fu</a> and Prof. <a class="urllink" href="http://www.phontron.com/" rel="nofollow">Graham Neubig</a>, developing my interest in natural language processing.
<!--Hi there, I am a postdoc at the National University of Singapore, working with Prof. <a class="urllink" href="https://www.comp.nus.edu.sg/~ngsk/" rel="nofollow">See-Kiong Ng</a>.
I received my Ph.D. degree from the School of Computer Science, Fudan University (Sep. 2016 ~ Jul. 2021), supervised by Prof. <a class="urllink" href="http://nlp.fudan.edu.cn/xjhuang" rel="nofollow"> Xuanjing Huang </a> and Prof. <a class="urllink" href="http://qizhang.info/" rel="nofollow"> Qi Zhang </a>. -->
<!-- where the advisor is Prof. <a class="urllink" href="http://nlp.fudan.edu.cn/xjhuang" rel="nofollow"> Xuanjing Huang </a>. -->
<!--From Dec. 2019 to Jun. 2022, I was lucky to work closely (remotely) with Dr.
<a class="urllink" href="http://pfliu.com/" rel="nofollow"> Pengfei Liu </a> and Prof. <a class="urllink" href="http://www.phontron.com/" rel="nofollow"> Graham Neubig</a> of the Language Technologies Institute (LTI) at Carnegie Mellon University. -->
</p>
<!--<p id="opening">
My research focuses on the following perspectives for natural language processing:
</p>
<ul>
<li><p> Dialogue System; </p></li>
<li><p> Interpretable Analysis and Text Evaluation; </p></li>
<li><p> Information Extraction, Sequence Labeling; </p></li>
<li><p> Cross-lingual Transfer Learning. </p></li>
</ul>-->
<!-- <span style="color: deeppink;font-size:20px;"> -->
<!-- <span style="color: deeppink"> -->
<!--<span>
🔥🔥 We are looking for highly-motivated interns, Research Assistants, and Ph.D to work on Natural Language Processing. Please drop me an email with your CV if you are interested. </span>-->
</div>
<hr id="award">
<div class="row" >
<h2 id="about-me">Awards</h2>
<ul>
<li><p><b>PolyU Presidential PhD Fellowship, PolyU 2023</b></p></li>
<li>
<p><b>Outstanding Demo Paper Award, ACL 2022: </b> <a href="https://aclanthology.org/2022.acl-demo.18.pdf">DataLab: A Platform for Data Analysis and Intervention</a></p>
</li>
<li>
<p><b>Best Demo Paper Award, ACL 2021: </b> <a href="https://arxiv.org/pdf/2104.06387.pdf">ExplainaBoard: An Explainable Leaderboard for NLP</a></p>
</li>
<li>
<p><b>National Scholarship, Fudan University 2019</b>
</p>
</li>
<!--<li>
<p><b>Excellent Doctoral Dissertation of Chinese Information Society of China</b></p>
</li>-->
<!-- <li>
<p>Outstanding Graduate, Fudan University, 2021</p>
</li>
<li>
<p>2017-2018 and 2019-2020, National Scholarship </p>
</li> -->
</ul>
</div>
<hr id="project">
<div class="row" >
<h2 id="about-me">Projects</h2>
<ul class="bibliography">
<li>
<span class="project_text">DataLab: A Platform for Data Analysis and Intervention: </span>
<a href="https://datalab.nlpedia.ai/" class="pj_home">Homepage</a>
<a href="https://github.com/ExpressAI/DataLab" class="pj_home">Code</a>
</li>
<li>
<span class="project_text">ExplainaBoard: An Explainable Leaderboard for NLP: </span>
<a href="http://explainaboard.nlpedia.ai/" class="pj_home">Homepage</a>
<a href="https://github.com/neulab/ExplainaBoard" class="pj_home">Code</a>
</li>
</ul>
</div>
<hr id="publication">
<div class="row">
<h2 id="about-me">Selected Publications</h2>
<h4>A complete list is in <a href="https://scholar.google.com/citations?user=rLqDPtQAAAAJ&hl=en" target="_blank"><u>Google Scholar</u></a>.<br>
</h4>
<h4> * denotes the corresponding author. </h4>
<h2 class="bibliography">2022</h2>
<ul class="bibliography">
<!--<li>
<h4> <a href="https://arxiv.org/pdf/2204.14264.pdf"> Polyglot Prompt: Multilingual Multitask PrompTraining </a></h4>
<span id="fu2022poly"> <b>Jinlan Fu</b>, See-Kiong Ng, Pengfei Liu </span>
<br>
<span class="conf">EMNLP </span>
<a href="https://arxiv.org/pdf/2204.14264.pdf" class="my_details">Full Text</a>
<a href="https://github.com/jinlanfu/Polyglot_Prompt" class="my_code">Code</a>
<a data-toggle="collapse" href="#fu2022poly-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#fu2022poly-bibtex" class="my_details">BibTeX</a>
<div id="fu2022poly-materials">
<pre id="fu2022poly-abstract" class="pre collapse">This paper aims for a potential architectural improvement for multilingual learning and asks: Can different tasks from different languages be modeled in a monolithic framework, i.e. without any task/language-specific module? The benefit of achieving this could open new doors for future multilingual research, including allowing systems trained on low resources to be further assisted by other languages as well as other tasks. We approach this goal by developing a learning framework named Polyglot Prompting to exploit prompting methods for learning a unified semantic space for different languages and tasks with multilingual prompt engineering. We performed a comprehensive evaluation of 6 tasks, namely topic classification, sentiment classification, named entity recognition, question answering, natural language inference, and summarization, covering 24 datasets and 49 languages. The experimental results demonstrated the efficacy of multilingual multitask prompt-based learning and led to inspiring observations. We also present an interpretable multilingual evaluation methodology and show how the proposed framework, multilingual multitask prompt training, works. We release all datasets prompted in the best setting and code. </pre>
<pre id="fu2022poly-bibtex" class="pre pre-scrollable collapse">@inproceedings{fu2022poly,
title = {Polyglot Prompt: Multilingual Multitask PrompTraining},
author = {Jinlan Fu, See-Kiong Ng, Pengfei Liu},
booktitle = {EMNLP},
year = {2022}
}
</pre>
</div>
</li>-->
<!--<li>
<h4> <a href="https://aclanthology.org/2022.coling-1.38.pdf"> CorefDiffs: Co-referential and Differential Knowledge Flow in Document Grounded Conversations </a></h4>
<span id="lin2022corefdiffs"> Lin Xu, Qixian Zhou, <b>Jinlan Fu</b>, Min-Yen Kan, See-Kiong Ng </span>
<br>
<span class="conf">COLING </span>
<a href="https://aclanthology.org/2022.coling-1.38.pdf" class="my_details">Full Text</a>
<a href="https://github.com/cathyxl/coref-diffs" class="my_code">Code</a>
<a data-toggle="collapse" href="#lin2022corefdiffs-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#lin2022corefdiffs-bibtex" class="my_details">BibTeX</a>
<div id="lin2022corefdiffs-materials">
<pre id="lin2022corefdiffs-abstract" class="pre collapse">Knowledge-grounded dialog systems need to incorporate smooth transitions among knowledge selected for generating responses, to ensure that dialog flows naturally. For document-grounded dialog systems, the inter- and intra-document knowledge relations can be used to model such conversational flows. We develop a novel Multi-Document Co-Referential Graph (Coref-MDG) to effectively capture the inter-document relationships based on commonsense and similarity and the intra-document co-referential structures of knowledge segments within the grounding documents. We propose CorefDiffs, a Co-referential and Differential flow management method, to linearize the static Coref-MDG into conversational sequence logic. CorefDiffs performs knowledge selection by accounting for contextual graph structures and the knowledge difference sequences. CorefDiffs significantly outperforms the state-of-the-art by 9.5%, 7.4%, and 8.2% on three public benchmarks. This demonstrates that the effective modeling of co-reference and knowledge difference for dialog flows are critical for transitions in document-grounded conversation </pre>
<pre id="lin2022corefdiffs-bibtex" class="pre pre-scrollable collapse">@inproceedings{lin2022corefdiffs,
title = {CorefDiffs: Co-referential and Differential Knowledge Flow in Document Grounded Conversations},
author = {Lin Xu, Qixian Zhou, Jinlan Fu, Min-Yen Kan, See-Kiong Ng},
booktitle = {COLING},
year = {2022}
}
</pre>
</div>
</li>-->
<li>
<h4> <a href="https://arxiv.org/pdf/2205.02129.pdf"> Are All the Datasets in Benchmark Necessary? A Pilot Study of Dataset Evaluation for Text Classification </a></h4>
<span id="xiao2022eval"> <b>Yang Xiao</b>, Jinlan Fu*, See-Kiong Ng, Pengfei Liu </span>
<br>
<span class="conf">NAACL </span>
<a href="https://arxiv.org/pdf/2205.02129.pdf" class="my_details">Full Text</a>
<a href="https://github.com/ExpressAI/DataLab" class="my_code">Code</a>
<a href="https://datalab.nlpedia.ai/" class="my_code">DataLab</a>
<a data-toggle="collapse" href="#xiao2022eval-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#xiao2022eval-bibtex" class="my_details">BibTeX</a>
<div id="xiao2022eval-materials">
<pre id="xiao2022eval-abstract" class="pre collapse">In this paper, we ask the research question of whether all the datasets in the benchmark are necessary. We approach this by first characterizing the distinguishability of datasets when comparing different systems. Experiments on 9 datasets and 36 systems show that several existing benchmark datasets contribute little to discriminating top-scoring systems, while those less used datasets exhibit impressive discriminative power. We further, taking the text classification task as a case study, investigate the possibility of predicting dataset discrimination based on its properties (e.g., average sentence length). Our preliminary experiments promisingly show that given a sufficient number of training experimental records, a meaningful predictor can be learned to estimate dataset discrimination over unseen datasets. We released all datasets with features explored in this work on DataLab </pre>
<pre id="xiao2022eval-bibtex" class="pre pre-scrollable collapse">@inproceedings{xiao2022eval,
title = {Are All the Datasets in Benchmark Necessary? A Pilot Study of Dataset Evaluation for Text Classification},
author = {Yang Xiao, Jinlan Fu, See-Kiong Ng, Pengfei Liu},
booktitle = {NAACL},
year = {2022}
}
</pre>
</div>
</li>
<li>
<h4> <a href="https://aclanthology.org/2022.acl-demo.18.pdf"> DataLab: A Platform for Data Analysis and Intervention </a></h4>
<span id="xiao2022datalab"> <b>Yang Xiao</b>, Jinlan Fu, Weizhe Yuan, Vijay Viswanathan, Zhoumianze Liu, Yixin Liu, Graham Neubig, Pengfei Liu</span>
<br>
<span class="conf">ACL-2022, Outstanding Demo </span>
<a href="https://aclanthology.org/2022.acl-demo.18.pdf" class="my_details">Full Text</a>
<a href="https://github.com/ExpressAI/DataLab" class="my_code">Code</a>
<a href="https://datalab.nlpedia.ai/" class="my_code">DataLab</a>
<a data-toggle="collapse" href="#xiao2022datalab-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#xiao2022datalab-bibtex" class="my_details">BibTeX</a>
<div id="xiao2022datalab-materials">
<pre id="xiao2022datalab-abstract" class="pre collapse">Despite data’s crucial role in machine learning, most existing tools and research tend to focus on systems on top of existing data rather than how to interpret and manipulate data.In this paper, we propose DataLab, a unified data-oriented platform that not only allows users to interactively analyze the characteristics of data but also provides a standardized interface so that many data processing operations can be provided within a unified interface. Additionally, in view of the ongoing surge in the proliferation of datasets, DataLab has features for dataset recommendation and global vision analysis that help researchers form a better view of the data ecosystem. So far, DataLab covers 1,300 datasets and 3,583 of its transformed version, where 313 datasets support different types of analysis (e.g., with respect to gender bias) with the help of 119M samples annotated by 318 feature functions. DataLab is under active development and will be supported going forward. We have released a web platform, web API, Python SDK, and PyPI published package, which hopefully, can meet the diverse needs of researchers. </pre>
<pre id="xiao2022datalab-bibtex" class="pre pre-scrollable collapse">@inproceedings{xiao2022datalab,
title = {DataLab: A Platform for Data Analysis and Intervention},
author = {Yang Xiao, Jinlan Fu, Weizhe Yuan, Vijay Viswanathan, Zhoumianze Liu, Yixin Liu, Graham Neubig, Pengfei Liu},
booktitle = {ACL},
year = {2022}
}
</pre>
</div>
</li>
<li>
<h4> <a href="https://arxiv.org/abs/2110.08555"> On the Robustness of Reading Comprehension Models to Entity Renaming </a></h4>
<span id="xiao2022datalab"> Jun Yan, <b>Yang Xiao</b>, Sagnik Mukherjee, Bill Yuchen Lin, Robin Jia, Xiang Ren</span>
<br>
<span class="conf">NACL </span>
<a href="https://arxiv.org/pdf/2110.08555.pdf" class="my_details">Full Text</a>
<a href="https://github.com/ExpressAI/DataLab" class="my_code">Code</a>
<a data-toggle="collapse" href="#yan2022datalab-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#yan2022datalab-bibtex" class="my_details">BibTeX</a>
<div id="yan2022datalab-materials">
<pre id="yan2022datalab-abstract" class="pre collapse">We study the robustness of machine read- ing comprehension (MRC) models to entity renaming—do models make more wrong pre- dictions when the same questions are asked about an entity whose name has been changed? Such failures imply that models overly rely on entity information to answer questions, and thus may generalize poorly when facts about the world change or questions are asked about novel entities. To systematically audit this is- sue, we present a pipeline to automatically gen- erate test examples at scale, by replacing entity names in the original test sample with names from a variety of sources, ranging from names in the same test set, to common names in life, to arbitrary strings. Across five datasets and three pretrained model architectures, MRC models consistently perform worse when enti- ties are renamed, with particularly large accu- racy drops on datasets constructed via distant supervision. We also find large differences be- tween models: SpanBERT, which is pretrained with span-level masking, is more robust than RoBERTa, despite having similar accuracy on unperturbed test data. We further experiment with different masking strategies as the contin- ual pretraining objective and find that entity- based masking can improve the robustness of MRC models.</pre>
<pre id="yan2022datalab-bibtex" class="pre pre-scrollable collapse">@article{yan2021robustness,
title={On the Robustness of Reading Comprehension Models to Entity Renaming},
author={Yan, Jun and Xiao, Yang and Mukherjee, Sagnik and Lin, Bill Yuchen and Jia, Robin and Ren, Xiang},
journal={arXiv preprint arXiv:2110.08555},
year={2021}
}
</pre>
</div>
</li>
<!--<li>
<h4> <a href="https://arxiv.org/pdf/1906.01378.pdf"> Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing </a></h4>
<span id="liu2021pretrain"> Pengfei Liu, Weizhe Yuan, <b>Jinlan Fu</b>, Zhengbao Jiang, Hiroaki Hayashi, Graham Neubig </span>
<br>
<span class="conf">ACM Computing Surveys </span>
<a href="https://arxiv.org/pdf/2107.13586.pdf" class="my_details">Full Text</a>
<a href="http://pretrain.nlpedia.ai/" class="my_code">Resource</a>
<a data-toggle="collapse" href="#liu2021pretrain-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#liu2021pretrain-bibtex" class="my_details">BibTeX</a>
<div id="liu2021pretrain-materials">
<pre id="liu2021pretrain-abstract" class="pre collapse">This paper surveys and organizes research works in a new paradigm in natural language processing, which we dub "prompt-based learning". Unlike traditional supervised learning, which trains a model to take in an input x and predict an output y as P(y|x), prompt-based learning is based on language models that model the probability of text directly. To use these models to perform prediction tasks, the original input x is modified using a template into a textual string prompt x' that has some unfilled slots, and then the language model is used to probabilistically fill the unfilled information to obtain a final string x, from which the final output y can be derived. This framework is powerful and attractive for a number of reasons: it allows the language model to be pre-trained on massive amounts of raw text, and by defining a new prompting function the model is able to perform few-shot or even zero-shot learning, adapting to new scenarios with few or no labeled data. In this paper we introduce the basics of this promising paradigm, describe a unified set of mathematical notations that can cover a wide variety of existing work, and organize existing work along several dimensions, e.g.the choice of pre-trained models, prompts, and tuning strategies. To make the field more accessible to interested beginners, we not only make a systematic review of existing works and a highly structured typology of prompt-based concepts, but also release other resources, e.g., a website http://pretrain.nlpedia.ai/ including constantly-updated survey, and paperlist. </pre>
<pre id="liu2021pretrain-bibtex" class="pre pre-scrollable collapse">@inproceedings{liu2021pretrain,
title = {Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing},
author = {Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, Graham Neubig},
booktitle = {ACM Computing Surveys},
year = {2022}
}
</pre>
</div>
</li>-->
</ul>
<h2 class="bibliography">2021</h2>
<ul class="bibliography">
<!--<li>
<h4> <a href="https://arxiv.org/pdf/2104.07412.pdf"> XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation </a></h4>
<span id="ruder2021xtremer">Sebastian Ruder, Noah Constant, Jan Botha, Aditya Siddhant, Orhan Firat, <b>Jinlan Fu</b>, Pengfei Liu, Junjie Hu, Graham Neubig, Melvin Johnson </span>
<br>
<span class="conf">EMNLP</span>
<a href="https://arxiv.org/pdf/2104.07412.pdf" class="my_details">Full Text</a>
<a href="https://github.com/google-research/xtreme" class="my_code">Code</a>
<a href="https://sites.research.google/xtreme/" class="my_code">Leaderboard</a>
<a href="http://explainaboard.nlpedia.ai/leaderboard/xtreme/" class="my_code">ExplainaBoard</a>
<a data-toggle="collapse" href="#ruder2021xtremer-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#ruder2021xtremer-bibtex" class="my_details">BibTeX</a>
<div id="ruder2021xtremer-materials">
<pre id="ruder2021xtremer-abstract" class="pre collapse">Machine learning has brought striking advances in multilingual natural language processing capabilities over the past year. For example, the latest techniques have improved the state-of-the-art performance on the XTREME multilingual benchmark by more than 13 points. While a sizeable gap to human-level performance remains, improvements have been easier to achieve in some tasks than in others. This paper analyzes the current state of cross-lingual transfer learning and summarizes some lessons learned. In order to catalyze meaningful progress, we extend XTREME to XTREME-R, which consists of an improved set of ten natural language understanding tasks, including challenging language-agnostic retrieval tasks, and covers 50 typologically diverse languages. In addition, we provide a massively multilingual diagnostic suite and fine-grained multi-dataset evaluation capabilities through an interactive public leaderboard to gain a better understanding of such models.</pre>
<pre id="ruder2021xtremer-bibtex" class="pre pre-scrollable collapse">@inproceedings{ruder2021xtremer,
title = {XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation},
author = {Sebastian Ruder, Noah Constant, Jan Botha, Aditya Siddhant, Orhan Firat, Jinlan Fu, Pengfei Liu, Junjie Hu, Graham Neubig, Melvin Johnson},
booktitle = {EMNLP},
year = {2021}
}
</pre>
</div>
</li>-->
<li>
<h4> <a href="https://arxiv.org/pdf/2104.06387.pdf"> EXPLAINABOARD: An Explainable Leaderboard for NLP </a></h4>
<span id="meng2020integer">Pengfei Liu, Jinlan Fu, <b>Yang Xiao</b>, Weizhe Yuan, Shuaicheng Chang, Junqi Dai, Yixin Liu, Zihuiwen Ye, Zi-Yi Dou, Graham Neubig</span>
<br>
<span class="conf">ACL, Best Demo </span>
<a href="https://arxiv.org/pdf/2104.06387.pdf" class="my_details">Full Text</a>
<a href="https://github.com/neulab/ExplainaBoard" class="my_code">Code</a>
<a href="http://explainaboard.nlpedia.ai/" class="my_code">ExplainaBoard</a>
<a data-toggle="collapse" href="#liu2021explain-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#liu2021explain-bibtex" class="my_details">BibTeX</a>
<!-- <a href="http://web.cs.ucla.edu/~kwchang/bibliography/liu2021explain/" class="my_details">Details</a> -->
<div id="liu2021explain-materials">
<pre id="liu2021explain-abstract" class="pre collapse">With the rapid development of NLP research, leaderboards have emerged as one tool to track the performance of various systems on various NLP tasks. They are effective in this goal to some extent, but generally present a rather simplistic one-dimensional view of the submitted systems, communicated only through holistic accuracy numbers. In this paper, we present a new conceptualization and implementation of NLP evaluation: the ExplainaBoard, which in addition to inheriting the functionality of the standard leaderboard, also allows researchers to (i) diagnose strengths and weaknesses of a single system (e.g.~what is the best-performing system bad at?) (ii) interpret relationships between multiple systems. (e.g.~where does system A outperform system B? What if we combine systems A, B, and C?) and (iii) examine prediction results closely (e.g.~what are common errors made by multiple systems, or in what contexts do particular errors occur?). So far, ExplainaBoard covers more than 400 systems, 50 datasets, 40 languages, and 12 tasks. ExplainaBoard keeps updated and is recently upgraded by supporting (1) multilingual multi-task benchmark, (2) meta-evaluation, and (3) more complicated task: machine translation, which reviewers also suggested.} We not only released an online platform on the website \url{http://explainaboard.nlpedia.ai/} but also make our evaluation tool an API with MIT Licence at Github \url{https://github.com/neulab/explainaBoard} and PyPi \url{https://pypi.org/project/interpret-eval/} that allows users to conveniently assess their models offline. We additionally release all output files from systems that we have run or collected to motivate "output-driven" research in the future. </pre>
<pre id="liu2021explain-bibtex" class="pre pre-scrollable collapse">@inproceedings{liu2021explain,
title = {EXPLAINABOARD: An Explainable Leaderboard for NLP},
author = {Pengfei Liu, Jinlan Fu, Yang Xiao, Weizhe Yuan, Shuaicheng Chang, Junqi Dai, Yixin Liu, Zihuiwen Ye, Zi-Yi Dou, Graham Neubig},
booktitle = {ACL},
year = {2021}
}
</pre>
</div>
</li>
<!--<li>
<h4> <a href="https://arxiv.org/pdf/2106.00641.pdf"> SpanNER: Named Entity Re-/Recognition as Span Prediction </a></h4>
<span id="fu2021spanner"><b>Jinlan Fu</b>, Xuanjing Huang, Pengfei Liu </span>
<br>
<span class="conf">ACL</span>
<a href="https://arxiv.org/pdf/2106.00641.pdf" class="my_details">Full Text</a>
<a href="https://github.com/neulab/spanner" class="my_code">Code</a>
<a href="http://explainaboard.nlpedia.ai/leaderboard/task-ner/" class="my_code">Demo</a>
<a data-toggle="collapse" href="#fu2021spanner-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#fu2021spanner-bibtex" class="my_details">BibTeX</a>
<div id="fu2021spanner-materials">
<pre id="fu2021spanner-abstract" class="pre collapse">Recent years have seen the paradigm shift of Named Entity Recognition (NER) systems from sequence labeling to span prediction. Despite its preliminary effectiveness, the span prediction model's architectural bias has not been fully understood. In this paper, we first investigate the strengths and weaknesses when the span prediction model is used for named entity recognition compared with the sequence labeling framework and how to further improve it, which motivates us to make complementary advantages of systems based on different paradigms. We then reveal that span prediction, simultaneously, can serve as a system combiner to re-recognize named entities from different systems' outputs. We experimentally implement 154 systems on 11 datasets, covering three languages, comprehensive results show the effectiveness of span prediction models that both serve as base NER systems and system combiners. We make all code and datasets available: \url{https://github.com/neulab/spanner}, as well as an online system demo: \url{http://spanner.sh}. Our model also has been deployed into the ExplainaBoard platform, which allows users to flexibly perform a system combination of top-scoring systems in an interactive way: \url{http://explainaboard.nlpedia.ai/leaderboard/task-ner/}.</pre>
<pre id="fu2021spanner-bibtex" class="pre pre-scrollable collapse">@inproceedings{fu2021spanner,
title = {SpanNER: Named Entity Re-/Recognition as Span Prediction},
author = {Jinlan Fu, Xuanjing Huang, Pengfei Liu},
booktitle = {ACL},
year = {2021}
}
</pre>
</div>
</li>-->
<!--<li>
<h4> <a href="https://arxiv.org/pdf/2104.04434.pdf"> Larger-Context Tagging: When and Why Does It Work? </a></h4>
<span id="fu2021spanner"><b>Jinlan Fu</b>, Liangjing Feng, Qi Zhang, Xuanjing Huang, Pengfei Liu </span>
<br>
<span class="conf">NAACL</span>
<a href="https://arxiv.org/pdf/2104.04434.pdf" class="my_details">Full Text</a>
<a href="http://explainaboard.nlpedia.ai/leaderboard/task-ner/" class="my_code">Demo</a>
<a data-toggle="collapse" href="#fu2021larger-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#fu2021larger-bibtex" class="my_details">BibTeX</a>
<div id="fu2021larger-materials">
<pre id="fu2021larger-abstract" class="pre collapse">The development of neural networks and pretraining techniques has spawned many sentence-level tagging systems that achieved superior performance on typical benchmarks. However, a relatively less discussed topic is what if more context information is introduced into current top-scoring tagging systems. Although several existing works have attempted to shift tagging systems from sentence-level to document-level, there is still no consensus conclusion about when and why it works, which limits the applicability of the larger-context approach in tagging tasks. In this paper, instead of pursuing a state-of-the-art tagging system by architectural exploration, we focus on investigating when and why the larger-context training, as a general strategy, can work. To this end, we conduct a thorough comparative study on four proposed aggregators for context information collecting and present an attribute-aided evaluation method to interpret the improvement brought by larger-context training. Experimentally, we set up a testbed based on four tagging tasks and thirteen datasets. Hopefully, our preliminary observations can deepen the understanding of larger-context training and enlighten more follow-up works on the use of contextual information.</pre>
<pre id="fu2021larger-bibtex" class="pre pre-scrollable collapse">@inproceedings{fu2021larger,
title = {Larger-Context Tagging: When and Why Does It Work?},
author = {Jinlan Fu, Liangjing Feng, Qi Zhang, Xuanjing Huang, Pengfei Liu},
booktitle = {NAACL},
year = {2021}
}
</pre>
</div>
</li>-->
<!--<li>
<h4> <a href="https://arxiv.org/pdf/2102.05486.pdf"> Towards More Fine-grained and Reliable NLP Performance Prediction </a></h4>
<span id="ye2021towards">Zihuiwen Ye, Pengfei Liu, <b>Jinlan Fu</b>, Graham Neubig </span>
<br>
<span class="conf">EACL</span>
<a href="https://arxiv.org/pdf/2102.05486.pdf" class="my_details">Full Text</a>
<a href="https://github.com/neulab/Reliable-NLPPP" class="my_code">Code</a>
<a href="http://explainaboard.nlpedia.ai/leaderboard/task-ner/" class="my_code">Demo</a>
<a data-toggle="collapse" href="#ye2021towards-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#ye2021towards-bibtex" class="my_details">BibTeX</a>
<div id="ye2021towards-materials">
<pre id="ye2021towards-abstract" class="pre collapse">Performance prediction, the task of estimating a system's performance without performing experiments, allows us to reduce the experimental burden caused by the combinatorial explosion of different datasets, languages, tasks, and models. In this paper, we make two contributions to improving performance prediction for NLP tasks. First, we examine performance predictors not only for holistic measures of accuracy like F1 or BLEU but also fine-grained performance measures such as accuracy over individual classes of examples. Second, we propose methods to understand the reliability of a performance prediction model from two angles: confidence intervals and calibration. We perform an analysis of four types of NLP tasks, and both demonstrate the feasibility of fine-grained performance prediction and the necessity to perform reliability analysis for performance prediction methods in the future. We make our code publicly available: \url{https://github.com/neulab/Reliable-NLPPP} </pre>
<pre id="ye2021towards-bibtex" class="pre pre-scrollable collapse">@inproceedings{ye2021towards,
title = {Towards More Fine-grained and Reliable NLP Performance Prediction},
author = {Zihuiwen Ye, Pengfei Liu, Jinlan Fu, Graham Neubig},
booktitle = {EACL},
year = {2021}
}
</pre>
</div>
</li>-->
<!--<li>
<h4> <a href="https://arxiv.org/pdf/2104.07412.pdf"> Textflint: Unified multilingual robustness evaluation toolkit for natural language processing </a></h4>
<span id="wang2021textflint">Xiao Wang, Qin Liu, Tao Gui, Qi Zhang, Yicheng Zou, Xin Zhou, Jiacheng Ye, Yongxin Zhang, Rui Zheng, Zexiong Pang, Qinzhuo Wu, Zhengyan Li, Chong Zhang, Ruotian Ma, Zichu Fei, Ruijian Cai, Jun Zhao, Xingwu Hu, Zhiheng Yan, Yiding Tan, Yuan Hu, Qiyuan Bian, Zhihua Liu, Shan Qin, Bolin Zhu, Xiaoyu Xing, <b>Jinlan Fu</b>, Yue Zhang, Minlong Peng, Xiaoqing Zheng, Yaqian Zhou, Zhongyu Wei, Xipeng Qiu, Xuan-Jing Huang </span>
<br>
<span class="conf">ACL</span>
<a href="https://aclanthology.org/2021.acl-demo.41.pdf" class="my_details">Full Text</a>
<a href="https://github.com/textflint" class="my_code">Code</a>
<a href="https://www.textflint.io/textflint" class="my_code">Textflint</a>
<a data-toggle="collapse" href="#wang2021textflint-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#wang2021textflint-bibtex" class="my_details">BibTeX</a>
<div id="wang2021textflint-materials">
<pre id="wang2021textflint-abstract" class="pre collapse">TextFlint is a multilingual robustness evaluation toolkit for NLP tasks that incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analyses. This enables practitioners to automatically evaluate their models from various aspects or to customize their evaluations as desired with just a few lines of code. TextFlint also generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model in terms of its robustness. To guarantee acceptability, all the text transformations are linguistically based and all the transformed data selected (up to 100,000 texts) scored highly under human evaluation. To validate the utility, we performed large-scale empirical evaluations (over 67,000) on state-of-the-art deep learning models, classic supervised methods, and real-world systems. The toolkit is already available at https://github.com/textflint with all the evaluation results demonstrated at textflint.io.</pre>
<pre id="wang2021textflint-bibtex" class="pre pre-scrollable collapse">@inproceedings{wang2021textflint,
title = {TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing},
author = {Xiao Wang, Qin Liu, Tao Gui, Qi Zhang, Yicheng Zou, Xin Zhou, Jiacheng Ye, Yongxin Zhang, Rui Zheng, Zexiong Pang, Qinzhuo Wu, Zhengyan Li, Chong Zhang, Ruotian Ma, Zichu Fei, Ruijian Cai, Jun Zhao, Xingwu Hu, Zhiheng Yan, Yiding Tan, Yuan Hu, Qiyuan Bian, Zhihua Liu, Shan Qin, Bolin Zhu, Xiaoyu Xing, Jinlan Fu, Yue Zhang, Minlong Peng, Xiaoqing Zheng, Yaqian Zhou, Zhongyu Wei, Xipeng Qiu, Xuan-Jing Huang},
booktitle = {ACL},
year = {2021}
}
</pre>
</div>
</li>-->
</ul>
<!-- </div> -->
<!--<h2 class="bibliography">2020</h2>
<ul class="bibliography">
<li>
<h4> <a href="https://arxiv.org/pdf/2011.06854.pdf"> Interpretable Multi-dataset Evaluation for Named Entity Recognition </a></h4>
<span id="fu2020interpret"><b>Jinlan Fu</b>, Pengfei Liu, Graham Neubig </span>
<br>
<span class="conf">EMNLP</span>
<a href="https://arxiv.org/pdf/2011.06854.pdf" class="my_details">Full Text</a>
<a href="https://github.com/neulab/InterpretEval" class="my_code">Code</a>
<a href="http://explainaboard.nlpedia.ai/leaderboard/task-ner/" class="my_code">Demo</a>
<a data-toggle="collapse" href="#fu2020interpret-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#fu2020interpret-bibtex" class="my_details">BibTeX</a>
<div id="fu2020interpret-materials">
<pre id="fu2020interpret-abstract" class="pre collapse">With the proliferation of models for natural language processing tasks, it is even harder to understand the differences between models and their relative merits. Simply looking at differences between holistic metrics such as accuracy, BLEU, or F1 does not tell us why or how particular methods perform differently and how diverse datasets influence the model design choices. In this paper, we present a general methodology for interpretable evaluation for the named entity recognition (NER) task. The proposed evaluation method enables us to interpret the differences in models and datasets, as well as the interplay between them, identifying the strengths and weaknesses of current systems. By making our analysis tool available, we make it easy for future researchers to run similar analyses and drive progress in this area: https://github.com/neulab/InterpretEval. </pre>
<pre id="fu2020interpret-bibtex" class="pre pre-scrollable collapse">@inproceedings{fu2020interpret,
title = {Interpretable Multi-dataset Evaluation for Named Entity Recognition},
author = {Jinlan Fu, Pengfei Liu, Graham Neubig},
booktitle = {EMNLP},
year = {2020}
}
</pre>
</div>
</li>
<li>
<h4> <a href="https://arxiv.org/pdf/2011.06858.pdf"> RethinkCWS: Is Chinese Word Segmentation a Solved Task? </a></h4>
<span id="fu2020rethinkcws"> <b>Jinlan Fu</b>, Pengfei Liu, Qi Zhang, Xuanjing Huang </span>
<br>
<span class="conf">EMNLP</span>
<a href="https://arxiv.org/pdf/2011.06858.pdf" class="my_details">Full Text</a>
<a href="https://github.com/neulab/InterpretEval" class="my_code">Code</a>
<a href="http://explainaboard.nlpedia.ai/leaderboard/task-cws/" class="my_code">Demo</a>
<a data-toggle="collapse" href="#fu2020rethinkcws-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#fu2020rethinkcws-bibtex" class="my_details">BibTeX</a>
<div id="fu2020rethinkcws-materials">
<pre id="fu2020rethinkcws-abstract" class="pre collapse">The performance of the Chinese Word Segmentation (CWS) systems has gradually reached a plateau with the rapid development of deep neural networks, especially the successful use of large pre-trained models. In this paper, we take stock of what we have achieved and rethink what's left in the CWS task. Methodologically, we propose a fine-grained evaluation for existing CWS systems, which not only allows us to diagnose the strengths and weaknesses of existing models (under the in-dataset setting), but enables us to quantify the discrepancy between different criterion and alleviate the negative transfer problem when doing multi-criteria learning. Strategically, despite not aiming to propose a novel model in this paper, our comprehensive experiments on eight models and seven datasets, as well as thorough analysis, could search for some promising direction for future research. We make all codes publicly available and release an interface that can quickly evaluate and diagnose user's models: https://github.com/neulab/InterpretEval. </pre>
<pre id="fu2020rethinkcws-bibtex" class="pre pre-scrollable collapse">@inproceedings{fu2020rethinkcws,
title = {RethinkCWS: Is Chinese Word Segmentation a Solved Task?},
author = {Jinlan Fu, Pengfei Liu, Qi Zhang, Xuanjing Huang},
booktitle = {EMNLP},
year = {2020}
}
</pre>
</div>
</li>
<li>
<h4> <a href="https://arxiv.org/pdf/2001.03844.pdf"> Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study </a></h4>
<span id="fu2020rethinking"> <b>Jinlan Fu</b>, Pengfei Liu, Qi Zhang, Xuanjing Huang </span>
<br>
<span class="conf">AAAI</span>
<a href="https://arxiv.org/pdf/2001.03844.pdf" class="my_details">Full Text</a>
<a href="http://pfliu.com/InterpretNER/interpretNER.html" class="my_code">Data</a>
<a data-toggle="collapse" href="#fu2020rethinking-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#fu2020rethinking-bibtex" class="my_details">BibTeX</a>
<div id="fu2020rethinking-materials">
<pre id="fu2020rethinking-abstract" class="pre collapse">While neural network-based models have achieved impressive performance on a large body of NLP tasks, the generalization behavior of different models remains poorly understood: Does this excellent performance imply a perfect generalization model, or are there still some limitations? In this paper, we take the NER task as a testbed to analyze the generalization behavior of existing models from different perspectives and characterize the differences of their generalization abilities through the lens of our proposed measures, which guides us to better design models and training methods. Experiments with in-depth analyses diagnose the bottleneck of existing neural NER models in terms of breakdown performance analysis, annotation errors, dataset bias, and category relationships, which suggest directions for improvement. We have released the datasets:(ReCoNLL, PLONER) for the future research at our project page: http://pfliu.com/InterpretNER/. </pre>
<pre id="fu2020rethinking-bibtex" class="pre pre-scrollable collapse">@inproceedings{fu2020rethinking,
title = {Rethinking Generalization of Neural Models: A Named Entity Recognition Case Study.},
author = {Jinlan Fu, Pengfei Liu, Qi Zhang, Xuanjing Huang},
booktitle = {AAAI},
year = {2020}
}
</pre>
</div>
</li>
<li>
<h4> <a href="https://dl.acm.org/doi/10.1145/3336191.3371817"> Recurrent Memory Reasoning Network for Expert Finding in Community Question Answering </a></h4>
<span id="fu2020recurrent"> <b>Jinlan Fu</b>, Yi Li, Qi Zhang, Qinzhuo Wu, Renfeng Ma, Xuanjing Huang, Yu-Gang Jiang </span>
<br>
<span class="conf">WSDM</span>
<a href="https://dl.acm.org/doi/10.1145/3336191.3371817" class="my_details">Full Text</a>
<a data-toggle="collapse" href="#fu2020recurrent-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#fu2020recurrent-bibtex" class="my_details">BibTeX</a>
<div id="fu2020recurrent-materials">
<pre id="fu2020recurrent-abstract" class="pre collapse">Expert finding is a task designed to enable recommendation of the right person who can provide high-quality answers to a requester's question. Most previous works try to involve a content-based recommendation, which only superficially comprehends the relevance between a requester's question and the expertise of candidate experts by exploring the content or topic similarity between the requester's question and the candidate experts' historical answers. However, if a candidate expert has never answered a question similar to the requester's question, then existing methods have difficulty making a correct recommendation. Therefore, exploring the implicit relevance between a requester's question and a candidate expert's historical records by perception and reasoning should be taken into consideration. In this study, we propose a novel \textslrecurrent memory reasoning network (RMRN) to perform this task. This method focuses on different parts of a question, and accordingly retrieves information from the histories of the candidate expert.Since only a small percentage of historical records are relevant to any requester's question, we introduce a Gumbel-Softmax-based mechanism to select relevant historical records from candidate experts' answering histories. To evaluate the proposed method, we constructed two large-scale datasets drawn from Stack Overflow and Yahoo! Answer. Experimental results on the constructed datasets demonstrate that the proposed method could achieve better performance than existing state-of-the-art methods. </pre>
<pre id="fu2020recurrent-bibtex" class="pre pre-scrollable collapse">@inproceedings{fu2020rethinkcws,
title = {Recurrent Memory Reasoning Network for Expert Finding in Community Question Answering},
author = {Jinlan Fu, Yi Li, Qi Zhang, Qinzhuo Wu, Renfeng Ma, Xuanjing Huang, Yu-Gang Jiang},
booktitle = {WSDM},
year = {2020}
}
</pre>
</div>
</li>
</ul>
<h2 class="bibliography">2019</h2>
<ul class="bibliography">
<li>
<h4> <a href="https://arxiv.org/pdf/1906.01378.pdf"> Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning </a></h4>
<span id="peng2019distantly">Minlong Peng, Xiaoyu Xing, Qi Zhang, <b>Jinlan Fu</b>, Xuanjing Huang </span>
<br>
<span class="conf">ACL</span>
<a href="https://arxiv.org/pdf/1906.01378.pdf" class="my_details">Full Text</a>
<a href="https://github.com/v-mipeng/LexiconNER" class="my_code">Code</a>
<a data-toggle="collapse" href="#peng2019distantly-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#peng2019distantly-bibtex" class="my_details">BibTeX</a>
<div id="peng2019distantly-materials">
<pre id="peng2019distantly-abstract" class="pre collapse">In this work, we explore the way to perform named entity recognition (NER) using only unlabeled data and named entity dictionaries. To this end, we formulate the task as a positive-unlabeled (PU) learning problem and accordingly propose a novel PU learning algorithm to perform the task. We prove that the proposed algorithm can unbiasedly and consistently estimate the task loss as if there is fully labeled data. A key feature of the proposed method is that it does not require the dictionaries to label every entity within a sentence, and it even does not require the dictionaries to label all of the words constituting an entity. This greatly reduces the requirement on the quality of the dictionaries and makes our method generalize well with quite simple dictionaries. Empirical studies on four public NER datasets demonstrate the effectiveness of our proposed method. We have published the source code at \url{https://github.com/v-mipeng/LexiconNER}. </pre>
<pre id="peng2019distantly-bibtex" class="pre pre-scrollable collapse">@inproceedings{peng2019distantly,
title = {Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning},
author = {Minlong Peng, Xiaoyu Xing, Qi Zhang, Jinlan Fu, Xuanjing Huang},
booktitle = {ACL},
year = {2019}
}
</pre>
</div>
</li>
<li>
<h4> <a href="https://arxiv.org/pdf/1905.12277.pdf"> Learning Task-specific Representation for Novel Words in Sequence Labeling </a></h4>
<span id="peng2019learning">Minlong Peng, Qi Zhang, Xiaoyu Xing, Tao Gui, <b>Jinlan Fu</b>, Xuanjing Huang </span>
<br>
<span class="conf">IJCAI</span>
<a href="https://arxiv.org/pdf/1905.12277.pdf" class="my_details">Full Text</a>
<a data-toggle="collapse" href="#peng2019learning-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#peng2019learning-bibtex" class="my_details">BibTeX</a>
<div id="peng2019learning-materials">
<pre id="peng2019learning-abstract" class="pre collapse">Word representation is a key component in neural-network-based sequence labeling systems. However, representations of unseen or rare words trained on the end task are usually poor for appreciable performance. This is commonly referred to as the out-of-vocabulary (OOV) problem. In this work, we address the OOV problem in sequence labeling using only training data of the task. To this end, we propose a novel method to predict representations for OOV words from their surface-forms (e.g., character sequence) and contexts. The method is specifically designed to avoid the error propagation problem suffered by existing approaches in the same paradigm. To evaluate its effectiveness, we performed extensive empirical studies on four part-of-speech tagging (POS) tasks and four named entity recognition (NER) tasks. Experimental results show that the proposed method can achieve better or competitive performance on the OOV problem compared with existing state-of-the-art methods. </pre>
<pre id="peng2019learning-bibtex" class="pre pre-scrollable collapse">@inproceedings{peng2019learning,
title = {Learning Task-specific Representation for Novel Words in Sequence Labeling},
author = {Minlong Peng, Qi Zhang, Xiaoyu Xing, Tao Gui, Jinlan Fu, Xuanjing Huang},
booktitle = {IJCAI},
year = {2019}
}
</pre>
</div>
</li>
</ul>
<h2 class="bibliography">2018</h2>
<ul class="bibliography">
<li>
<h4> <a href="https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/16432/16127"> Adaptive Co-Attention Network for Named Entity Recognition in Tweets </a></h4>
<span id="zhang2018adaptive">Qi Zhang, <b>Jinlan Fu</b>, Xiaoyu Liu, Xuanjing Huang </span>
<br>
<span class="conf">AAAI</span>
<a href="https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewFile/16432/16127" class="my_details">Full Text</a>
<a href="https://github.com/jlfu/NERmultimodal" class="my_code">Code</a>
<a data-toggle="collapse" href="#zhang2018adaptive-abstract" class="my_details">Abstract</a>
<a data-toggle="collapse" href="#zhang2018adaptive-bibtex" class="my_details">BibTeX</a>
<div id="zhang2018adaptive-materials">
<pre id="zhang2018adaptive-abstract" class="pre collapse">In this study, we investigate the problem of named entity recognition for tweets. Named entity recognition is an important task in natural language processing and has been carefully studied in recent decades. Previous named entity recognition methods usually only used the textual content when processing tweets. However, many tweets contain not only textual content, but also images. Such visual information is also valuable in the name entity recognition task. To make full use of textual and visual information, this paper proposes a novel method to process tweets that contain multimodal information. We extend a bi-directional long short term memory network with conditional random fields and an adaptive co-attention network to achieve this task. To evaluate the proposed methods, we constructed a large scale labeled dataset that contained multimodal tweets. Experimental results demonstrated that the proposed method could achieve a better performance than the previous methods in most cases. </pre>