-
Notifications
You must be signed in to change notification settings - Fork 43
Homework 2 old
Jinho D. Choi edited this page Dec 31, 2016
·
1 revision
Your task is to implement a named entity recognizer. You are allowed to work in groups of at most 2. Submit your work by Nov. 11th before the class.
- Clone the Emory NLP project.
- Download the benchmark dataset and brown clusters.
- Run
NERTrain
with the default setting inconfig_train_ner.xml
. - Improve the named entity recognizer.
- Download the ontology in
DBPedia
and use it asambiguity_class
inNERFeatureTemplate
. - Ensure the output of all chunks follow the
BILOU
notation. - Evaluate the accuracy of your system, precision, recall, and F1, on both the development and evaluation sets.
- Write a report (4-8) pages in the ACL format. Your report must include abstract, introduction, related work, approach, experiments, and conclusion.
- Commit all your work to your Github repository.
Only only RB _ O
France france NNP _ U-LOC
and and CC _ O
Britain britain NNP _ U-LOC
backed back VBD _ O
Fischler fischler NNP _ U-PER
's 's POS _ O
proposal proposal NN _ O
. . . _ O
Each column represents:
-
0
: word-form. -
1
: lemma (predicted). -
2
: POS tag (predicted). -
3
: extra features (blank). -
4
: named entity recognition (gold).
Element | Value |
---|---|
algorithm |
perceptron , softmax , adagrad , agagrad-mini-batch , agadelta-mini-batch , agagrad-regression
|
l1_regularization | L1 regularization, lower-bound (for adagrad* ) |
learning_rate | Learning rate |
max_epochs | Maximum number of epochs |
batch_size | Number of sentences used in mini-batch |
roll_in | Gold label probability, upper-bound |
bias | Bias value |
Index | Type | DBPedia |
---|---|---|
0 | PERSON | Person, PersonFunction, Mayor, Name |
1 | NORP | GeopoliticalOrganisation, Legislature, Parliament, PoliticalParty, ReligiousOrganisation, EthnicGroup |
2 | FACILITY | ArchitecturalStructure, Cemetery, ConcentrationCamp, Garden, HistoricPlace, Mine, Monument, SkiResort, SportFacility, Park, Street |
3 | ORGANIZATION | GovernmentAgency, Broadcaster, Company, EducationalInstitution, EmployersOrganisation, NonProfitOrganisation, SambaSchool, SportsLeague, SportsTeam, Website |
4 | GPE | Country, Settlement, State |
5 | LOCATION | Region, NaturalRegion, HistoricalRegion, Street, Territory, ProtectedArea, SkiArea, Island, NaturalPlace, Continent |
6 | PRODUCT | Aircraft, Automobile, Locomotive, MilitaryVehicle, Motorcycle, Rocket, Ship, SpaceShuttle, Spacecraft, Train, Device, Drug, Food |
7 | EVENT | NaturalEvent, Competition, SocietalEvent |
8 | WORK_OF_ART | Artwork, Cartoon, CollectionOfValuables, Document, Film, Musical, MusicalWork, WrittenWork, TelevisionShow |
9 | LANGUAGE | Language |
10 | DATE | TimePeriod |
11 | MONEY | Currency |
Copyright © 2015-2019 Emory University - All Rights Reserved.