Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tdl from grammar #13

Open
wants to merge 43 commits into
base: master
Choose a base branch
from
Open

Tdl from grammar #13

wants to merge 43 commits into from

Conversation

fcbond
Copy link
Owner

@fcbond fcbond commented Jun 13, 2020

@goodmami

could you please have a look at this? I would especially welcome some input on:

  • the tdl logging in tdl2db.py
  • how I read bash arguments
  • better linking to fftb (maybe should be an issue there not here)

Copy link
Collaborator

@goodmami goodmami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a huge PR so there are many comments. Some general things I noticed are:

  • inconsistent use of %, .format, and f-strings for formatting (sometimes causing issues which I've noted); but I get that this is updating old code so I didn't mention each of these
  • variables in Bash scripts should almost always be quoted when passed as arguments so that spaces do not break them into multiple arguments (I was getting a bit snarky toward the end because I was putting a comment on each time I saw this; sorry)
  • some parts looked like the output of 2to3, which is fine, but the result isn't always idiomatic code. I generally did not comment on these, as there should be a separate commit to resolve style issues

I also did not attempt to run the code so I don't know if it works as a whole, but I'll leave that to you. The interfacing with PyDelphin's TDL module looked pretty good.

README.rst Outdated
``LKBFOS=~/delphin/lkb_fos/lkb.linux_x86_64``.

Install dependencies (in ubuntu):

.. code:: bash

sudo apt-get install apache2 xmlstarlet
sudo apt-get install python-docutils python3-docutils python3-lxml
sudo apt-get install python3-docutils python3-lxml

sudo pip install pydelphin --upgrade
sudo pip3 install pydelphin --upgrade
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for both pip and pip3 if you're dropping support for Python 2. Also I don't suggest sudo for pip.

* hyperlinked types
* types without glb


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal todo lists don't serve much purpose in the repository. Actionable things can be made into issues, and personal notes can be left in unversioned files

gold2db.py Outdated Show resolved Hide resolved
html/ltdb.py Outdated Show resolved Hide resolved
html/ltdb.py Outdated Show resolved Hide resolved
makehome.py Show resolved Hide resolved
print("Adding files from %s to %s" % (grmdir, dbfile), file=sys.stderr)
(script, grammar, dbfile) = sys.argv
print("Adding files from %s to %s" % (grammar, dbfile), file=sys.stderr)


## make a log in the same directory as the database
log = open(os.path.join(os.path.dirname(dbfile),"tdl.log"), 'w')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've imported Path, so you could also do:

log = open(Path(dbfile).parent / 'tdl.log', 'w')

path = Path(grammarfile)
base = path.parent
for event, obj, lineno in tdl.iterparse(grammarfile):
if event == "LineComment":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary since the only other code block is if event == "EndEnvironment", so it's already filtered. But this makes sense if you plan to parse line comments later.

orths = obj.conjunction.get('ORTH', default=None)
try:
orth=' '.join([str(s) for s in orths.values()])
except:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blanket exception. So you expect this to fail if orths is None? How about something like this (not tested):

if 'ORTH' in obj:
    orths = ' '.join(map(str, obj['ORTH'].values()))
else:
    orth = ''
    print('No Orthography', obj.indentifier, sep='\t', file=log)
...

If this generates an error, perhaps it's a bug in PyDelphin or some other case we need to handle separately.

tdl2db.py Show resolved Hide resolved
make-ltdb.bash Outdated
LISP="
${extralisp}
(format t \"~%LTDB Read Grammar~%\")
LISP="""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think bash has triple-quoted (""") strings. This might work anyway if it's evaluated as an empty string "" followed by a multiline string.

@fcbond
Copy link
Owner Author

fcbond commented Jun 29, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants