-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retrieving the parsed structures for external processing by third-party tooling #5
Comments
It's certainly technically possible to dump the pre-render stage data structures to JSON. One concern is that this creates an API contract that has an expectation of stability, while for rendered content there's much more latitude for change. So that's a new thing that would require some thought. My other main question is how to deal with markdown and references? Leave them entirely unparsed/raw in the JSON? In LuaDox, the renderer takes care of parsing and converting markdown to HTML, and at that time all references (both How do you think this should be handled with JSON renders? Leave them raw/unresolved? |
Thanks for the quick response! I haven't thought about the design much, but here's a few ideas:
If I had a JSON of all the functions, modules, etc. organized by file, I would probably construct URLs based on them so that they fit into the existing website. For example, the I don't really know how luadox handles this internally so I can't comment on what would work best. But I guess if you consider this an experimental feature you would have plenty of opportunity to iterate after seeing how it turns out in practice :) |
Has anyone done any more thinking about this?
|
This is a substantial refactoring with the primary goal to reduce reliance on duck typing by using more concrete types in order to benefit from static type checking. The Reference object has been broken up into untyped and typed references, where typed refs are subclasses of Reference, and which implements the various data needed for rendering. This also begins paving the way to supporting multiple renderers, which will be needed for #5. In order to use slightly more recent type hinting features, Python 3.8 is now required. That makes this commit a breaking change, meaning the next release will require a major version bump.
I'm planning to implement this for the next release (LuaDox 2.0). I've begun some refactoring work to enable this (among other things, such as support for other annotation conventions), and in the process have been thinking about how best to approach it. First, the basic idea is that the JSON structure will reflect a hierarchical layout:
Each element will contain an In terms of markdown, there are a couple wrinkles that seem to necessitate a bit of extra complexity. LuaDox has some tags that need to be parsed, but yet don't directly map to any markdown. Currently these are So I'm thinking about handling this by representing markdown content fields in the JSON as an array instead of a string, where the array would contain a list of objects that represent either a markdown string, or some more complex parsed field such as an admonition. For example: {
"id": "foo.baz",
"type": "class",
"content": [
{
"markdown": "### Some heading\n\nSome text goes here."
},
{
"type": "admonition",
"level": "warning",
"title": "Beware!",
"content": [
{
"type": "Markdown within the admonition that has a @see tag"
},
{
"type": "see",
"ids": [
"bar.one",
"bar.two"
]
}
]
},
{
"markdown": "More markdown after the admonition [with a link](luadox:bar.two)"
}
],
"functions": [
"stuff goes here"
],
"fields": [
"stuff goes here"
]
} Or, as yaml, because it'll be trivial to support: id: foo.baz
type: class
content:
- markdown: |-
### Some heading
Some text goes here.
- type: admonition
level: warning
title: Beware!
content:
- type: Markdown within the admonition that has a @see tag
- type: see
ids:
- bar.one
- bar.two
- markdown: More markdown after the admonition [with a link](luadox:bar.two)
functions:
- stuff goes here
fields:
- stuff goes here That isn't a fully baked document, just depicts how a single collection might be represented within the larger document, and how markdown content is split up into an array like that. Let me know what you think. |
I can't really comment on the design, but if you have a working prototype I'm happy to give it a spin to get you some feedback :) Since this would be the input for scripts and other tools, it's probably not too important how the structures are laid out exactly. |
Reference resolution logic has been moved from the renderer to the parser (invoked by the prerenderer), where refs are now converted to markdown links using an intermediate `luadox:<refid>` link format. It's up to the renderer to resolve these links to whatever is appropriate. This required introducing the notion of an id to references. Ids are globally unique opaque strings that are tracked by the parser, which the renderer can consult in order to convert an id to a Reference object. This refactoring continues to pave the way for #5 and will allow for different kinds of renderers (not just HTML), where the common logic that applies to all renderers has been moved to the parser and run during the prerender stage. Additionally, tag parsing within content blocks (e.g. handling @tparam, @note, etc.) has been rewritten and hopefully simplified. (Parser.parse_raw_content()) Finally, this commit includes some optimizations: * Compiled regexp objects are now cached and reused, reducing compilation overhead * First sentence detection has been rewritten using a more naive, lower level approach that is significantly faster. During profiling, get_first_sentence() was the most disproportionately expensive functions called.
This is implemented in master now, if anyone's interested in trying it out. You can install and run out of master using a pipx editable install:
The structure isn't documented yet, but hopefully it's obvious enough to figure out. Probably the most counterintuitive thing is that for classes and modules, the Feedback welcome. |
I tried to generate /home/rdw/.local/bin/luadox test.lua --renderer yaml
2023-09-22 13:12:39,507 [INFO] parsing /tmp/luadox-json-test/test.lua
2023-09-22 13:12:39,510 [INFO] prerendering 1 pages
2023-09-22 13:12:39,511 [ERROR] unhandled error rendering around /tmp/luadox-json-test/test.lua:-1: No option 'name' in section: 'project'
Traceback (most recent call last):
File "/usr/lib/python3.11/configparser.py", line 805, in get
value = d[option]
~^^^^^^^^
File "/usr/lib/python3.11/collections/__init__.py", line 1004, in __getitem__
return self.__missing__(key) # support subclasses that define __missing__
^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/collections/__init__.py", line 996, in __missing__
raise KeyError(key)
KeyError: 'name'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/tmp/luadox-json-test/luadox/luadox/main.py", line 249, in main
renderer.render(toprefs, out)
File "/tmp/luadox-json-test/luadox/luadox/render/yaml.py", line 47, in render
project = self._generate(toprefs)
^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/luadox-json-test/luadox/luadox/render/json.py", line 33, in _generate
name = self.config.get('project', 'name')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/configparser.py", line 808, in get
raise NoOptionError(option, section)
configparser.NoOptionError: No option 'name' in section: 'project' A few observations:
This was on WSL (Kali Linux). I could test on other systems as well, but it doesn't seem like a platform-specific issue. |
@Duckwhale silly oversight on my part, sorry about that. Just committed a fix. |
BTW @Duckwhale, a specific renderer for Docusaurus is theoretically possible now, and I'd like to have that capability natively in Luadox. So I'm quite interested in your findings here, and really more generally any advice or thoughts you might have on the subject. I've not used Docusaurus yet (and it certainly generates significantly more polished output than LuaDox's current html renderer :)) so I don't yet have any intuitions on the ideal approach. |
Just FYI, I've started building a prototype to see if I can use the JSON output to generate something remotely close to my manually-created docs. I've written down a bunch of feedback already, but it'll take some time to get more insights. One thing that I can say already is that I wanted a way to find out which source file a (top-level entry) originates from. This is so I can add project specific tags that likely wouldn't have to be added to the tool itself, such as "FFI/Unsafe API" or "External", which are useful things to list in a documentation but needn't be custom tags necessarily. Or maybe it's already possible to get this info? I guess it would be possible to chain |
I can definitely add a |
I was having trouble getting this to run from source. I'm sure it was my lack of python experience. Maybe add this to the front page for other people. It was really useful for me. |
Hi,
is there any way to get the parsed structures in a format that can be processed by other applications?
I'd like to use a JSON or Lua table representation of the API to generate content dynamically so that it can be embedded in a static documentation website (created by Docusaurus). While serviceable, the website that luadox generates by default doesn't lend itself well to customization, and I can think of other ways to process the API structures that would be useful to me as well.
I see they're stored in the
Parser
class, but I'm not sure if the internal layout is appropriate if dumped. Ideally, I would have something along the lines of Blizzard's WOW API documentation that could easily be used to populate React components in Docusaurus, but any standardized format would likely work.Do you think this would be possible, and if yes what approach would you suggest?
The text was updated successfully, but these errors were encountered: