Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow builds of targets even when other targets have incorrect specification in project.ptx #623

Closed
oscarlevin opened this issue Oct 17, 2023 · 29 comments

Comments

@oscarlevin
Copy link
Member

See https://groups.google.com/g/pretext-support/c/5UeK0ywGrLU

@bnmnetp
Copy link
Contributor

bnmnetp commented Oct 17, 2023

Oddly, I had 3 targets with a bad publication tag... but I only got two warnings. the targets were latex, pdf, and subset.

For now I've just deleted them from the project file.

@StevenClontz
Copy link
Member

So we validate the whole XML file, then throw up if validation fails.

Would it make sense (and be possible with PydanticXml) to have a weaker validation at the <project/> level, ignoring the contents of each <target/>? Then run validation on each <target/> separately and only choke completely when a target is named that is invalid?

@StevenClontz
Copy link
Member

The other issue here - is this not an issue with validation of XML persay, but a non-existent publication file? In that case, should we fall back to our default publication file settings when the file does not exist (tossing out a warning, not an error)?

@oscarlevin
Copy link
Member Author

Right, so the problem is that the user has specified a publication file, but one that doesn't exist (or is invalid). If you tried to build with such a file, I think a breaking error is appropriate. Given the number of messages in the log, a warning will be too hidden and I think users are likely to miss it (and then not understand why changing their publication file isn't having the desired effect).

For the original issue, I agree that I don't know how or if this can be done with pydantic. I don't want to reimplement the original reading of project.ptx just to solve the issue.

@bjones1
Copy link
Collaborator

bjones1 commented Oct 18, 2023

This is possible to do -- line 1090 of pretext/project/__init__.py performs the post- validation step that's raising this error. So, if we know which target(s) to validate. But, this means that we'd need to pass which targets to validate when parsing/loading the project file.

@bnmnetp
Copy link
Contributor

bnmnetp commented Oct 18, 2023

To repeat a little of what I said in the original email to -support.

I did pretext build runestone and only got the following messages:

warning: Provided publication file /Users/bmiller/Runestone/books/pretext/examples/sample-book/publication/publication.ptx does not exist.
warning: Provided publication file /Users/bmiller/Runestone/books/pretext/examples/sample-book/publication/publication.ptx does not exist.

So I checked my target... Thats not the publication file I specified for Runestone! grumble grumble

How did I get into this predicament? I'm sure I ran pretext init to create the project.ptx file. And never bothered to update anything for the pdf, latex or subset targets.

pretext core will already break and tell you that a publication file does not exist, so this seems like duplicated checking to me.

@bjones1
Copy link
Collaborator

bjones1 commented Oct 18, 2023

So I checked my target... Thats not the publication file I specified for Runestone! grumble grumble

Sigh, that's frustrating. We're hoping to reduce these problems in the future using better up-front validation.

pretext core will already break and tell you that a publication file does not exist, so this seems like duplicated checking to me.

The CLI has to read the publication file before invoking the core in order to know what directory generated files will be placed in. However, it doesn't need this info if it's not building a target with validation errors.

@bnmnetp
Copy link
Contributor

bnmnetp commented Nov 1, 2023

Even worse, when you spit out the warning messages and then exit, it does not exit with an error code that indicates failure. So my build scripts think that the build has actually succeeded when the script bails without doing anything.

@bnmnetp
Copy link
Contributor

bnmnetp commented Nov 1, 2023

I guess, at a minimum I would request that any kind of validation error that means you are not going to actually do the requested build would actually raise an exception or exit with a non-zero exit code so that scripts would know that the build didn't really happen.

@bnmnetp
Copy link
Contributor

bnmnetp commented Nov 5, 2023

After today's update, I have no idea which books have rebuilt and which books have not updated because of a problem with the project.ptx file. If you are not going to build the book, please make this an error where the build fails. Otherwise the current behavior completely defeats my scripts looking at the result code of trying to run pretext build --clean runestone

@oscarlevin
Copy link
Member Author

Let me explore the exit code thing. We are trying to balance having nice error messages for users who are using the command line with making sure that a call to the CLI via a script behaves predictably.

@bnmnetp
Copy link
Contributor

bnmnetp commented Nov 7, 2023

Thanks Oscar,

After some quick checking, it appears that about half of the pretext builds for Runestone failed silently this last weekend, and have been doing so for about two months!

I understand trying to balance things, but if the program is not going to do what you ask it to do it should not exit with a 0 exit code. In my case this is simply using subprocess to run "pretext build --clean runestone". I understand that maybe I should update the code to use the newer api functions that you make available, but in the meantime I'm very sad that books I thought were getting updated really were not! Even worse, I don't know how to check for this other than trying to look at stdout for this one particular error message.

@oscarlevin
Copy link
Member Author

In #633, I have (I think) ensured that whenever there is a failure to build for any reason, pretext will report an exit code of 1. Note,

  1. This does not apply to a build that fails to generate an asset (other than webwork, which is required to build the book)
  2. This would still require a subprocess calling pretext to check for a non-zero exit code and handle it appropriately.

@oscarlevin
Copy link
Member Author

@bnmnetp , if you upgrade to 2.3.1 you can try this out. Let me know if additional changes are needed.

I'll still leave this original issue open, as it would be nice to not need non-used targets to pass validation.

@oscarlevin
Copy link
Member Author

@bjones1, when you have time, it would be great if you could help me sort this out. Essentially what I would like to do is whenever a project is parsed, be it legacy or current, to look at each target, and if that target validates, include it. If not, give a warning.

Then if a user tries to build a target that validates, but has a non-valid second target, they just get a warning. If they try to build the target that doesn't validate, they get an error because that isn't a valid target.

@bjones1
Copy link
Collaborator

bjones1 commented Nov 8, 2023

I'd be happy to! Let me take a look at the code. Should I use log.warning to produce warnings, or is there some other mechanism?

@bjones1
Copy link
Collaborator

bjones1 commented Nov 8, 2023

Here are my current thoughts, which I'm posting now as notes to myself. I may have found a better solution, which I'll write up in another post.

I took a look, and realized this will be much harder than I though. A guess at a solution:

  1. Parse the whole file. If there are no errors, done.
  2. If there are errors, create a clone of the project with all targets removed. Add a target in, parse it, then check for errors.
  3. Remove all targets that failed validation.
  4. Reparse this modified project file.

Overall, this is a heavy lift. IMHO, this is a lot of work for a small amount of benefit.

Background: validation errors don't provide the exact location in the XML tree, which would simplify the process somewhat. Here's some sample output from a validation error; note that the loc field only gives the tag, not the path to the tag, of the error. We also don't get any sort of line number information in terms of where this was in the project file.

[
  {
    "type": "value_error",
    "loc": [
      "output_filename"
    ],
    "msg": "Value error, The output-filename must not be present when the format is HTML.",
    "input": "not-allowed",
    "ctx": {
      "error": "The output-filename must not be present when the format is HTML."
    },
    "url": "https://errors.pydantic.dev/2.4/v/value_error"
  }
]

Thoughts on where to go from here:

  1. Improve the quality of error messages produced by the validator. I'd really like to provide a line number, since this would help users understand what's wrong then fix them. Often, the fixes are pretty simply, but only if the source and underlying reason is well explained.

@bjones1
Copy link
Collaborator

bjones1 commented Nov 8, 2023

Another possibility: pydantic-xml supports raw element fields. We might be able to define a target as an element, then parse this individually, creating an empty target element if validation fails. Thoughts on this approach"

  1. In the Project class, declare a targets_raw: t.List[Element] = pxml.wrapped("targets", pxml.element(tag="target", default=[])) and a _targets: t.List[Target]. Rename all references to targets in the code with _targets.
  2. In the Project's model validator, include code which would iterate through each targets_raw item, then validate it using BaseXmlModel.from_xml_tree. If this validates, add it to the list of targets; otherwise, issue a warning. Assign the result to _targets.

Thoughts/comments?

@oscarlevin
Copy link
Member Author

I really like this idea. It is roughly what I thought we would do, just didn't know how it could be implemented.

Is that roughly what we do with legacy projects? In that code, we seem to validate targets separately, starting with the LegacyTarget class in xml.py

@bjones1
Copy link
Collaborator

bjones1 commented Nov 8, 2023

@StevenClontz and I talked, and he came up with a simpler workaround: if a v1 project file's publication file doesn't exist, issue a warning and remove it, so that the legacy project will use the default publication file. See https://github.com/PreTeXtBook/pretext-cli/tree/v1pubfix.

@oscarlevin
Copy link
Member Author

That would solve the issue with an incorrect publication file, but we still need the other fix in general, in case an unused target is broken for any reason but the requested target is correct.

@bjones1
Copy link
Collaborator

bjones1 commented Nov 9, 2023

In this case (a v2 project file) I think our emphasis should be on helping authors correct their project file, rather than on accepting invalid project file bits that aren't necessary for the current build.

Mainly, this is motivated by looking at the challenges of implementing this. It looks non-trivial; I don't think the pain justifies the gain.

@oscarlevin
Copy link
Member Author

Alright. I'll try to add a better error message (something like, inspect ALL your targets for errors) for when validation fails.

@bnmnetp
Copy link
Contributor

bnmnetp commented Nov 9, 2023

But @bjones1 -- having 1/2 of the rebuilds fail silently because a target I don't care about happens to have a bad value is not good behavior, not to mention super confusing to a user. If I try to build the runestone target and I get a message about missing publisher file that isn't even referenced in that target what am I supposed to do with that information?

I think its way over the top validation to fail because something unrelated to what I'm trying to do causes my thing to fail.

Maybe having a pretext check command that makes sure the entire project file is valid would make sense. Otherwise just make sure the current target is valid.

@bjones1
Copy link
Collaborator

bjones1 commented Nov 10, 2023

A requirements to make this work is knowledge of what target to build before loading the project file. If we know what target to build, then we can edit the targets to remove everything but the desired target. @oscarlevin and @StevenClontz, do we know this before loading the project file?

@StevenClontz
Copy link
Member

I don't think so. Consider this pattern:

p = Project() # loads project file
p.get_target('web').build()
p.get_target('print').build()

@bjones1
Copy link
Collaborator

bjones1 commented Nov 10, 2023

Good point. I think improving current feedback (non-zero exit code on failure, which @oscarlevin already fixed), better error messages if a publication file isn't valid, provide line numbers for errors in the project file, etc. would provide a better user experience within the current implementation.

@oscarlevin
Copy link
Member Author

In #639, I have improved the error messages we give when a project manifest is invalid, including a warning to look through the entire file for issues.

I have also relaxed the check for a publication file existing. If the publication file doesn't exist and we happen to call that target, then we later check that the file exists, so we can catch it there. Anyway, having a valid path that just happens to point somewhere that doesn't contain the file isn't really a problem with the manifest; it's a problem with the user's computer.

Come to think of it, I should probably do a similar check for other files that are specified by the manifest.

@oscarlevin
Copy link
Member Author

Actually, we have never validated that @source exists. So I'm happy now that we are consistent with the publication file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants