Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: Lightweight in-place XpsDocument modification APIs #10361

Open
edwardneal opened this issue Jan 29, 2025 · 5 comments
Open

[API Proposal]: Lightweight in-place XpsDocument modification APIs #10361

edwardneal opened this issue Jan 29, 2025 · 5 comments
Labels
API suggestion Early API idea and discussion, it is NOT ready for implementation

Comments

@edwardneal
Copy link

edwardneal commented Jan 29, 2025

Background and motivation

Relates to #4000, providing the API proposal for the ability to read and rewrite the structure of an XPS document.

There are three objectives with this proposal:

  1. Allow access to an IXpsFixedDocumentSequenceWriter for an XpsDocument's existing FixedDocumentSequence. At present, the only way to access an IXpsFixedDocumentSequenceWriter is to create one - and only one can be created per XpsDocument, so it's not possible to amend existing XpsDocuments.
  2. Allow navigation from a FixedDocumentSequence, FixedDocument and FixedPage's reader APIs to their writer APIs. This should be simple enough: the underlying implementation implements both interfaces, so I expect these three properties could potentially just return this.
  3. Improve the writer APIs, ensuring that every property which can be read from IXps*Reader can be modified from IXps*Writer.

The wider goal is to allow an XPS document's XML structure to be inspected in a way which doesn't involve instantiating full WPF controls (with the thread affinity this implies) or using STA threads in the way that the existing XpsDocumentWriter API requires. With that said, this enables some functionality too - being able to navigate from a page's IXpsFixedPageReader to its IXpsFixedPageWriter means that we can get to the raw XmlWriter APIs, so any future XPS support can build on these types.

Although #4000 relates to modifying and printing XPS documents, this proposal only addresses the modification aspect - I think that's where the largest speed/memory usage improvement lies.

API Proposal

namespace System.Windows.Xps.Packaging;

public interface IXpsFixedDocumentSequenceReader
{
    // Existing
    ReadOnlyCollection<IXpsFixedDocumentReader> FixedDocuments { get; }
    PrintTicket PrintTicket { get; }
    XpsThumbnail Thumbnail { get; }
    Uri Uri { get; }

    IXpsFixedDocumentReader GetFixedDocument (Uri documentSource);

    // New
+   IXpsFixedDocumentSequenceWriter FixedDocumentSequenceWriter { get { throw new NotImplementedException(); } }
}

public interface IXpsFixedDocumentReader : IDocumentStructureProvider
{
    // Existing
    int DocumentNumber { get; }
    XpsStructure DocumentStructure { get; }
    ReadOnlyCollection<IXpsFixedPageReader> FixedPages { get; }
    PrintTicket PrintTicket { get; }
    ICollection<XpsSignatureDefinition> SignatureDefinitions { get; }
    XpsThumbnail Thumbnail { get; }
    Uri Uri { get; }
    
    void AddSignatureDefinition (XpsSignatureDefinition signatureDefinition);
    void CommitSignatureDefinition ();
    IXpsFixedPageReader GetFixedPage (Uri pageSource);
    RemoveSignatureDefinition (XpsSignatureDefinition signatureDefinition);

    // New
+   IXpsFixedDocumentWriter FixedDocumentWriter { get { throw new NotImplementedException(); } }
}

public interface IXpsFixedPageReader : IStoryFragmentProvider
{
    // Existing
    ICollection<XpsColorContext> ColorContexts { get; }
    ICollection<XpsFont> Fonts { get; }
    ICollection<XpsImage> Images { get; }
    int PageNumber { get; }
    PrintTicket PrintTicket { get; }
    ICollection<XpsResourceDictionary> ResourceDictionaries { get; }
    XpsStructure StoryFragment { get; }
    XpsThumbnail Thumbnail { get; }
    Uri Uri { get; }
    XmlReader XmlReader { get; }

    XpsColorContext GetColorContext (Uri uri);
    XpsFont GetFont (Uri uri);
    XpsImage GetImage (Uri uri);
    XpsResource GetResource (Uri resourceUri);
    XpsResourceDictionary GetResourceDictionary (Uri uri);

    // New
+   IXpsFixedPageWriter FixedPageWriter { get { throw new NotImplementedException(); } }
}

public interface IXpsFixedDocumentSequenceWriter
{
    // Existing
    PrintTicket PrintTicket { set; }
    Uri Uri { get; }

    IXpsFixedDocumentWriter AddFixedDocument ();
    XpsThumbnail AddThumbnail (XpsImageType imageType);
    void Commit ();

    // New
+   void RemoveFixedDocument (IXpsFixedDocumentWriter document) => throw new NotImplementedException();
+   void RemoveThumbnail (XpsThumbnail thumbnail) => throw new NotImplementedException();
}

public interface IXpsFixedDocumentWriter : IDocumentStructureProvider
{
    // Existing
    int DocumentNumber { get; }
    PrintTicket PrintTicket { set; }
    Uri Uri { get; }

    IXpsFixedPageWriter AddFixedPage ();
    XpsThumbnail AddThumbnail (XpsImageType imageType);
    void Commit ();

    // New
+   void RemovePage (IXpsFixedPageWriter page) => throw new NotImplementedException();
+   void RemoveThumbnail (XpsThumbnail thumbnail) => throw new NotImplementedException();
}

public interface IXpsFixedPageWriter : IStoryFragmentProvider
{
    // Existing
    IList<string> LinkTargetStream { get; }
    int PageNumber { get; }
    PrintTicket PrintTicket { set; }
    Uri Uri { get; }
    XmlWriter XmlWriter { get; }

    XpsColorContext AddColorContext ();
    XpsFont AddFont ();
    XpsFont AddFont (bool obfuscate);
    XpsFont AddFont (bool obfuscate, bool addRestrictedRelationship);
    XpsImage AddImage (string mimeType);
    XpsImage AddImage (XpsImageType imageType);
    XpsResource AddResource (Type resourceType, Uri resourceUri);
    XpsResourceDictionary AddResourceDictionary ();
    XpsThumbnail AddThumbnail (XpsImageType imageType);
    void Commit ();

    // New
+   void RemoveColorContext (XpsColorContext colorContext) => throw new NotImplementedException();
+   void RemoveFont (XpsFont font) => throw new NotImplementedException();
+   void RemoveImage (XpsImage image) => throw new NotImplementedException();
+   void RemoveResource (XpsResource resource) => throw new NotImplementedException();
+   void RemoveResourceDictionary (XpsResourceDictionary resourceDictionary) => throw new NotImplementedException();
+   void RemoveThumbnail (XpsThumbnail thumbnail) => throw new NotImplementedException();
}

public interface IDocumentStructureProvider
{
    // Existing
    XpsStructure AddDocumentStructure ();

    // New
+   void RemoveDocumentStructure (XpsStructure documentStructure) => throw new NotImplementedException();
}

public interface IStoryFragmentProvider
{
    // Existing
    XpsStructure AddStoryFragment ();

    // New
+   void RemoveStoryFragment (XpsStructure storyFragment) => throw new NotImplementedException();
}

All of these members have default interface members which throw a NotImplementedException; this is to avoid introducing source breaking changes.

API Usage

const string XpsPath = "...";
const string LastPageUrl = "...";
using var existingDocument = new XpsDocument(XpsPath, FileAccess.ReadWrite, CompressionOption.Normal);

var fdsReader = existingDocument.FixedDocumentSequenceReader;
var fdReader = fdsReader.FixedDocuments[0];

ReplaceLastPage(fdReader);
AttachDocumentThumbnail(existingDocument);

void ReplaceLastPage(IXpsFixedDocumentReader fixedDocument)
{
    var fdWriter = fixedDocument.FixedDocumentWriter;
    var lastPage = fixedDocument.GetFixedPage(new Uri(LastPageUrl, UriKind.Relative));
    var lastPageWriter = lastPage.FixedPageWriter;

    fdWriter.RemovePage(lastPageWriter);
    lastPageWriter = fdWriter.AddFixedPage();

    AttachPageThumbnail(lastPageWriter);
}

void AttachPageThumbnail(IXpsFixedPageWriter newPage)
{
    var pgThumbnail = newPage.AddThumbnail(XpsImageType.PngImageType);

    using var newThumbnail = GenerateThumbnail(newPage);
    SetThumbnail(pgThumbnail, newThumbnail);
}

void AttachDocumentThumbnail(XpsDocument document)
{
    var fdsReader = document.FixedDocumentSequenceReader;
    var fdsWriter = fdsReader.FixedDocumentSequenceWriter;
    var fdsThumbnail = fdsReader.Thumbnail ?? fdsWriter.AddThumbnail(XpsImageType.PngImageType);

    using var newThumbnail = GenerateThumbnail(document);
    SetThumbnail(fdsThumbnail, newThumbnail);
}

void SetThumbnail(XpsThumbnail thumbnail, Stream thumbnailStream)
{
    using var xpsThumbnailStream = thumbnail.OpenStream();

    xpsThumbnailStream.SetLength(0);
    thumbnailStream.CopyTo(xpsThumbnailStream);
}

Stream GenerateThumbnail(XpsDocument document) => null;
Stream GenerateThumbnail(IXpsFixedPageWriter newPage) => null;

Alternative Designs

XPS support could be moved into its own library, separate to WPF and similar to OpenXML. This could make XPS support cross-platform, but it'd need a much larger effort and it'd need someone to maintain that library.

For the moment, the key obstacle is the inability to read and update XPS documents without instantiating WPF controls and by using multiple threads. I expect that once it becomes possible to open an XPS document, iterate through its pages and access each page's XmlReader and XmlWriter, this functionality can be used as building blocks if somebody wants to build a strongly-typed API later.

Risks

No response

@miloush
Copy link
Contributor

miloush commented Jan 29, 2025

I think it would be helpful if the API changes also included the current members, or at least the relevant ones (like Add counterparts).

Both the reader and writer interface derive from IDocumentStructureProvider which has AddDocumentStructure. Do you need a remove equivalent of that?

I am not a fan of *Reader interfaces having *Writer properties. They are unrelated, and changing the interfaces is breaking. Unless I overlooked something, there is no *Reader properties defined on existing interfaces, so there shouldn't be need to have interfaces for *Writer properties. You can define a Writer property directly on e.g. XpsDocument.

Furthermore, this assumes that the reader/writer are 1) random access 2) mutable. I think it is perfectly reasonable and desirable to have fast forward-only readers/writers. You could remove say a page by copying - implementing a reader that does not write it to the writer.

If we are to have mutable writers, we should have a new set of interfaces with the Remove methods.

@edwardneal
Copy link
Author

Thanks @miloush. I've added the current interface members, and expanded it to include IDocumentStructureProvider and IStoryFragmentProvider. You're correct, I'd missed those - apologies.

I understand the hesitation about blurring the *Reader and *Writer boundaries. These boundaries are already blurred though - IXpsFixedPageReader and IXpsFixedPageWriter both inherit from IStoryFragmentProvider, which has an AddStoryFragment method. Perhaps adding a method to get a fixed page/etc. by its URI would be better here?

Are the interface changes really a breaking change? It's my understanding that adding a DIM to an interface is described by the runtime documentation as permitted (link.)

@miloush
Copy link
Contributor

miloush commented Jan 30, 2025

I think it is more useful for consumers to check whether something implements an interface with the stuff they want rather than getting exceptions when trying to use it.

By the way have you seen https://learn.microsoft.com/en-us/windows/win32/printdocs/documents-xps? Maybe these interfaces could be used when creating a standalone library.

@edwardneal
Copy link
Author

I agree, but the new member's default implementation is purely to prevent the proposal from becoming a breaking change: If another class implementing the interface was compiled against the .NET 9.0 version of the interface, that type can be loaded in .NET 10.0+ and its methods can be bound to the new version. The WPF-provided implementation naturally wouldn't throw.

Whether we add a new interface or modify the existing one, none of our options are ideal. If we add a new "IXpsFixedPageWriter2" interface which derives from IXpsFixedPageWriter and adds the relevant methods, that new interface needs to be exposed somehow. Changing the return type of XpsDocument.AddFixedDocumentSequence is a breaking change, but without changing it we find ourselves writing code like this:

var pgWriter = documentSequenceWriter.Pages[0]; // or similar

// .NET vNext
if (pgWriter is IXpsFixedPageWriter2 pgWriter2)
{
    pgWriter2.RemoveFont(fontToRemove);
}
else
{
    // .NET 9.0 doesn't support this method, use alternative approach here
}

When upgrading to .NET vNext, this code becomes:

var pgWriter2 = documentSequenceWriter.Pages[0] as IXpsFixedPageWriter2;

pgWriter2.RemoveFont(fontToRemove);

In this situation, our documented return type is no longer as accurate - consumers have to "just know" that to (for e.g.) remove fonts from a page, they need to cast the return value to its derived type. From a newcomer's perspective, that lack of discoverability could be confusing; even if documented there's no way to fix the root cause of that confusion without introducing a breaking change.

On the other hand: adding a default implementation means that if a downstream library implements IXpsFixedPageWriter but doesn't implement the new methods, those methods will throw at runtime. That's a poor consumer experience too, but there's at least a path to fix it without needing any breaking changes.

What do you think about adding CanAddChildren and CanRemoveChildren properties to the writer interfaces, and documenting that consumers should check these properties before calling Add or Remove methods? CanAddChildren could always return true and CanRemoveChildren could have a default implementation which always returns false (but which the WPF implementation overrides to return true.) The writer interfaces would start to look a little like Stream (with its Seek and CanSeek, etc. properties.)

By the way have you seen https://learn.microsoft.com/en-us/windows/win32/printdocs/documents-xps? Maybe these interfaces could be used when creating a standalone library.

Thanks for these. I did see those and did consider writing a library - but while I was testing, I found that the majority of the performance improvements with XPS processing actually came from being able to modify individual pages in an XPS document in their own threads. I think that if we can work out a sensible way to make this API proposal to let the writer interfaces modify XPS documents (rather than just create them) then it'll actually provide the majority of the performance improvements.

@himgoyalmicro himgoyalmicro added the API suggestion Early API idea and discussion, it is NOT ready for implementation label Jan 31, 2025
@miloush
Copy link
Contributor

miloush commented Jan 31, 2025

What about say IXpsFixedPageEditor : IXpsFixedPageWriter? I guess that would still be breaking to the IXpsFixedDocumentWriter implementors.

I also noticed that there are these implementations:

  1. XpsFixedDocumentReaderWriter : IXpsFixedDocumentReader, IXpsFixedDocumentWriter
  2. XpsFixedDocumentSequenceReaderWriter : IXpsFixedDocumentSequenceReader, IXpsFixedDocumentSequenceWriter
  3. XpsFixedPageReaderWriter : IXpsFixedPageReader, IXpsFixedPageWriter

So in practice you don't need to have the writer properties on readers, you can just cast them to writers.

The majority of the performance improvements with XPS processing actually came from being able to modify individual pages in an XPS document in their own threads.

Taking a step back, how do the proposed changes enable this? Removing pages seems to be the only useful thing that the API allows. Without being able to edit or access the page content, the others like removing font and images don't seem to be very useful. This also corresponds to your example usage. There is also nothing to guarantee thread safety. All the classes share an instance of XpsManager under the hood, which does not seem to be managing its collections in a thread-safe manner. (That said, you have researched this more than me, so feel free to correct me.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API suggestion Early API idea and discussion, it is NOT ready for implementation
Projects
None yet
Development

No branches or pull requests

3 participants