Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Struct namespaces with Serde #218

Open
Richterrettich opened this issue Jun 22, 2020 · 16 comments
Open

Struct namespaces with Serde #218

Richterrettich opened this issue Jun 22, 2020 · 16 comments
Labels
enhancement namespaces Issues related to namespaces support serde Issues related to mapping from Rust types to XML

Comments

@Richterrettich
Copy link

Hi,
it would be really cool to have a feature to define common namespaces for structs using serde. Something like this:

#[derive(Serialize, Deserialize)]
#[namespace(F,foourn)]
struct Foo {
  id: String,
  #[serde(flatten)]
  #[namespace(B,barurn)]
  bar: Bar
}

#[derive(Serialize, Deserialize)]
struct Bar {
  name: String,
  desc: String
}

resulting in:

<F:foo xmlns:B="foourn" xmlns:F="barurn">
      <F:id>123</F:id>
      <B:name>asdf</B:name>
      <B:desc>foobar </B:desc>
</F:foo>

Background

I sometimes face (old) API's that requrie XML to be structured in a way similar to the given example. They use namespaces to distinguish between entities to form some sort of inheritance tree.
A feature like this would make interacting with these kind of API's very easy.

@mlevkov
Copy link

mlevkov commented Jun 28, 2020

I do not know how to handle this here, but I've used yaserde crate for such @Richterrettich

@tafia
Copy link
Owner

tafia commented Jul 17, 2020

This would be a nice feature indeed. Unfortunately I don't have much time but I'd be happy to integrate it if someone finds the time to do it.

@WhyNotHugo
Copy link

I'd love to see this happen, but I've no idea where to even start. What kind of macro would namespace be in order for its data to somehow be accessible in the serde (de)serialization stage?

@Mingun Mingun removed their assignment Feb 18, 2023
@meghfossa
Copy link

meghfossa commented May 26, 2023

Is there any workaround for this, or : is not supported at all. I don't mind if the workaround approach is verbose or unintuitive. Right now, I can't parse any field with : in it's name.

@WhyNotHugo
Copy link

WhyNotHugo commented May 26, 2023 via email

@Mingun
Copy link
Collaborator

Mingun commented May 26, 2023

If you don't want to write deserialization code manually, you could look at xmlserde. Support for namespaces for serde is not an easy task, unfortunately.

@dralley
Copy link
Collaborator

dralley commented Jul 26, 2023

Whoever picks this up, consider starting from #466

@jespersm
Copy link

jespersm commented Jan 2, 2025

I want to give this one a try.

I'm aiming for the following feature set:

  • Easy specification of namespace information into serde derive attributes
  • Ability to ensure space-efficient serialization (i.e. by explicitly pushing the namespace definitions to the start of the serialized XML)
  • Ser/de for xml:lang and xml:space
  • Support for custom ser/de of QNames, as used in e.g. XML Schema files, WSDL, and XSLT.

The main problem is that there aren't a lot of options in the serde's container, variant and field attributes that we can encode the namespace information into -- really only 'rename', like in the following suggested format (some have called it James Clark notation ):

    /// Type where one field represented by an attribute and one by an element
    #[derive(Debug, Deserialize, PartialEq)]
    #[serde(rename = "{urn:example:a}mixed-ns")]
    struct Mixed {
        #[serde(rename = "@{urn:example:b}float")]
        float: f64, // Note: It's an XML attribute
        #[serde(rename = "{urn:example:c}string")]
        string: String,
    }

Which could be used to deserialize XML like this:

<elements xmlns="urn:example:a" xmlns:bbb="urn:example:b" xmlns:ccc="urn:example:c" bbb:float="42.0">
  <ccc:string>answer</ccc:string>
</elements>

Note that to deserialize, you don't need to know the prefixes in advance. For seralizing, you don't either, but you may have preferences you want to express in the generated XML.

Repeating the name namespace over and over again is not pretty, I know.
However, alternatively, we'd need a separate procmacro and/or some gruesome linker-tricks to lookup the namespace per-type information (by type-id or similar), if I understand serde's architecture correctly -- I'd rather not go that way.

@WhyNotHugo
Copy link

WhyNotHugo commented Jan 3, 2025 via email

@jespersm
Copy link

jespersm commented Jan 3, 2025

This would imply that namespaces need to be known at compile time, and can't be read when deserialising either, right? I don't think this is a problem, but I still want to ensure that limitations are clear.

In the general case (deserialize instances of a known "ordinary" schema into Rust datastructures), you know the relevant namespaces in advance, and the prefixes do not matter.

In the case more creative uses of XML Namespaces, such as XML schema, WSDL and the like, you know the namespaces of the structures ahead of time, but to deserialize it properly, you need access to the namespaces of the content which the structures are describing.

Example:

<xsd:schema targetNamespace="http://www.example.com/items"
    xmlns:xsd="http://www.w3.org/2001/XMLSchema"
    xmlns:items="http://www.example.com/items"
    elementFormDefault="qualified">

    <xsd:element name="order">
        <xsd:complexType>
            <xsd:sequence>
                <xsd:element name="item" type="items:itemType" maxOccurs="unbounded" />
            </xsd:sequence>
        </xsd:complexType>
    </xsd:element>

    <xsd:complexType name="itemType">
        <xsd:simpleContent>
            <xsd:extension base="xsd:string">
                <xsd:attribute name="itemId" type="xsd:ID" />
            </xsd:extension>
        </xsd:simpleContent>
    </xsd:complexType>

</xsd:schema>

Note here how the value of /xsd:schema/xsd:element/xsd:complexType/xsd:sequence/xsd:element@type mentions a namespace prefix ("items") which is given in the schema instance (i.e. the schema document for 'order' and 'items'), but not part of the value space for XML Schema itself. This is crux of the fourth bullet point above, about deserializing (and serializing) QNames.

For XML files which are entirely mixed-form, like XSLT, I really can't tell if deserialization is a productive strategy to pursue. It would require doing everything in mixed mode, deserializing the "content" subtrees into an in-memory XML tree, while deserializing the XSLT elements themselves. All XPath expressions would need access to namespace mappings, since elements like <xsl:value-of select="//library:book/@isbn:isbn-number"/> can several different QNames which need to be parsed out carefully. Writing a custom parser using NsReader would likely be more productive in that case -- you can't be all things to all people.

Ideally, the namespaces could be made known to any custom Serialize or Deserialize you use, perhaps by somehow extending the contract or hooking into the NsReader state -- thread locals?

@WhyNotHugo
Copy link

WebDAV uses custom namespaces for properties defined by extensions.

A client might receive elements with unknown namespaces (which some other client created). But clients need dedicated support to do something useful with this data, so it's usually okay to ignore unknown elements with unknown namespaces.

I know of one tricky situation where different clients use different namespaces for the same property (a not-fully-standard one; calendar colour). But I guess that with the proposed implementation, a client could just serialise both variations into separate fields.

@jespersm
Copy link

jespersm commented Jan 3, 2025

WebDAV uses custom namespaces for properties defined by extensions.

Ouch, WebDAV is a dumpster fire of incompatibility, I just had a major run-in with it. I'll keep the WebDAV namespace in mind. Examples welcome.

@Caellian
Copy link

Caellian commented Jan 3, 2025

SVGs are similar with <metadata>.

Support for namespaces for serde is not an easy task, unfortunately.

It's not possible. Namely:

specification of namespace information into serde derive attributes

requires addition of namespaces to serde. This isn't some additional data that can be encoded in (de)serialized data, it's a parser/generator metadata, so it really needs to be supported by the serde itself (or another library).

Every element has to be aware of namespaces defined for it or any defined above it in the tree. Deserializer has to be able to provide the top-level/default namespace. And so on...

I saw that no serde issues mention namespaces so I created one: serde-rs#2877. I'm not 100% sure this issue will be accepted though because afaict, the problem requires a context-aware parser and serde wasn't designed for this.

@jespersm
Copy link

jespersm commented Jan 3, 2025

It's not possible. Namely:

specification of namespace information into serde derive attributes

requires addition of namespaces to serde. This isn't some additional data that can be encoded in (de)serialized data, it's a parser/generator metadata, so it really needs to be supported by the serde itself (or another library).

I'm sure it's doable, either with extra derives as already attempted, or by encoding the required namespaces into the renames as prototyped by me already.

But yes, it's a hard problem, with a diverse set of trade-offs.

@Caellian
Copy link

Caellian commented Jan 4, 2025

Not sure why, but adding it to name and working around that feels a bit hackish to me.

While creating the issue on serde I got an idea of storing additional metadata in (de)serializer as HashMap<TypeId, XMLSpecificTypeInfo>, maybe a trait that builds on top of serde infra could be added to specify attribute/inner and namespace.

@jespersm
Copy link

I've pushed my work in progress to https://github.com/jespersm/quick-xml/tree/serde_namespace_support - starting with the deserialization side of things, but it's not reviewable yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement namespaces Issues related to namespaces support serde Issues related to mapping from Rust types to XML
Projects
None yet
Development

No branches or pull requests

9 participants