-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement write_serializable_content
on element writer
#508
Conversation
Codecov Report
@@ Coverage Diff @@
## master #508 +/- ##
==========================================
+ Coverage 61.33% 61.51% +0.17%
==========================================
Files 33 33
Lines 15393 15456 +63
==========================================
+ Hits 9441 9507 +66
+ Misses 5952 5949 -3
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, but I'm not confident of how the result of serde serialization is merged into Writer
. It would be much more useful if the type name won't be produce an inner element. I also suggest to move test to the doctest comment and include to it an attribute in the struct definition that will be merged with manually added attributes.
Also pay attention on failed checks.
@@ -330,6 +333,48 @@ impl<'a, W: Write> ElementWriter<'a, W> { | |||
.write_event(Event::End(self.start_tag.to_end()))?; | |||
Ok(self.writer) | |||
} | |||
|
|||
/// Serialize an arbitrary value inside the current element |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, it would be valuable to add a test as doc example here. Pay particular attention to which tag name the type will be serialized with
r#"<paired attr1="value1" attr2="value2"> | ||
<Foo> | ||
<bar> | ||
<baz>42</baz> | ||
<bat>43</bat> | ||
</bar> | ||
<val>foo</val> | ||
</Foo> | ||
</paired>"# | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect
<paired attr1="value1" attr2="value2">
<bar>
<baz>42</baz>
<bat>43</bat>
</bar>
<val>foo</val>
</paired>
here. This also means that you need to test that the attribute lists will be merged correctly. It also would be valuable to move the entire test to doctests for write_serializable_content
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mingun I would expect it to produce the same XML it would if you were to serialize it in a different context, but I don't think that would be the case if it produced the XML you suggest?
If you were to just serialize a Foo
, then I would expect <foo>...inner_contents...</foo>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I asked myself this question and I can see both use cases being valid. Maybe we should leave this as an option for that function? Although I'm not sure how to tell the serializer to not produce the <foo>
tags if we don't want them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming it works as it does in your test case and in my description, it just feels extremely niche. The ElementWriter
construct only allows you to write one inner value wrapped in a pair of tags, and I can't think of many XML structures where you'd want 2 levels of nesting like that for only one possible value inner value. The construct is basically just too limiting to make it broadly useful.
Wheras with the other approach of only supporting it on the writer, you can still accomplish the same thing, and it would be more flexible - if slightly more verbose for that one strange case.
writer
.create_element("paired")
.with_attribute(("attr1", "value1"))
.with_attribute(("attr2", "value2"))
.write_inner_content(|writer| {
writer.write_serializable(&content1).unwrap();
writer.write_serializable(&content2).unwrap();
writer.write_serializable(&content3).unwrap();
})?;
If it were to work the way @Mingun describes then none of this necessarily applies. I'm just not sure I understand why the outer tags wouldn't be included when on the deserialization side you would expect some struct fields to potentially be sourced from attributes. Shouldn't it work the same in the opposite direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dralley, I proceed from the following considerations:
- We expect that in
<foo>...</foo>
foo
could be mapped to a field of a struct. This is a basic expectation. - We also expect to get the inner content of
<foo>...</foo>
, so we could in the result a type which content is mapped to the...
in our XML. This is a second basic expectation - Ok, so we expect that
is mapped to
<...> <!-- a far-far inside some xml... --> <foo>...</foo> </...>
struct { foo: ???, }
- To understand what is
???
let's expand our example:To access... <foo> <bar>...</bar> </foo> ...
bar
content we can use pathfoo.bar
. That can be naturally expressed with nested structs:struct { foo: FooType, } struct FooType { bar: ???, }
- Because we want to build a mapping, which can be applied to XML with any nesting depth, obviously,
bar
should have aBarType
similar toFooType
. Our rules should be recursive. - Now take a closer look at
BarType
and how it is presented in XML:<bar>...</bar>
We may notice that structstruct BarType { /*...*/, }
BarType
is a<bar>
element. Remember, that this XML can be described with following XSD:Quite obvious, that rust's<xs:schema> <xs:complexType name="BarType"> <xs:sequence> <!--...--> </xs:sequence> </xs:complexType> <xs:element name="bar" type="BarType"/> </xs:schema>
BarType
can be mapped to XSD'sBarType
. - The example above gives an interesting question -- where is the
bar
? XML responds by saying that thebar
is part of the document. For example, XmlBeans implements this concept by generation a classBarDocument
, which is different from classBarType
that represents type of<bar>
element and contains actual data. - serde is very bond to json format, and json has no dedicated conception of a document -- each piece of JSON is JSON. XML is opposite -- not all pieces of XML is a document (but each document is XML piece). XmlBeans names that pieces is a Fragment. So JSON in terms of XML is a fragment.
- Returning to Rust, we have to admit that the Rust types will model fragments. We could introduce a special methods / types to work with documents (now we can say, that this is an object, that holds the root element name and a technical XML stuff -- namespace definitions, DTD and so on), but they will outside serde API.
I think, now it is obvious why I think, that serialization of a Rust type should not generate an element name -- because that name is not part of an xml-fragment (a type). There is, however, one significant semi-exception to this rule. enum
at the top level is convenient to represent as a document. But note that even in that case the (root) element name mapped to the variant name, not the enum
's type name.
I'll plan to write a doc about that in #369.
Although I must say that there are several subtle moments that may not be quite obvious. For example, serialization of sequences and enum
s. It is best to create tests and imagine what result we expect in each case? Will the selected mapping option be consistent?
If you were to just serialize a
Foo
, then I would expect<foo>...inner_contents...</foo>
This depends on whether Foo
is represent the root element type or not. In the latter case, you most likely expect that the name of the element will have a name of struct field of type Foo
as I shown in the beginning of that comment.
The Deserializer
and Serializer
are worked as I described. For convenience, the Foo
type can simultaneously represent both the data type and the document type -- the root tag can be derived from the type name, this is the only place in the serializer where the type name is used. The deserializer never uses a type name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mingun So let's that Foo
had some fields mapped to attributes, how does that work? If the Foo
element is erased and only its fields are serialized, any fields mapped to attributes will have nowhere to be serialized to.
I can appreciate that there are limitations with the serde model which may limit our ability to achieve certain outcomes, but I would rather not do something than to do it in a way that is incorrect or illogical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[derive(Serialize)]
struct Foo {
#[serde(rename = "@attribute")]
attribute: &'static str,
element: Bar,
list: Vec<&'static str>,
#[serde(rename = "$text")]
text: &'static str,
}
let content = Foo {
attribute: "attribute",
element: Bar { baz: 42, bat: 43 },
list: vec!["first element", "second element"],
text: "text",
};
writer
.create_element("paired")
.with_attribute(("attr1", "value1"))
.with_attribute(("attr2", "value2"))
.write_serializable_content(content)
.expect("failure");
I would expect (indentation depends on settings):
<paired attr1="value1" attr2="value2" attribute="attribute">
<element>
<baz>42</baz>
<bat>43</bat>
</element>
<list>first element</list>
<list>second element</list>
text
</paired>
src/writer.rs
Outdated
|
||
/// Serialize an arbitrary value inside the current element | ||
#[cfg(feature = "serialize")] | ||
pub fn write_serializable_content<T: Serialize>(self, content: T) -> Result<&'a mut Writer<W>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the usage example in tests I think that the name with_content
could be better, because:
writer.create_element("foo")
.with_attribute(("a", "b"))
.with_content(&bar)
It is also better to take a reference instead of value, because Serialize
is not automatically implemented for the references.
Summing up all the suggestions:
pub fn write_serializable_content<T: Serialize>(self, content: T) -> Result<&'a mut Writer<W>> { | |
pub fn with_content<T: Serialize>(self, content: &T) -> DeResult<&'a mut Writer<W>> { |
where
// in crate::error::serialize
type Result<T> = std::result::Result<T, DeError>;
use crate::error::serialize::Result as DeResult;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mingun I'm not sure which usage you are referring to, but the scheme you suggest is inconsistent with all of the existing methods such as write_text_content()
, write_empty()
, etc.
https://github.com/tafia/quick-xml/blob/master/src/writer.rs#L285-L332
They're named write_
because they perform the write, they don't push into a separate buffer. with_
implies that you can chain another method onto the end, which shouldn't be the case. Attributes are different because there's already a separate buffer for storing them which you can append to.
Likewise using DeResult
would be inconsistent with the other methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I disagree slightly with the overall approach here, we should support this directly on the writer as opposed to on ElementWriter
. If you want to use it in conjunction with ElementWriter
it would be easy to do with write_inner_content()
https://github.com/tafia/quick-xml/blob/master/src/writer.rs#L285-L332
This specific use case just feels too niche to support as an independent thing as - if it worked like the other ElementWriter
methods - it would be writing a single object inside a custom outer pair of tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g.
writer
.create_element("paired")
.with_attribute(("attr1", "value1"))
.with_attribute(("attr2", "value2"))
.write_inner_content(|writer| {
writer.write_serializable(&content)
})?;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mingun I'm not sure which usage you are referring to, but the scheme you suggest is inconsistent with all of the existing methods such as
write_text_content()
,write_empty()
, etc.
I mean, that we have with_attribute
, so another with_*
looks familiar. But you're right -- with_*
methods does not end the chain. Ok, let it be write_serializable_content
.
Likewise using
DeResult
would be inconsistent with the other methods.
That looks not so inconsistent, if we took into account that this is another serialization level which have its own error type. I do not like hiding real errors in library code -- the caller should decide how to handle them. We cannot include DeError
into Error
because DeError
already includes Error
. So the logical step is just return DeError
here.
I think I disagree slightly with the overall approach here, we should support this directly on the writer as opposed to on
ElementWriter
.
I would see this as an auxiliary method for a frequent use case. We can have methods both on the Writer
and ElementWriter
.
Also add an entry to changelog |
@calops Is this ready for re-review? |
My apologies, I didn't have any time lately to finish this. There isn't much to do to address all the feedback here though, it should be ready in the next few days. |
@calops Do you have a moment to finish in the next few days or can I do so? |
I'm refiling it here: #609 Will try to address the feedback before un-drafting |
Resolves #499
This introduces a new
write_serializable_content
method in order to serialize arbitrary types throughserde
within manual manipulation of an XML writer.I feel it's a bit rough, especially how indentation is handled, but I'm not sure how to improve this without a bigger overhaul of some of the deeper stuff.