You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I import the EnzymeML_Template_Example.xlsm file into an EnzymeMLDocument using pyenzyme, and then try to look at the data, I get a dataframe with the "absorbance" and "concentration" data concatenated on top of each other. I have a couple of questions about this:
Is this the expected behavior? I would have expected to get a dataframe that roughly corresponded to the input data from the excel template file. In this case, one column for time, and two columns for the pyruvate (species s0) data, one corresponding to concentration, and one to absorbance.
If you have a datatype for absorbance, is there any place to store information about the absorbance wavelength?
From an EnzymeMLDocument object, how do I find the data type? I would have expected this information to be accessible from the measurement_dict object.
Regards,
Dan
The text was updated successfully, but these errors were encountered:
Hi Dan! Thanks for submitting the issue and your questions. Happy to answer your questions:
Is this the expected behavior? I would have expected to get a dataframe that roughly corresponded to the input data from the excel template file. In this case, one column for time, and two columns for the pyruvate (species s0) data, one corresponding to concentration, and one to absorbance.
This is expected behavior but has been implemented in aid of the modeling platforms we are communicating to. I am happy to add a flag that disables this behavior and results in species columns side by side.
If you have a datatype for absorbance, is there any place to store information about the absorbance wavelength?
To this point, there is no place to add the wavelength of an absorbance to EnzymeML, but this is a current work in progress and will be implemented soon.
From an EnzymeMLDocument object, how do I find the data type? I would have expected this information to be accessible from the measurement_dict object.
The data_type information is tied to the Replicate object, which is a container for the measured values of a species. The Measurement object on the other hand represents a set of Replicates and initial concentrations. Hence, you can access the individual data types by getting the replicates. Here is an example that uses the EnzymeML_template_example.xlsm spreadsheet:
# Get the measurement with the id "m1"measurement=enzmldoc.getMeasurement("m1")
# Get the reactant with the id "s0"s1=measurement.getReactant("s0")
# Finally, get all replicates and print their data typesforreplicateins1.replicates:
print(replicate.data_type)
# Out:# DataTypes.ABSORPTION# DataTypes.CONCENTRATION
Would you prefer the DataFrame export to filter certain data types? This way there wouldn't be a mix up of different types.
I think that only concentration data should be exported by default, since that is the data that is most likely to be used by people other than the creator of the EnzymeML document.
Without wavelength information, the absorbance data does not seem particularly useful. I agree that it should be saved for archival purposes, but this is one more reason to exclude it from the default export.
If the absorbance data is exported, it should be exported as a separate column. To me, having one column that contains both the expected concentration data, and the unexpected (and differently scaled) absorbance data, seems very confusing.
When I import the
EnzymeML_Template_Example.xlsm
file into an EnzymeMLDocument using pyenzyme, and then try to look at the data, I get a dataframe with the "absorbance" and "concentration" data concatenated on top of each other. I have a couple of questions about this:Regards,
Dan
The text was updated successfully, but these errors were encountered: