Skip to content

Open Questions

ehennestad edited this page Nov 30, 2022 · 16 revisions

EH

Manifest classes for item tables

  • There is a lot of commonality between the OphysManifest class and the EphysManifest class. It would make sense to gather this in a superclass, and the Manifest class is an obvious candidate. Would that be ok to do, or would that violate some principles of design?

    • VI: Much of OOP design pattern lore guides to "favor association over inheritance", which is what's done currently. However I believe this lore largely stems from Java & strongly reflects its lack of multiple inheritance. I agree with your instinct to favor inheritance where rational such as here. Seems a worthwhile refactor to consider. And regardless of association or inheritance relationship, refactoring to centralize common functionality (e.g. updateManifest as noted below) would be worthwhile. As your next question gets at, this may involve detective work to determine what is or could/should be common.
    • EH: This is now implemented and there were a handful of methods that are shared among Ephys- and OphysManifest that are now methods of the Manifest superclass
  • The Ephys- and OphysManifest classes do not have the same internal logic for retrieving and caching manifests. In the OphysManifest, all the different tables are fetched and put into a struct and immediately added to the disk cache, whereas in the EphysManifest they are handled more individually and the final processed tables are initially cached in memory, and only added to the disk cache when the property get method for particular item types are invoked. Is there a particular reason for this (i.e performance considerations during retrieval)?

    • VI: Great that you're analyzing this. Perhaps DM has insights.
    • DM: The EphysManifest handling adheres very closely to the Allen SDK. The OphysManifest handling predates my involvement, but it's also the case that OphysManifest tables are much much smaller, so re-fetching or re-processing portions is less costly. So the additional caching of EPhysManifest might anyway be warranted.
    • EH: I found that the biggest discrepancy was that item tables for the OphysManifest were all lumped together in one struct in the disk cache, whereas the ephys item tables were treated individually. I made the change to treat ophys tables individually as well, and I think this provides some improvements, e.g when fetching ophys sessions, it only fetches the session table where before it fetched the session, experiment and the cell tables (and the cell table is huge). Also, the underlying code is now more "symmetric". Thus, it should now be easier to do the same (post)processing of ophys tables as ephys tables, i.e joining tables and adding counts for subtables.
  • Would it be cleaner to use the OnDemandProperties mixin for memory caching of tables in the Manifest classes?

    • VI: So far OnDemandProperties has only been used for user-facing properties of item object properties. I guess adapting to private properties would be fairly trivial, but would have to analyze wrt to how intertwined/specialized it is for item objects today. Conceptually seems sound, but would tend to prioritize lower if it doesn't prove highly straightforward.
    • EH: This was straight forward, and since item table get-access is public, they are in theory user facing (although users should not have to interact with them directly).
  • What is the purpose of the UpdateManifests method - Why is it called update? It seems more like a clear/reset method...

    • VI: I believe it was intended to allow a fresh check of the data source (whether API or now possibly S3), which is always subject to change (which has occurred from time to time during the project, affecting demos etc). Maybe "Refresh" would have been a better word than "Update", but the implementation is a clear/reset type operation as you say.
    • EH: I changed the name to resetManifest, and added a new method updateManifest that does the reset, but additionally fetches all the manifests again.

General note: I changed some syntax internally in these classes, so now all the tables are referred to as item tables (in general) and session table, unit table etc. in particular. A manifest would then be the collection of all item tables for a dataset. I think this makes things clearer, as before a manifest could be both an individual table, but also the collection of tables (and, as we discussed before, cache maps are also referred to as manifests).

General syntax suggestions:

  • To give the manifests more of a matlab flavour I would suggest to rename the item table properties like ‘ophys_sessions' or ‘ephys_sessions' to ‘Sessions’.
  • Following the same line, internally in the classes (manifest classes, but also others) to consistently refer to objects of the class with the variable name obj instead of various acronyms that are specific to specific classes.

Caching / API

  • In the bot.internal.cache/CachedApiCall method, there are a lot of optional arguments, like: nPageSize, strFormat, strRMAPrefix, strHost, strScheme. However, none of the calls to this method makes use of them, and for some, like strScheme, strHost and strRMAPrefix, it seems like they are not really variable in the context of the Allen Brain Observatory Api. Was there any past scenario where these were useful, or is there any future scenario where they might be put to use?
    • DM: This was my conservative implementation, as I learned how to use the back-end HTTP API. Future-proofing in case Allen changes the way the API should be accessed?
  • Do you know if there exists documentation for the RMA API that is specific for the brain observatory? Or is the current API calls in matlab derived from the python toolbox?
    • DM: Unknown to me
  • Is there a reason why behavioral data and pupil data are not added as linked files? Are they not available as well_known_files from the API?

VI

  • Are there H5 files in the Visual Coding 2P dataset at this point? The "SessH5" linked file appears defunct in various ophys sessions I've retrieved lately.
Clone this wiki locally