Skip to content

Data Model: Status

webbhm edited this page Jul 21, 2018 · 2 revisions

Status

Status is often overlooked, as it has little value to students and people just trying to grow plants. As long as they see some data on their temperature chart they are happy. Still, as a product, I think status is important for supporting the MVP and development. What do we do when a teacher contacts us saying their sensor is not working? How do we inject test data into the system without impacting reporting or operations? Status is currently needed for these issues, and becomes critical for scaled up operations. Much of research management's attention is focused around error trending. Watching the failure rate of a sensor to determine when it should be replaced. Looking how process changes improve (or hurt) process performance. Status in not just about sensors, but it is used to track failure of germination or failure of fruit to ripen (fungus infection can be defined as a failure status qualifier reason). It is easy to add status now and take advantage of its small benefits, and then it is there when things scale up and it is really needed. Status codes have the advantage over logs in that they are a part of the data system, they can be managed with standard reporting and other data can be combined into the status reports (ie which sensor is trending toward failure?, what is the average time to failure for a particular light fixture?, which plant is giving the most trouble with germination?).

Fractal Status Reporting

Status information needs to be reported at several levels. There is an inverse relationship between the ease of querying and reporting, and the amount of information contained. I high level boolean "Success"/"Failure" is easy to search, but it doesn't tell you much (does not answer "why"?). Comments can convey a lot of detailed specific information; but they are almost impossible to automatically process and categorize. At a practical level, status needs to be reported at about four levels:

  1. Status: the state of a process; whether it is "In Process", "Complete" (finished) or "Canceled". Canceled is used to mark mistaken entries, or processes that were stopped by administration. This is not needed for punctiliar (momentary) actions that lack a start_date and end_date, as they are by definition always "Complete". This is needed for long running processes (Trial or Experiment).
  2. Status Qualifier: A simple (almost boolean) "Success" or "Failure" applied to a record when it is complete. This allows for quick query filtering of good data and easy reporting of failures. For reports, select where equal "Success" and ignore all the other records. In practice, several more codes are useful. "Canceled" is needed when a process is stopped for a business reason, it didn't complete so it is not a "Success", but neither did something go wrong to make it a "Failure"; it is almost treated as if it never existed. "Test" is used for injecting things into the live environment without corrupting the data. This can be used for experimental code, checking if things are working or any other purpose. With legacy systems it was also helpful to have an "Unknown" status, this was needed when converting legacy data into a new system and it was impossible to determine the status of a record.
  3. Status Qualifier Reason: This is almost exclusively used to record failure reasons, and is usually a semi-custom set of coded for each activity type. For a germination activity this could contain values like "Failed to grow", "Fungus infection", "Tray dropped". This data field is used for administrative reporting to address process improvement.
  4. Comment: This is for when the status qualifier reason does not cover the situation, or more details need to be recorded (ie the Exception in a try/catch block of code). This field cannot easily be used for analytic reporting (as it is free text), but is invaluable for recording important notes.