Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hackathon discussion forum #2

Open
AruniRC opened this issue Feb 18, 2017 · 4 comments
Open

Hackathon discussion forum #2

AruniRC opened this issue Feb 18, 2017 · 4 comments

Comments

@AruniRC
Copy link
Contributor

AruniRC commented Feb 18, 2017

Purpose: use the github issues like a discussion forum for this hackathon.

You can ask questions about the dataset, some technical issues, doubts about any theoretical or implementation problems with your models that you may be having.
We'll try to help to the best of our ability.

thanks,
Aruni

@AruniRC AruniRC closed this as completed Feb 23, 2017
@AruniRC AruniRC reopened this Feb 23, 2017
@djsaunde
Copy link

In the SampleTime column of the "DATA" worksheet of the "Copper_Iron_and_Lead_for_GriD.xlsx" workbook seems to have duplicate time entries (e.g., 12:21 appears multiple times for house 8, and 8:12 appears multiple times for house 17). Is there a way to resolve this? If we're interested in time series data, it might be hard to work with multiple data points which occur at the same time.

@ospiro
Copy link

ospiro commented Feb 25, 2017

The samples were taken in quick succession and times were not recorded to the second. However, the samples are in chronological order, and the Samp_No column can be used to order them within each property.

@heyitsjoe
Copy link

Correct. Samples are in chronological order, and drawn in fairly rapid succession. My guess is that minimal time elapsed and was just neglected by whomever took the samples.

@AruniRC AruniRC closed this as completed Feb 25, 2017
@AruniRC AruniRC reopened this Feb 25, 2017
@heyitsjoe
Copy link

I had another thought on this. The samples that start with DS (the distribution samples) were drawn last, and have a different recorded time then the other samples at each respective location. One could calculate the elapsed time between the first sample and the DS sample and then divide by the total number of samples at each location to come up with the individual elapsed/unique time for each sample drawn, if desired. This would assume consistent time sampling, which I think it a defensible assumption. Just a thought. Probably not required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants