Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement minor changes from Weather branch into Auxiliary for PR #182

Open
wants to merge 39 commits into
base: main
Choose a base branch
from

Conversation

gmanatole
Copy link
Contributor

@gmanatole gmanatole commented Jul 20, 2024

Weather branch is a huge PR coming up. We break it down into smaller PRs starting with the changes to Auxiliary.
Here is a small list of the modifications :
utils/timestamp_utils :
Function to_timestamp():
Existing timestamp handling of : "%Y-%m-%dT%H:%M:%S.%fZ" or "%Y-%m-%dT%H-%M-%S_%fZ"
We add handling of "%Y-%m-%dT%H:%M:%S.%f%z", handling of data already in timestamp format.
We can now apply this to_timestamp() to any df regardless of is the data is already into timestamp.
The check_epoch function applies the to_timestamp function to any df and creates an epoch column (POSIX time).

Spectrogram :
We switch audio_foldername from a variable to an attribute that will be used in Auxiliary

Auxiliary :
Added a saving method. As of now joined dataframes are stored in data/auxiliary/{self.spectro_duration}_{self.samplerate}.csv (ie self.audio_foldername)
Note that the dataframe only depends on spectro_duration.
Moved fetch_data method from Weather class to join_acoustic method in Auxiliary

gmanatole and others added 30 commits May 6, 2024 17:47
* Update README.md with new "OSEkit" name

* Update README.md "osekit" renamed to "OSEkit"

* correct name of processed file (#149)

there was a bug, it was joining two absolute paths..

* correct pending jobs list in the case of resampling only mode

* do not send jobs with zscore other than original , to be changed later

---------

Co-authored-by: Elodie <[email protected]>
Co-authored-by: cazaudo <[email protected]>
Removed commented functions
Figure showing how Auxiliary module can be used
Adding section on auxiliary module
Modifying figure
…dio_foldername attribute, changing timestamp handling) to join branch for PR
@gmanatole gmanatole changed the title Implement minor changes from Weather branch into for PR Implement minor changes from Weather branch into Auxiliary for PR Jul 20, 2024
@cazaudo
Copy link
Member

cazaudo commented Jul 22, 2024

please format your code using black , poetry run black .

@cazaudo
Copy link
Member

cazaudo commented Jul 22, 2024

i cannot see the acoustic features in the saved dataframe , eg /home/datawork-osmose/dataset/glider_WHOI_2014_we04/data/auxiliary/59_32000.csv , we just have to set " , acoustic = True)" right ?

@cazaudo
Copy link
Member

cazaudo commented Jul 22, 2024

not easy to see how you will make your input params fcs settable by the user , but it will be necessary , making available to the user the method join_acoustics ? that should bypass the automatic_join when defined from the exterior ? dont know

@cazaudo
Copy link
Member

cazaudo commented Jul 22, 2024

suggestion : another possible commit on the implementation of a dropna method inst.df = inst.df.dropna() in the Auxiliary class ?

@cazaudo
Copy link
Member

cazaudo commented Jul 22, 2024

not sure it is a good idea to name dataframe column with a numerical value (eg fcs=[8000] will give a column name 8000 -> inst.df.8000 does not work..)

@gmanatole
Copy link
Contributor Author

not sure it is a good idea to name dataframe column with a numerical value (eg fcs=[8000] will give a column name 8000 -> inst.df.8000 does not work..)

Do you have any suggestions ? I don't mind calling inst.df['8000'] as I find it intuitive, but we could change column name

@gmanatole
Copy link
Contributor Author

i cannot see the acoustic features in the saved dataframe , eg /home/datawork-osmose/dataset/glider_WHOI_2014_we04/data/auxiliary/59_32000.csv , we just have to set " , acoustic = True)" right ?

Modified acoustic join to create full_band acoustic join.

@gmanatole gmanatole closed this Jul 22, 2024
@gmanatole gmanatole reopened this Jul 22, 2024
@cazaudo
Copy link
Member

cazaudo commented Jul 22, 2024

"Do you have any suggestions ? I don't mind calling inst.df['8000'] as I find it intuitive, but we could change column name"

i would use SPL_8000 , by default SPL_broadband

@cazaudo
Copy link
Member

cazaudo commented Jul 22, 2024

Modified acoustic join to create full_band acoustic join.

sorry dedicated term is more broadband rather than fullband

@gmanatole
Copy link
Contributor Author

"Do you have any suggestions ? I don't mind calling inst.df['8000'] as I find it intuitive, but we could change column name"

i would use SPL_8000 , by default SPL_broadband

@cazaudo Are all npz storing sound pressure level data ? Scipy's welch function returns power spectral densities.
We can use NL for noise level ?

@cazaudo
Copy link
Member

cazaudo commented Jul 22, 2024

@cazaudo Are all npz storing sound pressure level data ? Scipy's welch function returns power spectral densities.
We can use NL for noise level ?

good point we need to more rigorous on our voc , i let you watch the method gen_spectro to see details on what is saved , if doubt talk about it with @GabrielDubus he is the last contributor to the code

@GabrielDubus
Copy link
Contributor

GabrielDubus commented Jul 22, 2024

@cazaudo Are all npz storing sound pressure level data ? Scipy's welch function returns power spectral densities.
We can use NL for noise level ?

good point we need to more rigorous on our voc , i let you watch the method gen_spectro to see details on what is saved , if doubt talk about it with @GabrielDubus he is the last contributor to the code

If self.data_normalization is instrument, and considering that input data are in Pascal, gen_spectro return:

  • Pa**2 (scaling=spectrum), wich correspond to Power spectrum
  • Pa**2/Hz (scaling = density) wich correspond to DSP

If self.data_normalization is zscore, gen_spectro return the square of adimensionned values, so "log_spectro = 10 * np.log10(Sxx + (1e-20))" is scaled around 0dB

Anatole GROS-MARTIAL added 3 commits July 22, 2024 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants