GitHub - BuildingEnergySimulationTools/tide: Vizualisation and pipeline creation for time series

Pipeline Development and Data Visualization for Time Series in Physical Measurements

Tide is a tool for developing data processing pipelines and visualizing time series data, particularly suited for physical measurements. Key features include:

Efficient Data Management
- Organize and select data using a tagging system
Pipeline Construction
- Store and retrieve pipelines easily with JSON-based dictionary structures
- Build dynamic pipelines that adjust based on the selected data
Interactive Visualization
- Create interactive plots to explore data (plotly)
- Visualize pipeline or slices of pipelines effects on data
Custom Data Enrichment
- Integrate external weather data sources
- Implement autoregressive models for gaps filling
- Develop and incorporate custom data processors

Uses pandas DataFrames and Series for robust data handling. bigtree for tags and data selection. Scikit-learn's API for pipeline construction.

Getting started

1- Install ⬇️

pip install python-tide

2- Load and format data 🌲

To begin, load your time series data into a pandas DataFrame, ensuring the index is a DateTimeIndex:

df = pd.read_csv(
    "https://raw.githubusercontent.com/BuildingEnergySimulationTools/tide/main/tutorials/getting_started_ts.csv",
    parse_dates=True,
    index_col=0
)

Rename columns using Tide's tagging system. The format is: name__unit__bloc__sub_bloc with tags separated by double underscores. The order of the tags matters. The order of tags is important, and you can use "OTHER" as a placeholder You can use one or several tags.

df.columns = ["Tin__°C__Building", "Text__°C__Outdoor", "Heat__W__Building"]

Plumber objects are used to help us with pipelines building and data visualization

from tide.plumbing import Plumber

plumber = Plumber(df)

Display the data organization as a tree:

plumber.show()

Select data using tags:

plumber.get_corrected_data("°C")
plumber.get_corrected_data("Building")
plumber.get_corrected_data("Tin")

3- Visualizing data 📈

Show data availability:

plumber.plot_gaps_heatmap(time_step='d')

Plot time series with missing data highlighted:

fig = plumber.plot(plot_gaps=True)
fig.show(None)

4- Building and testing Pipelines 🛠️

Create a pipeline dictionary:

pipe_dict = {
    "step_1": [["Common_proc_1"], ["Common_proc_2", ["arg1", "arg2"]]],
    "step_2": {
        "selection_1": [["Proc_selection_1", {"arg": "arg_value"}]]
    }
}

Pipeline Rules:

Use dictionaries for pipeline description
Keys represent pipeline steps ex. "step_1"
Step values can be lists (apply to all columns) or dicts (filter columns)
Processing objects are listed as [class_name, arguments]

Example Pipeline:

Resample data to 15-minute intervals
Interpolate temperature gaps ≤ 3 hours
Fill large Tin gaps using Autoregressive STLForecast

pipe_dict = {
    "resample_15min": [["Resample", ["15min"]]],
    "interpolate_temps": {
        "°C": [["Interpolate", {"gaps_lte": "3h"}]]
    },
    "ar_tin": {
        "Tin": [
            [
                "FillGapsAR",
                {
                    "model_name": "Prophet",
                    "resample_at_td": "1h",
                    "gaps_gte": "3h",
                    "gaps_lte": "3d"
                }
            ]
        ]
    }
}

plumber.pipe_dict = pipe_dict

Get pipeline using get_pipeline method.

plumber.get_pipeline(verbose=True)

Get pipelines for specific columns

plumber.get_pipeline(select="Building", verbose=True)

Visualize pipeline effects:

plumber.plot(
    steps=None,
    plot_gaps=True,
    steps_2=slice(None, "interpolate_temps"),
    plot_gaps_2=True,
    verbose=True
)

Step Arguments:

None: No operation (Identity)
str: Process until named step
list[str]: Perform specified steps
slice: Process a slice of the pipeline

Compare full pipeline to raw data:

plumber.plot(
    steps=None,
    plot_gaps=True,
    steps_2=slice(None),
    plot_gaps_2=True,
    verbose=True
)

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
.github/workflows		.github/workflows
requirements		requirements
tests		tests
tide		tide
tutorials		tutorials
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE.md		LICENSE.md
README.md		README.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
tide_logo.svg		tide_logo.svg
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pipeline Development and Data Visualization for Time Series in Physical Measurements

Getting started

1- Install ⬇️

2- Load and format data 🌲

3- Visualizing data 📈

4- Building and testing Pipelines 🛠️

Sponsors

About

Releases

Packages

Languages

License

BuildingEnergySimulationTools/tide

Folders and files

Latest commit

History

Repository files navigation

Pipeline Development and Data Visualization for Time Series in Physical Measurements

Getting started

1- Install ⬇️

2- Load and format data 🌲

3- Visualizing data 📈

4- Building and testing Pipelines 🛠️

Sponsors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages