Skip to content

Commit

Permalink
🚀 v1.3.0: Dynamic Columns
Browse files Browse the repository at this point in the history
  • Loading branch information
bmeares authored Oct 3, 2022
2 parents 0f46918 + ca574da commit 1d035a6
Show file tree
Hide file tree
Showing 19 changed files with 1,059 additions and 529 deletions.
88 changes: 87 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,94 @@
# 🪵 Changelog

## 1.3.x Releases

This is the current release cycle, so stay tuned for future releases!

### v1.3.0: Dynamic Columns

**Improvements**

- **Syncing now handles dynamic columns.**
Syncing a pipe with new columns will trigger an `ALTER TABLE` query to append the columns to your table:

```python
import meerschaum as mrsm
pipe = mrsm.Pipe('foo', 'bar', instance='sql:memory')

pipe.sync([{'a': 1}])
print(pipe.get_data())
# a
# 0 1

pipe.sync([{'b': 1}])
print(pipe.get_data())
# a b
# 0 1 <NA>
# 1 <NA> 1
```

If you've specified index columns, you can use this feature to fill in `NULL` values in your table:

```python
import meerschaum as mrsm
pipe = mrsm.Pipe(
'foo', 'bar',
columns = {'id': 'id_col'},
instance = 'sql:memory',
)

pipe.sync([{'id_col': 1, 'a': 10.0}])
pipe.sync([{'id_col': 1, 'b': 20.0}])

print(pipe.get_data())
# id_col a b
# 0 1 10.0 20.0
```

- **Add as many indices as you like.**
In addition to the special index column labels `datetime`, `id`, and `value`, the values of all keys within the `Pipe.columns` dictionary will be treated as indices when creating and updating tables:

```python
import meerschaum as mrsm
indices = {'micro': 'station', 'macro': 'country'}
pipe = mrsm.Pipe('demo', 'weather', columns=indices, instance='sql:memory')

docs = [{'station': 1, 'country': 'USA', 'temp_f': 80.6}]
pipe.sync(docs)

docs = [{'station': 1, 'country': 'USA', 'temp_c': 27.0}]
pipe.sync(docs)

print(pipe.get_data())
# station country temp_f temp_c
# 0 1 USA 80.6 27.0
```

- **Added a default 60-second timeout for pipe attributes.**
All parameter properties (e.g. `Pipe.columns`, `Pipe.target`, `Pipe.dtypes`, etc.) will sync with the instance every 60 seconds. The in-memory attributes will be patched on top of the database values, so your unsaved state won't be lost (persist your state with `Pipe.edit()`). You can change the timeout duration with `mrsm edit config pipes` under the keys `attributes:local_cache_timeout_seconds`. To disable this caching behavior, set the value to `null`.

- **Added custom indices and Pandas data types to the Web UI.**

**Breaking Changes**

- **Removed `None` as default for uninitalized properties for pipes.**
Parameter properties like `Pipe.columns`, `Pipe.parameters`, etc. will now always return a dictionary, even if a pipe is not registered.
- **`Pipe.get_columns()` now sets `error` to `False` by default.**
Pipes are now mostly index-agnostic, so these checks are no longer needed. This downgrades errors in several functions to just warnings, e.g. `Pipe.get_sync_time()`.

**Bugfixes**

- **Always quote tables that begin with underscores in Oracle.**
- **Always refresh the metadata when grabbing `sqlalchemy` tables for pipes to account for dynamic state.**

## 1.2.x Releases

This is the current release cycle, so future features will be updated below.
This series brought many industry-ready features, such as the `@make_connector` decorator, improvements to the virtual environment system, the environment variable `MRSM_PLUGINS_DIR`, and much more.

### v1.2.9

- **Added support for Windows junctions for virtual environments.**
This included many changes to fix functionality on Windows. For example, the addition of the `MRSM_PLUGINS_DIR` environment variable broke Meerschaum on Windows, because Windows requires administrator rights to create symlinks.

### v1.2.8

Expand Down
88 changes: 87 additions & 1 deletion docs/mkdocs/news/changelog.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,94 @@
# 🪵 Changelog

## 1.3.x Releases

This is the current release cycle, so stay tuned for future releases!

### v1.3.0: Dynamic Columns

**Improvements**

- **Syncing now handles dynamic columns.**
Syncing a pipe with new columns will trigger an `ALTER TABLE` query to append the columns to your table:

```python
import meerschaum as mrsm
pipe = mrsm.Pipe('foo', 'bar', instance='sql:memory')

pipe.sync([{'a': 1}])
print(pipe.get_data())
# a
# 0 1

pipe.sync([{'b': 1}])
print(pipe.get_data())
# a b
# 0 1 <NA>
# 1 <NA> 1
```

If you've specified index columns, you can use this feature to fill in `NULL` values in your table:

```python
import meerschaum as mrsm
pipe = mrsm.Pipe(
'foo', 'bar',
columns = {'id': 'id_col'},
instance = 'sql:memory',
)

pipe.sync([{'id_col': 1, 'a': 10.0}])
pipe.sync([{'id_col': 1, 'b': 20.0}])

print(pipe.get_data())
# id_col a b
# 0 1 10.0 20.0
```

- **Add as many indices as you like.**
In addition to the special index column labels `datetime`, `id`, and `value`, the values of all keys within the `Pipe.columns` dictionary will be treated as indices when creating and updating tables:

```python
import meerschaum as mrsm
indices = {'micro': 'station', 'macro': 'country'}
pipe = mrsm.Pipe('demo', 'weather', columns=indices, instance='sql:memory')

docs = [{'station': 1, 'country': 'USA', 'temp_f': 80.6}]
pipe.sync(docs)

docs = [{'station': 1, 'country': 'USA', 'temp_c': 27.0}]
pipe.sync(docs)

print(pipe.get_data())
# station country temp_f temp_c
# 0 1 USA 80.6 27.0
```

- **Added a default 60-second timeout for pipe attributes.**
All parameter properties (e.g. `Pipe.columns`, `Pipe.target`, `Pipe.dtypes`, etc.) will sync with the instance every 60 seconds. The in-memory attributes will be patched on top of the database values, so your unsaved state won't be lost (persist your state with `Pipe.edit()`). You can change the timeout duration with `mrsm edit config pipes` under the keys `attributes:local_cache_timeout_seconds`. To disable this caching behavior, set the value to `null`.

- **Added custom indices and Pandas data types to the Web UI.**

**Breaking Changes**

- **Removed `None` as default for uninitalized properties for pipes.**
Parameter properties like `Pipe.columns`, `Pipe.parameters`, etc. will now always return a dictionary, even if a pipe is not registered.
- **`Pipe.get_columns()` now sets `error` to `False` by default.**
Pipes are now mostly index-agnostic, so these checks are no longer needed. This downgrades errors in several functions to just warnings, e.g. `Pipe.get_sync_time()`.

**Bugfixes**

- **Always quote tables that begin with underscores in Oracle.**
- **Always refresh the metadata when grabbing `sqlalchemy` tables for pipes to account for dynamic state.**

## 1.2.x Releases

This is the current release cycle, so future features will be updated below.
This series brought many industry-ready features, such as the `@make_connector` decorator, improvements to the virtual environment system, the environment variable `MRSM_PLUGINS_DIR`, and much more.

### v1.2.9

- **Added support for Windows junctions for virtual environments.**
This included many changes to fix functionality on Windows. For example, the addition of the `MRSM_PLUGINS_DIR` environment variable broke Meerschaum on Windows, because Windows requires administrator rights to create symlinks.

### v1.2.8

Expand Down
24 changes: 15 additions & 9 deletions meerschaum/api/dash/pipes.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
from meerschaum.utils.typing import List, Optional, Dict, Any, Tuple, Union
from meerschaum.utils.misc import string_to_dict
from meerschaum.utils.packages import attempt_import, import_dcc, import_html
from meerschaum.utils.sql import get_pd_type
from meerschaum.api import endpoints, CHECK_UPDATE
from meerschaum.api.dash import (
dash_app, debug, _get_pipes
Expand Down Expand Up @@ -168,13 +169,8 @@ def accordion_items_from_pipe(
html.Tr([html.Td("Location"), html.Td(f"{pipe.location_key}")]),
html.Tr([html.Td("Instance"), html.Td(f"{pipe.instance_keys}")]),
]
if dt_name is not None:
overview_rows.append(html.Tr([html.Td("Datetime column"), html.Td(dt_name)]))
if id_name is not None:
overview_rows.append(html.Tr([html.Td("ID column"), html.Td(id_name)]))
if val_name is not None:
overview_rows.append(html.Tr([html.Td("Value column"), html.Td(val_name)]))

for col_key, col in pipe.columns.items():
overview_rows.append(html.Tr([html.Td(f"'{col_key}' Index"), html.Td(col)]))

items_bodies['overview'] = dbc.Table(
overview_header + [html.Tbody(overview_rows)],
Expand Down Expand Up @@ -214,10 +210,20 @@ def accordion_items_from_pipe(

if 'columns' in active_items:
try:
columns_header = [html.Thead(html.Tr([html.Th("Column"), html.Th("Type")]))]
columns_header = [html.Thead(html.Tr([
html.Th("Column"), html.Th("DB Type"), html.Th("PD Type")
]))]
columns_types = pipe.get_columns_types(debug=debug)
# combined_columns_types = {
# col: {
# 'db': typ,
# 'pd': get_pd_type(typ)
# } for col, typ in columns_types.items()
# }
columns_rows = [
html.Tr([html.Td(col), html.Td(typ)]) for col, typ in columns_types.items()
html.Tr([
html.Td(html.Pre(col)), html.Td(html.Pre(typ)), html.Td(html.Pre(get_pd_type(typ)))
]) for col, typ in columns_types.items()
]
columns_body = [html.Tbody(columns_rows)]
columns_table = dbc.Table(columns_header + columns_body, bordered=False, hover=True)
Expand Down
3 changes: 3 additions & 0 deletions meerschaum/config/_default.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,9 @@
'id' : None,
},
},
'attributes' : {
'local_cache_timeout_seconds': 60,
},
}
default_plugins_config = {}
default_experimental_config = {
Expand Down
2 changes: 1 addition & 1 deletion meerschaum/config/_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
Specify the Meerschaum release version.
"""

__version__ = "1.2.9"
__version__ = "1.3.0"
Loading

0 comments on commit 1d035a6

Please sign in to comment.