✨ v2.7.2 Add bytes, enforce, allow autoincrementing datetime index, improve MSSQL indices.
v2.7.0 – v2.7.2
-
Introduce the
bytes
data type.
Instance connectors which support binary data (e.g.SQLConnector
) may now take advantage of thebytes
dtype. Other connectors (e.g.ValkeyConnector
) may usemeerschaum.utils.dtypes.serialize_bytes()
to store binary data as a base64-encoded string.import meerschaum as mrsm pipe = mrsm.Pipe( 'demo', 'bytes', instance='sql:memory', dtypes={'blob': 'bytes'}, ) pipe.sync([ {'blob': b'hello, world!'}, ]) df = pipe.get_data() binary_data = df['blob'][0] print(binary_data.decode('utf-8')) # hello, world! from meerschaum.utils.dtypes import serialize_bytes, attempt_cast_to_bytes df['encoded'] = df['blob'].apply(serialize_bytes) df['decoded'] = df['encoded'].apply(attempt_cast_to_bytes) print(df) # blob encoded decoded # 0 b'hello, world!' aGVsbG8sIHdvcmxkIQ== b'hello, world!'
-
Allow for pipes to use the same column for
datetime
,primary
, andautoincrement=True
.
Pipes may now use the same column as thedatetime
axis andprimary
withautoincrement
set toTrue
.pipe = mrsm.Pipe( 'demo', 'datetime_primary_key', 'autoincrement', instance='sql:local', columns={ 'datetime': 'Id', 'primary': 'Id', }, autoincrement=True, )
-
Only join on
primary
when present.
When the indexprimary
is set, use the column as the primary joining index. This will improve performance when syncing tables with a primary key. -
Add the parameter
enforce
.
The parameterenforce
(defaultTrue
) toggles data type enforcement behavior. Whenenforce
isFalse
, incoming data will not be cast to the desired data types. For static datasets where the incoming data is always expected to be of the correct dtypes, then it is recommended to setenforce
toFalse
andstatic
toTrue
.from decimal import Decimal import meerschaum as mrsm pipe = mrsm.Pipe( 'demo', 'enforce', instance='sql:memory', enforce=False, static=True, autoincrement=True, columns={ 'primary': 'Id', 'datetime': 'Id', }, dtypes={ 'Id': 'int', 'Amount': 'numeric', }, ) pipe.sync([ {'Amount': Decimal('1.11')}, {'Amount': Decimal('2.22')}, ]) df = pipe.get_data() print(df)
-
Create the
datetime
axis as a clustered index for MSSQL, even when aprimary
index is specififed.
Specifying adatetime
andprimary
index will create a nonclusteredPRIMARY KEY
. Specifying the same column as bothdatetime
andprimary
will create a clustered primary key (tip: this is useful whenautoincrement=True
). -
Increase the default chunk interval to 43200 minutes.
New hypertables will use a default chunksize of 30 days (43200 minutes). -
Virtual environment bugfixes.
Existing virtual environment packages are backed up before re-initializing a virtual environment. This fixes the issue of disappearing dependencies. -
Store
numeric
asTEXT
for SQLite and DuckDB.
Due to limited precision,numeric
columns are now stored asTEXT
, then parsed intoDecimal
objects upon retrieval. -
Show the Webterm by default when changing instances.
On the Web Console, changing the instance select will make the Webterm visible. -
Improve dtype inference.