Releases: bmeares/Meerschaum
v1.5.10
v1.5.8 – v1.5.10
-
Infer JSON columns from the first first non-null value.
When determining complex columns (dictionaries or lists), the first non-null value of the dataframe is checked rather than the first row only. This accounts for documents which contain variable keys in the same sync, e.g.:import meerschaum as mrsm pipe = mrsm.Pipe('a', 'b') pipe.sync([ {'a': {'b': 1}}, {'c': {'d': 2}}, ])
-
Fix a bug when reconstructing JSON columns.
When rebuilding JSON values after merging, a check is first performed if the value is in fact a string (sometimesNULLS
slip in). -
Increase the timeout when determining Python versions.
This fixes some difficult-to-reproduce bugs on Windows.
v1.5.7
v1.5.7
-
Replace
ast.literal_eval()
withjson.loads()
when filtering JSON columns.
This patch replaces the use ofstr
andast.literal_eval()
withjson.dumps()
andjson.loads()
to preserve accuracy. -
Fix a subtle bug with subprocesses.
The functionrun_python_package()
now better handles environment passing and raises a more verbose warning when something goes wrong. -
Allow columns with
'create'
in the name.
A security measure previously disallowed certain keywords when sanitizing input. Now columns are allowed to contain certain keywords.
v1.5.6
v1.5.3 – v1.5.6
-
Pipes now support syncing dictionaries and lists.
Complex columns (dicts or lists) will now be preserved:import meerschaum as mrsm pipe = mrsm.Pipe('a', 'b') pipe.sync([{'a': {'b': 1}}]) df = pipe.get_data() print(df['a'][0]) # {'b': 1}
You can also force strings to be parsed by setting the data type to
json
:import meerschaum as mrsm pipe = mrsm.Pipe( 'foo', 'bar', columns = {'datetime': 'id'}, dtypes = {'data': 'json', 'id': 'Int64'}, ) docs = [{'id': 1, 'data': '{"foo": "bar"}'}] pipe.sync(docs) df = pipe.get_data() print(df['data'][0]) # {'foo': 'bar'}
For PostgreSQL-like databases (e.g. TimescaleDB), this is stored as
JSONB
under the hood. For all others, it's stored as the equivalent forTEXT
. -
Fixed determining the version when installing plugins.
Like therequired
list, the__version__
string must be explicitly set in order for the correct version to be determined. -
Automatically cast
postgres
topostgresql
When aSQLConnector
is built with a flavor ofpostgres
, it will be automatically set topostgresql
.
v1.5.2
v1.5.0 – v1.5.2
-
Pipes may now use integers for the
datetime
column.
If you use an auto-incrementing integer as your primary key, you may now use that column as your pipe'sdatetime
column, just specify thedtype
as anInt64
:import meerschaum as mrsm pipe = mrsm.Pipe( 'foo', 'bar', instance = 'sql:memory', columns = { 'datetime': 'id', }, dtypes = { 'id': 'Int64', }, ) pipe.sync([{'id': 1, 'foo': 'bar'}]) pipe.sync([{'id': 2, 'foo': 'baz'}])
This applies the same incremental range filtering logic as is normally done on the datetime axis.
-
Allow for multiple plugins directories.
You may now set multiple directories forMRSM_PLUGINS_DIR
. All of the plugins contained in each directory will be symlinked together into a singleplugins
namespace. To do this, just setMRSM_PLUGINS_DIR
to a JSON-encoded list:export MRSM_PLUGINS_DIR='["./plugins_1", "./plugins_2"]'
-
Better Windows support.
At long last, the color issues plaguing Windows users have finally been resolved. Additionally, support for background jobs has been fixed on Windows, though the daemonization library I use is pretty hacky and doesn't make for the smoothest experience. But at least it works now! -
Fixed unsafe TAR extraction.
A PR about unsafe use oftar.extractall()
brought this issue to light. -
Fixed the blank logs bug in
show logs
.
Backtracking a couple lines before following the rest of the logs has been fixed. -
Requirements may include brackets.
Python packages listed in a plugin'srequirements
list may now include brackets (e.g.meerschaum[api]
). -
Enforce 1000 row limit in
SQLConnector.to_sql()
for SQLite.
When inserting rows, the chunksize of 1000 is enforced for SQLite (was previously enforced only for reading). -
Patch parameters from
--params
inedit pipes
andregister pipes
.
When editing or registering pipes, the value of--params
will now be patched into the pipe's parameters. This should be very helpful when scripting. -
Fixed
edit users
.
This really should have been fixed a long time ago. The actionedit users
was broken due to a stray import left over from a major refactor. -
Fixed a regex bug when cleaning up packages.
-
Removed
show gui
andshow modules
.
v1.4.14
v1.4.14
-
Added flag
temporary
toPipe
(and--temporary
).
Pipes built withtemporary=True
, will not create instance tables (pipes
,users
, andplugins
) or be able to modify registration. This is particularly useful when creating pipes from existing tables when automatic registration is not desired.import meerschaum as mrsm import pandas as pd conn = mrsm.get_connector('sql:temp', uri='postgresql://user:pass@localhost:5432/db') ### Simulating an existing table. table_name = 'my_table' conn.to_sql( pd.DataFrame([{'id_column': 1, 'value': 1.0}]), name = table_name, ) ### Create a temporary pipe with the existing table as its target. pipe = mrsm.Pipe( 'foo', 'bar', target = table_name, temporary = True, instance = conn, columns = { 'id': 'id_column', }, ) docs = [ { "id_column": 1, "value": 123.456, "new_column": "hello, world!", }, ] ### Existing table `my_table` is synced without creating other tables ### or affecting pipes' registration. pipe.sync(docs)
-
Fixed potential security of public instance tables.
The API now refuses to sync or serve data if the target is a protected instance table (pipes
,users
, orplugins
). -
Added not-null check to
pipe.get_sync_time().
Thedatetime
column should never contain null values, but just in case,pipe.get_sync_time()
now passes a not-null check toparams
for the datetime column. -
Removed prompt for
value
frompipe.bootstrap()
.
The prompt for an optionalvalue
column has been removed from the bootstrapping wizard becausepipe.columns
is now largely used as a collection of indices rather than the original purpose of meta-columns. -
Pass
--debug
and other flags incopy pipes
.
Command line flags are now passed to the new pipe when copying an existing pipe.
v1.4.13
v1.4.12 – v1.4.13
-
Fixed an issue when syncing empty DataFrames (#95).
When syncing an empty list of documents,Pipe.filter_existing()
would trigger pulling the entire table into memory. This patch adds a check if the dataframe is empty. -
Allow the
datetime
column to be omitted in thebootstrap
wizard.
Now that thedatetime
index is optional, the bootstrapping wizard allows users to skip this index. -
Fixed a small issue when syncing to MySQL.
Due to the addition of MySQL 5.7 support in v1.4.11, a slight edge case arose which broke SQL definitions. This patch fixes MySQL behavior when aWHERE
clause is present in the definition.
v1.4.11
v1.4.11
-
Add support for older versions of MySQL.
TheWITH
keyword for CTE blocks was not introduced until MySQL 8.0. This patch uses the older syntax for older versions of MySQL and MariaDB. MySQL 5.7 was added to the test suite. -
Allow for any iterable in
items_str()
If an iterable other than a list is passed toitems_str()
, it will convert to a list before building the string:from meerschaum.utils.misc import items_str print(items_str({'apples': 1, 'bananas': 2}, quotes=False) # apples and bananas
-
Fixed an edge case with
datetime
set toNone
.
This patch will ignore the datetime index even if it was set explicitly toNone
. -
Added
Pipe.children
.
To complementPipe.parents
, setting the parameters keychildren
to a list of pipes' keys will be treated the same asPipe.parents
:import meerschaum as mrsm pipe = mrsm.Pipe( 'a', 'b', parameters = { 'children': [ { 'connector': 'a', 'metric': 'b', 'location': 'c', }, ] } ) print(pipe.children) # [Pipe('a', 'b', 'c')]
-
Added support for
type:label
syntax inmrsm.get_connector()
.
The factory functionmrsm.get_connector()
expects the type and label as two arguments, but this patch allows for passing a single string with both arguments:import meerschaum as mrsm print(mrsm.get_connector('sql:local')) # sql:local
-
Fixed more edge case bugs.
For example, converting toInt64
sometimes breaks with older versions ofpandas
. This patch adds a workaround.
v1.4.10
v1.4.10
-
Fixed an issue with syncing background jobs.
The--name
flag of background jobs with colliding with thename
keyword argument ofSQLConnector.to_sql()
. -
Fixed a datetime bounding issue when
datetime
index is omitted.
If the minimum datetime value of the incoming dataframe cannot be determined, do not bound theget_data()
request. -
Keep existing parameters when registering plugin pipes.
When a pipe is registered with a plugin as its connector, the return value of theregister()
function will be patched with the existing in-memory parameters. -
Fixed a data type syncing issue.
In cases where fetched data types do not match the data types in the pipe's table (e.g. automatic datetime columns), a bug has been patched to ensure the correct data types are enforced. -
Added
Venv
to the root namespace.
Now you can access virtual environments directly frommrsm
:import meerschaum as mrsm with mrsm.Venv('noaa'): import pandas as pd
v1.4.9
v1.4.9
-
Fixed in-place syncs for aggregate queries.
In-place SQL syncs which use aggregation functions are now handled correctly. This version addresses differences in column types between backtrack and new data. For example, the following query will now be correctly synced:WITH days_src AS ( SELECT *, DATE_TRUNC('day', "datetime") AS days FROM plugin_stress_test ) SELECT days, AVG(val) AS avg_value FROM days_src GROUP BY days
-
Activate virtual environments for custom instance connectors.
All pipe methods now activate virtual environments for custom instance connectors. -
Improved database connection performance.
Cold connections to a SQL database have been sped up by replacingsqlalchemy_utils
with handwritten logic (JSON for PostgreSQL-like and SQLite). -
Fixed an issue with virtual environment verification in a portable environment.
The portable build has been updated to Python 3.9.15, and this patch includes a check to determine the knownsite-package
path for a virtual environment ofNone
instead of relying on the default usersite-packages
directory. -
Fixed some environment warnings when starting the API
v1.4.8
v1.4.5 – v1.4.8
- Bugfixes and stability improvements.
These versions included several bugfixes, such as patching--skip-check-existing
for in-place syncs and fixing the behavior of--params
(build_where()
).