27 Feb 06:20

bmeares

6ac40e0

v1.5.10

v1.5.8 – v1.5.10

Infer JSON columns from the first first non-null value.
When determining complex columns (dictionaries or lists), the first non-null value of the dataframe is checked rather than the first row only. This accounts for documents which contain variable keys in the same sync, e.g.:
```
import meerschaum as mrsm
pipe = mrsm.Pipe('a', 'b')
pipe.sync([
    {'a': {'b': 1}},
    {'c': {'d': 2}},
])
```
Fix a bug when reconstructing JSON columns.
When rebuilding JSON values after merging, a check is first performed if the value is in fact a string (sometimes NULLS slip in).
Increase the timeout when determining Python versions.
This fixes some difficult-to-reproduce bugs on Windows.

Assets 2

13 Feb 23:17

bmeares

v1.5.7

ca90650

v1.5.7

Replace ast.literal_eval() with json.loads() when filtering JSON columns.
This patch replaces the use of str and ast.literal_eval() with json.dumps() and json.loads() to preserve accuracy.
Fix a subtle bug with subprocesses.
The function run_python_package() now better handles environment passing and raises a more verbose warning when something goes wrong.
Allow columns with 'create' in the name.
A security measure previously disallowed certain keywords when sanitizing input. Now columns are allowed to contain certain keywords.

Assets 2

16 Jan 21:22

bmeares

v1.5.6

a3202fe

v1.5.6

v1.5.3 – v1.5.6

Pipes now support syncing dictionaries and lists.
Complex columns (dicts or lists) will now be preserved:

import meerschaum as mrsm
pipe = mrsm.Pipe('a', 'b')
pipe.sync([{'a': {'b': 1}}])
df = pipe.get_data()
print(df['a'][0])
# {'b': 1}

You can also force strings to be parsed by setting the data type to json:

import meerschaum as mrsm
pipe = mrsm.Pipe(
    'foo', 'bar',
    columns = {'datetime': 'id'},
    dtypes = {'data': 'json', 'id': 'Int64'},
)
docs = [{'id': 1, 'data': '{"foo": "bar"}'}]
pipe.sync(docs)
df = pipe.get_data()
print(df['data'][0])
# {'foo': 'bar'}

For PostgreSQL-like databases (e.g. TimescaleDB), this is stored as JSONB under the hood. For all others, it's stored as the equivalent for TEXT.

Fixed determining the version when installing plugins.
Like the required list, the __version__ string must be explicitly set in order for the correct version to be determined.
Automatically cast postgres to postgresql
When a SQLConnector is built with a flavor of postgres, it will be automatically set to postgresql.

Assets 2

09 Jan 09:53

bmeares

v1.5.2

c773463

v1.5.2

v1.5.0 – v1.5.2

Pipes may now use integers for the datetime column.
If you use an auto-incrementing integer as your primary key, you may now use that column as your pipe's datetime column, just specify the dtype as an Int64:

import meerschaum as mrsm
pipe = mrsm.Pipe(
    'foo', 'bar',
    instance = 'sql:memory',
    columns = {
        'datetime': 'id',
    },
    dtypes = {
        'id': 'Int64',
    },
)
pipe.sync([{'id': 1, 'foo': 'bar'}])
pipe.sync([{'id': 2, 'foo': 'baz'}])

This applies the same incremental range filtering logic as is normally done on the datetime axis.

Allow for multiple plugins directories.
You may now set multiple directories for MRSM_PLUGINS_DIR. All of the plugins contained in each directory will be symlinked together into a single plugins namespace. To do this, just set MRSM_PLUGINS_DIR to a JSON-encoded list:
```
export MRSM_PLUGINS_DIR='["./plugins_1", "./plugins_2"]'
```
Better Windows support.
At long last, the color issues plaguing Windows users have finally been resolved. Additionally, support for background jobs has been fixed on Windows, though the daemonization library I use is pretty hacky and doesn't make for the smoothest experience. But at least it works now!
Fixed unsafe TAR extraction.
A PR about unsafe use of tar.extractall() brought this issue to light.
Fixed the blank logs bug in show logs.
Backtracking a couple lines before following the rest of the logs has been fixed.
Requirements may include brackets.
Python packages listed in a plugin's requirements list may now include brackets (e.g. meerschaum[api]).
Enforce 1000 row limit in SQLConnector.to_sql() for SQLite.
When inserting rows, the chunksize of 1000 is enforced for SQLite (was previously enforced only for reading).
Patch parameters from --params in edit pipes and register pipes.
When editing or registering pipes, the value of --params will now be patched into the pipe's parameters. This should be very helpful when scripting.
Fixed edit users.
This really should have been fixed a long time ago. The action edit users was broken due to a stray import left over from a major refactor.
Fixed a regex bug when cleaning up packages.
Removed show gui and show modules.

Assets 2

0 Join discussion

05 Dec 04:28

bmeares

v1.4.14

c3094a7

v1.4.14

Added flag temporary to Pipe (and --temporary).
Pipes built with temporary=True, will not create instance tables (pipes, users, and plugins) or be able to modify registration. This is particularly useful when creating pipes from existing tables when automatic registration is not desired.

import meerschaum as mrsm
import pandas as pd
conn = mrsm.get_connector('sql:temp', uri='postgresql://user:pass@localhost:5432/db')

### Simulating an existing table.
table_name = 'my_table'
conn.to_sql(
    pd.DataFrame([{'id_column': 1, 'value': 1.0}]),
    name = table_name,
)

### Create a temporary pipe with the existing table as its target.
pipe = mrsm.Pipe(
    'foo', 'bar',
    target = table_name,
    temporary = True,
    instance = conn,
    columns = {
        'id': 'id_column',
    },
)

docs = [
    {
        "id_column": 1,
        "value": 123.456,
        "new_column": "hello, world!",
    },
]

### Existing table `my_table` is synced without creating other tables
### or affecting pipes' registration.
pipe.sync(docs)

Fixed potential security of public instance tables.
The API now refuses to sync or serve data if the target is a protected instance table (pipes, users, or plugins).
Added not-null check to pipe.get_sync_time().
The datetime column should never contain null values, but just in case, pipe.get_sync_time() now passes a not-null check to params for the datetime column.
Removed prompt for value from pipe.bootstrap().
The prompt for an optional value column has been removed from the bootstrapping wizard because pipe.columns is now largely used as a collection of indices rather than the original purpose of meta-columns.
Pass --debug and other flags in copy pipes.
Command line flags are now passed to the new pipe when copying an existing pipe.

Assets 2

22 Nov 01:56

bmeares

v1.4.13

edc4d23

v1.4.13

v1.4.12 – v1.4.13

Fixed an issue when syncing empty DataFrames (#95).
When syncing an empty list of documents, Pipe.filter_existing() would trigger pulling the entire table into memory. This patch adds a check if the dataframe is empty.
Allow the datetime column to be omitted in the bootstrap wizard.
Now that the datetime index is optional, the bootstrapping wizard allows users to skip this index.
Fixed a small issue when syncing to MySQL.
Due to the addition of MySQL 5.7 support in v1.4.11, a slight edge case arose which broke SQL definitions. This patch fixes MySQL behavior when a WHERE clause is present in the definition.

Assets 2

19 Nov 04:18

bmeares

v1.4.11

a9200a7

v1.4.11

Add support for older versions of MySQL.
The WITH keyword for CTE blocks was not introduced until MySQL 8.0. This patch uses the older syntax for older versions of MySQL and MariaDB. MySQL 5.7 was added to the test suite.
Allow for any iterable in items_str()
If an iterable other than a list is passed to items_str(), it will convert to a list before building the string:
```
from meerschaum.utils.misc import items_str
print(items_str({'apples': 1, 'bananas': 2}, quotes=False)
# apples and bananas
```
Fixed an edge case with datetime set to None.
This patch will ignore the datetime index even if it was set explicitly to None.

Added Pipe.children.
To complement Pipe.parents, setting the parameters key children to a list of pipes' keys will be treated the same as Pipe.parents:

import meerschaum as mrsm
pipe = mrsm.Pipe(
    'a', 'b',
    parameters = {
        'children': [
            {
                'connector': 'a',
                'metric': 'b',
                'location': 'c',
            },
        ]
    }
)
print(pipe.children)
# [Pipe('a', 'b', 'c')]

Added support for type:label syntax in mrsm.get_connector().
The factory function mrsm.get_connector() expects the type and label as two arguments, but this patch allows for passing a single string with both arguments:
```
import meerschaum as mrsm
print(mrsm.get_connector('sql:local'))
# sql:local
```
Fixed more edge case bugs.
For example, converting to Int64 sometimes breaks with older versions of pandas. This patch adds a workaround.

Assets 2

18 Nov 03:45

bmeares

v1.4.10

656279f

v1.4.10

Fixed an issue with syncing background jobs.
The --name flag of background jobs with colliding with the name keyword argument of SQLConnector.to_sql().
Fixed a datetime bounding issue when datetime index is omitted.
If the minimum datetime value of the incoming dataframe cannot be determined, do not bound the get_data() request.
Keep existing parameters when registering plugin pipes.
When a pipe is registered with a plugin as its connector, the return value of the register() function will be patched with the existing in-memory parameters.
Fixed a data type syncing issue.
In cases where fetched data types do not match the data types in the pipe's table (e.g. automatic datetime columns), a bug has been patched to ensure the correct data types are enforced.
Added Venv to the root namespace.
Now you can access virtual environments directly from mrsm:
```
import meerschaum as mrsm

with mrsm.Venv('noaa'):
    import pandas as pd
```

Assets 2

13 Nov 04:59

bmeares

v1.4.9

e3c3cf3

v1.4.9

Fixed in-place syncs for aggregate queries.
In-place SQL syncs which use aggregation functions are now handled correctly. This version addresses differences in column types between backtrack and new data. For example, the following query will now be correctly synced:
```
WITH days_src AS (
  SELECT *, DATE_TRUNC('day', "datetime") AS days
  FROM plugin_stress_test
)
SELECT days, AVG(val) AS avg_value
FROM days_src
GROUP BY days
```
Activate virtual environments for custom instance connectors.
All pipe methods now activate virtual environments for custom instance connectors.
Improved database connection performance.
Cold connections to a SQL database have been sped up by replacing sqlalchemy_utils with handwritten logic (JSON for PostgreSQL-like and SQLite).
Fixed an issue with virtual environment verification in a portable environment.
The portable build has been updated to Python 3.9.15, and this patch includes a check to determine the known site-package path for a virtual environment of None instead of relying on the default user site-packages directory.
Fixed some environment warnings when starting the API

Assets 2

04 Nov 04:21

bmeares

v1.4.8

e783609

v1.4.8

v1.4.5 – v1.4.8

Bugfixes and stability improvements.
These versions included several bugfixes, such as patching --skip-check-existing for in-place syncs and fixing the behavior of --params (build_where()).

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.5.8 – v1.5.10

v1.5.7

v1.5.3 – v1.5.6

v1.5.0 – v1.5.2

v1.4.14

v1.4.12 – v1.4.13

v1.4.11

v1.4.10

v1.4.9

v1.4.5 – v1.4.8

Releases: bmeares/Meerschaum

v1.5.10

v1.5.8 – v1.5.10

v1.5.7

v1.5.7

v1.5.6

v1.5.3 – v1.5.6

v1.5.2

v1.5.0 – v1.5.2

v1.4.14

v1.4.14

v1.4.13

v1.4.12 – v1.4.13

v1.4.11

v1.4.11

v1.4.10

v1.4.10

v1.4.9

v1.4.9

v1.4.8

v1.4.5 – v1.4.8