Releases: getdozer/dozer
v0.1.13
What's Changed
- fix: broken link by @chloeminkyung in #1268
- fixed config file link by @snork-alt in #1269
- Bump jsonwebtoken from 8.2.0 to 8.3.0 by @dependabot in #1265
- chore: rename admin types by @v3g42 in #1276
- chore: Hint
cache_max_map_size
config in error message by @chubei in #1275 - Bump clap from 4.1.8 to 4.1.11 by @dependabot in #1277
- test: Add ny taxi sample to e2e tests by @chubei in #1263
- refactor: Product Processor parallel join execution by @mediuminvader in #1180
- chore: handle errors properly by @v3g42 in #1279
- feat: Resolve sink errors by @karolisg in #1267
- feature: Handle grpc API errors gracefully by @karolisg in #1289
- test:
LocalStorage
basic test by @chubei in #1290 - chore: Log datafusion read error by @chubei in #1293
- Fix: fix cache benchmarking by @karolisg in #1294
- fix:
arrow::Inverval
is not supported in data conversion but supported in type converion by @chubei in #1295 - fix: Derived builder by @mediuminvader in #1297
- chore: Remove unused
id
inPostgresConnector
by @chubei in #1298 - test: Add
DataReadyConnectorTest
and implementLocalStorage
by @chubei in #1296 - fix: Fix data conversion issues in local storage by @karolisg in #1301
- chore: refactor SQL operations unit tests with proptest by @chloeminkyung in #1287
- test: Add sql integration tests by @mediuminvader in #1282
- chore: Print field name on schema mismatch by @chubei in #1300
- test: Add postgres connector sample to e2e tests by @chubei in #1278
- chore: Check map size before creating lmdb environment by @chubei in #1303
- fix: Use cdc type "nothing" when no PK defined by @karolisg in #1307
- chore: Ignore sql join sample integration test because of lmdb crash by @chubei in #1308
- test: Add postgres data type test by @chubei in #1299
- fix: postgres
BIT
is supported in type conversion but not value conversion by @chubei in #1305 - Prepare v0.1.13 by @mediuminvader in #1306
Full Changelog: v0.1.12...v0.1.13
v0.1.12
What's Changed
- feat: Asynchoronous indexing by @chubei in #1206
- feat: Add schema support to postgres connector by @karolisg in #1070
- feat: SQL static typing validation by @snork-alt in #1208
- feat: Add object store validation by @hoangnh93 in #1140
- feat: Introduce sqllogictest for dozer by @xudong963 in #1214
- refactor: Finalize
Connector
interface by @chubei in #1233 - refactor: Average aggregator by @chloeminkyung in #1262
- test: Run e2e test locally on PR by @chubei in #1239
- fix: Lmdb environments should use the same map size by @chubei in #1216
- fix: Restore Docker build by @mediuminvader in #1218
- fix: broken e2e tests by various PRs by @chubei in #1235
- fix: Fix postgres validation with schemas usage by @karolisg in #1243
- fix: Don't panic if operation cannot be read from indexing thread. by @chubei in #1247
Full Changelog: v0.1.11...v0.1.12
v0.1.11
New Features
- feat: Switch to new cache if it processed enough number of operations by @chubei in #1092
- feat: implement arrow format for grpc ingestion by @v3g42 in #1087
- feat: Add
LmdbMap
by @chubei in #1151 - feat: Add
LmdbMultimap
by @chubei in #1153 - feat: support deltalake connector by @xudong963 in #1067
- feat:
SET
operatorsrecord_map
intoCountingBloomFilter
by @chloeminkyung in #1168 - feat: implement
HAVING
by @snork-alt in #1198 - feat: Window processor by @mediuminvader in #1175
- feat: Add operation log to cache (never cleaned up now) by @chubei in #1172
- feat: implement date extract by @v3g42 in #1178
Bug Fixes
- fix: Use same progress bar style (expect for color) for source and sink by @chubei in #1100
- fix: updated download_binary by @chloeminkyung in #1107
- fix: Sink should only be built after all source checkpoints are checked by @chubei in #1112
- fix: Blue cache was counted with limit. Green cache didn't get picked up. by @chubei in #1123
- fix: Define unique temp folder for aggregation tests by @mediuminvader in #1124
- fix: groupby without measures were handled wrongly by @snork-alt in #1127
- fix: add postgres columns validations and connector test by @hoangnh93 in #1035
- fix: Ingestion via Arrow by @v3g42 in #1141
- fix:
load_multi_sources
test was not implemented correctly by @chubei in #1146 - fix: Remove table name property usage in deltalake connector by @karolisg in #1155
- fix: Fix compilation error introduced in #1158 by @chubei in #1183
- fix: Add sources names used in the query by @mediuminvader in #1190
- fix: PK was not automatically when
GROUP BY
is used by @snork-alt in #1193
New Contributors
- @readall made their first contribution in #1122
- @hoangnh93 made their first contribution in #1035
Full Changelog: v0.1.10...v0.1.11
v0.1.10
New Features
- Blue/Green Cache swap
- Timestamp difference
- Improved transaction performance
- Introduce DAY_OF_WEEK() date function
- Extend SQL type support for math operations
- Show sources ingestion progress
- Parallelized ingestion of Postgres snapshot data
Bug Fixes
- Fix
timestamp
,point
,decimal
support for grpc - Fix use of wildcard in select
- Fix convert from jsonb to dozer type
- Fix check of supported tables in postgres
- Clean dozer on start because blue/green cache is not switched correctly
- Fix columns order in schema
- Primary key field conversion from REST url
- Sort schemas by sources order in dozer config
Full Changelog: v0.1.9...v0.1.10
v0.1.9
Bug Fixes
- Fix the replication slot check
- Remove non-existent proto include
- Fix insert operation in Record Readers
Full Changelog: v0.1.8...v0.1.9
v0.1.8
New Features
- Support UNION and UNION all Set operations
- Support Geographical Point type and Distance
- Return Record id and version in REST APIs, gRPC endpoints and gRPC events
- Support
decimal
andbinary
for gRPC - Support $after expression in pagination
- Specify LmdbCacheManager and endpoint as an alias
- Generating the Graph of the Pipeline
- Expose Pipeline configuration
Bug Fixes
- Datafusion connector compilation
- Support mathematical operations for UINT
TypedService
didn't filter events based on endpoint- Fix
ls
command of the Object Storage connector - Add missing types.proto reference
- Use correct path builder for local storage
- Fix
eth-traces
e2e test - Health check should not return
NotServing
whendynamic
isfalse
- Config in e2e test
flags_dynamic_false
- Fix
release.yml
identation - Add epsilon to distance test case
- Don't create default connector config
- Remove leaked in
eth_union
e2e case - Eth blocks parse fails
- Remove
RwCache::insert_schema
so we don't read the db on every insert - Invalid column error
- Fix snowflake table name validation
- Fix snowflake schema helper import
New Contributors
Full Changelog: v0.1.7...v0.1.8
v0.1.7
New Features
- New DataFusion Connector
- Simplified
Dag
interface - All nodes use the same instant for commit timeout
- SQL transformations are common across sources
LmdbRoCache
andLmdbRwCache
implementSend
andSync
- Stateful pipeline for Sources without Primary Key
- Separation of Dag execution and Dag construction
Bug fixes
- Fix record delete when duplicated words are in string field
- Fix usage of decimal with trailing zeros in secondary index
- Fix e2e tests and hanging epoch manager test
- Rephrase error when primary key is missing
- buildkite integration test for eth e2e case
Full Changelog: v0.1.6...v0.1.7
v0.1.6
New Features
- Support Wildcard expression in Select clause
- Expression parsing support nested aggregations
- Implement Continue Ingestion and Stream creation in Snowflake
- Implement direct insert to cache pipeline
- Support Multi-way JOIN
- Improved DAG structure with
daggy
- Atomatically trim record history in
RecordWriter
- Only persist checkpoint states of relevant sources.
- Add sql_date support to snowflake integration
- adding commit SHA for ECR image tag for release
- Change app config to support source name in api endpoints
Bug fixes
- Mapping of sqlite data type to dozer data type
CONCAT()
to support multiple arguments- Unify typed service
query
andon_event
record representation - Implement record versioning in
RecordWriter
andRecordReader
- Fix minor snowflake connector issues
- Continue REPL execution after unknown command
- Forbid duplicated cte names
- Insert after delete problem in aggregation
- Use context to solve alias properly
- Fix snowflake connector
- Unable to clean untagged images
- Fix source with empty columns list validation
serde_str
is needed to bincode deserializeDecimal
- Don't commit when no source produces new op
- Fix stream exist check for tables
- Architecture diagram
- Fix snowflake stream consumption
- Ethereum end to end test
ApiEndpoint
is not serializing all fields- Integration tests required
dozer-tests
image when using local runner - Clear PK from Projection output
Full Changelog: v0.1.5...v0.1.6
v0.1.5
New Features
Support CAST operator
Support UInt aggregation
Implement continuous ingestion for Ethereum
Implement nested queries and CTE
Support bpchar column type from Postgres
Bug fixes
Fixed Ethereum E2E test
Update primary key mismatch error message
Sequence number reset if no data is sent from source by
Count APIs default to no limit
Fixed on_event behavior
Fixed Discord notification block
Dozer-api test utils generating inconsistent schema and record
Schema id was not copied to records before inserting
Allow text fields being used in full text index
Generate full text index for String
fields
Ignore unused sources for pipeline
Copy schema id to records before letting cache handle it
Fix sql parsing panic and statements being silently ignored
Need ignore_pipeline_branch_filters
to trigger when pushes are disabled
OnEvent is not propagating data in some scenarios with postgres
Don't pass commit message to BuildKite.
Update issue template config with parent item
New Contributors
- @xudong963 made their first contribution in #686
Full Changelog: v0.1.4...v0.1.5
Development Release - dev
Bug Fixes
- Improve performance of date parsing (#2306)
- Remove
ApiEndpoint::name
to be consistent with other kinds of sinks (#2316) - Log aerospike sink error and increase batch timeout (#2336)
- Dozer-core now propagates recoverable errors instead of panicking (#2348)
- Don't bail out if log manager can't be started (#2365)
- Read full message in aerospike tcp server (#2368)
- Fix table indexes usage in aerospike connector (#2370)
- Avoid sending repeated messages when error happens in Oracle log miner (#2373)
- Fix oracle connector update parser (#2386)
i8
->c_char
(#2410)- Fix dummy sink performance output (#2411)
- Oracle source skips uncommitted operations that are before another commit (#2413)
- Oracle source replication dead loop (#2416)
- Oracle redo log can have NULL REDO_SQL (#2418)
- Aggregate transactions during replication to avoid losing part of long running transactions (#2423)
Features
- Enable postgres continue ingestion (#2290)
- Snowflake sink (#2266)
- Support different kinds of endpoints (#2294)
- Implement batch processing for stateless processors (#2319)
- Continue mysql ingestion (#2322)
- Webhook-connector (#2355)
- Clickhouse sink (#2353)
- Oracle connector snapshotting (#2359)
- Oracle replication (#2361)
- Aerospike ingestion (#2366)
- Use http1 server for aerospike (#2374)
- Dozer ui and dozer ui update (#2376)
- Stop ingestion after number of events in dummy sink (#2394)
- Print average latency in dummy sink (#2395)
- Allow aerospike sink to handle multiple tables (#2393)
- Support integer on oracle ingestion/sink (#2419)
Fix
- Revert oracle resumability and fix snapshot data insert (#2425)
Miscellaneous Tasks
- Move
last_checkpoint
fromSource::start
toSourceFactory::build
(#2264) - Add latency measuring code in dummy sink (#2318)
- Diable stateful output port (#2330)
- Adapt UI for multiple tables in 1 sink (#2408)
- Add dev container (#2412)
- Stop using larger runners (#2417)
- Add oracle instant client library to docker image (#2421)
- Clean up
dozer_sql::builder
(#2424) - Remove clones and duplicated code in
dozer_sql::builder
(#2428) - Update version number to 0.4.0 (#2432)
Performance
Refactor
- Change
OpIdentifier
toRestartableState
(#2263) - Asyncify
Source
(#2329) - Make source state source wide instead of per table (#2344)
- Send source state and op id to sink (#2354)
- Make connector methods take unique reference (#2358)
- Remove dozer-lambda (#2380)
- Respect source transaction boundaries (#2335)
- Handle schema and types in aerospike (#2375)
- Remove dozer-api (#2388)
- Guarentee operation order inside a transaction (#2389)
- Remove dozer-cache (#2390)
Support
Contact us at https://getdozer.io