-
-
Notifications
You must be signed in to change notification settings - Fork 650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to check which records have been updated when liveQuery
runs?
#1792
Comments
Currently it's only possible to observe the query results and if they are encrypted, the encypted items is what will be observed. Decryption will have to happen on all items whenever a new item is added or updated (unless solving some caching outside dexie). With the new cache/optimistic updates functionality in 4.x, we could find a way to optimize this by offering some kind of hooks into the optimistic updates engine. |
One side note: If using the option |
Ok thanks for the response, this WeakMap idea sounds similar to what I had in mind. I was thinking something like storing the decrypted query results in a new variable and then each time the Observable updates I would loop over all the records and see if the record ID is included in the decrypted array. If it is then I do nothing, but if it isn't I decrypt and add it to the array. But I was thinking that this might be very inefficient (loop over 1000s of records to find 1 new one). Especially if there are many records and they update every few seconds. I think I may run into CPU issues if I run this expensive loop every few seconds, do you think this would be a problem? If |
Well it might be ways to avoid looping the records, but it's almost a no-op to loop over thousands of record to just check against a WeakMap. In this snippet I'm looping over one million records checking against a weak map. It took 11 milliseconds on my laptop - and that is when the weak map is also filled with a million objects. |
Ok interesting, I realized though that this method is not possible with the data structure of my DB. Because checking if the record exists or not is not good enough since my table contains an Array of Objects column (which contains the key/value that is encrypted) that can be updated. So even if a record exists already in my cloned and decrypted copy of the stored data, I would still want to replace the entire nested Array of Objects because that would likely be quicker than trying to check and see which key/value in the Objects was updated. So I could replace everything EXCEPT the encrypted field because I know this field is immutable and cannot be updated. But if I am already replacing basically everything in the cloned decrypted copy of the DB except for the encrypted content then I am basically just re-implementing the functionality of what the Observable does in the first place right? For this reason I am leaning towards changing how the apps works and seeing if I can just decrypt and update all the records in the DB when the app loads, and then when the app closes encrypt all the records in the DB again. This will actually speed up the initial sync slightly by not having to encrypt the data at that point. And then I can use I think the only thing I need to figure out to make this work is if it is possible ensure that the encryption function runs and updates the records in the DB even if the user does not do a graceful and explicit logout step when exiting the app. If it is possible to run functions in the background when a user force-closes a Web App by closing the browser tab/window then I think this method can work well. The DB can be in a decrypted state while the app is running as long as it returns to its encrypted state when the app is closed. Running a |
I thought of a third solution to this problem because after looking into running functions when the browser closes I found that there is not a very reliable way to do so (there is the |
It depends on your concurrancy requirements, size of database and memory consumption limits for small devices. If a user loads everything to memory in one tab, leave it open and opens it again in another tab, it would load older data? And they might overwrite each other. Also, memory consumption could be a problem if you target mobile devices and the data can grow. The good thing with liveQuery (with or without the 4.0 cache) is that it only observes what you query - not the entire database - so it generates less memory footprint than caching the entire database. Also if you need to work with the data from a worker, you'd need to have a way to communicate that loaded data between worker and browser window. If relying on dexie's liveQuery you are always working against the realtime data from either worker or browser window. My general advice is to not store large arrays of items in IndexedDB if caring about query performance and also data consistency, but instead adapt a relational model where every item is stored in its own row so it can be efficiently indexed and queried without loading large nested objects. IndexedDB is more similar to a relational database in that way - it is slow in storing large nested data - specifically when updating a single item of an array you have to update the entire object. Better to flatten out the arrays to objects stored in tables so that updating an object is a small operation. If you'd be storing the data in individual objects (and not in arrays on objects), the encryption performance should not be an issue (as of my humble guess!). Using |
Oh right great point, I forgot about the concurrancy issue... so yeah the memory idea is not a solution. Moving my Array of Objects column into it's own table with a relational ID field that I can use to join the data together after querying it from the DB is something I considered when setting up my data structure. BUT I thought that it would be better to get the data into its final format before storing it in the DB rather than having to join the data every time But you are saying that with those Objects in their own Table and using I still don't really understand how the Also it's worth pointing out that currently my app has the entire DB being used all the time. I thought it would be good to have all the data available because it is a SPA and all the data can be visible within the SPA in one click. I was thinking this would lead to the app loading very fast. But now that I see how some of the data will need to be processed after querying the DB I am thinking it might be smart to refactor and only load the data from the DB that is visible in the UI. This should reduce the impact of having to loop over large amounts of records from the DB and run the formatting functions on them. I also might not need to even join the data if I use this method. So I am thinking the following steps could be the best path forward:
|
Sounds like a good plan! I'm not sure you'll need the optimization of only decrypting things once if your responses are few, but in case you have a view that shows 1000 items at the same time and they are updated frequently, you could optimize this using const decryptedItems = new WeakMap<Item, DecryptedItem | Promise<DecryptedItem>>(); // Keep this as a global variable.
const dectyptedItemsObservable = liveQuery(async () => {
const encryptedItems = await db.items.toArray();
const result = encryptedItems.map(item => {
let decryptedItem = decryptedItems.get(item);
if (!decryptedItem) {
decryptedItem = decrypt(item);
decryptedItems.set(item, decryptedItem);
}
return decryptedItem;
});
// If 'decrypt()' is a synchronous function:
return result;
// If `decrypt()` is an asynchronous function:
return await Promise.all(result);
}); Note that if your decrypt function would be async (as it would if you'd use WebCrypto, but not if you'd use TweetNacl or other userland crypto library), then the result would be an array of promises that you'd need to resolve using Promise.all(). You'd probably want to break out all this as a helper function though. It could be generic and the weakmap would work regardless of what type of items you are retrieving - you could put anything there and it will auto garbage-collect itself. // With helper function
const observable = liveQuery(async () => {
const specificItems = await db.items.where('foo').equals('bar').toArray();
return await decryptHelper(specificItems);
});
async function decryptHelper(encryptedItems) {
const result = encryptedItems.map(item => {
let decryptedItem = decryptedItems.get(item);
if (!decryptedItem) {
decryptedItem = decrypt(item);
decryptedItems.set(item, decryptedItem);
}
return decryptedItem;
});
// If 'decrypt()' is a synchronous function:
return result;
// If `decrypt()` is an asynchronous function:
return await Promise.all(result);
} |
Awesome thanks again for the help! I have begun the refactor, I think this makes much more sense to flatten out the data into multiple Tables and be much more granular with my queries based on the current view. I will likely still try to optimize the decryption even though it will be much less critical now that I am not running it on all the records all the time. :) |
If I store the decrypted items in a Svelte Store and use the liveQuery(async () => {
const query = await db.test.where('test').equals(true)
svelteStorVar.update(async (v) => {
const result = ...
// loop over the query and check current store Array values and only update and decrypt records that are not found in the Store
return result
// the return value updates the Svelte Store
})
// nothing is returned from liveQuery
}); EDIT: Looks like I am getting a type error when trying to use an |
UPDATE: I am leaning towards taking your advice and not trying to optimise the query decryption since after my refactor I think at most there should only be a few 100 records visible at any given time in the UI. They can update frequently but I should test the UI first to see how it performs before over optimising I think... Checking that the values exist and updating all non-immutable columns for each record will itself take up some CPU and might even end up being similar to running a decrypt function... (since I dont know which records and columns were updated, I need to an update on every query anyway). I think I will continue to have the values in a Svelte Store though so I can access them across components instead of using the return value from the So I think if I am not running this optimize check and just formatting the data each query, then I dont need to worry about changing the |
Here is what I ended up going with if you are interested, I realized after testing that let conversationsQuery = liveQuery(async () => {
try {
const convos = await db.conversations
.where('blocked')
.equals('false')
.reverse()
.sortBy('lastUpdate');
const msgs = await db.messages.bulkGet(convos.map((c) => c.latestMessage));
const convosFormatted = convos.map((c) => {
const latestMessage = msgs.find((m) => m?.id === c.latestMessage);
if (latestMessage) {
const decryptedMsg = decrypt(latestMessage.iv, latestMessage.message);
return { ...c, message: decryptedMsg };
} else {
return c;
}
});
const result = await Promise.all(convosFormatted);
conversations.set(result);
return 'Conversations query complete.';
} catch (error) {
console.log(error);
errorToast('Could not query conversations.');
}
});
$: console.log($conversationsQuery); UPDATE: My It's weird that it says the error is uncaught even though I have it in a |
It seems to somehow be an issue with storing the UPDATE: It seems to actually be an issue with calling |
Don't see what error you get. There's a section in the docs of liveQuery you should read - specifically to do Promise.resolve() or Dexie.waitFor() around promises returned from web crypto calls. Else than that, if you don't find a solution, please create a standalone repro of the issue. |
Sorry pasted the wrong link. Should be https://dexie.org/docs/liveQuery()#rules-for-the-querier-function |
Say I use
liveQuery
to display records from my Dexie DB in my application and for each record I need to run adecrypt
function in order to display the results to a user. With 1000s of records you would want to only run thedecrypt
function once whenliveQuery
first loads the data. And then when an update triggers the Observable to run again, you would only want todecrypt
the newly added record. Is this type of functionality supported with Dexie? I looked at theliveQuery
docs but I didn't see anything that talked about being able to know which records exactly are new when the event is triggered, and then to be able to conditionally run functions on only those records. Thanks!The text was updated successfully, but these errors were encountered: