Mongoose plugin to enable sorting on encrypted fields
npm i mongoose-sort-encrypted-field
We are having a user with an encrypted email, We just need to add the sortFieldName
option to that field
const { encrypt, decrypt } = require("./encryption.js");
const { getModelWithSortEncryptedFieldsPlugin } = require("mongoose-sort-encrypted-field");
const userSchema = new mongoose.Schema({
email: {
type: String,
required: true,
unique: true,
set: encrypt,
get: decrypt,
sortFieldName: "emailSort",
},
});
const User = getModelWithSortEncryptedFieldsPlugin("User", userSchema, {
redisQueueClientOptions: { redis: "redis://localhost:6379" },
ignoreCases: true,
});
module.exports = User;
Then we can sort all records by email from the 'emailSort' field. For performance, we can create a MongoDB index for that field.
const sortedUsers = await User.find({}).sort({ emailSort: 1 }).exec();
Note: For values equal to null
or undefined
, It consider those as ''
empty strings to support proper sorting if we do multiple fields sorting.
Example:
// by default mongodb/javascript sort those as follow
documents = [
{ firstName: "a", middleName: "", lastName: "b" },
{ firstName: "a", middleName: "b", lastName: "b" },
{ firstName: "a", middleName: "b", lastName: "b" },
{ firstName: "a", middleName: null, lastName: "b" },
{ firstName: "a", middleName: undefined, lastName: "b" },
];
// If we sort as { $sort: { firstName: 1, middleName: 1, lastName: 1 } }
// For full name then it will be wrong according to default behaviour
["aa", "abb", "ab", "ab"];
// So, in our plugin we are doing as follow
documents = [
{ firstName: "a", middleName: "", lastName: "b" },
{ firstName: "a", middleName: null, lastName: "b" },
{ firstName: "a", middleName: undefined, lastName: "b" },
{ firstName: "a", middleName: "b", lastName: "b" },
{ firstName: "a", middleName: "b", lastName: "b" },
];
// Now it is corrent
["aa", "ab", "ab", "abb"];
-
redisQueueClientOptions: RedisQueueClientOptions;
default:{ redis: new Redis(), // It can be an instance of [ioredis](https://www.npmjs.com/package/ioredis) or any value that we can pass to ioredis constructor batchSize: 10, groupVisibilityTimeoutMs: 60000, pollingTimeoutMs: 10000, consumerCount: 3, // Better to have consumerCount in a balance of maximum fields we have to sort vs resources usage for multiple consumers redisKeyPrefix: "mongoose-sort-encrypted-field", }
Any options which we can pass to redis-ordered-queue constructor and redis options can be an instance of ioredis or any value that we can pass to ioredis constructor
-
noOfCharsForSortId?: number
default:50
Number of bytes for sort ID, bigger number is mathematically better. -
noOfCharsToIncreaseOnSaturation?: number;
default:2
Number of bytes to increase on saturation, for example, for04
and05
, first, we can see there is no whole number between those so, It appends an extra digit at the end and it becomes040
and050
and the average is045
. In the base2^15
number system, getting a saturation like that is mathematically very unlikely. -
ignoreCases?: boolean;
default:false
To ignore cases. -
silent?: boolean;
default:false
Flag to turn on/off console info logs -
selectSortFields?: boolean;
default:false
Flag forselect
option for sort fields. If setfalse
then the sort fields will be removed from any query result. -
revaluateAllThreshold?: number;
default:0.5
If the number of documents without sort ID divided by the total number of documents is less than this threshold Then it will get all values, sort them, and generate sort ID for all at equal distances 0 to 2^15 For example, if we have 3 documents and we can 00 to 20 sort ID then those documents will have 05 10 15 sort ID -
revaluateAllCountThreshold?: number;
default:100
If the total number of documents is less than this value then it will regenerate the sort ID the same way as revaluateAllThreshold
We create a sort order ID which is just a number in base 2^15
, which is a huge number system as compared to the 10 base number system. We search in DB using binary search. For 1 lakh
documents, it queries and decrypts only 18
documents (first+last+log(1lakh)) to generate a sort ID. It generates a sort order ID in O(1)
.
To generate a sort order ID it only needs to know the previous and next sort ID, and it just averages out those to get the current sort order ID, for example in the base 10 system if need to insert between 03
and 07
then (03+07)/02
which is 05
. for 04
and 05
, first we can see there is no whole number between those so, It append extra digit at the end and it becomes 040
and 050
and the average is 045
. In the base 2^15
number system, getting a saturation like that is mathematically very unlikely.
It uses redis-ordered-queue to generate a sort ID. It means it only processes one document at a time as per the mathematical requirement of the sort ID generation algorithm even when we are running multiple instances of our service.