Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New inventory class #31

Merged
merged 25 commits into from
Jun 13, 2024
Merged

New inventory class #31

merged 25 commits into from
Jun 13, 2024

Conversation

Kelvinrr
Copy link
Collaborator

@Kelvinrr Kelvinrr commented Apr 26, 2024

Still needs tests and some clean up, up for initial review.

local tests shows massive speed improvements to kernel search, before using the old method was 20,000ms and using the new Inventory class was 4ms.

//! map of path to kernel pointers
std::vector<SharedKernel> loadedKernels;

//! json used to populate the loadedKernels
nlohmann::json kernels;
nlohmann::json m_kernels;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to be more consistent with these names 🙃

@@ -928,14 +927,14 @@ namespace SpiceQL {
string currFile = fileType;

//create a spice cell capable of containing all the objects in the kernel.
SPICEINT_CELL(currCell, 100);
SPICEINT_CELL(currCell, 200000);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to match ISIS, kept running into issues with large kernels e.g. smithed THEMIS kernels which have an insane amount of segments in them. Now that we are going to save the times off as binary, it almost doesn't matter if this is a waste of memory since it'll only run when creating the inventory.

@@ -990,6 +989,104 @@ namespace SpiceQL {
}


pair<double, double> getKernelStartStopTimes(string kpath) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to only storing start/stop windows for the entire kernel, instead of saving every single segment like ISIS does. afaik we only care if a particular kernel has coverage.

};


class Database {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably change this to InventoryImpl, or something more direct like that, as it might be mistaken for being related to the db files.

class Database;

class Inventory {
Database *m_impl;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hiding the implementation, this way we don't have to expose types like BTreeMap to python or other languages.

Comment on lines +221 to +235
auto start_upper_bound = time_indices->start_times.upper_bound(stop_time);
for(auto it = time_indices->start_times.begin() ;it != start_upper_bound; it++) {
start_time_kernels.insert(it->second);
}

// Get everything stopping after the start_time;
auto stop_lower_bound = time_indices->stop_times.lower_bound(start_time);
for(auto &it = stop_lower_bound;it != time_indices->stop_times.end(); it++) {
// if it's also in the start_time set, add it to the list
iterations++;

if (start_time_kernels.contains(it->second)) {
final_time_kernels.push_back(time_indices->index.at(it->second));
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indices didn't help as much as I would have hoped because of the nature of these time windows. In practice, it cuts the amount of comparisons such that you're only checking half the times or less. Which is still better, I think most of the speed up is from using mmap binary files for reading and better truncation of kernels to search before doing any time searches.

Comment on lines +81 to +82
Config config;
json json_kernels = getLatestKernels(config.get());
Copy link
Collaborator Author

@Kelvinrr Kelvinrr Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, this code would be integrated into Config, but didn't have time to do that level of a rewrite. Considering this is more in line with the API I envisioned when this first started, I wonder if we can make the Config classes private in the interface. That is, not expose it in the headers and by extension the Python API.

@Kelvinrr Kelvinrr requested a review from acpaquette April 26, 2024 15:56
Copy link
Collaborator

@acpaquette acpaquette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks fine, from what I can tell the new Inventory and Database classes aren't used anywhere in the code yet. I could be missing something so please let me know otherwise.

I would also like to see tests for some of the functions added under utils, as well as more individual tests for the database class. I think the current Database class test depends on locally data and also doesn't make any test assertions

Database db(false);
db.write_database();
nlohmann::json kernels = db.search_kernels("lroc", {Kernel::Type::FK, Kernel::Type::SCLK, Kernel::Type::SPK, Kernel::Type::CK, Kernel::Type::PCK}, Kernel::Quality::RECONSTRUCTED, 723470470.1831709, 723470590.1831709);
cout << kernels << endl;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be expanded to check for some expected values?

@Kelvinrr Kelvinrr merged commit c1a7c84 into DOI-USGS:main Jun 13, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants