We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
There is a nice open source tool for analyzing compile time, and its pretty easy to use.
It would be cool to have it run in CI, so we can track what code introduces compile time issues.
https://github.com/aras-p/ClangBuildAnalyzer
Needs to be built from source.
Then:
ClangBuildAnalyzer --all build_Release capture.bin ClangBuildAnalyzer --analyze capture.bin
Should produce output like:
**** Templates that took longest to instantiate: 374211 ms: tt::tt_metal::operation::launch_op<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/decorators.hpp:268:13), std::vector<tt::tt_metal::Tensor>> (1261 times, avg 296 ms) 199458 ms: std::__function::__func<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:159:13), std::allocator<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:159:13)>, void (tt::tt_metal::IDevice *)>::__func (3915 times, avg 50 ms) 198045 ms: std::__function::__func<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:243:38), std::allocator<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:243:38)>, void ()>::__func (3915 times, avg 50 ms) 175460 ms: std::make_shared<std::function<void (tt::tt_metal::IDevice *)>, (lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:159:13), void> (1305 times, avg 134 ms) 174609 ms: std::allocate_shared<std::function<void (tt::tt_metal::IDevice *)>, std::allocator<std::function<void (tt::tt_metal::IDevice *)>>, (lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:159:13), void> (1305 times, avg 133 ms) 171832 ms: std::__shared_ptr_emplace<std::function<void (tt::tt_metal::IDevice *)>, std::allocator<std::function<void (tt::tt_metal::IDevice *)>>>::__shared_ptr_emplace<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:159:13)> (1305 times, avg 131 ms) 159520 ms: std::function<void (tt::tt_metal::IDevice *)>::function<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:159:13), void> (1305 times, avg 122 ms) 158751 ms: std::__function::__value_func<void (tt::tt_metal::IDevice *)>::__value_func<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:159:13), void> (1305 times, avg 121 ms) 158372 ms: std::function<void ()>::function<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:243:38), void> (1305 times, avg 121 ms) 157491 ms: std::__function::__value_func<void ()>::__value_func<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:243:38), void> (1305 times, avg 120 ms) 157363 ms: std::__function::__value_func<void (tt::tt_metal::IDevice *)>::__value_func<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:159:13), std::allocator<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:159:13)>> (1305 times, avg 120 ms) 156056 ms: std::__function::__value_func<void ()>::__value_func<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:243:38), std::allocator<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:243:38)>> (1305 times, avg 119 ms) 120691 ms: std::__function::__alloc_func<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:243:38), std::allocator<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:243:38)>, void ()>::__alloc_func (3915 times, avg 30 ms) 120651 ms: std::__function::__alloc_func<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:159:13), std::allocator<(lambda at /home/blozano/tt-metal/ttnn/cpp/ttnn/run_operation_inl.hpp:159:13)>, void (tt::tt_metal::IDevice *)>::__alloc_func (3915 times, avg 30 ms) 99497 ms: tt::tt_metal::operation::run<ttnn::operations::data_movement::ReshardDeviceOperation> (109 times, avg 912 ms) 99090 ms: tt::tt_metal::operation::DeviceOperation<>::DeviceOperation<ttnn::operations::data_movement::ReshardDeviceOperation &> (109 times, avg 909 ms) 96766 ms: std::make_tuple<const std::vector<std::pair<tt::umd::xy_pair, CoreRangeSet>> &, const unsigned int &> (487 times, avg 198 ms) 88778 ms: std::tuple<std::vector<std::pair<tt::umd::xy_pair, CoreRangeSet>>, unsigned int>::tuple<std::_And, 0> (487 times, avg 182 ms) 88096 ms: std::__tuple_impl<std::__tuple_indices<0, 1>, std::vector<std::pair<tt::umd::xy_pair, CoreRangeSet>>, unsigned int>::__tuple_impl<0UL, 1UL, std::vector<std::pair<tt::umd::xy_pair, CoreRangeSet>>, unsigned int, const std::vector<std::pair<tt::umd::xy_pair, CoreRangeSet>> &, const unsigned int &> (487 times, avg 180 ms) 87847 ms: nlohmann::basic_json<>::basic_json (478 times, avg 183 ms) 86858 ms: std::__tuple_leaf<0, std::vector<std::pair<tt::umd::xy_pair, CoreRangeSet>>>::__tuple_leaf<const std::vector<std::pair<tt::umd::xy_pair, CoreRangeSet>> &, void> (487 times, avg 178 ms) 86523 ms: std::vector<std::pair<tt::umd::xy_pair, CoreRangeSet>>::vector (488 times, avg 177 ms) 82930 ms: std::vector<std::pair<tt::umd::xy_pair, CoreRangeSet>>::__init_with_size<std::pair<tt::umd::xy_pair, CoreRangeSet> *, std::pair<tt::umd::xy_pair, CoreRangeSet> *> (487 times, avg 170 ms) 81463 ms: std::vector<std::shared_ptr<tt::tt_metal::IGraphProcessor>> (468 times, avg 174 ms) 76550 ms: fmt::format<const tt::tt_metal::DataType &> (133 times, avg 575 ms) 76101 ms: fmt::detail::value<fmt::context>::format_custom_arg<tt::tt_metal::DataType, fmt::formatter<tt::tt_metal::DataType>> (135 times, avg 563 ms) 75873 ms: fmt::formatter<tt::tt_metal::DataType>::format (135 times, avg 562 ms) 75497 ms: tt::stl::reflection::operator<<<tt::tt_metal::DataType> (143 times, avg 527 ms) 75092 ms: nlohmann::basic_json<>::parse<const char *> (78 times, avg 962 ms) 74174 ms: magic_enum::enum_name<tt::tt_metal::DataType, magic_enum::detail::enum_subtype::common> (137 times, avg 541 ms)
The text was updated successfully, but these errors were encountered:
No branches or pull requests
There is a nice open source tool for analyzing compile time, and its pretty easy to use.
It would be cool to have it run in CI, so we can track what code introduces compile time issues.
https://github.com/aras-p/ClangBuildAnalyzer
Needs to be built from source.
Then:
Should produce output like:
The text was updated successfully, but these errors were encountered: