Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future work: Learning configurations using ML #30

Open
SankBad opened this issue Mar 25, 2021 · 0 comments
Open

Future work: Learning configurations using ML #30

SankBad opened this issue Mar 25, 2021 · 0 comments

Comments

@SankBad
Copy link
Contributor

SankBad commented Mar 25, 2021

Modern Operating incorporates thousands of configurations that can be set by users.
There are at least two broad categories that can directly affect applications’ performances and can largely benefit from ML approaches.

  1. There are many timing-related configurations in Linux, such as the frequency of interrupting a CPU core (for thread
    scheduling), the frequency of invoking background swapping (for memory paging), the frequency of flushing buffer cache
    (for storage), and the sampling rate of CPU clock frequency (for energy and performance). Setting these timing-related
    configurations is hard since there are various tradeoffs associated with them. For example, frequent CPU interruption
    offers the opportunity to improve CPU utilization (with more aggressive thread scheduling) but can cause performance overhead (by preempting and context switching threads), which in turn, reduces effective CPU utilization.
  2. There are many configurations on various types of sizes, such as the buffer cache size (for storage caching), disk
    prefetching amount (for storage access), and swap prefetching amount (for memory paging). Setting these size-related configurations is hard, especially when there are tradeoffs among different sizes. For example, a larger buffer cache improves
    the performance of the storage system but reduces available memory for user applications.

Many of the above two types of configurations can considerably affect application performance and other important
metrics like energy cost. However, the practice of setting them has long been one that involves heavy engineering and human efforts: with heuristics, by trial and error, or with offline
experiments. Moreover, once set, they are seldom changed. ML is a better fit for setting these OS configurations. A
good ML model trained with past workloads and OS/hardware environments can potentially outperform human-set configurations. The model can continue to dynamically generate new configurations to adapt to the workload and environment changes. We can use reinforcement learning for learning OS configurations.

Source:

  1. https://cseweb.ucsd.edu/~yiying/LearnedOS-OSR19.pdf
  2. https://github.com/containers-ai/federatorai-operator
  3. https://arxiv.org/pdf/2005.14410.pdf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant