-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
many features and improvements #22
Conversation
aajkl
commented
Jul 1, 2024
- support ceph rbd
- support ovn
- support cloud-init
- merge eru agent
- many other improvements and bugfix
1. support ceph rbd 2. support ovn 3. support cloud-init 4. merge eru agent 5. many other improvements and bugfix
@@ -5,13 +5,13 @@ import ( | |||
|
|||
"github.com/BurntSushi/toml" | |||
|
|||
"github.com/projecteru2/yavirt/pkg/errors" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, why not wraps the cockroachdb/errors
within yavirt/pkg/errors
, then don't need to change anything from upper pkgs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using upstream package directly is enough and adding a thin wrapper is needless. actually I just want to unify the log and errors pkg of core,yavirt and some components developed by me.
labels[parts[0]] = parts[1] | ||
} | ||
go func() { | ||
// Core need to connect to the local grpc server, so sleep 30s here to wait local grpc server up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sending a ping packet to be more graceful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping what? local machine? sleeping 30s is enough here, it is more simple and works well
internal/eru/agent/agent.go
Outdated
logger.Error(ctx, err, "failed to add node") | ||
return | ||
} | ||
if e.Code() == 1031 && strings.Contains(e.Message(), "node already exists") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defines a var for (hardcoded) 1031
should be a good idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, I'll fix this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these code numbers are from core, I just don't want to import too many dependencies, but I just read code and it's ok to import core here.
internal/eru/agent/guest.go
Outdated
} | ||
} | ||
|
||
f1 := utils.CheckHTTP(ctx, g.ID, httpChecker, healthCheck.HTTPCode, timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about shortcut here for saving cpu/bandwidth
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is specified by user and when checker list is empty, checkHTTP
do nothing
func (e *MetricsCollector) Collect(ch chan<- prometheus.Metric) { | ||
for _, v := range e.wrkStatusCache.Items() { | ||
wrkStatus, _ := v.Object.(*types.WorkloadStatus) | ||
if !wrkStatus.Running { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer to check wrkStatus
if is nil in the first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a check here, actually wrkStatus never be nil here
logger.Warn(ctx, "service is not healthy") | ||
return | ||
} | ||
if err := m.store.CheckHealth(ctx); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about
m.mCol.coreHealthy.Store(m.store.CheckHealth(ctx) == nil)
for removing the if/else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here I'll add a log message
tick := time.NewTicker(time.Duration(m.config.HealthCheck.Interval) * time.Second) | ||
defer tick.Stop() | ||
|
||
_ = utils.Pool.Submit(func() { m.checkAllWorkloads(ctx) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can remove _ =
part from the statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing _=
will cause a lint issue
for { | ||
select { | ||
case <-tick.C: | ||
_ = utils.Pool.Submit(func() { m.checkAllWorkloads(ctx) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also can remove the _ =
part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
internal/eru/agent/workload.go
Outdated
} | ||
|
||
for _, workloadID := range workloadIDs { | ||
ID := workloadID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can add comment to explain why we define a new var here.
(im not sure if is fixed the bug in golang higher version, if so, we can remove the new var)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is copied from agent, https://github.com/projecteru2/agent/blob/master/manager/workload/health_check.go#L41 I also don't know why defining a new var, I'll delete this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't be deleted because of https://go.dev/blog/loopvar-preview
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, just use for idx := range workloadIDs
and define a new var in loop, so it's not that strange.
internal/eru/agent/workload.go
Outdated
|
||
for _, workloadID := range workloadIDs { | ||
ID := workloadID | ||
_ = utils.Pool.Submit(func() { m.checkOneWorkload(ctx, ID) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dont need to _ =
part
internal/eru/agent/workload.go
Outdated
|
||
m.wrkStatusCache.Set(workloadStatus.ID, workloadStatus, 0) | ||
|
||
if err = m.setWorkloadStatus(ctx, workloadStatus); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer to a brand-new err :=
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, this is also from agent
"github.com/projecteru2/yavirt/internal/utils" | ||
"github.com/projecteru2/yavirt/internal/utils/notify/bison" | ||
"github.com/samber/lo" | ||
"google.golang.org/grpc/status" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer to group/separate pkgs from 3rd
and yavirt
internal/eru/recycle/recycle.go
Outdated
log.Debugf(ctx, "[recycle] notifier: %v", notifier) | ||
if notifier != nil { | ||
msgList := []string{ | ||
"<font color=#00CC33 size=10>delete dangling successfully </font>", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about defining a template snippet here?
|
||
coreIDs, err := fetchWorkloads() | ||
if err != nil { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to log the err
but not found
kind
continue | ||
} | ||
coreMap := map[string]struct{}{} | ||
for _, id := range coreIDs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we utilize the lo.Map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lo.Map returns slice and here is a map and i think for-loop is more intuitive here
} | ||
} | ||
|
||
func Setup(ctx context.Context, cfg *configs.Config, t *testing.T) (err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer to call this Setup
and pass the store instance from unit test files
internal/eru/resources/core.go
Outdated
defer cm.mu.Unlock() | ||
for resName, rawParams := range capacity { | ||
switch resName { | ||
case "cpumem": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strongly rec constants rather than literal in the switch/case
internal/eru/resources/core.go
Outdated
if ans != nil { | ||
return | ||
} | ||
cm.fetchResources() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to fetchResourcesWithLock
or FetchResources
(with lock implicitly)
i'd like to the former and
func (cm *CoreResourcesManager) GetCpumem() (ans *cpumemtypes.NodeResource) {
cm.mu.Lock()
defer cm.mu.Unlock()
...
}
only once
if err := json.Unmarshal([]byte(gpuNameJSON), &gpuNameMap); err != nil { | ||
return err | ||
} | ||
// merge name map from config to gpuNameMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lo.Map
would help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for-loop is very intuitive here, using functional style code here don't have any advantages.
DefaultGW CloudInitGateway `json:"-"` | ||
} | ||
|
||
func initTpls() (err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about func init()
then we can remove all the if xxx == nil
from the func
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or sync.Once(initTpls)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like init
, I always run initialization code explicitly, sync.Once
seems good
fc7746f
to
28a6c2f
Compare
@aajkl can be merged after lint fixing |
@anrs done |
* many features and improvements 1. support ceph rbd 2. support ovn 3. support cloud-init 4. merge eru agent 5. many other improvements and bugfix * fix lint and ut issues * fix review issues * don't return when check imageHub failed * remove fmt.Printf