Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[curvefs] When the fs mount is mounted multiple times with different enableSumInDir switches, the xattr information recorded in the metadata will be inaccurate #2094

Closed
cw123 opened this issue Nov 21, 2022 · 17 comments
Assignees
Labels
bug Something isn't working Developer Activities good first issue Good for newcomers

Comments

@cw123
Copy link
Contributor

cw123 commented Nov 21, 2022

curvefs 有个特性,为了加快getfattr查询目录下的所有文件的length的长度之和,使用了一个enableSumInDir的特性,当开启这个开关的时候,每次在目录下创建、删除文件,修改文件的长度的时候,都会在目录的attr字段中更新相应的字段。可以在curvefs/src/client/fuse_client.cpp 搜索 enableSumInDir_ 查看对应的实现逻辑。

但是,如果一个fs,挂载了多次,且挂载的时候,有时开启这个特性,有时不开这个特性,就会造成文件的长度信息统计不准。需要想办法避免这种情况。怎么才能在开启这个特性的时候让这个值保持准确?或者,怎么才能在这个统计值不准的时候,不要使用这个值,而是通过遍历所有文件的方式统计数据,来避免拿到错误的数据?

Curvefs has a feature. In order to speed up getfattr to query the sum of the lengths of all files in the directory, a feature of enableSumInDirt is used. When this switch is turned on, every time a file is created or deleted in the directory, the length of the file is modified. At any time, the corresponding field will be updated in the attr field of the directory. You can search enableSumInDir_ in curvefs/src/client/fuse_client.cpp to view the corresponding implementation logic.

However, if an fs is mounted multiple times, and when mounting, sometimes this feature is enabled, and sometimes this feature is not enabled, the statistics of the length information of the file will be inaccurate. A way needs to be found to avoid this. How can I keep this value accurate when this feature is turned on? Or, how can we not use this value when the statistical value is inaccurate, but get the data by traversing all files to avoid getting wrong data?

curvefs/src/client/fuse_client.cpp

enableSumInDir_ = fsInfo_->enablesumindir() && !FLAGS_enableCto;
@tangwz
Copy link
Contributor

tangwz commented Nov 21, 2022

assign

@wuhongsong
Copy link
Contributor

Are you having some trouble? If so, please let us know so that we can communicate together @tangwz

@tangwz
Copy link
Contributor

tangwz commented Dec 1, 2022

@wuhongsong Sorry I'm too busy this week, I will try to finish it this week.

@wuhongsong
Copy link
Contributor

@tangwz keep going when you have time ? If you're still busy, you can do it later.

@ilixiaocui
Copy link
Contributor

cc @tangwz Is this issue still being followed up? Did you encounter any problems?

@tangwz
Copy link
Contributor

tangwz commented Jan 5, 2023

cc @tangwz Is this issue still being followed up? Did you encounter any problems?

I'm doing this these days. I'm reading the code and think how to do it.

@tangwz
Copy link
Contributor

tangwz commented Jan 10, 2023

@ilixiaocui How about if enableSumInDir_ is true in FuseOpInit(), we call CalOneLayerSumInfo() get and record the XATTRFBYTES?

@SeanHai
Copy link
Contributor

SeanHai commented Jan 10, 2023

@ilixiaocui How about if enableSumInDir_ is true in FuseOpInit(), we call CalOneLayerSumInfo() get and record the XATTRFBYTES?

CURVEFS_ERROR XattrManager::GetXattr(const char* name, std::string *value,

CalOneLayerSumInfo need calculate all files and directories under the current directory
CalAllLayerSumInfo need calculate all files and directories under the current directory and all subdirs

FastCalOneLayerSumInfo and FastCalAllLayerSumInfo only need get statistics from sumdirs' xattr, because every dir xattr have recorded one layer statistics.

You can read the design document: https://github.com/opencurve/curve/blob/master/docs/cn/curvefs_summary_xattr.md

@tangwz
Copy link
Contributor

tangwz commented Jan 10, 2023

So, we can

  1. Is enableSumInDir_ is true in FuseOpInit()?
  2. If enableSumInDir_ is true, callGetXattr()get theXATTRFBYTES ` and update it in the xattr.

How about this ?

@SeanHai
Copy link
Contributor

SeanHai commented Jan 10, 2023

So, we can

  1. Is enableSumInDir_ is true in FuseOpInit()?
  2. If enableSumInDir_ is true, callGetXattr()get theXATTRFBYTES ` and update it in the xattr.

How about this ?

I actually don't understand what your problem is.
The issue is about problem:

  1. You turn on 'enableSumInDir_' first, and the summary info will recorded in parent dir xattr after a while. You only need statistic all subdirs' xattr can get the all info under this dir.
  2. Then you turn off 'enableSumInDir_', the file number/size etc changed will not record in parent xattr anymore at this period.
  3. Finally you turn on 'enableSumInDir_', you get a wrong statistic info form all subdirs' xattr, because missed some changes when step 2(turn off 'enableSumInDir_').

@wuhongsong
Copy link
Contributor

@tangwz Keep talking and figure it out?

@tangwz
Copy link
Contributor

tangwz commented Feb 16, 2023

/unassign

@caoxianfei1
Copy link
Contributor

any one want to take it ?

@bit-dance
Copy link
Contributor

I want to try it.

@Cyber-SiKu
Copy link
Contributor

@bit-dance Are you still going on?

@bit-dance
Copy link
Contributor

Yes,and i am trying to test in my local fs but had some build questions and struggled in this step.

@zhanghuidinah
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Developer Activities good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

9 participants