Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add attemptCount to client RPC log #18366

Open
wants to merge 1,048 commits into
base: main
Choose a base branch
from

Conversation

secfree
Copy link
Contributor

@secfree secfree commented Nov 3, 2023

What changes are proposed in this pull request?

Add attemptCount to client RPC log.

Why are the changes needed?

Sometimes the time cost of a RPC recorded at server side and client side may have a big gap. Below is a case - it costs 54543ms at client side but only 29ms at server side. Adding attemptCount to the client's log can help explain the gap.

Client side

2023-11-02T14:37:59 WARN alluxio.client.file.FileSystemMasterClient ListStatus(path=/projects/..., options=loadMetadataType: ONCE
commonOptions {
  syncIntervalMs: 0
  ttl: -1
  ttlAction: DELETE
}
loadMetadataOnly: false
excludeMountInfo: false
) returned ArrayList{1 entries} in 54543 ms (>=10000 ms)

Server side

2023-11-02 14:37:59 INFO  AUDIT_LOG  succeeded=true   allowed=true ip=null cmd=listStatus  src=/projects/...  executionTimeUs=29392

Does this PR introduce any user facing changes?

NO.

yyongycy and others added 30 commits August 14, 2023 03:23
### What changes are proposed in this pull request?

Remove freeWorker command

### Why are the changes needed?

Remove freeWorker command.

### Does this PR introduce any user facing changes?
NA
			pr-link: Alluxio#17970
			change-id: cid-c2bdcedcebd2c48997a74d494a5bc1bc90006306
### What changes are proposed in this pull request?

1. Improve filter setting in `dora/shaded/client/pom.xml` to eliminate some warnings in the shading process.
2. Remove duplicated code of filter setting in UFS. The shading filter in each UFS will inherit the shading plugin settings from `dora/underfs/pom.xml`, enabling us to reduce duplication by moving the common settings to `dora/underfs/pom.xml`.

### Why are the changes needed?

Code hygiene and readability

### Does this PR introduce any user facing changes?

No

			pr-link: Alluxio#17824
			change-id: cid-1881e1faef2e5b11b5fd63f1c9533985e0bbaaf7
### What changes are proposed in this pull request?

Remove unused stressbench

### Why are the changes needed?

With architecture after 300, a set of stressbench are no longer relevant. 

### Does this PR introduce any user facing changes?

Removed StressJobServiceBench, RpcBench, StressMasterBench, MaxFileBench, GetPinnedFileIdsBench, CompactionBench and FuseIOBench 
			pr-link: Alluxio#17981
			change-id: cid-cd179851f618b7cc44391649e3b38887a1d548f1
### What changes are proposed in this pull request?

Remove JNR-based FuseFileSystem

### Why are the changes needed?

JNI-based fuse is preferred since 2.3

### Does this PR introduce any user facing changes?

  1. `alluxio.fuse.jnifuse.enabled` is no longer needed

			pr-link: Alluxio#17975
			change-id: cid-20d36295fa5fd92fe2b06185eee4d1eca41cde9e
### What changes are proposed in this pull request?

Remove the countCommand

### Why are the changes needed?

Remove the countCommand.

### Does this PR introduce any user facing changes?
NA
			pr-link: Alluxio#17897
			change-id: cid-8a313d1d5193f8cdbdfa9e88fc7f64e4b68873a7
### What changes are proposed in this pull request?

Remove UfsCleaner and supporting code

### Why are the changes needed?

FileSystemMaster/BlockMaster functionalities are no longer needed. This is one step in removing them.

### Does this PR introduce any user facing changes?

Related property keys are removed

			pr-link: Alluxio#17954
			change-id: cid-f677dd6ac302017995b724a82849f9e474783698
### What changes are proposed in this pull request?

Addresses Alluxio#17522

### Why are the changes needed?

The CLI in the bash format has grown to be a giant monolith of bash code. One of the goals in the new main branch is to properly modularize various portions of the codebase and the set of bash scripts within the bin/ folder is one obvious counterexample to this goal.

### Does this PR introduce any user facing changes?

For developers, using/building the CLI will now require having golang 1.18+ installed. Note it will not affect running `mvn install` as part of the standard java compilation.

`bin/alluxio` script is completely changed from before and is 100% guaranteed to break backwards compatibility. Most previous commands are migrated to follow the new guidelines defined, namely `bin/alluxio <service> <operation>`. The previous `bin/alluxio` script is renamed to `bin/alluxio-bash` temporarily but will not be maintained and eventually be deleted.

`bin/alluxio-start.sh` and `bin/alluxio-stop.sh` redirect to the new `bin/alluxio process` subcommand. Backwards compatibility is preserved when calling these scripts and they should behave similarly as before. The previous version of the 2 scripts are renamed with a `-bash` suffix along with their supporting bash scripts. The supporting bash scripts will eventually be deleted.

List of common commands and their updated form:
- `collectInfo` -> `info collect`
- `format` -> `init format`
- `fsadmin report` -> `info report`
- `getConf` -> `conf get`
- `runTests` -> `exec basicIOTest`
- most `fs` commands are still under `fs`
			pr-link: Alluxio#17974
			change-id: cid-a62ccb70b78c0c48deda670d3be8b366a28b1171
### What changes are proposed in this pull request?

This PR rewrites the existing `StressWorkerBench` test adapting to the new Dora architecture. The major changes are:
1. Introduced a distribution concept to the test results. Each file read operation is measured as a data point, and we display a distribution of every operation in term of 0~100 percentiles. This will better describe the performance, than a single `throughput =total bytes / total time`. See more details in jiacheliu3#2
2. In 2.x, job workers find local workers by specifying `BlockLocationPolicy`. In 3.x, we now change to use `WorkerLocationPolicy=LocalWorkerPolicy`. No config in `alluxio-site.properties` is required,  this setting is embedded in the code.
3. Added a way for job workers to deterministically calculate which files to read. So instead of all workers reading one single file in 2.x StressWorkerBench, now we support each worker reading multiple files.
4. Some other minor changes were introduced to the args. For example, we used to allow negative clusterLimit like `--cluster-limit -1`. Now this is disallowed because it barely makes sense. Another removed arg is `--block-size`, because Block API has been removed.

Most existing arguments are kept as-is. We only support reading local workers at this moment. In a future PR, we will introduce a way to measure performance reading a remote worker in the cluster.

Special thanks to @twalluxio and @voddle for their contributions in this change.

### Why are the changes needed?

We need an equivalent stress testing in Dora.

### Does this PR introduce any user facing changes?

Command args are changed, see details in code.

			pr-link: Alluxio#17734
			change-id: cid-c95a0bcc00b596f96d0cca65e3cfc5088c5ae355
### What changes are proposed in this pull request?

Fix non-existent property `WORKER_TIERED_STORE_LEVEL_ALIAS` preventing worker from starting.

### Why are the changes needed?

The property key was removed in Alluxio#17948. 

See error stack

```
2023-08-15 12:04:46,377 WARN  [data-server-tcp-socket-worker-0](ChannelInitializer.java:97) - Failed to initialize a channel. Closing: [id: 0xe69bb935, L:/192.168.31.18:29997 - R:/192.168.31.18:52896]
java.lang.RuntimeException: No value set for configuration key alluxio.worker.tieredstore.level0.alias
        at alluxio.conf.InstancedConfiguration.get(InstancedConfiguration.java:108)
        at alluxio.conf.InstancedConfiguration.get(InstancedConfiguration.java:100)
        at alluxio.conf.InstancedConfiguration.getString(InstancedConfiguration.java:259)
        at alluxio.conf.Configuration.getString(Configuration.java:221)
        at alluxio.DefaultStorageTierAssoc.<init>(DefaultStorageTierAssoc.java:71)
        at alluxio.worker.netty.FileWriteHandler.<init>(FileWriteHandler.java:49)
        at alluxio.worker.netty.PipelineHandler.addBlockHandlerForDora(PipelineHandler.java:86)
        at alluxio.worker.netty.PipelineHandler.initChannel(PipelineHandler.java:73)
```

### Does this PR introduce any user facing changes?

No.

			pr-link: Alluxio#17988
			change-id: cid-23e029d94828f106e135250ff72098c5c5bffdd9
### What changes are proposed in this pull request?

Remove unused parameters pathConf.

### Why are the changes needed?

Remove unused parameters pathConf.

### Does this PR introduce any user facing changes?
NA

			pr-link: Alluxio#17990
			change-id: cid-2ce22476a63b3ddd06d02814f36dab6b4701a5a6
### What changes are proposed in this pull request?

Refactor command line options and parsing logic in Fuse.

### Why are the changes needed?

To make the control flow more extensible.

### Does this PR introduce any user facing changes?

No.

			pr-link: Alluxio#17956
			change-id: cid-af9e2f2b613f8c72badae902a6f878483bdb0e0e
update the check-docs as well
			pr-link: Alluxio#17995
			change-id: cid-5bcfa56a7696a65e93b0de460585a7807c9f5f82
### What changes are proposed in this pull request?

Add testcontainers test support for relevant unit testings.

### Why are the changes needed?

For certain junit test suites such as etcd related, testcontainers framework needs to be supported

### Does this PR introduce any user facing changes?

N/A

			pr-link: Alluxio#17977
			change-id: cid-ba36ad7f742b1b5370da43b68574be0ff88ead04
`declare -A` is only available starting bash 4+ as pointed out by @twalluxio and @Kai-Zhang
change os/arch mappings in `bin/alluxio` into a list of tuples
			pr-link: Alluxio#17998
			change-id: cid-3e90bd74de0f4acecd961bb4d4c1537332dc7f6e
### What changes are proposed in this pull request?

Remove unused class which causing problem.
```
2023-08-15 21:40:36,663 WARN  ChannelInitializer - Failed to initialize a channel. Closing: [id: 0xa460dfda, L:/172.31.93.123:29997 - R:/172.31.80.106:34390]
java.lang.RuntimeException: No value set for configuration key alluxio.worker.tieredstore.level0.alias
	at alluxio.conf.InstancedConfiguration.get(InstancedConfiguration.java:108)
	at alluxio.conf.InstancedConfiguration.get(InstancedConfiguration.java:100)
	at alluxio.conf.InstancedConfiguration.getString(InstancedConfiguration.java:259)
	at alluxio.conf.Configuration.getString(Configuration.java:221)
	at alluxio.DefaultStorageTierAssoc.<init>(DefaultStorageTierAssoc.java:71)
	at alluxio.worker.netty.FileWriteHandler.<init>(FileWriteHandler.java:49)
	at alluxio.emon.worker.netty.FileWriteHandlerEE.<init>(FileWriteHandlerEE.java:42)
	at alluxio.emon.worker.netty.PipelineHandlerEE.addBlockHandlerForDora(PipelineHandlerEE.java:113)
	at alluxio.emon.worker.netty.PipelineHandlerEE.initChannel(PipelineHandlerEE.java:100)
	at io.netty.channel.ChannelInitializer.initChannel(ChannelInitializer.java:129)
	at io.netty.channel.ChannelInitializer.handlerAdded(ChannelInitializer.java:112)
	at io.netty.channel.AbstractChannelHandlerContext.callHandlerAdded(AbstractChannelHandlerContext.java:1114)
	at io.netty.channel.DefaultChannelPipeline.callHandlerAdded0(DefaultChannelPipeline.java:609)
	at io.netty.channel.DefaultChannelPipeline.access$100(DefaultChannelPipeline.java:46)
	at io.netty.channel.DefaultChannelPipeline$PendingHandlerAddedTask.execute(DefaultChannelPipeline.java:1463)
	at io.netty.channel.DefaultChannelPipeline.callHandlerAddedForAllHandlers(DefaultChannelPipeline.java:1115)
	at io.netty.channel.DefaultChannelPipeline.invokeHandlerAddedIfNeeded(DefaultChannelPipeline.java:650)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:514)
	at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:429)
	at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:486)
	at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:174)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:167)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at java.base/java.lang.Thread.run(Thread.java:829)
```
### Why are the changes needed?

fix bug

### Does this PR introduce any user facing changes?
na

			pr-link: Alluxio#17997
			change-id: cid-4343001dd94f832a09bc9b3647b9e22fe6554c80
add check to ensure the go code in `cli/` compiles as part of the checks build
			pr-link: Alluxio#17996
			change-id: cid-aa0589a0d7c889af2b3abd0b30037a6769aac5a2
parity with the bash side change in Alluxio#17933
			pr-link: Alluxio#17994
			change-id: cid-85e50e9b00f4f03edf96abbccac99242d8284c5e
### What changes are proposed in this pull request?
current version didn't support must_cache and async_cache, so remove the usage of must_cache and async_cache usage in code and mark deprecated to prevent any use.

### Why are the changes needed?
This change focuses on removing the write type of MUST_CACHE and ASYNC_CACHE, and Alluxio#17980 would remove the master-side code about async persist.

### Does this PR introduce any user facing changes?
No

			pr-link: Alluxio#17963
			change-id: cid-14d40a217134beb8a5e7316b050af01bca18b65e
### What changes are proposed in this pull request?

Skip copy process if validation fails.
Also adding more info to the progress report.

### Why are the changes needed?

Feature request.

### Does this PR introduce any user facing changes?

na

			pr-link: Alluxio#17804
			change-id: cid-c0affd0be9609f9e43a3c901247d27ce17021e85
### What changes are proposed in this pull request?
running `build/cli/build-cli.sh -a` from tarball dry runs

### Why are the changes needed?
needed to fix dry runs

### Does this PR introduce any user facing changes?
no

			pr-link: Alluxio#18004
			change-id: cid-705e7c2d1eb77591400eeddf82d3e3aff6bbdc74
### What changes are proposed in this pull request?

Remove copyTo/FromLocal commands

### Why are the changes needed?

Remove copyTo/FromLocal commands.

### Does this PR introduce any user facing changes?

NA

			pr-link: Alluxio#18003
			change-id: cid-7a2832c9c4953edfa06716237672a45d715c4154
### What changes are proposed in this pull request?

Lacking of COS's Unit Test. So I have implemented COS's UT compared with OSS's UT, and improve coverage nearly to 80%.

<img width="621" alt="截屏2023-07-17 11 14 01" src="https://github.com/Alluxio/alluxio/assets/57146148/db5f20ae-806e-4508-be60-adc295ccb73a">

### Why are the changes needed?

Unit test is important to alluxio.

### Does this PR introduce any user facing changes?

No.

			pr-link: Alluxio#17782
			change-id: cid-c69e4264bacccefc23bdaef399c9c660e2e1534e
### What changes are proposed in this pull request?

[DOCFIX] Add Doc of running alluxio with Tensorflow

### Why are the changes needed?

Now dora branch is lack of docs about how to run alluxio with Tensorflow. This PR provides a doc about running Alluxio with Tensorflow

### Does this PR introduce any user facing changes?

No

			pr-link: Alluxio#18011
			change-id: cid-c9c8950f57f7f54a93d83415f2dfdac97c5b405a
directly call `fs cp` instead of the deleted `copyToLocal` and `copyFromLocal`
			pr-link: Alluxio#18013
			change-id: cid-30a30d0f76dd64b30b750c98e50383e0f6eaf29c
### What changes are proposed in this pull request?

Remove MetaMasterConfigClient initialization in the client.

### Why are the changes needed?

Remove MetaMasterConfigClient initialization in the client.

### Does this PR introduce any user facing changes?

NA

			pr-link: Alluxio#18012
			change-id: cid-3a82b8c04ad0d3dbca77cf35ee8e0b2bf8832b8c
### What changes are proposed in this pull request?

Prepare the worker for accessing multiple UFS's as the client requests

### Why are the changes needed?

Allow worker to access arbitrary UFS's in the future.

### Does this PR introduce any user facing changes?

No.

			pr-link: Alluxio#17839
			change-id: cid-569ed16a888f2a576bdfd69a22cb587989ff3559
Additional `exec`,`fs`,`init`,`info` commands to golang CLI as part of Alluxio#17522

`bin/alluxio-bash runUfsIOTest` -> `bin/alluxio exec ufsIOTest`

`bin/alluxio-bash fs chgrp` -> `bin/alluxio fs chgrp`
`bin/alluxio-bash fs chmod` -> `bin/alluxio fs chmod`
`bin/alluxio-bash fs chown` -> `bin/alluxio fs chown`

`bin/alluxio-bash fsadmin metrics clear` -> `bin/alluxio init clearMetrics`
`bin/alluxio-bash clearCache` -> `bin/alluxio init clearOSCache`
`bin/alluxio-bash validateConf` -> `bin/alluxio init validate --type conf`
`bin/alluxio-bash validateEnv` -> `bin/alluxio init validate --type env`

`bin/alluxio-bash fsadmin doctor` -> `bin/alluxio info doctor`
`bin/alluxio-bash fsadmin nodes` -> `bin/alluxio info nodes`
			pr-link: Alluxio#18007
			change-id: cid-fcbc6fc230a4a3b8802bee748585469b2cd35309
### What changes are proposed in this pull request?

Update main nav menu to 3 tiers
Rename Basic Logging to Logging & update respective link paths
Pull Glossary to level 1

### Why are the changes needed?

restructuring menu 

### Does this PR introduce any user facing changes?

webui

			pr-link: Alluxio#17986
			change-id: cid-9d6ccd0b2a94f930b611adb46a4ace88a80acdc4
### What changes are proposed in this pull request?

Remove needsync commands.

### Why are the changes needed?

Remove needsync commands.

### Does this PR introduce any user facing changes?

NA

			pr-link: Alluxio#17962
			change-id: cid-38d2f6c311c5b56ddbd16d34ea8d8c2ac344efd9
YichuanSun and others added 28 commits October 23, 2023 03:32
### What changes are proposed in this pull request?

Remove UnsetTtlTest which is outdated now.

### Why are the changes needed?

Improve code quality.

### Does this PR introduce any user facing changes?

No.

			pr-link: Alluxio#18292
			change-id: cid-3de116636374ad12962793e7ea19ba1170870db7
### What changes are proposed in this pull request?

Now with worker id can be assumed from a different worker instance whether on a different pod in k8s or a different host machine for baremetal. The creation onto the persisted ring path : /DHT/DefaultAlluxioCluster/AUTHORIZED/ should not bail if a different value is seen.

### Why are the changes needed?

to enable rejoin of a worker bearing same worker id but with different host or other WorkerInfo fileds.

### Does this PR introduce any user facing changes?

No

			pr-link: Alluxio#18275
			change-id: cid-51322e010e0d51ae4f81268c2bb607b568f08c46
### What changes are proposed in this pull request?

In the position reader, preloading pages on workers to improve the cold read performance.

### Why are the changes needed?

To improve the cold read performance.

### Does this PR introduce any user facing changes?

N/A
			pr-link: Alluxio#18317
			change-id: cid-97e28711cf7f7b3ce60da346c737505a365d3238
Origin @dbw9580 

### What changes are proposed in this pull request?

Add an prefetch cache policy that does not reset the sliding window when a cache read misses. Fuse sometimes create read requests that are out of the order and this helps the prefetch keep stable.


			pr-link: Alluxio#18318
			change-id: cid-b0a6331fac06bb743724e8b68005e257b89aa64d
Add getStatus RESTful API.

**Example:**
Get the specified directory/file information by the following request:
`curl -X GET http://localhost:28080/v1/info?path=/tpcds-data`

The response JSON looks like:
`[
  {
    "mType": "directory",
    "mName": "tpcds-data",
    "mPath": "/tpcds-data",
    "mUfsPath": "s3a://jiamingmai-test/tpcds-data",
    "mLastModificationTimeMs": 0,
    "mLength": 0,
    "mHumanReadableFileSize": "0B"
  }
]`

<img width="597" alt="image" src="https://github.com/Alluxio/alluxio/assets/6129818/b38d644b-11da-4206-937b-6d61fd6b3a6c">

			pr-link: Alluxio#18312
			change-id: cid-6e563a21372e9fe1867a36d9b311e246b49c459e
### What changes are proposed in this pull request?

This change mainly adds two fixes to the `MemoryPageStore`:
1. Add a noop implementation to commit() so it does nothing instead of throwing `UnsupportedOperationException`
2. Clears the cache on close explicitly

			pr-link: Alluxio#18322
			change-id: cid-5e7656e1eca363ed2de40cd48025f464c41a2584
### What changes are proposed in this pull request?

All integration tests do not run now, this PR fixs it.

### Why are the changes needed?

Fix bug.

### Does this PR introduce any user facing changes?

No.
			pr-link: Alluxio#18313
			change-id: cid-e3f71f5b817624c45b8a3fc2baae2b08b99cd109
### What changes are proposed in this pull request?

The `WORKER_FUSE_ENABLED ` key is not used, so this test is outdated.

### Why are the changes needed?

Improve code quality.

### Does this PR introduce any user facing changes?
no.

			pr-link: Alluxio#18268
			change-id: cid-8acc857afc5fff384d65bff748ad1c6b07a13e87
### What changes are proposed in this pull request?

Delete invalid UnderFileSystemAlluxioTest for changed codebase.

### Why are the changes needed?

Improve code quality.

### Does this PR introduce any user facing changes?

No.
			pr-link: Alluxio#18277
			change-id: cid-8814028d7725c927c7973917f1da6d22e1f827db
### What changes are proposed in this pull request?
Update worker api support for load multi replicas

### Why are the changes needed?
part of PR to support load multi replicas

### Does this PR introduce any user facing changes?
na

			pr-link: Alluxio#18296
			change-id: cid-0213f2aba669b7687ac42cf932cdcec911d397a4
### What changes are proposed in this pull request?

Add Rust toolchain to the Docker image used for CI.

### Why are the changes needed?

Allow CI to compile and run Rust code.

### Does this PR introduce any user facing changes?

No.

			pr-link: Alluxio#18319
			change-id: cid-6912fe3659bb0044a6980d9ac2f32f79101efc44
Add regx pattern file filter for distributed load.

**Example:**
The following request allows us to load the files under `/test-load` directory with "hello" prefix:
`curl -X GET http://localhost:28080/v1/load?path=s3a://jiamingmai-test/test-load&opType=submit&verbose=true&fileFilterRegx=^hello.*`
			pr-link: Alluxio#18311
			change-id: cid-4ec2bfe58bfba413f6d2925f5b3937bd6f5c2eb1
### What changes are proposed in this pull request?

I have created a new file named DoraLsCommandIntegrationTest and it can test the 'ls' command whether it can correctly run.Also, I revised the base class that can create byte files in alluxio.What's more, I have added a new test named DoraMkdirCommandIntegrationTest which it can test cli 'mkdir'.
 
### Why are the changes needed?

1、New Test does not have a function that can create file in alluxio.
2、Add a IntegrationTest.

### Does this PR introduce any user facing changes?

No.

			pr-link: Alluxio#18325
			change-id: cid-067ec742deab39af294d089dec932b69c0362682
### What changes are proposed in this pull request?

Extract the logic to create `UfsBaseFileSystem` into `FileSystemContext`, for possible extension. This change is functionally a refactor that changes nothing.
			pr-link: Alluxio#18333
			change-id: cid-105be187f763f83680f64bd963646ce32eb58493
### What changes are proposed in this pull request?

resolves Alluxio#18324

Disclaimer: I might have monkey-typed this fix but I still do not know anything about buffer ref counting. This fix does NOT make me the owner of this state machine.

			pr-link: Alluxio#18323
			change-id: cid-eb5bde353c08d3d9bdd39da5b9caf13681bae495
### What changes are proposed in this pull request?

Remove invalid LeaderCommandIntegrationTest.java, the command is deleted.

### Why are the changes needed?

improve code quality.

### Does this PR introduce any user facing changes?

no.

			pr-link: Alluxio#18287
			change-id: cid-43a00b7ec27735262bfa45b7b37331389d82a881
### What changes are proposed in this pull request?
Remove NeedsSyncCommandIntegrationTest, which command is deleted.

### Why are the changes needed?

Improve code quality.

### Does this PR introduce any user facing changes?

No.

			pr-link: Alluxio#18288
			change-id: cid-0c6040e1804065067032384e6046c0d47ccf8312
### What changes are proposed in this pull request?

HelpCommandIntegrationTest works now.

### Why are the changes needed?

Improve code quality.

### Does this PR introduce any user facing changes?

No.

			pr-link: Alluxio#18293
			change-id: cid-ccb04b9ede68fae2342d6a25e443b85ccd8f990c
### What changes are proposed in this pull request?

Add rust spdk library and design structures for cache.

### Why are the changes needed?

For NVMe SSD cache requirement.

### Does this PR introduce any user facing changes?

No.

### Benchmark Result

<table style="text-align:center;">
<tbody>
  <tr>
    <th rowspan="2">block size</th>
    <th colspan="3">time consumption</th>
    <th rowspan="2">throughput</th>
  </tr>
  <tr>
    <th>millisecond</th>
    <th>microsecond</th>
    <th>nanosecond</th>
  </tr>
  <tr>
    <td>512B</td>
    <td>10ms</td>
    <td>10194us</td>
    <td>10194421ns</td>
    <td>0.048MB/s</td>
  </tr>
  <tr>
    <td>1KB</td>
    <td>13ms</td>
    <td>13472us</td>
    <td>13472304ns</td>
    <td>0.072MB/s</td>
  </tr>
  <tr>
    <td>4KB</td>
    <td>9ms</td>
    <td>9242us</td>
    <td>9242424ns</td>
    <td>0.423MB/s</td>
  </tr>
  <tr>
    <td>16KB</td>
    <td>8ms</td>
    <td>8585us</td>
    <td>8585361ns</td>
    <td>1.820MB/s</td>
  </tr>
  <tr>
    <td>64KB</td>
    <td>11ms</td>
    <td>11030us</td>
    <td>11030930ns</td>
    <td>5.666MB/s</td>
  </tr>
  <tr>
    <td>256KB</td>
    <td>15ms</td>
    <td>15962us</td>
    <td>15962353ns</td>
    <td>15.662MB/s</td>
  </tr>
  <tr>
    <td>1MB</td>
    <td>13ms</td>
    <td>13059us</td>
    <td>13059113ns</td>
    <td>76.575MB/s</td>
  </tr>
  <tr>
    <td><b>4MB</b></td>
    <td><b>28ms</b></td>
    <td><b>28930us</b></td>
    <td><b>28930274ns</b></td>
    <td><b>138.264MB/s</b></td>
  </tr>
  <tr>
    <td><b>16MB</b></td>
    <td><b>79ms</b></td>
    <td><b>79423us</b></td>
    <td><b>79423390ns</b></td>
    <td><b>201.452MB/s</b></td>
  </tr>
  <tr>
    <td>64MB</td>
    <td>308ms</td>
    <td>308856us</td>
    <td>308856745ns</td>
    <td>207.216MB/s</td>
  </tr>
  <tr>
    <td>256MB</td>
    <td>1218ms</td>
    <td>1218323us</td>
    <td>1218323252ns</td>
    <td>210.125MB/s</td>
  </tr>
  <tr>
    <td>1GB</td>
    <td>5056ms</td>
    <td>5056277us</td>
    <td>5056277683ns</td>
    <td>202.521MB/s</td>
  </tr>
</tbody>
</table>



			pr-link: Alluxio#18231
			change-id: cid-92ee56270bc5bb237ecc0df78c2974e1051bc543
### What changes are proposed in this pull request?

Update load job to adopt multi replicas

### Why are the changes needed?
part of PR to support loading multi replicas

### Does this PR introduce any user facing changes?
new load option `replicas`

			pr-link: Alluxio#18320
			change-id: cid-7d01ca19a28faf4c7773cbf5c355dd6cf070728f
### What changes are proposed in this pull request?

Fix bug involved by Alluxio#18332.
Alter the time judgment logic for judging whether stale client channels are inactive. Using the LocaTime object cannot correctly judge whether a channel client is inactive, because a LocalTime plus or minus time offset only changes the hour, minute, second attribute value, and It will not affect the date, you actually need to use the LocalDateTime object instead.
### Why are the changes needed?

Please clarify why the changes are needed. For instance,
In the code, the LocaTime class is used to determine that a client channel is inactive. The LocalTime object adds or subtracts the time offset. It only changes the hour, minute and second attribute value and does not affect the date. In fact, you need to use the LocalDateTime object.  In other words, the three-day certification cycle judgment should be based on date and time, not just time.

### Does this PR introduce any user facing changes?

Please list the user-facing changes introduced by your change, including
None

			pr-link: Alluxio#18340
			change-id: cid-5b69e0c87d3bad8556ae27d491f3e0dc567378b9
### What changes are proposed in this pull request?

Fix a bug when file size < 1 page, alluxio considers all pages are cached no matter if it really caches it or not.

### Why are the changes needed?

Bug fixing 

### Does this PR introduce any user facing changes?

N/A
			pr-link: Alluxio#18347
			change-id: cid-da4bdce6615e4b1d9777a98ab335bf36e503d102
### What changes are proposed in this pull request?
Add some metrics and change the registry to expose the metrics more accurately.
And add the call point of the capacity-related metrics.

### Why are the changes needed?
Use default registry will expose all the metrics and some of them are not meaningful for all components.

### Does this PR introduce any user facing changes?
no

			pr-link: Alluxio#18350
			change-id: cid-646ee1f3e41171d2147df29fccaf5a3476b66033
### What changes are proposed in this pull request?

Please outline the changes and how this PR fixes the issue.

### Why are the changes needed?

Please clarify why the changes are needed. For instance,
  1. If you propose a new API, clarify the use case for a new API.
  2. If you fix a bug, describe the bug.

### Does this PR introduce any user facing changes?

Please list the user-facing changes introduced by your change, including
  1. change in user-facing APIs
  2. addition or removal of property keys
  3. webui

			pr-link: Alluxio#18352
			change-id: cid-b5f5b695a1c32a65ff3b303cb8227889bee5d81c
### What changes are proposed in this pull request?

1. If an object is created inside `PagedDoraWorker` constructor, extract that creation to before the constructor and use dependency injection to inject it to the worker object. This doesn't change any creation logic, just a refactor to better adapt to dependency injection flavor.
2. There is a circular dependency between `MetaManager` and `PagedDoraWorker`. This change removes that cycle. Now we create one, then create the other. Before, we create one and in the construction, we let `this` ref escape and create the other. Some methods are either moved or changed to `static`.
3. By adapting to dependency injection, we rely on `UfsManager` interface instead of `DoraUfsManager` implementation. Some method signatures are extracted to the interface level.
4. A few other small refactors to get rid of some downcasts and variable scope changes. Reasons are attached in comments on this PR.

### Why are the changes needed?

Improve code quality and extensibility.

### Does this PR introduce any user facing changes?

No. All refactor changes are small and equivalent to existing code. So nothing should break.

			pr-link: Alluxio#18181
			change-id: cid-4f9e9bc770b12253188bb541dd456ef3cd889c2b
### What changes are proposed in this pull request?

Add one `RemoteOnlyPolicy` implementation for testing. This is usable for reading all files from remote nodes, rather than the local node itself.

Generally, this policy keeps a thread-safe list including all workers. When one thread reads, the round-robin list returns all available workers after a roulette (putting the first element in the list to the end) , and choose the first remote worker to read from.

We also added available options of `--mode` in StressWorkerBench to use the new remote only policy.

### Why are the changes needed?

The new policy is for internal testing where all test clients find the remote worker for IO. This policy should not be used in real deployments because if all clients find remote worker, overall throughput can be quite low due to bandwidth restrictions.

### Does this PR introduce any user facing changes?

No, RemoteOnlyPolicy should only be used in internal testing
			pr-link: Alluxio#18273
			change-id: cid-ce534382a1ebd86230296475f4e2d3c6dd862033
Fix the bug that command line doesn't support regx file filter
			pr-link: Alluxio#18359
			change-id: cid-c2ec9a5394a1ad776e31251a9c113f4115cb651d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.