Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add linear chunk format. #494

Merged
merged 35 commits into from
Feb 5, 2025

Conversation

Mili-ssm
Copy link
Contributor

@Mili-ssm Mili-ssm commented Jan 24, 2025

Description

Introduce a new linear chunk format and integrate zstd compression as a dependency. This enhances the compression capabilities of the project.

Just now Linear format is not integrated on any part of the project but we have the needed APIs (sames as Anvil format) to start using it.

Testing

It has its owns unit test (same as the Anvil format)
Load a linear format world to ensure the spec is well implemented

Checklist

  • Add an Advance configuration for Chunk File Format.
  • Ensure the linear format is correctly implemented loading an external world saved with the same format.

Related: #118

@suprohub
Copy link
Contributor

Can you use dynamic var instead of consts? Because plugins can save some stuff with this format.

@Mili-ssm
Copy link
Contributor Author

Mili-ssm commented Jan 27, 2025

Can you use dynamic var instead of consts? Because plugins can save some stuff with this format.

I guess there is no problem? but exactly about what constants are you talking? (some code ref or file/line_number would be great).

@Snowiiii
Copy link
Member

Mhh just tried to test but for some reason got it panicked saying me this
thread '<unnamed>' panicked at pumpkin-world/src/chunk/linear.rs:245:48: range end index 2061500416 out of range for slice of length 52582031.
I converted my anvil files (which worked in pumpkin) into linear and changed linear to the default chunk reader/writer

@Mili-ssm
Copy link
Contributor Author

Mhh just tried to test but for some reason got it panicked saying me this
thread '<unnamed>' panicked at pumpkin-world/src/chunk/linear.rs:245:48: range end index 2061500416 out of range for slice of length 52582031.
I converted my anvil files (which worked in pumpkin) into linear and changed linear to the default chunk reader/writer

with what software do you generate the save? (it is just thr last part i required, the real test hahaha, so this is perfect hehe)

@Snowiiii
Copy link
Member

Mhh just tried to test but for some reason got it panicked saying me this
thread '<unnamed>' panicked at pumpkin-world/src/chunk/linear.rs:245:48: range end index 2061500416 out of range for slice of length 52582031.
I converted my anvil files (which worked in pumpkin) into linear and changed linear to the default chunk reader/writer

with what software do you generate the save? (it is just thr last part i required, the real test hahaha, so this is perfect hehe)

I used linear's own convert script
./convert_region_files.py mca2linear my_region_folder my_linear_folder

@Mili-ssm
Copy link
Contributor Author

Mili-ssm commented Jan 27, 2025

Mhh just tried to test but for some reason got it panicked saying me this
thread '<unnamed>' panicked at pumpkin-world/src/chunk/linear.rs:245:48: range end index 2061500416 out of range for slice of length 52582031.
I converted my anvil files (which worked in pumpkin) into linear and changed linear to the default chunk reader/writer

with what software do you generate the save? (it is just thr last part i required, the real test hahaha, so this is perfect hehe)

I used linear's own convert script ./convert_region_files.py mca2linear my_region_folder my_linear_folder

I add the Advance Configuration ( features.toml ):

# if you choose "Linear" the compression algorithm 
# is ignored and ZStd will be used.

[chunk]
file_format = "Anvil" # or "Linear"

Also, i add some extra checks about the files to ensure is compatible (or otherwise get better info at least for debug). @Snowiiii if do you wanna try and give me some feedback about the new message you receive perfect, if not dont worry i will try tomorrow some tests by my self with the mc converter.

@Mili-ssm
Copy link
Contributor Author

Mili-ssm commented Jan 28, 2025

Okey guys, some good/bad news.

@Snowiiii i have to admit i dont yet try with the mca2linear but i think i get the issue (or any ways i found a great issue xd).

Pumpkin reads a lot in parallel, with anvil this is not a problem but with linear many chunks share file and its compress so you need time to read and write, this tend to create race bugs.

  1. Thread 1 is writing a file while Thread 2 is reading.
  2. The data must be compressed, so also rewritten.
  3. The Thread 2 found an EOF before it would or make an a wrong parse of the data.

For all of this there is some available fixes ideas but i want/need your thoughts.

  • A Queue system.
    To store pending chunks that need to be stored and every X time or X chunks flush everything in bulk (while you swap the queue to store new chunks).

  • A Map with many Muttex.
    To know if "you" (Thread X) can use the File Z with out any problem (many readers at a time is allowed, but if anyone is writer only the writer. something like the rustc borrow checker what a life, hahahahah).

  • Change the APIs.
    To manage this type of things inside the core and add the Bulk API to the ChunkWriter and ChunkReader. We can even, if the "one chunk at a time" function is usefull, do an a default implementation with a ChunkWriter.write_bulk(folder, &[chunk],&[*at]); or something like that.
    (This could be very aligned with the PR Separate chunk saving #470 what try to implement an internal chunk manager and independents writers/readers for the Disk Files if i didnt misunderstand it, please @suprohub give me your opinion)

Any idea or proposition is welcome 😄.

pd 1:
For my tests and for now i am using aliases (file name with a timestamp, then y write, and when is everything written i rename the files with the correct one but this bring many issues like the lose chunks on the thread races but at least less crashes.

pd 2:
good new because i think the spec is well implemented and address this type of stuff now probably save us many future headaches.

@Mili-ssm
Copy link
Contributor Author

Mili-ssm commented Jan 29, 2025

As for now the LinearFormat look working well ( still not tested the official implementation ).

But just for now it has perf issues caused by the de-allocation and write of chunks. As talked before Linear format will benefit from a some type of bulk/batch API for reading and for writing ( too much compressions and decompressions at a time )

As an alternative i think maybe we can have some background loop managing the writes and reads in some type of queue.
ie.

  • I ask for a chunk to be written.
  • The loop will check if we have enough Chunks waiting to be loaded (with an Advance Config) or if the time passed since the oldest request waiting is bigger than some threshold (an other Advance Config)
  • Finally start a writing or a reading of the file.

Anyways this is an early idea, first i want to lookup the Anvil format code to ensure how easy or hard would be fix this from the core and not from the format impl.( basically, why are we doing so much IO ops).

pd: For now i put some annoying dbg! logs to watch how many writings/reading are happening at a time.

@Mili-ssm
Copy link
Contributor Author

Mili-ssm commented Jan 29, 2025

The mca2linear converter was tested with the implementation (it had some bugs but now everything is okey).

The last issues is about performance (tend to kick the player when the server gets saturated from IO operation) but i want to fix this issues in the next PRs (improving the chunk management and improving the ChunkReader and ChunkWritter APIs)

@Mili-ssm Mili-ssm requested a review from Snowiiii January 29, 2025 23:29

pub const CHUNK_AREA: usize = 16 * 16;
pub const SUBCHUNK_VOLUME: usize = CHUNK_AREA * 16;
pub const SUBCHUNKS_COUNT: usize = WORLD_HEIGHT / 16;
pub const CHUNK_VOLUME: usize = CHUNK_AREA * WORLD_HEIGHT;

// Manejador global para múltiples archivos
static FILE_LOCK_MANAGER: LazyLock<FileLocksManager> = LazyLock::new(FileLocksManager::default);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but maybe we can use the File locking crate which we also use to lock the session.lock ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will check that lib and commit the changes or ask to clarify any doubt that i could have.

Copy link
Contributor Author

@Mili-ssm Mili-ssm Feb 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was reading about the lib and its says:

  • Is based on a POSIX spec. (that require the other program/process use the same spec if not it can be ignorable)
  • The windows version require open the file with write permission.
  • Apparently the POSIX spec suffer from write starvation contrary to Tokio RWLock or DashMap.
    Could happen that the writter never get the lock what in this case include accumulate thousands of threads trying to write that chunk and overheading the system.

I am happy if we want to make our own implementation with RWLock and make it global for all the crates or system. Or also make only inside the level struct.
( Its just the thing i made, may be with few improves, but is just a hash table so this dont require neither a lot of develop/LoCs nor complex stuff )

Both cases in my opinion preferable than the POSIX solution.

@Snowiiii
Copy link
Member

Snowiiii commented Feb 1, 2025

Overall I like your code and you did a great Job of using abstraction and not cluttering anything :D. I also really like that you write good comments and documentation for almost everything.

I see some unwraps which may can be replaced but besides of that Great Work @Mili-ssm 👍

@Mili-ssm Mili-ssm requested a review from Snowiiii February 2, 2025 20:45
@Mili-ssm Mili-ssm mentioned this pull request Feb 2, 2025
1 task
@Mili-ssm Mili-ssm mentioned this pull request Feb 3, 2025
1 task
@Snowiiii
Copy link
Member

Snowiiii commented Feb 5, 2025

Everything looks Awesome! Thank you so much for your Hard work @Mili-ssm. I like your Code style 👍. Would love to see more from You 😄

Thank you ❤️

@Snowiiii Snowiiii merged commit 12866bc into Pumpkin-MC:master Feb 5, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants