-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Instructions for training #106
Comments
I haven't gotten around to writing up good instructions for using MzingaTrainer, but here's what I sent the last person who asked me: When I was first trying to find the best numbers, I generated a folder of various profiles and used the lifecycle command with the following script: rm %~dp0trainer.log
MzingaTrainer.exe lifecycle -pp "%~dp0profiles" -lg 3 -lb 1 -ckc 50 -msp false -mpc 10 -mminm -1.0 -mmaxm 10000.0 -tmt 00:00:05 -btl 00:30:00 -pgc 100 -bbtl 12:00:00 -bsp true -mcb 8 -fpc false -gt "Base" | tee -a %~dp0trainer.log Lifecycle makes the profiles battle each other for a cycle, then the "weakest" profiles are culled from the herd and the "strongest" are allowed to "mate", generating new profiles with numbers mixed from their parents. Mechanisms are in place to make sure profiles have a provisional "childhood" period where they can't be culled and can't mate until they've fought sufficient battles. It was a slow process, so eventually I switched to using autotrain. I exported the latest built-in numbers from MzingaTrainer using exportai command, then tweaked the name, id, and filename of each (since autotraining overwrites the numbers, and ids are based on the version number). That way each could be considered a new profile (and not conflict with a normal exportai file). Then I wrote a script to train each of these profiles (exportai generates one per game type) against its gametype: @echo off
setlocal
for /L %%N in () do (
call :at %~dp000080000-000a-0000-0000-000000000000.xml "Base"
call :at %~dp000080001-000a-0000-0000-000000000000.xml "Base+M"
call :at %~dp000080002-000a-0000-0000-000000000000.xml "Base+L"
call :at %~dp000080003-000a-0000-0000-000000000000.xml "Base+ML"
call :at %~dp000080004-000a-0000-0000-000000000000.xml "Base+P"
call :at %~dp000080005-000a-0000-0000-000000000000.xml "Base+MP"
call :at %~dp000080006-000a-0000-0000-000000000000.xml "Base+LP"
call :at %~dp000080007-000a-0000-0000-000000000000.xml "Base+MLP"
timeout 30
)
goto :eof
:at
echo Starting %2
copy /y %1 %1.bak
%~dp0\MzingaTrainer.exe autotrain -tpp %1 -tmt 00:00:05 -btl 00:25:00 -mb 8 -mht 3 -tts 64 -gt %2
exit /b
goto :eof
endlocal Then, after weeks of letting these run, I would test these new profiles using the battleroyale command against a pool of profiles that includes the exportai from previous versions of Mzinga. At the end of the battle royale (once per game type) I would use the mergetop command to get the top profiles from each gametype and generate one config file I could copy into Mzinga for the next release. @echo off
setlocal
set OD=/path/where/autotrained/profiles/are
set LOG=%~dp0brawl.log
set MT=%~dp0MzingaTrainer.exe
set BACKUPS=%~dp0autotrained
set PROFILES=%~dp0profiles
rm %LOG%
rem for /L %%N in () do (
copy /Y %OD%\*.bak "%BACKUPS%\" | tee -a %LOG%
copy /Y %OD%\*.xml "%PROFILES%\" | tee -a %LOG%
%MT% battleroyale -pp "%PROFILES%" -tmt 00:00:05 -btl 00:05:00 -bbtl 1.00:00:00 -bsp true -mcb 6 -fpc true -agt true | tee -a %LOG%
%MT% mergetop -pp "%PROFILES%" | tee -a %LOG%
rem )
endlocal You can see my script uses a command called "tee" which lets you direct the output from the trainer to both the console output and to a log file simultaneously. That way I could open the log file and review how things were going without having to scroll the limited console window, since these operations can take days or weeks dependending on your parameters. |
Hi there,
As an avid Hive player and enthusiast I love and support what you are doing here. I've always wondered how AlphaZero would approach Hive ^^
I have played a few games against the Mzinga AI and it is rather weak at the moment. I was wondering whether you could breakdown how the training works and the correct commands and procedure to use in order to make the AI a little stronger. I have a 16 core processor and would love to train new engines for this.
I have tried to make it do what I want it to via the cmd but unfortunately I am not savvy enough to get it to work.
The text was updated successfully, but these errors were encountered: