Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor Bug fixes, notebook examples, and Documentation of added functionality #139

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Youbadawy
Copy link
Contributor

Hello all, so I fixed the minor bugs working with the current repository,

I will try to make this as well documented as I possibly can as per @spencerR1992's request,

So allow me to explain the included functionality as displayed in the two notebooks created

  • AI4Finance-Crypto-MultiCoin.ipynb
  • AI4Finance-Stock-MultiStock.ipynb

Let us begin with the first begin with the opening functionality, the guide of our project and all hyperparameters, secret keys, and the usability of each user, The config file

1- Configuration
from finrl.config.configuration import Configuration

The configuration file has two main functions/methods that are used to extract a dictionary of information needed across the RL project, like

  • the coins/stocks focused on
  • the technical indicators looked at,
  • the exchange details and api keys
  • ..... (a lot more)

to retrieve the file within the notebook/script we use the following method
config = Configuration.from_files(["./notebooks/config.json"])
and to get the coins for example that shall be traded we use
config.get("exchange").get("pair_whitelist")

Now you dont necessarily have to open your config file everytime you want to edit something, so I made two functions/methods in specific to this,

from finrl.config import setup_utils_configuration

ARGS_DOWNLOAD_DATA = {'config': ['./notebooks/config.json'], 'datadir': None, 
                      'user_data_dir': None, 'pairs': None, 'pairs_file': None, 
                      'days': 1825, 'timerange': None, 
                      'download_trades': False, 'exchange': 'binance', 
                      'timeframes': '1d', 'erase': False, 
                      'dataformat_ohlcv': None, 'dataformat_trades': None}

### Adds to config ARGS stats for further use through config

config = setup_utils_configuration(ARGS_DOWNLOAD_DATA, RunMode.UTIL_EXCHANGE)
print(config.get("timeframes"))

Where the method takes in a dictionary and edits the values based on what you signify here.

Another method for example is instead of the constant research of the top volume coins, I made a quick method that extracts the data from the exchange, and rewrites them to your config file using the below code,

CRYPTO

from finrl.tools.coin_search import *
import json

#Search top Selling Coins based on Volume specify the top 50, 100, 200 coins? 
#Default is 50

coins = coinSearch("BTC", top=5)

#Add them to config file Pair_whitelist
coins_to_json("./notebooks/config.json", coins)

#STOCKS

from finrl.tools.coin_search import *

tickers = ["FUBO", "JMIA", "AMZN", "APPN", "LUV", "IBIO", "BB","TSLA", "BABA", "GOOG", "RLLCF", "NIO", "RIOT", "BNGO", "MVIS", "IDEX", "NOK", "PLTR", "TSNP", "CCIV", "JNJ", "JPM", "MA", "PG", "BAC", "PFE", "UPS", "WFC", "SNOW", "USB", "INTC", "EA", "AMD", "NDA", "PYPL"  ]
len(tickers)

stocks_to_json("./notebooks/config.json", tickers)

2- User Data Directory

Now machine learning and algorithmic trading require the constant use of data, agents, models, and the back and forth of training and constant calculations. Thus data is necessary, and starting off with the clean data we have is essential. If we consider the 5 min timeframes of the crypto market the constant download and re download of this data is quite time consuming and inefficient, thus downloading once and using that data is a must.

To properly store this data each project user starts by creating their own user directory using this following methods,

from finrl.config.directory_operations import create_userdata_dir
create_userdata_dir("./user_data",create_dir=True)

where the directory being used is defined in the config["user_data_directory"] or the config["datadir"] definition

3- Data Downloading

Now the main utility of your Agent, the Data, here we choose which data we want to try and download to our specified user_data_directory using the below methods,

ARGS_DOWNLOAD_DATA = {'config': ['./notebooks/config.json'], 'datadir': None, 
                      'user_data_dir': None, 'pairs': None, 'pairs_file': None, 
                      'days': 365, 'timerange': None, 
                      'download_trades': False, 'exchange': 'binance', 
                      'timeframes': ['5m'], 'erase': False, 
                      'dataformat_ohlcv': None, 'dataformat_trades': None}

# ######## downloads data to our local data repository as dictated by our config, or we could overide it using 'datadir'

start_download_cryptodata(ARGS_DOWNLOAD_DATA)

In here we specify which data we would like to download, timeframes, and what not, if you leave it blank, it will automatically fetch it from your config.json file. Note that only in this method timeframes is a list, since you could download multiple data timeframes at once. this functionality is not available for the stock method

ARGS_DOWNLOAD_DATA = {'config': ['./notebooks/config.json'], 'datadir': None, 
                      'user_data_dir': None, 'days': None, 'timerange': "20020101-20210101",
                      'timeframes': ['1d'], 'erase': False}


start_download_stockdata(ARGS_DOWNLOAD_DATA)

Also note how for the amount of data you want to download you can specify two things, either a 'timerange': "20020101-20210101" or number of days 'days': 365, where it goes back 365 days from today.

4- Fetching the Data

Here is where we retrieve the saved data for use, Now this had an issue before where changing the repository of the files or where you save the user_data file caused issies, but it was fixed in a way that does a full directory walk to retrieve all data, and use only that is specified using your config["datadir"], config["exchange"], and config["user_data_dir"]

from finrl.data.fetchdata import FetchData

df = FetchData(config).fetch_data_stock()

image

from finrl.data.fetchdata import FetchData

df = FetchData(config).fetch_data_crypto()

image

All these functionalities are displayed in two notebooks as mentioned above,

What do we need to currently do is integrated these functionalities with the pre existing code that you have wrote, like for example

  • the backtest_plot functionality, instead of re-downloading data, to simply fetch it from the user directories mentioned in config.
  • the config.py information to be included in the config.json file.
  • the saving of agents, models, and other preprocessed dataframes to be saved in the user_data directory
  • Whatever you guys think of im sure their might be a lot of improvements and ideas given your experience and seniority in the project,

Looking forward for further criticism, advice, and ideas to improve on this amazing project!

Thanks

@XiaoYangLiu-FinRL @BruceYanghy @spencerR1992 @Yonv1943

@YangletLiu
Copy link
Contributor

@Youbadawy Can we divide it into multiple function pieces? Then, we can update each one.

@Youbadawy
Copy link
Contributor Author

@Youbadawy Can we divide it into multiple function pieces? Then, we can update each one.

Yes definitely @XiaoYangLiu-FinRL , currently focused on creating an environment using operations research that fits the stock market, since this is why I added the proposed functionality,

I will get to dividing it to smaller portions by the end of this week hopefully!

@setar
Copy link

setar commented Mar 2, 2021

to retrieve the file within the notebook/script we use the following method
config = Configuration.from_files(["./notebooks/config.json"])

commits in the PR not have any example of notebooks/config.json

@setar
Copy link

setar commented Mar 2, 2021

start_download_stockdata(ARGS_DOWNLOAD_DATA)

i think download_data must be packed on storage (via gzip for example)

@spencerR1992
Copy link
Contributor

@Youbadawy Thank you very much for this work!
Can you please add some tests for the data download code? Specifically for the changes you made to fetchdata.py?

Any new functionality like this should have testing!

If you need help with testing please let me know. It should be as simple as writing tests like the ones already in the repo, building the docker container and running ./docker/bin/test.sh to verify that your tests work.

This will ensure the reliability of the functionality and an integration test to ensure we don't break this functionality in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants