Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

td_seq perf improvement #774

Merged
merged 1 commit into from
Mar 27, 2024
Merged

Conversation

Rossco8
Copy link

@Rossco8 Rossco8 commented Mar 10, 2024

I'm not familiar with this particular indicator, but the slowness was caused by the use of apply function on the rolling window in calc_td()

Using the Pandas native functions cumsum() and where() should give the same results. The test data I used confirmed it was the same, but not sure if needs additional test cases.

Original performance as reported by cProfile

res = cProfile.run("momentum.td_seq(df['Close'])", sort='tottime')

2706808 function calls (2669110 primitive calls) in 2.399 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    30162    0.165    0.000    0.192    0.000 generic.py:6233(__finalize__)
602524/589996    0.137    0.000    0.268    0.000 {built-in method builtins.isinstance}
   109722    0.107    0.000    0.133    0.000 generic.py:42(_instancecheck)
    12602    0.093    0.000    0.452    0.000 managers.py:317(apply)
     2520    0.084    0.000    0.989    0.000 generic.py:10612(_where)
     5048    0.082    0.000    0.374    0.000 series.py:389(__init__)
    20080    0.066    0.000    0.154    0.000 series.py:664(_constructor_from_mgr)
    25129    0.046    0.000    0.054    0.000 generic.py:278(__init__)
     4962    0.046    0.000    0.134    0.000 array_ops.py:288(comparison_op)
    32695    0.037    0.000    0.084    0.000 generic.py:6298(__setattr__)

Updated code performance

4936 function calls (4867 primitive calls) in 0.014 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        2    0.001    0.001    0.001    0.001 algorithms.py:1339(diff)
        2    0.001    0.000    0.001    0.000 missing.py:968(_pad_2d)
       47    0.001    0.000    0.001    0.000 generic.py:6233(__finalize__)
       26    0.000    0.000    0.005    0.000 managers.py:317(apply)
        4    0.000    0.000    0.006    0.001 generic.py:10612(_where)
  999/977    0.000    0.000    0.001    0.000 {built-in method builtins.isinstance}
      6/4    0.000    0.000    0.002    0.000 blocks.py:1524(where)
        8    0.000    0.000    0.001    0.000 series.py:389(__init__)
      134    0.000    0.000    0.000    0.000 generic.py:42(_instancecheck)
        1    0.000    0.000    0.000    0.000 sre_parse.py:493(_parse)

@twopirllc twopirllc merged commit 872a18a into twopirllc:development Mar 27, 2024
1 check failed
@twopirllc
Copy link
Owner

@Rossco8

Thanks for the PR with the speed improvements. Updating it and a few other indicators and methods have drastically improved the overall speed of the library. 😎

On another note, TradingView (TV) has pulled down and closed sourced all public DeMark indicators due to trademark™️ reasons. To get ahead of the issue, I modified it some and named it exhc as we do not truly know what the correct source is and thus not able to validate it.

Lastly in the updated development branch, I incorporated a numba version of wma to speed it up as well around the same time you submitted #776. Curious about the speed differences you find between the one I updated and the recent one you submitted.

KJ

@Rossco8
Copy link
Author

Rossco8 commented Apr 1, 2024

Nice work @twopirllc on exhc(), looks great. Totally agree this approach distances the pandas-ta library from potential trademark issues.

To answer the question regarding the Numba version of wma() in your development branch, I did some test runs and compared the results to the version in my PR and I don't think Numba is any real benefit in this instance. In my opinion, if there is a Numpy or Pandas native function for a calculation (eg convolve(), concatenate() etc etc) that should be preferred over using Numba. Numba can offer benefits when iterating over Numpy arrays, but the downside of yet another dependency, larger container sizes and slow first run while it JITs is not worth it when you compare the execution times.

In the test runs I did, I can see consistent .001s calls to wma() in the Numba version when calling the function for the second, third etc time, but the first execution to JIT the function I saw regularly 1.6s - 2.3s. I retested my proposed version and can see consistent .002 & .001 (all tests on my M2 macbook).

@twopirllc
Copy link
Owner

@Rossco8

To answer the question regarding the Numba version of wma in your development branch, I did some test runs and compared the results to the version in my PR and I don't think Numba is any real benefit in this instance.

Yeah probably not for wma (et al). It was just something I whipped up while working on ht_trendline since it was also using wma. I will implement the best calculation for job in an upcoming push.

In my opinion, if there is a Numpy or Pandas native function for a calculation (eg convolve(), concatenate() etc etc) that should be preferred over using Numba.

Since incorporating numba, I decided that core calculations should be done with numpy, demse loop calculations with numba and when necessary pandas functions.

Numba can offer benefits when iterating over Numpy arrays, but the downside of yet another dependency, larger container sizes and slow first run while it JITs is not worth it when you compare the execution times.

I hear you. I contemplated a long time whether or not to require numba. Originally the library was to have few dependencies (just pandas and it's dependents), but demands have required a bit more.

Undoubtedly requiring numba will increase container size, but I think the speed improvements alone exceeds those concerns. Though I wish there was more insight on user environments, but I assume, maybe incorrectly, that many users may already have numba in their dependency chain already. 🤷🏼‍♂️

Yeah I am not thrilled with the initial jit compilation either but it is a minor initial inconvenience. I just do a dry run with 200 bars to initialize and test that all the indicators ran. Numba does have Ahead-of-Time (AOT) compilation that maybe become depreciated and does have some caveats. But unfortunately have not had the opportunity to test and determine it's feasibility for inclusion. 😅

In the test runs I did, I can see consistent .001s calls to wma() in the Numba version when calling the function for the second, third etc time, but the first execution to JIT the function I saw regularly 1.6s - 2.3s. I retested my proposed version and can see consistent .002 & .001 (all tests on my M2 macbook).

Nice. How many bars did you test with? Did you test via a terminal or in a Notebook? Curious about the difference between my M1 an your M2?

@Rossco8
Copy link
Author

Rossco8 commented Apr 4, 2024

@twopirllc

I just checked Numba size on disk, it is only 23MB including the binary blobs of compiled code that were created when I was doing my testing. So maybe increase container sizes isn't an issue - what's another 23MB hey?

The testing setup is very rudimentary. Just a Python script. I added a new function in the overlap package wma2() alongside the Numba version called wma() and just called both using cProfile. e.g.

res = cProfile.run('overlap.wma(df["close"])', sort='tottime')
res = cProfile.run('overlap.wma2(df["close"])', sort='tottime')

I used both AirBnb EOD prices which is only 3 years and also IBM which has over 15K rows dating back to 1962. There was no real diference in the performance of the functions with AirBnB data and IBM, both were basically the same speed

@Rossco8 Rossco8 deleted the td_sequpdates branch April 4, 2024 22:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants