Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicit dyn fsm compilation #2395

Draft
wants to merge 20 commits into
base: main
Choose a base branch
from
Draft

Conversation

parthsarkar17
Copy link
Contributor

Introduces an option to tdcc ( -x tdcc:infer-fsms ) that creates an explicit Calyx FSM construct for each dynamic FSM within a schedule. When compiled with the Verilog backend, each FSM construct compiles to a separate FSM module, which is then instantiated exactly once in the module for the Calyx component itself. The benefit of this modularization is that the Vivado toolchain can infer that an FSM has been created and optimize the hardware based on this knowledge.

To-do's before merging:

  • Verify that the FSM is indeed optimized by Vivado (we know it is inferred, but want to make sure the resources and/or clock period goes down)
  • Finish implementing the allocation of these new FSM constructs for par blocks. A small bug exists right now

- consolidated explicit fsm construction with inlined fsm construction
- added an option to explicitly construct fsm
- need to implement this for par blocks
- can most likely get rid of dyn-fsm-allocation now
@rachitnigam
Copy link
Contributor

Cool stuff! Incredibly dumb nit but is there a good reason to keep this logic within tdcc instead of separating it out into a different lowering pass that can just be run in a different pass pipeline? This would also make it easier to define an alias (all-with-fsm) that we can use to run all of the regression tests and make sure it works!

@parthsarkar17
Copy link
Contributor Author

That's a good point! The reason I put it in as a tdcc option is because the other pass I made (dfsm) had too much in common with tdcc, specifically the whole calculate_states portion. Maybe a solution would be to factor that code out, and make two different passes (namely tdcc and dfsm) that each call those functions? Because you're right, it's possible that we may eventually want two totally different lowering processes...

@rachitnigam
Copy link
Contributor

Right, I think that refactoring + separation makes sense to me. We do want separate flows so that we can differentially test them.

- defined a method to collect a list of states at which an assignment should be valid, based on the assignment's guard
- modified the state2assigns map to, for each such state, contain a copy of the assignment
- need to figure out a way to get rid of all interval portions of a static guard
- Built functionality to change a StaticTiming Guard into a Nothing Guard
- This was acceptable because the FSM construct maps FSM states to assignments that are active at those states. Therefore, we don't need extraneous guards that are implicitly being checked by virtue of the fact that they are placed in a certain spot at the FSM. In this case, the implicit guards are the intervals represnted by Guard::Info objects. I got rid of every Guard::Info leaf in a Guard tree, and was therefore able to turn a StaticTiming Guard into a Nothing Guard
- Need to implement Visitor trait for this new pass
- put all assignments into the fsm module
- i.e., they don't depend on state-indicators like they do in dynamic fsms... i probably need to change this
…ts, guarded by state-wires assigned to at the corresponding FSM state
…d to modify verilog backend to support merged states
- Any static<n> component will have exactly one FSM to control it
- This FSM should not write to comp[done], and should take exactly n cycles
- The need to take n cycles implies we cannot have distinct IDLE/CALC states. We must merge them
- Further, this n cycles requirement means we cannot write to comp[done], meaning control must be empty
- Getting rid of the FSMEnable node representing the FSM means that wire-inliner cannot drive fsm[start]
- Hence, we must do it ourselves for FSMs that are strictly static (i.e. don't maintain go-done interface)
- ignoring static-promoted components; this technically breaks their static<n> guarantee because we add a DONE state to assign fsm[done], which is another cycle
- this probably doesn't matter, but will be good to fix in the future because it's confusing
- still failing a few runt tests
…n fsm; solved some failing dynamic runts tests
- Added a  register to pulse  at state 0 for promoted static components
- still failing skid buffer benchmark; but something strange is going on there; can proceed with polybench
- Using TDCC-like method: will compile StaticRepeat bodies to FSMs using finish_repeat
- Will call construct_schedule on a static schedule: need to StaticFSMEnable node
- Will need to compile par blocks in finish_par: allocate an enable or an fsm_enable for each block
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants