A directed cascade stream connection from one Worker to another.
Construct one of these for each cascade edge in your design::
CascadeFlow(producer_worker, consumer_worker)
Lowers to ``aie.cascade_flow(producer.tile, consumer.tile)`` after both
Workers are placed. The kernel functions are responsible for using the
``put_mcd`` / ``get_scd`` intrinsics to actually drive/read the cascade
stream — this object only declares the directed topology edge.
Hardware constraints (enforced by the underlying op verifier):
* Source and destination tiles must be cardinal-adjacent.
* Each compute tile has at most one cascade input (from N or W) and one
cascade output (to S or E). Multiple cascade outputs from the same
tile will fail at lowering, not at construction.
* ShimTiles and MemTiles do not have cascade interfaces.
Discovery: each newly-constructed CascadeFlow registers itself on its
*source* Worker's ``_outgoing_cascades`` list. ``Program.resolve()``
walks the runtime's workers and resolves each worker's outgoing
cascades after placement — no global registry, no drain step.