CS294-48 10/22/2009 Scribe Notes Continuation of pattern mining exercise: Sarah Bird - DMA Controller --------------------------- UTL Breakdown - input commands -> (4 memory channels) -> (network) -> (timing control) Timing and Control broken down to control state machine + priority decision logic. Memory channels broken down into command+control/memory channels/temporary storage. Useful addition to graphical notation would be a way to express multiple parallel and independent units - shadowing is a good way to do this. Also simplifies notation of connections to multiple blocks. Suggested improved breakdown would have four independent memory channels with an arbiter to choose among them. Summary of patterns: time multiplexing, multiple concurrent channels multiplexed onto one piece of hardware, possible coarse grained threading Scott Beamer - Fulcrum 24 port 10 gig L2-L4 network switch chip --------------------------------------------------------------- only switches on L2 (router not a switch), understands policies for L3/L4 High level functional diagram breakdown: RX port logic x 24, management, frame control, scheduler, 24 x TX port logic, switch element data path (rapidarray memory) UTL breakdown: 24 x receive ports -> (frame control, management, scheduler) -> 24 x transmit ports Rapid array memory takes up most of the chip and is connected to scheduler & rx/tx ports Frame control -> decides which packet goes where, scheduler manages crossbar & port usage Further breakdown of frame control: 24 ports -> Merge FIFO -> TCAMs for policy -> ARP table -> L2 vlan/stp table -> MAC table -> stacking tag table -> LAG QoS Triggers (drop or direct to proper output) Summary of patterns: implementing a network using memory - all to all connection with buffer memory in the middle. Scheduler manages movement of data through memory & crossbars. Pipe and filter stream for processing stages. Merging of frame headers from multiple discrete input ports into a single (or multiple) FIFO+pipeline for processing/scheduling. Yunsup - Broadcom switching chip -------------------------------- 25 x Ingress pipelines (input ports) "All packets travel as cells along a very wide central bus" Routing and switching is done on each ingress pipeline L2/L3 determines destination ports L4-L7 policy for when to drop packets, priority assignment, cell slicing MMU details: internal bus connected to packet queues, SRAM packet buffer, SDRAM Interface (connected to 2-64 MB of SDRAM) for overflow Common pattern: takes time to figure out dependencies for scheduling, so you must buffer your payload while this takes place. UTL decomposition: MAC addresses in and out, interface to PCI bus & SDRAM PCI controller, ingress pipelines, MMU, egress pipelines all connected to a central bus Ingress pipeline: inputs and outputs to/from data buffers. L2-3/L4-7 logic and control. MMU, Egress pipelines, PCI interface, Bus ? Bus : Where does the arbitration happen? Traditionally broken down into bus masters/bus slaves/bus controller (which decides who gets access to a port on a given cycle). Summary of patterns: very similar to Scott's - difference is in switch implementation : Bus vs. Crossbar-Memory-Crossbar. Temporal vs. Spatial multiplexing, fewer wider ports vs. more skinny ports. Chris Celio - student chip for undergrad electronics class, 3D rendering on an FPGA ----------------------------------------------------------------------------------- Memory (triangle source) -> renderer -> display logic -> monitor FSM (pipeline controller) interfaces to memory, renderer, display logic FSM (game control logic) connects to renderer Structural patterns: DLP, Pipelining Computational patterns: Dense linear algebra, FSMs Renderer breakdown: triangle transforms -> FSM (shader) Transform can be broken down into multiple pipeline stages: triangle in -> transform -> triangle out Screen & Z-Buffer breakdown: renderer connected via network to 2 memories which are also connected to a VGA controller. One gets written to while to other one is being read (double buffering). Things to think about: when is backpressure obvious or implied, when to show controller explicitly. Progression of diagrams: Highest level: UTL where arrows indicate communication without specification of how it is implemented. Lowest level: RTL implementation. A diamond inside a unit implies synchronous control for a UTL leaf node. ----------------------------------------------------------------------