



# Salsa – IpGBT interface : Feasibility and constraints

Irakli Mandjavidze

Irfu, CEA Saclay Gif-sur-Yvette, 91191 France

ePIC eDAQ WG 17/Oct/2024







- Interface with IpGBT
  - $\rightarrow \text{Clocks}$
  - $\rightarrow$  Synchronous commands
  - $\rightarrow$  Data
- Backend FPGA (ROD, DAM) interface
- System
  - $\rightarrow$  MPGD readout options
- Backup
  - $\rightarrow$  Reminder : Salsa's direct interface with VTRX+





### IpGBT Interface

### Support of "traditional" prevalent interface





- Important number of heterogeneous external interface signals proper for each functionality
  - $\rightarrow Clock\_diff\_in, ~SynCmd\_diff\_in$ 
    - Synchronous commands decoding options in backup
  - $\rightarrow$  SCL\_in, SDA\_io
    - Configuration of ASICs on a FE board in series : longer startup and recovery times
  - $\rightarrow$  Up to 4 Data\_diff\_out serial links
  - $\rightarrow$  Additional IOs like Trigger\_diff\_in, TrigPrim\_diff\_out
- Most adapted for FEB with an on-board companion intelligence
  - $\rightarrow$  for control & aggregation







#### • 256-channel FEB

- $\rightarrow$  Salsa receives recovered clock and sync data from an IpGBT eLink group
- $\rightarrow$  Salsa sends physics, calibration and monitoring data to a number of IpGBT lines of the eLink group
- $\rightarrow$  Salsa's are configured over daisy chained I2C interface from IpGBT
- $\rightarrow$  IpGBT provide a bidirectional interface between 4 Salsas and remote FPGA on RDO
- $\rightarrow$  VTRX+ is used with only one TX line
- $\rightarrow$  All ASICs are radiation hard

#### • 1024-channel RDO : common hardware with adaptation based on FireFly transceivers from Samtec

- Single 4-lane bidirectional FireFly is enough to serve 4 FEBs
- $\rightarrow$  Placed anywhere in user friendly area
  - No particular restrictions on power consumption, cooling infrastructure, radiation, magnetic field

## <sup>222</sup> <sup>irfu</sup> FEB based on Salsa's heterogeneous interface with IpGBT





ADC, DAC







### • Salsa can be configured to operate in CERN clock environment

- $\rightarrow$  Via external pins
- $\rightarrow$  Prisme PLL IP accepts 80 MHz clock
- $\rightarrow$  What can be input clock frequency and divider ?
  - 80 MHz : divider 1
  - 160 MHz : divider 2
  - 320 MHz : divider 4



 $\rightarrow$  Anyway, whatever is eLink clock frequency, EIC clock is in phase with eLink clock every 5<sup>th</sup> tick

| EIC cllk 100 MHz | 10 ns |
|------------------|-------|-------|-------|-------|-------|
|------------------|-------|-------|-------|-------|-------|

| 40 MHz  | 25 ns           |                 |                              |                 | 25 ns           |                 |                 |                 |
|---------|-----------------|-----------------|------------------------------|-----------------|-----------------|-----------------|-----------------|-----------------|
| 80 MHz  | 12.5 ns 12.5 ns |                 | o ns                         | 12.5            | o ns            | 12.5 ns         |                 |                 |
| 160 MHz | 6.25 ns         | 6.25 ns         | 6.25 ns                      | 6.25 ns         | 6.25 ns         | 6.25 ns         | 6.25 ns         | 6.25 ns         |
| 320 MHz | 3.125 n 3.125 n | 3.125 n 3.125 n | 3.125 n <mark>3.125 n</mark> | 3.125 n 3.125 n |

- 320 MHz eClock is the preferred option
  - $\rightarrow$  See the following

irfu cea

### Synchronous commands



- Some ePIC-wide commands should be in phase with EIC clock (if not all)
  - e.g. RevTick (BC0), Sync...
  - $\rightarrow$  They should be distributed every 5<sup>th</sup> EIC clock that is in phase with eLink clock
    - Meaning there are always at least 50 ns between 2 EIC machine synchronous commands
      - Up to 240 commands per EIC revolution should be enough



- Option of 320 MHz eLink clock and 320 Mbit/s downlink
  - $\rightarrow$  Allows to decode a large number of synchronous commands from 8-bit wide received word
    - And to recover the base 40 MHz clock



Note : an additional sub-system specific command can be sent within the same 5 EIC clock period
 → e.g. Tx link realign, fire internal pulser, calibrate, …



### Sampling



- With IpGBT recovered clock, the reference clock frequency of Salsa's Prisme PLL is 80 MHz
  - $\rightarrow$  Divider 4 for 320 MHz eLink clock
  - $\rightarrow$  The PLL produces 1.6 GHz clock out of 80 MHz
  - $\rightarrow$  Clock synthesizer operates at 1.6 GHz / 2 = 800 MHz



- Sampling clock is derived by dividing 800 MHz synthesizer clock
  - $\rightarrow$  50 MHz : divider 16
    - Reminder: this frequency is the baseline when working with 100 MHz distributed system clock
- Despite of 320 MHz distributed clock, looks like sampling can still be done at 50 MHz

 $\rightarrow$  And samples can be tagged with timestamps from EIC revolution ticks – being in "natural" EIC time units

| EIC cllk<br>100 MHz   | 10 ns                     | 10 ns                     | 10 ns                      | 1                            | 10 ns           | 10 ns             |                    | LO ns         |
|-----------------------|---------------------------|---------------------------|----------------------------|------------------------------|-----------------|-------------------|--------------------|---------------|
|                       |                           |                           |                            |                              | 1               | l                 | · · · · ·          |               |
| eLnk cllk             | 3.125 n 3.125 n 3.125 n 3 | .125 n 3.125 n 3.125 n 3. | 125 n 3.125 n 3.125 n      | 3.125 n <mark>3.125</mark> r | n 3.125 n 3.125 | n 3.125 n 3.125 n | 3.125 n 3.125 n 3. | 125 n 3.125 n |
| 320 MHz               |                           |                           |                            |                              |                 |                   |                    |               |
| Smp clk               | 20                        | ns                        |                            | 20 ns                        |                 |                   | 20 ns              |               |
| 50 MHz                |                           |                           |                            |                              |                 |                   |                    |               |
| irakli.mandjavidze@ce | a.fr                      |                           | Salsa - IpGBT interface, 2 | L7/Oct/2024                  |                 |                   |                    | 9             |



• Periodic "Sync" command generated when EIC and 40 MHz clocks are in phase

- $\rightarrow$  RevTick in this example
- $\rightarrow\,$  Allows phase alignment and monitoring of clocks within Salsa

#### • In this example Salsa is programmed to keep one sample before and after threshold crossing

 $\rightarrow$  Salsa data for the signal : Tstp(O-1) Smp(O-1) Smp(Origin) Smp(O+1) Smp(O+2) Smp(O+3)



### Data Tx



- Salsa output bandwidth for ePIC : < 400 Mbit/s
   <p>→ Direct VTRX+ interface provides single 1 Gbit/s link for data communications
- IpGBT : operate with FEC5 or FEC12 10 Gbit/s uplink
  - $\rightarrow$  More than fine for mild radiation environment of EIC
  - $\rightarrow$  FEB-RDO roundtrip monitoring capability
- Salsa has 4 serial links with ~1 Gbit/s bandwidth
- 3 options to consider for connectivity with IpGBT
  - $\rightarrow$  Four 320 Mbit/s links per ASIC
    - 16 links for all 4 ASICs out of 24 available
  - $\rightarrow$  Two 640 Mbit/s links per ASIC
    - 8 links for all 4 ASICs out of 12 available
  - $\rightarrow$  Single 1280 Mbit/s link per ASIC
    - 4 links for all 4 ASICs out of 6 available
    - Assuming this throughput is stably supported by Salsa
- Current preferred option : single 1280 Mbit/s link
  - $\rightarrow$  Assessing the feasibility







### RDO (DAM) Interface

<u>cea</u> irfu

### IpGBT – backend FPGA communication





- IpGBT-FPGA IP from CERN on backend FPGA to communicate with IpGBT on FEB
  - → Four 8-bit @ 40 MHz synchronous commands transparently delivered to 4 Salsas over 4 eLink groups
    - It is one of the possibilities may allow parallel communication with Salsas
      - Others possibilities (e.g. port mirroring) not discussed
  - $\rightarrow$  Data from Salsa are 32-bit wide words => 128 bits for 4 Salsas out of 202 bits
  - $\rightarrow$  I2C configuration of Salsas over IpGBT IC/EC channel

### Downlink from RDO / DAM FPGA to IpGBT and Salsa





- As for direct VTRX+ interface
- → If clever enough, might try using 1 bit to decode asynchronous slow control commands 40 Mb/s channel
  - As for direct VTRX+ interface

cea irfu

### Uplink from Salsa and IpGBT to RDO / DAM FPGA



#### Salsa

cea irfu

 $\rightarrow$  Produces 32-bit words @ 40 MHz

IDLE word when no user data to be sent

 $\rightarrow$  Data is serialized @ 1.28 Gb/s

### • IpGBT

- 4 out of 6 1.28 Gbit/s links are used
- $\rightarrow$  Recovers 32-bit words from Salsas
- $\rightarrow$  Forms 256 word including IC/EC and other data
- $\rightarrow$  Sends them over 10 Gbit/s link



Salsa4

Salsa3

#### IpGBT-FPGA IP 10 Gbit/s uplink interface with FEC12 202 bits $\rightarrow$ Output to FPGA fabric: Link 5 Link 4 Link 3 Link 2 Link 1 202-bit user data 2-bit EC data 32-bit 32-bit 32-bit 32-bit 32-bit 10-bit 2-bit IC data IC(1:0) L(202:192) D(191:160) D(160:128) D(127:96) D(95:64) D(63:32) HDR(1:0)

EC(1:0)

 $\rightarrow$  32-bit words from Salsas recovered in the same order as they have been sent

Round trip

Unused Unused

Link 0

32-bit

FEC(47:0)

D(31:0)

Salsa1

Salsa2



### Configuration



- Use IpGBT IC serial channel to produce I2C transactions with Salsas
  - $\rightarrow$  IC Internal Control provides 80 Mb/s line to program lpGBT registers
  - $\rightarrow$  A set of registers is dedicated to exercise I2C Master port(s) of IpGBT
  - $\rightarrow$  Various flavors of I2C read and write operations supported
    - Single byte reads or write, multi byte reads or writes, RdModWr, etc.



- FPGA logic should provide an implementation of IpGBT IC protocol
  - $\rightarrow$  Frame based read-write and read-only commands
    - Constructed by 2-bit @ 40 MHz
- Somewhat long sequence of steps per unitary I2C transaction
  - $\rightarrow$  Configuration of sub-systems may take considerable time
- Same mechanism to control GPIO pins and environmental analog circuitry (DAC, ADC)



### ePIC backend FPGA responsibility



- Derive from ePIC protocol a downstream link per Salsa
  - IpGBT option : 320 MHz clock and 320 Mbit/s downlink
  - Direct VTRX+ option : unified 1 Gbit/s link embedding 100 MHz clock and synchronous commands
  - $\rightarrow$  Salsa will implement ~30 synchronous commands
    - Overlap with ePIC fast synchronous commands must be found
    - ePIC protocol must allow deriving synchronous commands that will guide Salsa's state machine
      - e.g. start / stop data taking, Resync, RevTic (BC0), Calib, ResyncTxLink, etc.
- Receive and aggregate uplink data from Salsas
  - IpGBT option : 1.28 Gb/s uplink (link rate to be confirmed)
  - Direct VTRX+ option : 1 Gbit/s uplink
  - $\rightarrow$  Salsa will mix physics / calibration data with slow control / monitoring data
    - The packets of different nature must be separated for on-line processing of physics data
    - Usually some "firmware" checks are done of the integrity of received data (*e.g.* synchronization)
    - Error recovery must be present
- Configuration and monitoring
  - IpGBT option : sequential configuration of Salsas via I2C transactions over IC channel
  - Direct VTRX+ option : parallel configuration of Salsas via 100 Mb/s config channel in the unified interface



### Summary



- With today's knowledge and advancement in Salsa development lpGBT-based FEB seems possible
  - $\rightarrow$  320 MHz / 320 Mb/s downlink
  - $\rightarrow$  1.28 Gbit/s uplink to be confirmed
  - $\rightarrow$  I2C configuration
  - $\rightarrow$  50 MHz sampling possible
- FEB on-board control and monitoring over IpGBT Digital GPIO and analog DAC, ADC
  - $\rightarrow$  Direct VTRX+ option : requires extra functionality from Salsa or extra on-board circuitry
- FEB and RDO-adaptation circuitry become 10 Gbit/s grade
  - $\rightarrow$  FEB becomes high density grade as well
  - $\rightarrow$  Direct VTRX+ option : 1 Gbit/s grade
- Power comparison between two options will come later
  - $\rightarrow$  IpGBT overhead expected to be low
- Implementation
  - $\rightarrow$  IpGBT option : proven IpGBT-FPGA IP provided by CERN for deployment in backend FPGA
  - $\rightarrow$  Direct VTRX+ option : Salsa-FPGA IP will be provided by Salsa team

## <sup>222</sup> <sup>irfu</sup> 160K-ch MPGD readout configurations and component count



#### • FEB with direct Salsa-VTRX+ interface



#### • FEB with IpGBT-VTRX+ interface



#### • FEB with direct DAM interface







### Backup : Reminder of Salsa's unified backend interface



### "Unified" interface





- Single encoded RX line for Clock, SynCmd, Trigger, configuration and monitoring
  - $\rightarrow$  Minimal external interface: a single diff RX line + at least one diff TX line
    - Simplest case: only 4 pins (Rx\_p / Rx\_n + Tx\_p / Tx\_n) to communicate with the chip
    - Parallel configuration of ASICs possible : fast startup and recovery time
- Relatively complex initialization phase requiring collaboration from the remote partner (FPGA)
  - $\rightarrow$  Clock recovery phase followed by
  - $\rightarrow$  Data reception and transmission phase



Derived RX serial clock

- Periodic 0-to-1 transition for system clock recovery and jitter cleaning in the Prisme PLL IP
- Up to 32 synchronous commands at every system clock cycle
- Asynchronous slow control bit per system clock cycle
  - $\rightarrow$  100 Mb/s slow control link
- Note : the serialization-deserialization mechanism involves Reed Solomon FEC
  - $\rightarrow$  Capability to recover from up to 13 erroneous bits
    - Precious for robust downstream synchronization



### Data from Salsa over the serial TX link



8-bit FEC

40-bit transmission unit

• Fixed 40-bit length transmission units

- $\rightarrow$  32-bit user payload + 8-bit FEC
  - RS(15,13) Reed Solomon is considered with possibility to correct 4 errors
- Salsa packets transmitted word by word
  - $\rightarrow$  20% overhead due to FEC
- Use 40-bit parallel to serial converter
  - $\rightarrow$  Use single 40-bit RS(15,13) encoder
  - $\rightarrow$  Interleave four 40-bit encoded words
    - RS(15,13) uses 4-bit symbols
    - Interleave 4-bit by 4-bit
    - Recover from up to 13 consecutive bit errors

#### Data packet sketch for illustration only

32-bit user data



| Type: Sample | data | Chip I | D, length  | FEC |
|--------------|------|--------|------------|-----|
| Cl           | FEC  |        |            |     |
| Flags        | Sa   | mple 2 | Sample 1   | FEC |
|              |      |        |            | FEC |
| Flags        | Sa   | mple N | Sample N-1 | FEC |



#### • FEB

 $\rightarrow$  Salsa receives embedded clock / sync / async data over the unified RX interface

- VTRX+ RX serial link to Salsas via a rad-hard fan-out ASIC
- $\rightarrow$  Salsa sends physics, monitoring and slow control data over a single TX line
  - One VTRX+ TX serial link per Salsa
- $\rightarrow$  All ASICs are radiation hard



**CERN VTRX+** 

- RDO : common hardware with adaptation based on COTS FireFly transceivers from Samtec
  - Four 4-lane bidirectional FireFly components are needed to serve 4 FEBs
  - $\rightarrow$  Placed anywhere in user friendly area
    - No particular restrictions on power consumption, cooling infrastructure, radiation, magnetic field