



#### Digitizer ASICs for LAPPD Readout

October 25, 2022 Luca Macchiarulo, Ph.D. Senior IC Design Architect - Nalu Scientific LLC

Work partially funded by US DOE SBIR Grants:

DE-SC0015231, DE-SC0017833, DE-SC0020457

NALU SCIENTIFIC - Approved for public release. Copyright @ 2022 Nalu Scientific LLC. All rights reserved.



### WAVEFORM DIGITIZER SoCs FOR PRECISE TIME OF FLIGHT ESTIMATION





#### 1. Front-end Chips:

- Event based digitizer+DSP
- 4-32 channel scope on chip
- 1-15 Gsa/s, 12 bit res.
- Low SWaP-C
- User friendly: FW/SW tools

#### 2. Integration:

- SiPM
- PMT
- LAPPD
- Detector arrays



#### 3a. Main application:

- NP/HEP experiments
- Astro particle physics

#### **3b. Other applications:**

- Beam Diagnostics
- Plasma/fusion diagnostics
- Lidar
- PET imaging









## ABOUT NALU SCIENTIFIC

#### Fast Growing Startup in Honolulu, Hawai'i

Located at the Manoa Innovation Center near U. of Hawaii 20 staff members-diverse background Access to advanced design tools Rapid prototyping and testing lab

#### **Technical Expertise**

IC design: Hardware design: Firmware design: Software design: Analog + digital System-on-Chip (SoC) Complex multi-layer PCBs FPGAs, CPUs GUI, analysis, documentation

#### Scientific Expertise - NP/HEP subject matter experts

Physicists (3x) Electronics for large scientific instruments

#### **Exclusive Distributor Agreement for North America**

Sales of ASICs, eval boards Enhanced OEM opportunities



#### Nalu = 'wave' in native Hawaiian language

NALU SCIENTIFIC - Approved for public release. Copyright © 2022 Nalu Scientific LLC. All rights reserved.



### **Current SoC-ASIC Projects**

| Project  | Sampling<br>Frequency<br>(GHz) | lnput<br>BW<br>(GHz) | Buffer<br>Length<br>(Samples) | Number of<br>Channels | Timing<br>Resolution<br>(ps) | Available<br>Date |
|----------|--------------------------------|----------------------|-------------------------------|-----------------------|------------------------------|-------------------|
| ASoC     | 3-5                            | 0.8                  | 16k                           | 4                     | 35                           | Rev 3 avail       |
| HDSoC    | 1-3                            | 0.6                  | 2k                            | 64                    | 80-120                       | Rev 1 avail       |
| AARDVARC | 8-14                           | 2.5                  | 32k                           | 4                     | 10                           | Rev 3 avail       |
| AODS     | 1-2                            | 1                    | 8k                            | 1-4                   | 100-200                      | Rev 2 avail       |
| UDC      | 8-10                           | 1.5-2                | 4k                            | 16/32                 | 10                           | Rev 1 avail       |

- **ASoC**: Analog to digital converter System-on-Chip •
- **HDSoC**: SiPM specialized readout chip with bias and control
  - Baseline readout for EIC hpDIRC 0
- **AARDVARC**: Variable rate readout chip for fast timing and low deadtime •
- **AODS:** Low density digitizer with High Dynamic Range (HDR) option

Work funded by DOE SBIRs. University of Hawaii as subcontractor.

NALU SCIENTIFIC - Approved for public release. Copyright © 2022 Nalu Scientific LLC. All rights reserved.



S/N:

ASoC v3 S/N

Mfa: 01



## Outline

- Waveform digitization architecture principles, pros and cons
- Reference specs for EIC-LAPPD readouts
- Some Nalu Asics features:
  - UDC
  - AARDVARCv3
  - AARDVARCv4
- Board design examples:
  - HIPeR (LAppd gen I)
  - DSA C10-8+ (8 channel)
  - DSA E-10-96 (96 channel)
- Conclusions

# Advantages and disadvantages of a digitizing architecture

The choice of a digitizing architecture, as opposed to the much simpler discriminator based approach (TDC/CFD/TOT, etc.), has the comparative advantages of:

- 1. Better estimation of overall charge (effectively Riemann sum of samples) without need of complex analog filtering/shaping causing pile-up or baseline wander issues -> charge estimation is very important for sub-pixel spatial resolution
- 2. Use of full waveform information for timing extraction (noise on a single point less important).
- 3. Robustness to overall wander and gain variations
- 4. Comparative independence with respect to aging (not a big issue for LAPPDs)
- 5. Robust methods for pile-up management can be implemented

The additional design complexity is high (a digitizer is much more complex than a single discriminator, as is the digital processing required), but modern technology can handle the additional requirements (especially digital requirements) quite gracefully. It is vitally important to control power.

## **Reference Specifications for EIC-LAPPD readout**

A sketch of "optimal" features using back-of-the-envelope calculations (based on waveform sampling):

- <u>Pixel size</u>
  - pitch down to 3.2mm -> max footprint of 1 channel/10mm^2 -> tough but right order of magnitude (see below)
- Gain stage
  - Assuming 10^6-10^7 gain, pulse of ~2ns length on 50 ohm -> ~4-40mV
  - Safe with a gain ~5-10 (14-20dB)
  - BW preserve rising time of 1 ns for timing see below -> >0.35/tr-> >350MHz -> doable in 130nm and better nodes
- <u>Power</u>
  - DIRC 72 HRPPDs -> ~72,000 channels max
  - System power: ~2kW -> 30mW/channel -> in the ballpark of existing chips
- <u>Timing resolution and sampling rate</u>?
  - Assuming 1 ns rise time
  - Assuming min signal peak (after gain) of 100mV
  - Desired electronics timing better than LAPPD/HRPPD 50 ps by a factor of 2 -> 25ps
  - Measurement better than 2.5mV (100\*0.025)
    - In reality waveform sampling (next page example) can have better than this depending on number of samples on edge: if 1 ns @ 5Gsps -> 5 samples possible noise (including conversion) of ~5mV RMS -> reasonable see next

# Numerical example of time extraction advantage

- Simple model of signal: ideal linear rising edge, with additive noise
  - Amplitude = 100mV
  - Noise RMS= 5.0mV
  - Rise time = 1ns
- "Default" estimation of jitter: Noise/(dV/dt) 5/(100/1000)ps=50ps
- Using linear least squares on 4 points sampled on the rise time to estimate the slope and constant for the rising edge, and calculating the midpoint crossing: ~25ps
   Statistics on 5000 cases
- Naive model: does not take into account noise correlation at non-infinite BW, and assumes ideal edge, ideal sampling times, and identical pulse positions but shows the potential of the approach. Most of these effects can be simulated and modelled via MC. Can also be estimated on acquired waveforms.





#### NALU SCIENTIFIC Waveform digitizer architecture - single channel



permission. © 2022 Nalu Scientific LLC



### **Readout ASICs specifications**

|                             | UDC                                     | AARDVARC v3/v4                         |
|-----------------------------|-----------------------------------------|----------------------------------------|
| Channels                    | 16                                      | 4/8                                    |
| Sampling Rate               | Dual:<br>• <1 Gsps<br>• 7-11 Gsps       | • 7-11 Gsps                            |
| ABW                         | 1-2 GHz (under evaluation)              | 1.3 -> 2.0 GHz                         |
| Samples/channel             | 4096                                    | 32768/16384                            |
| On-chip digital storage     | Yes                                     | No                                     |
| ADC bits ( <u>not</u> ENOB) | 10                                      | 12                                     |
| Die size                    | ~6.4mm x ~4.5mm = 28.8 mm <sup>2</sup>  | 5.88mm x 4.98mm = 29.3 mm <sup>2</sup> |
| Package                     | QFN64 (9x9)                             | QFN100 (12x12) (pinout - 64 possible)  |
| Output IF                   | Serial or parallel, polled or streaming | Serial or parallel                     |
| Trigger out                 | Yes                                     | Yes                                    |
| Self Triggering             | No                                      | Yes                                    |
| Input amp                   | No                                      | Yes - buffer @>2GHz                    |



### UDC Theory of operation and usage

Designed for single shot/slow rate experiments in full slave mode

- 1. Setup: Using RxIn, the analog and digital parameters are written
- 2. Acquisition: data is continuously sampled and stored in round robin fashion in 32 windows of 64 samples
- 3. Digitization: after reception of an "Acquire"
  - a. Sampling Stops
  - b. Digitizing starts from the oldest samples
  - c. Digitized Samples copied in parallel in the SRAMs
  - d. Loop over b,c for all successive 63 windows in order
  - e. Signals the end of operation by raising the "Data Ready" flag
- 4. **Readout**: Whenever the user is ready, a "Read Data" can be issued:
  - a. The *TxSelect* will packetize and send out data from all 16 channels via the serial interface *Tx*, using the TxClk
- 5. If new data is desired, a new Acquire request can be issued

Note: it never initiates an operation (digitization/data out) without external command



#### **Rate Simulations**

- Includes serial interface locking and all register writing
- Correct behavior for one half of full system
- Assumes very fast serial (1 Gbps tested to ~400Mbps)
- <u>"Slow"</u> for multiple acquisitions/ high data rate - but can be modified relatively easily (different/hybrid chip -see below) to target short signals (i.e. 64 samples) and/or partial acquisitions (fewer than 16 channels)
  - For example a single channel readout on a single window, could require as little as ~13 us.





### **UDC Floorplan and Size**

- X values:
  - Xchan: ~1.5mm (x2)
  - Xdig = ~3.0 mm
- Ydie = 4.3 mm
- Xdie = 6.2 mm
- Total area: 26.6mm^2

#### XChan

|    |    |                |                                                             | Channel 8                                             |
|----|----|----------------|-------------------------------------------------------------|-------------------------------------------------------|
| 00 | 01 | 08             | 09                                                          | Channel 9                                             |
|    |    |                |                                                             | Channel 10                                            |
| 02 | 03 | 10             | 11                                                          | Channel 11                                            |
|    |    |                |                                                             | Channel 12                                            |
| 04 | 05 | 12             | 13                                                          | Channel 13                                            |
|    |    |                |                                                             | Channel 14                                            |
| 06 | 07 | 14             | 15                                                          | Channel 15                                            |
|    |    |                |                                                             | T.Gen/DACs                                            |
|    |    | 02 03<br>04 05 | 02         03         10           04         05         12 | 02     03     10     11       04     05     12     13 |

Xdig





### **UDC data examples**



Baseline noise and calibration



Fast photo-diode signal captured by the UDC microchip (red) and fast oscilloscope (Tektronix 2 GHz, 10 GSa/s, DPO 5204B oscilloscope) showing a very good match on the overall shape and rising edge of the pulse.







Multipulse signal for timing evaluation



## Summary: UDC advantages and disadvantages

Advantages:

- Relatively dense: 16 channels in 9x9 package: footprint of 2.25mm per pixel (does not consider supporting electronics i.e. Amplifiers, regulators, clock tree, and routing constraints).
- Good timing performance (expected in the ~10ps range)
- "Moderate" power consumption (~30-35mW/channel)
- Simple serial interfacing (but point to point needed)
- On-chip digital storage

Disadvantages:

- Low rate: entire record needs to be read (easy change)
- **Deadtime**: no sampling during digitization
- Fairly complex clocking scheme (2-3 clocks needed)



### AARDVARCv3

Larger package (12x12)

Only 4 channels

Longer array (32k samples/channel)

Complex management mechanisms to increase rate/reduce data transfer:

- "Interval" centered by triggers
- Local ("self") triggering also combined between channels
- "ROI mode" select only windows with signal over threshold
- "Banking": permit simultaneous sampling and digitization





Micrograph of AARDVARC V3 in 130nm CMOS node. Overall dimensions are 5.88mm x 4.98mm = 29.3 mm<sup>2</sup>



## AARDVARC Theory of operation and usage

Designed for high rate experiments in independent streaming mode

- 1. **Setup**: Using RxIn, or parallel interface, the analog and digital parameters are written
- 2. Acquisition: data is continuously sampled and stored in round robin fashion in 256 (128 for rev4) windows of 64 samples -
- 3. Triggering Digitization: using either external trigger or self-trigger (possibly combination of channels)
  - a. Interval around the trigger is selected
  - b. Windows in the interval is digitized (only "marked" windows if in ROI mode
- 4. **Readout**: As soon as data is ready:
  - a. Data from digitized channels and windows are sent all 16 channels via the serial interface *Tx*, using the TxClk
- 5. Back to 2. Wait for new trigger

Using the banking system, phases 2 and 3+4 can overlap (on different regions) thus permitting dead-free operation.

Note: it requires a collecting node always ready to receive data



#### AARDVARCv4

Improvements from v3:

- 8 channel device
- Input buffer > 2GHz:
  - Increase BW
  - Reduce "kickback" effects on sampling.
- Simpler clocking (internal PLL)
- Internal changes to improve data quality
- Extension of readout modes ("orthogonal" operations)
- Minor digital bugs corrected
- Bare chips just received from fabrication







## Summary: AARDVARC advantages and disadvantages

Advantages:

- Fully validated architecture (at least in rev. 3)
- Very good timing performance (validated in the ~10ps range)
- Simple serial interfacing (but point to point needed)
- Self-triggering and streaming capabilities
- Potential high rate (up to 150kHz single channel events/chip or full 8 channel readout at ~100kHz): minimum readout (64 samples or 128 samples in ROI mode)
- Long record length (16k/32k)
- On-board calibration (pedestal subtraction)
- Multibanking scheme permits deadtime-free operation below fixed (relatively high) rates
- PLL clocking scheme (only ref. clock needed in rev4)

Disadvantages:

- "High" power consumption (~70mW/channel) mainly due to long buffer length
- Larger footprint (2xUDC for rev4)



## "Ideal" LAPPD ROIC: UDC/AARDVARC Hybrid

From both:

- Good timing performance (expected in the ~10ps range)
- Simple serial interfacing (but point to point needed)

From UDC:

- Density: 16 channels in 9x9 package: possibly squeeze even more with smaller analog buffer
- "Moderate" power consumption (~30-35 mW/channel)
- On-chip digital storage (help decoupling readout)

From AARDVARC

- Potential high rate: minimum readout single window (single channel if ROI mode)
- Self-triggering and streaming capabilities
- On-board calibration (pedestal subtraction)
- Multibanking scheme permits deadtime-free operation below fixed (relatively high) rates
- PLL clocking scheme (only ref. clock needed) (in rev4)
- Input Amplification (modification of buffer to add moderate gain)

### **Board design and considerations**

- Board level integration is essential:
  - Fully harness high channel density at chip level
  - Handles power distribution and thermal control
  - Guarantees system level integration
  - Concentrates data
- Nalu experiences of integration with AARDVARC/UDC chips
  - HIPeR project (for Gen I LAPPD) 7 AARDVARCv3 in a "sparse" setup
  - DSA C10-8+ : 2 AARDVARC with compact readout
  - DSA E-10-96: 96 channel UDC-based readout



#### HIPeR

- Designed as part of a DoE SBIR
- Meant to be directly coupled to a Gen I LAPPD (same size as Gen I tile)
- Uses 14 AARDVARC to read 2 sides of the 28 strips.
- Individual amplifying stages per channel
- Finely matched time reference distribution
- Demonstrated fine spatial determination via ~20ps timing difference between pulses (on different asics)









### **Product Description DSA C10-8+**

Product Name: Product Description: Dimensions: Gain: Bandwidth: Digitizer Chip: Sample rate: Power: Data: Integration: Trigger:

DSA C10-8+ 8 channel, 10 GSa/s digitizer with built-in amplifier ~ 6" x 4" x 1" 19.5dB, replaceable ~1.8 GHz AARDVARC V3 9-14 GSa/s USB USB/UART, future ethernet Amps, chip, FPGA, clock, regulators, comm, FW, SW Internal, External, Software





### **Product Description DSA E10-96**



Product Name: **DSA E10-96** 96 channel, 10 GSa/s digitizer Product Description: ~ 12" x 4" x 1" **Dimensions:** Bandwidth: ~1.2 GHz Digitizer Chip: UDC V1 Sample rate: 9-12 GSa/s Power: Terminal Data: USB 3/UART, future ethernet Integration: chip, FPGA, clock, regulators, comm, FW, SW Trigger: Internal, External, Software

#### 96 ch UFL breakout





#### Enclosure





PCB

### Software

NALU SCIENTIFIC

• Easy-to-use interface

• Export mode

- Easy mode
- Toolbox for quick access
- Real time visualization
- Flexible number of channels on display
- CSV/binary export
- Calibration
- Data processing
- Data exporting



### Conclusions

- Waveform digitizing ROICs can be a robust and flexible solution for readout of high density LAPPD-based detectors requiring 10s of picosecond timing resolution and can provide a richer set of measurements than is typically possible with TDC-TOT approaches.
- Current versions of Nalu digitizers already cover most of the "strawman" LAPPD specs derived above.
- Small variations to support optimized LAPPD readout along with hybridization/fusion of proven features and capabilities should be possible with a moderate effort (i.e. with no new architecture development).
- Optimized <u>board integration</u> will be essential for fully leveraging the inherent capabilities of LAPPDs.
- Initial testing and R&D by developers can take advantage of <u>existing</u> Nalu board-level electronics and chips that can work as realistic demonstrators for a subset of channels in benchtop testing and/or in environments such as beam tests, etc. with fewer constraints (e.g. space, power, rate, etc.) than those imposed by typical deployments at scale in HEP/NP detectors. Both AARDVARC and UDC-based systems are currently available, along with engineering support for related items such as e.g. custom connectorizations, etc.
- We look forward to working with the community to develop a readout solution that meets the needs of the widest possible base of HEP/NP users.
- Finally, it is important to note that developments spurred by HEP/NP will also directly benefit highly important adjacent fields such as e.g. time-of-flight PET.

# Back up slides





#### Packaging -9x9 mm QFN 64 (0.5mm pitch)

Dark Green: VDD 2.5V Light Green: VDD 1.2V Orange: VSS Black: essential pins Purple: RF inputs Blue: LVDS pairs





#### AARDVARCv3 data



**Baseline noise and calibration** 

Please do not repost without permission. © 2022 Nalu Scientific LLC





Stddev of distance between two peaks vs amount of events used for calibration 1k to 400k events, stepsize = 1k, randomly selected for each step



#### **Timing calibration**



View Help

### **Software Features**

#### Straightforward readout configuration

- Tools for calibration generation and data correction
- Tools for data visualization and processing
- Automatically export data to disk while capturing

Captured data is stored with all calibration data and board parameters used for easy offline analysis.

| Capture                                                                         |                   |        |                                              |                     | C  | alibration           |
|---------------------------------------------------------------------------------|-------------------|--------|----------------------------------------------|---------------------|----|----------------------|
|                                                                                 | • Trigger         |        | 0                                            | Auto                |    | Trigger I<br>Thresho |
| Limit events:                                                                   |                   | 1      |                                              |                     |    | Generat              |
| Limit buffer                                                                    |                   | 5000   |                                              | \$                  | Ð  | Load Pe              |
| Start Acq                                                                       | Start Acquisition |        | op Acquisi                                   | tion                | s: | Save Pe              |
| Re <mark>adout c</mark> hanne                                                   | els               |        |                                              |                     |    | Reset Pe             |
| 0 1                                                                             | 2 3               | 4      | 5 6                                          | 7                   | ui | Load Tir             |
| Trigger                                                                         |                   |        |                                              |                     |    | Save Tin             |
| ngger                                                                           |                   |        |                                              |                     | ls | Reset Tir            |
| Source                                                                          |                   | ternal | Sign                                         |                     | _  | ADC2m                |
| Edge<br>Interval                                                                | Rai     For       | -      | <ul> <li>Fallin</li> <li>Relation</li> </ul> | -                   |    | Save cal             |
|                                                                                 |                   | cea    | • Kela                                       | tive                |    | Load ca              |
| Read window                                                                     |                   |        |                                              |                     | -  |                      |
| Read window                                                                     |                   | _      |                                              |                     | _  |                      |
| Read window                                                                     | vs                | 8      |                                              | <b>(</b>            |    |                      |
|                                                                                 | VS                | 8      |                                              | <b>•</b>            |    |                      |
| Read window                                                                     |                   |        |                                              |                     |    | ⊖ Extern             |
| Read windov<br>Lookback                                                         | rig               | 15     |                                              | \$                  |    | DAC Sw<br>Extern     |
| Read windov<br>Lookback<br>Write after t                                        | rig               | 15     | Level                                        | \$                  |    | ⊖ Extern             |
| Read windov<br>Lookback<br>Write after t<br>Trigger values                      | rig               | 15     | Level<br>0                                   | \$                  |    | ⊖ Extern             |
| Read window<br>Lookback<br>Write after t<br>Trigger values<br>Channel           | rig<br>Offset     | 15     |                                              | <ul> <li></li></ul> |    | ⊖ Extern             |
| Read window<br>Lookback<br>Write after t<br>Trigger values<br>Channel<br>V CH 0 | rig<br>Offset     | 15     | 0                                            | •                   |    | ⊖ Extern             |

#### Capture Toolbox

| -  |                                            |     |
|----|--------------------------------------------|-----|
|    | Trigger baseline                           |     |
|    | Threshold scan                             |     |
|    | Generate Pedestals                         |     |
| Ð  | Load Pedestals                             | ۲   |
| s: | Save Pedestals                             | Þ   |
| •  | Reset Pedestals                            |     |
|    | Load Time calibration                      |     |
| ui | Save Time calibration                      |     |
| ls | Reset Time calibration                     |     |
|    | ADC2mV calibration                         |     |
| -  | Save calibration                           |     |
| 1  | Load calibration                           |     |
|    | DAC Sweep                                  | ×   |
|    | <ul> <li>External</li> <li>Sigr</li> </ul> | nal |

#### bration