Electronics

tt-serdesphy - 240 Mbps SerDes PHY on SKY130

How I designed a single-lane Manchester-encoded SerDes PHY for TinyTapeout, covering the PLL, CDR, TX and RX datapaths, I2C CSR interface, and the challenges of building analog-digital mixed-signal circuits in an open-source PDK.

tt-serdesphy - 240 Mbps SerDes PHY on SKY130

Overview

tt-serdesphy is a silicon-proven SerDes PHY submitted to TinyTapeout and fabricated on the SKY130 open-source PDK. The chip implements a single-lane differential serial link at 240 Mbps using Manchester encoding, a ring-oscillator PLL, a bang-bang CDR, dual FIFOs, and a PRBS-7 test engine - all controlled over I2C.

View the 3D GDS Layout - Interactive 3D viewer: step through metal layers and inspect individual cells of the fabricated chip.
Download Datasheet (PDF) - Full register map, AC timing specs, electrical characteristics, and loopback test circuit.
GitHub Repository - Verilog source, testbenches, and CI workflows for GDS generation, simulation, and FPGA validation.

What is TinyTapeout

TinyTapeout is an educational multi-project-wafer service that lets you submit a small digital or mixed-signal design and get it fabricated on a real chip. The platform handles the place-and-route, design rule checking, and tape-out logistics. Your design occupies a tile on a shared shuttle - for this project, a standard TinyTapeout tile with 8 input pins, 8 output pins, and 8 bidirectional pins.

Fitting a full SerDes PHY - PLL, CDR, serializer, deserializer, FIFOs, I2C slave, PRBS engine - into that pin budget is the central engineering constraint. Everything maps to the 24-pin interface.


Architecture - PCS and PMA

The design splits cleanly into two layers:

LayerModuleResponsibility
PCS (Physical Coding Sublayer)serdesphy_pcs.vData formatting, FIFOs, CSR registers, PRBS, clock domain crossing
PMA (Physical Medium Attachment)serdesphy_pma (analog)PLL, CDR, serializer, deserializer, CML driver, limiting amplifier

serdesphy_top.v instantiates one u_pcs and one u_pma and wires them together. The PCS runs in the 24 MHz domain; the PMA operates at 240 MHz for serial I/O.

tt-serdesphy system architecture diagram

Source Layout

src/
  serdesphy_top.v          top-level integration
  config.json              LibreLane configuration
  project.v                TinyTapeout wrapper

  digital/
    serdesphy_pcs.v        PCS top

    tx/
      serdesphy_word_assembler.v
      serdesphy_tx_fifo.v
      serdesphy_tx_data_mux.v
      serdesphy_manchester_encoder.v
      serdesphy_serializer_if.v
      serdesphy_prbs_generator.v
      serdesphy_tx_top.v

    rx/
      serdesphy_deserializer_if.v
      serdesphy_manchester_decoder.v
      serdesphy_rx_fifo.v
      serdesphy_word_disassembler.v
      serdesphy_prbs_checker.v
      serdesphy_rx_top.v

    csr/
      serdesphy_i2c_slave.v
      serdesphy_i2c_defines.v
      serdesphy_registerInterface.v
      serdesphy_serialInterface.v
      serdesphy_csr_top.v

    pll/
      serdesphy_pll_ctrl.v

    por/
    common/

  analog/                  PMA - PLL VCO, CDR, diff I/O

Part 1 - Clock Architecture

The chip runs from a single 24 MHz reference clock and derives all other clocks internally.

CLK_24M (24 MHz)
  |
  +---> PLL (10x ring oscillator) ---> CLK_240M_TX (240 MHz)
  |
  +---> All CSRs, FIFO control, word assembler/disassembler

CDR VCO output ---> CLK_240M_RX (240 MHz, recovered from RX data)

Three independent clock domains means three sets of CDC (clock domain crossing) logic. The TX FIFO bridges CLK_24M writes to CLK_240M_TX reads. The RX FIFO bridges CLK_240M_RX writes to CLK_24M reads. This is where most of the subtle bugs live.

PLL Controller

serdesphy_pll_ctrl.v wraps the analog VCO with a digital lock detector and CSR interface. The PLL uses a four-state FSM:

StateCondition
UNLOCKEDWaiting for raw lock + VCO OK + CP OK
ACQUIRINGCounting 2400 consecutive valid cycles (~100 us)
LOCKEDOperational, monitoring for dropout
ERRORFault state, requires reset

The 100 us validation window (LOCK_COUNT_MAX = 2400 at 24 MHz) prevents false lock declarations from transient noise on the VCO control voltage. The PLL_LOCK output only asserts after all three conditions (raw lock, VCO in range, charge pump healthy) are simultaneously true for the full window.

PLL tuning is exposed through two CSR fields in PLL_CONFIG (0x04):

  • VCO_TRIM[3:0] - coarse frequency selection: 0x0 = 200 MHz, 0x8 = 240 MHz (nominal), 0xF = 400 MHz
  • CP_CURRENT[1:0] - charge pump current: 10 uA / 20 uA / 40 uA (default) / 80 uA

Part 2 - Transmit Datapath

Data enters the chip as 4-bit nibbles on TXD[3:0], gated by TX_VALID. The transmit path converts these to 240 Mbps differential output through six stages:

TXD[3:0] + TX_VALID
  --> Word Assembler   (2 nibbles -> 8-bit word, 2 CLK_24M cycles)
  --> TX FIFO          (8 words x 8 bits, 24 MHz write clock)
  --> Data Mux         (FIFO output or PRBS generator)
  --> Manchester Encoder (8 bits -> 16-bit biphase code)
  --> Serializer       (16 bits shifted out at 240 MHz, 15 ns/bit)
  --> CML Driver       (differential, 100 ohm, 400-800 mV swing)

Word Assembler

serdesphy_word_assembler.v latches the first 4-bit nibble on one CLK_24M edge and the second on the next, combining them into an 8-bit word. The timing contract is strict: TX_VALID must be held for two consecutive cycles per word. Gaps in TX_VALID stall the assembler without corrupting the partial word.

Manchester Encoder

serdesphy_manchester_encoder.v implements a four-state FSM (IDLE, ENCODING, READY, OUTPUT). The encoding rule:

encoded_data_reg <= {
    input_data_reg[7] ? 2'b01 : 2'b10,   // 1 -> low-to-high, 0 -> high-to-low
    input_data_reg[6] ? 2'b01 : 2'b10,
    // ... for all 8 bits
};

Each input bit becomes two output symbols - the transition always happens at the bit center. Manchester guarantees 100% transition density, which means the CDR on the receive side always has a timing edge to lock onto. It also means the symbol rate doubles: 240 Mbps serial rate carries 120 Mbps net data.

The encoder handshakes with the serializer via data_valid and serializer_ready, coordinating when the 16-bit encoded word is ready to shift out.

TX FIFO

The 8-word deep FIFO provides clock domain crossing from the 24 MHz word assembler to the 240 MHz serializer. Write clock: 24 MHz. Read clock: 240 MHz (CLK_240M_TX). On overflow (FIFO full, new write arriving), the FIFO asserts FIFO_ERR and discards the incoming data. This is a sticky flag that clears on read from the STATUS register.

PRBS Generator

serdesphy_prbs_generator.v implements the PRBS-7 polynomial: x^7 + x^6 + 1. Output is 8 bits parallel, updating at 24 MHz. When TX_PRBS_EN is set in TX_CONFIG (0x01), the data mux bypasses the FIFO and feeds the PRBS sequence directly to the encoder. This is the standard bring-up test path.


Part 3 - Receive Datapath

The receive path is the harder half. It runs a CDR to extract a clock from the incoming data, then deserialized and decodes in that recovered clock domain before crossing back to 24 MHz for output.

RXP/RXN (differential)
  --> Limiting Amplifier (10 mV sensitivity)
  --> CDR (bang-bang, Alexander architecture)
  --> Deserializer (16-bit shift register at 240 MHz recovered clock)
  --> Manchester Decoder (16 bits -> 8 bits)
  --> RX FIFO (CDC: CLK_240M_RX -> CLK_24M)
  --> Word Disassembler (8 bits -> two 4-bit nibbles over two cycles)
  --> RXD[3:0] + RX_VALID

Alignment FSM

serdesphy_rx_top.v runs an alignment FSM in parallel with the data path:

StateBehavior
ALIGN_STATE_SEARCHScanning for valid Manchester transitions
ALIGN_STATE_VERIFYConfirming pattern stability over multiple words
ALIGN_STATE_LOCKEDBit and word alignment confirmed, data valid

RX data is not considered valid until the alignment FSM reaches LOCKED and the CDR_LOCK status bit asserts. The rx_aligned signal gates downstream FIFO writes.

The RX controller FSM tracks a separate state: DISABLED, ALIGNING, ACQUIRING, ACTIVE, and ERROR. Errors from the decoder, FIFO, or serial path all push to the ERROR state and latch sticky flags.

CDR - Bang-Bang Phase Detector

The CDR uses an Alexander (bang-bang) phase detector. The detector samples the incoming bit stream at three points per bit period - setup, center, and hold - and outputs a binary early/late decision. The loop filter integrates these decisions to drive the VCO control voltage.

CDR_GAIN in CDR_CONFIG (0x05) controls the loop bandwidth. Higher gain accelerates lock acquisition but reduces noise immunity and increases jitter. The datasheet recommends 0x3-0x6 as the operating range, with 0x4 as the default.

CDR lock time is 50-100 us. CDR_LOCK asserts when the phase error stays below 0.1 UI for more than 64 consecutive bits.

PRBS Checker

serdesphy_prbs_checker.v compares the decoded byte stream against the expected PRBS-7 sequence. Single-bit errors per 8-bit word are detected. The error counter saturates at 255 and resets via RX_ALIGN_RST in RX_CONFIG (0x02). PRBS_ERR in the STATUS register (0x06) is a sticky latch that clears on read.


Part 4 - I2C CSR Interface

All configuration and status registers are accessible over I2C at slave address 0x42. The I2C slave is implemented in serdesphy_i2c_slave.v and operates up to 24 MHz SCL frequency.

Register Map

AddressRegisterKey Fields
0x00PHY_ENABLEPHY_EN, ISO_EN
0x01TX_CONFIGTX_EN, TX_FIFO_EN, TX_PRBS_EN, TX_IDLE
0x02RX_CONFIGRX_EN, RX_FIFO_EN, RX_PRBS_CHK_EN, RX_ALIGN_RST
0x03DATA_SELECTTX_DATA_SEL (0=PRBS, 1=FIFO)
0x04PLL_CONFIGVCO_TRIM[3:0], CP_CURRENT[1:0], PLL_RST, PLL_BYPASS
0x05CDR_CONFIGCDR_GAIN[2:0], CDR_FAST_LOCK, CDR_RST
0x06STATUSPLL_LOCK, CDR_LOCK, FIFO flags, PRBS_ERR (read-only)
0x07DEBUG_ENABLEDBG_VCTRL, DBG_PD, DBG_FIFO

Initialization Sequence

Getting the link up requires this exact order:

1. Apply AVDD (3.3V), then DVDD (1.8V)
2. Assert RST_N low for >= 10 CLK_REF cycles
3. Write 0x01 to 0x00  (PHY_ENABLE: PHY_EN=1)
4. Write 0x00 to 0x04  (PLL_CONFIG: release PLL_RST)
5. Poll 0x06 STATUS[0] until PLL_LOCK = 1
6. Write 0x05 to 0x01  (TX_CONFIG: TX_EN + TX_PRBS_EN)
7. Write 0x00 to 0x03  (DATA_SELECT: PRBS source)
8. Write 0x00 to 0x05  (CDR_CONFIG: release CDR_RST)
9. Write 0x05 to 0x02  (RX_CONFIG: RX_EN + RX_PRBS_CHK_EN)
10. Poll 0x06 STATUS[1] until CDR_LOCK = 1
11. Monitor STATUS[6] (PRBS_ERR should stay 0)

Power supply sequencing matters: AVDD must be stable before DVDD because the analog bias circuits are isolated by ISO_EN (default 1 = isolated). PHY_EN must stay 0 until both supplies are stable.

I2C Transaction Format

Write:  START | 0x42 + W | ACK | REG_ADDR | ACK | DATA | ACK | STOP
Read:   START | 0x42 + W | ACK | REG_ADDR | ACK |
        START | 0x42 + R | ACK | DATA | NACK | STOP

Part 5 - Debug Infrastructure

The chip includes a DBG_ANA output pin that can be mux-routed to one of three internal analog signals, controlled by DEBUG_ENABLE (0x07):

BitSource
DBG_VCTRLVCO control voltage
DBG_PDPhase detector output
DBG_FIFOFIFO status signal

Only one source can be active at a time. This is essential for characterizing PLL lock behavior and CDR tracking on silicon - you can probe the VCO Vtune pin directly with an oscilloscope to verify the control loop is settling correctly.


Challenges When Implementing Your Own SerDes

1. Clock domain crossing is where bugs hide

The TX and RX FIFOs each cross between the 24 MHz control domain and the 240 MHz serial domain. A naive register-stage CDC works if the data changes slowly relative to the destination clock, but for FIFOs you need gray-coded pointers or a proper handshake. Getting this wrong produces intermittent data corruption that is almost impossible to reproduce in simulation at the gate level.

The safe path: use gray-coded read/write pointers, synchronize them with two flip-flop stages in the destination domain, and prove the pointer arithmetic is correct before anything else. The FIFO full/empty logic based on incorrect synchronized pointers produces the most confusing waveforms you will ever debug.

2. Bang-bang CDRs have a fixed phase error floor

A bang-bang (binary) phase detector makes a hard early/late decision on every bit. Unlike a linear PD, it has no proportional output - it is always commanding maximum correction or nothing. This produces a limit cycle: the phase permanently oscillates by approximately one step of the VCO tuning resolution. That residual jitter is irreducible and sets your minimum phase error floor.

The practical consequence: CDR_GAIN is a trade-off knob, not a free variable. Set it high and lock is fast but jitter is large. Set it low and jitter is small but lock takes longer and the tracking bandwidth may be insufficient to follow a spread-spectrum clock source. For the 240 Mbps target with a bang-bang PD, the 5-10 UI RMS phase error spec in the datasheet reflects this fundamental limit.

3. Manchester doubles your symbol rate budget

Manchester encoding guarantees DC balance and 100% transition density - both good properties. What it costs is a 2x symbol rate. To carry 120 Mbps net data you need a 240 Mbps serial channel. Every timing closure number in your PLL and serializer now applies to 4.17 ns bit periods (240 MHz), not 8.33 ns (120 MHz). On SKY130 that is a meaningful difference for ring-oscillator based designs.

If your target data rate is already limited by the PDK's maximum toggle frequency, Manchester is a bad choice. 8b/10b gives you DC balance and run-length control with only a 20% overhead.

4. PLL lock detection needs hysteresis

The naive lock detector - "assert lock when the lock raw signal goes high" - produces a chatter condition where PLL_LOCK bounces at startup as the VCO control voltage settles. Any downstream logic that reacts to PLL_LOCK (like releasing the CDR reset) will misfire.

The fix is a counter-based lock detector with separate lock and unlock thresholds. The LOCK_COUNT_MAX = 2400 cycles (~100 us) in this design means the PLL has to stay locked continuously for 100 us before PLL_LOCK asserts. Dropout detection uses a shorter UNLOCK_COUNT = 240 (~10 us) to catch real loss-of-lock quickly without reacting to transients.

5. Supply sequencing and analog isolation

On SKY130, analog circuits powered by AVDD (3.3V) that are connected to a digital domain running DVDD (1.8V) need explicit isolation during power-up. If DVDD comes up before AVDD, the digital outputs can stress the analog inputs beyond their rated voltage.

ISO_EN in PHY_ENABLE (0x00) defaults to 1 (isolated). The initialization sequence relies on this: even if PHY_EN goes high early, the analog bias stays disconnected until ISO_EN is explicitly cleared. Do not skip the power sequencing step.

6. Differential routing and termination

The TXP/TXN and RXP/RXN pairs are CML (Current Mode Logic) with 100 ohm differential impedance. The loopback test described in the datasheet uses a 100 ohm differential resistor bridging TXP/TXN back to RXP/RXN. If your PCB trace length or termination is wrong, the received signal amplitude will be outside the 10 mV minimum sensitivity of the limiting amplifier and the CDR will never lock regardless of software configuration.

The differential input sensitivity spec (10 mV minimum) applies after the limiting amplifier. That amplifier has a finite bandwidth that rolls off at high frequency - a clean 10 mV at DC becomes a degraded signal at 240 MHz. The 400-800 mV output swing from the CML driver provides substantial margin when the loopback path is short and matched. On a real multi-chip link, add careful SI analysis.


Electrical Summary

ParameterMinTypMaxUnit
Data rate240Mbps
Reference clock23.524.024.5MHz
PLL lock time810us
CDR lock time50100us
TX output swing (100 ohm)400600800mV_pp
RX sensitivity10mV_pp
Active supply current1218mA
DVDD1.711.81.89V
AVDD3.03.33.6V

Full register descriptions, AC timing specs, and the loopback test circuit are in the datasheet linked above.


Tapeout Result

The GDS render above shows the fabricated layout. The design was submitted to TinyTapeout as project 0000 in the HDL track, targeting the SKY130 PDK.

View the 3D GDS Layout - Step through individual metal layers, zoom into cells, and explore the full physical implementation in the browser.
GitHub Repository - Full Verilog source, testbenches, and CI workflows (GDS generation, functional simulation, FPGA validation) are all open source.
Download Datasheet (PDF) - Complete electrical specifications, register descriptions, and application circuits.