tt-serdesphy - 240 Mbps SerDes PHY on SKY130
How I designed a single-lane Manchester-encoded SerDes PHY for TinyTapeout, covering the PLL, CDR, TX and RX datapaths, I2C CSR interface, and the challenges of building analog-digital mixed-signal circuits in an open-source PDK.

Overview
tt-serdesphy is a silicon-proven SerDes PHY submitted to TinyTapeout and fabricated on the SKY130 open-source PDK. The chip implements a single-lane differential serial link at 240 Mbps using Manchester encoding, a ring-oscillator PLL, a bang-bang CDR, dual FIFOs, and a PRBS-7 test engine - all controlled over I2C.
What is TinyTapeout
TinyTapeout is an educational multi-project-wafer service that lets you submit a small digital or mixed-signal design and get it fabricated on a real chip. The platform handles the place-and-route, design rule checking, and tape-out logistics. Your design occupies a tile on a shared shuttle - for this project, a standard TinyTapeout tile with 8 input pins, 8 output pins, and 8 bidirectional pins.
Fitting a full SerDes PHY - PLL, CDR, serializer, deserializer, FIFOs, I2C slave, PRBS engine - into that pin budget is the central engineering constraint. Everything maps to the 24-pin interface.
Architecture - PCS and PMA
The design splits cleanly into two layers:
| Layer | Module | Responsibility |
|---|---|---|
| PCS (Physical Coding Sublayer) | serdesphy_pcs.v | Data formatting, FIFOs, CSR registers, PRBS, clock domain crossing |
| PMA (Physical Medium Attachment) | serdesphy_pma (analog) | PLL, CDR, serializer, deserializer, CML driver, limiting amplifier |
serdesphy_top.v instantiates one u_pcs and one u_pma and wires them together. The PCS runs in the 24 MHz domain; the PMA operates at 240 MHz for serial I/O.
Source Layout
src/
serdesphy_top.v top-level integration
config.json LibreLane configuration
project.v TinyTapeout wrapper
digital/
serdesphy_pcs.v PCS top
tx/
serdesphy_word_assembler.v
serdesphy_tx_fifo.v
serdesphy_tx_data_mux.v
serdesphy_manchester_encoder.v
serdesphy_serializer_if.v
serdesphy_prbs_generator.v
serdesphy_tx_top.v
rx/
serdesphy_deserializer_if.v
serdesphy_manchester_decoder.v
serdesphy_rx_fifo.v
serdesphy_word_disassembler.v
serdesphy_prbs_checker.v
serdesphy_rx_top.v
csr/
serdesphy_i2c_slave.v
serdesphy_i2c_defines.v
serdesphy_registerInterface.v
serdesphy_serialInterface.v
serdesphy_csr_top.v
pll/
serdesphy_pll_ctrl.v
por/
common/
analog/ PMA - PLL VCO, CDR, diff I/O
Part 1 - Clock Architecture
The chip runs from a single 24 MHz reference clock and derives all other clocks internally.
CLK_24M (24 MHz)
|
+---> PLL (10x ring oscillator) ---> CLK_240M_TX (240 MHz)
|
+---> All CSRs, FIFO control, word assembler/disassembler
CDR VCO output ---> CLK_240M_RX (240 MHz, recovered from RX data)
Three independent clock domains means three sets of CDC (clock domain crossing) logic. The TX FIFO bridges CLK_24M writes to CLK_240M_TX reads. The RX FIFO bridges CLK_240M_RX writes to CLK_24M reads. This is where most of the subtle bugs live.
PLL Controller
serdesphy_pll_ctrl.v wraps the analog VCO with a digital lock detector and CSR interface. The PLL uses a four-state FSM:
| State | Condition |
|---|---|
UNLOCKED | Waiting for raw lock + VCO OK + CP OK |
ACQUIRING | Counting 2400 consecutive valid cycles (~100 us) |
LOCKED | Operational, monitoring for dropout |
ERROR | Fault state, requires reset |
The 100 us validation window (LOCK_COUNT_MAX = 2400 at 24 MHz) prevents false lock declarations from transient noise on the VCO control voltage. The PLL_LOCK output only asserts after all three conditions (raw lock, VCO in range, charge pump healthy) are simultaneously true for the full window.
PLL tuning is exposed through two CSR fields in PLL_CONFIG (0x04):
VCO_TRIM[3:0]- coarse frequency selection: 0x0 = 200 MHz, 0x8 = 240 MHz (nominal), 0xF = 400 MHzCP_CURRENT[1:0]- charge pump current: 10 uA / 20 uA / 40 uA (default) / 80 uA
Part 2 - Transmit Datapath
Data enters the chip as 4-bit nibbles on TXD[3:0], gated by TX_VALID. The transmit path converts these to 240 Mbps differential output through six stages:
TXD[3:0] + TX_VALID
--> Word Assembler (2 nibbles -> 8-bit word, 2 CLK_24M cycles)
--> TX FIFO (8 words x 8 bits, 24 MHz write clock)
--> Data Mux (FIFO output or PRBS generator)
--> Manchester Encoder (8 bits -> 16-bit biphase code)
--> Serializer (16 bits shifted out at 240 MHz, 15 ns/bit)
--> CML Driver (differential, 100 ohm, 400-800 mV swing)
Word Assembler
serdesphy_word_assembler.v latches the first 4-bit nibble on one CLK_24M edge and the second on the next, combining them into an 8-bit word. The timing contract is strict: TX_VALID must be held for two consecutive cycles per word. Gaps in TX_VALID stall the assembler without corrupting the partial word.
Manchester Encoder
serdesphy_manchester_encoder.v implements a four-state FSM (IDLE, ENCODING, READY, OUTPUT). The encoding rule:
encoded_data_reg <= {
input_data_reg[7] ? 2'b01 : 2'b10, // 1 -> low-to-high, 0 -> high-to-low
input_data_reg[6] ? 2'b01 : 2'b10,
// ... for all 8 bits
};Each input bit becomes two output symbols - the transition always happens at the bit center. Manchester guarantees 100% transition density, which means the CDR on the receive side always has a timing edge to lock onto. It also means the symbol rate doubles: 240 Mbps serial rate carries 120 Mbps net data.
The encoder handshakes with the serializer via data_valid and serializer_ready, coordinating when the 16-bit encoded word is ready to shift out.
TX FIFO
The 8-word deep FIFO provides clock domain crossing from the 24 MHz word assembler to the 240 MHz serializer. Write clock: 24 MHz. Read clock: 240 MHz (CLK_240M_TX). On overflow (FIFO full, new write arriving), the FIFO asserts FIFO_ERR and discards the incoming data. This is a sticky flag that clears on read from the STATUS register.
PRBS Generator
serdesphy_prbs_generator.v implements the PRBS-7 polynomial: x^7 + x^6 + 1. Output is 8 bits parallel, updating at 24 MHz. When TX_PRBS_EN is set in TX_CONFIG (0x01), the data mux bypasses the FIFO and feeds the PRBS sequence directly to the encoder. This is the standard bring-up test path.
Part 3 - Receive Datapath
The receive path is the harder half. It runs a CDR to extract a clock from the incoming data, then deserialized and decodes in that recovered clock domain before crossing back to 24 MHz for output.
RXP/RXN (differential)
--> Limiting Amplifier (10 mV sensitivity)
--> CDR (bang-bang, Alexander architecture)
--> Deserializer (16-bit shift register at 240 MHz recovered clock)
--> Manchester Decoder (16 bits -> 8 bits)
--> RX FIFO (CDC: CLK_240M_RX -> CLK_24M)
--> Word Disassembler (8 bits -> two 4-bit nibbles over two cycles)
--> RXD[3:0] + RX_VALID
Alignment FSM
serdesphy_rx_top.v runs an alignment FSM in parallel with the data path:
| State | Behavior |
|---|---|
ALIGN_STATE_SEARCH | Scanning for valid Manchester transitions |
ALIGN_STATE_VERIFY | Confirming pattern stability over multiple words |
ALIGN_STATE_LOCKED | Bit and word alignment confirmed, data valid |
RX data is not considered valid until the alignment FSM reaches LOCKED and the CDR_LOCK status bit asserts. The rx_aligned signal gates downstream FIFO writes.
The RX controller FSM tracks a separate state: DISABLED, ALIGNING, ACQUIRING, ACTIVE, and ERROR. Errors from the decoder, FIFO, or serial path all push to the ERROR state and latch sticky flags.
CDR - Bang-Bang Phase Detector
The CDR uses an Alexander (bang-bang) phase detector. The detector samples the incoming bit stream at three points per bit period - setup, center, and hold - and outputs a binary early/late decision. The loop filter integrates these decisions to drive the VCO control voltage.
CDR_GAIN in CDR_CONFIG (0x05) controls the loop bandwidth. Higher gain accelerates lock acquisition but reduces noise immunity and increases jitter. The datasheet recommends 0x3-0x6 as the operating range, with 0x4 as the default.
CDR lock time is 50-100 us. CDR_LOCK asserts when the phase error stays below 0.1 UI for more than 64 consecutive bits.
PRBS Checker
serdesphy_prbs_checker.v compares the decoded byte stream against the expected PRBS-7 sequence. Single-bit errors per 8-bit word are detected. The error counter saturates at 255 and resets via RX_ALIGN_RST in RX_CONFIG (0x02). PRBS_ERR in the STATUS register (0x06) is a sticky latch that clears on read.
Part 4 - I2C CSR Interface
All configuration and status registers are accessible over I2C at slave address 0x42. The I2C slave is implemented in serdesphy_i2c_slave.v and operates up to 24 MHz SCL frequency.
Register Map
| Address | Register | Key Fields |
|---|---|---|
| 0x00 | PHY_ENABLE | PHY_EN, ISO_EN |
| 0x01 | TX_CONFIG | TX_EN, TX_FIFO_EN, TX_PRBS_EN, TX_IDLE |
| 0x02 | RX_CONFIG | RX_EN, RX_FIFO_EN, RX_PRBS_CHK_EN, RX_ALIGN_RST |
| 0x03 | DATA_SELECT | TX_DATA_SEL (0=PRBS, 1=FIFO) |
| 0x04 | PLL_CONFIG | VCO_TRIM[3:0], CP_CURRENT[1:0], PLL_RST, PLL_BYPASS |
| 0x05 | CDR_CONFIG | CDR_GAIN[2:0], CDR_FAST_LOCK, CDR_RST |
| 0x06 | STATUS | PLL_LOCK, CDR_LOCK, FIFO flags, PRBS_ERR (read-only) |
| 0x07 | DEBUG_ENABLE | DBG_VCTRL, DBG_PD, DBG_FIFO |
Initialization Sequence
Getting the link up requires this exact order:
1. Apply AVDD (3.3V), then DVDD (1.8V)
2. Assert RST_N low for >= 10 CLK_REF cycles
3. Write 0x01 to 0x00 (PHY_ENABLE: PHY_EN=1)
4. Write 0x00 to 0x04 (PLL_CONFIG: release PLL_RST)
5. Poll 0x06 STATUS[0] until PLL_LOCK = 1
6. Write 0x05 to 0x01 (TX_CONFIG: TX_EN + TX_PRBS_EN)
7. Write 0x00 to 0x03 (DATA_SELECT: PRBS source)
8. Write 0x00 to 0x05 (CDR_CONFIG: release CDR_RST)
9. Write 0x05 to 0x02 (RX_CONFIG: RX_EN + RX_PRBS_CHK_EN)
10. Poll 0x06 STATUS[1] until CDR_LOCK = 1
11. Monitor STATUS[6] (PRBS_ERR should stay 0)
Power supply sequencing matters: AVDD must be stable before DVDD because the analog bias circuits are isolated by ISO_EN (default 1 = isolated). PHY_EN must stay 0 until both supplies are stable.
I2C Transaction Format
Write: START | 0x42 + W | ACK | REG_ADDR | ACK | DATA | ACK | STOP
Read: START | 0x42 + W | ACK | REG_ADDR | ACK |
START | 0x42 + R | ACK | DATA | NACK | STOP
Part 5 - Debug Infrastructure
The chip includes a DBG_ANA output pin that can be mux-routed to one of three internal analog signals, controlled by DEBUG_ENABLE (0x07):
| Bit | Source |
|---|---|
| DBG_VCTRL | VCO control voltage |
| DBG_PD | Phase detector output |
| DBG_FIFO | FIFO status signal |
Only one source can be active at a time. This is essential for characterizing PLL lock behavior and CDR tracking on silicon - you can probe the VCO Vtune pin directly with an oscilloscope to verify the control loop is settling correctly.
Challenges When Implementing Your Own SerDes
1. Clock domain crossing is where bugs hide
The TX and RX FIFOs each cross between the 24 MHz control domain and the 240 MHz serial domain. A naive register-stage CDC works if the data changes slowly relative to the destination clock, but for FIFOs you need gray-coded pointers or a proper handshake. Getting this wrong produces intermittent data corruption that is almost impossible to reproduce in simulation at the gate level.
The safe path: use gray-coded read/write pointers, synchronize them with two flip-flop stages in the destination domain, and prove the pointer arithmetic is correct before anything else. The FIFO full/empty logic based on incorrect synchronized pointers produces the most confusing waveforms you will ever debug.
2. Bang-bang CDRs have a fixed phase error floor
A bang-bang (binary) phase detector makes a hard early/late decision on every bit. Unlike a linear PD, it has no proportional output - it is always commanding maximum correction or nothing. This produces a limit cycle: the phase permanently oscillates by approximately one step of the VCO tuning resolution. That residual jitter is irreducible and sets your minimum phase error floor.
The practical consequence: CDR_GAIN is a trade-off knob, not a free variable. Set it high and lock is fast but jitter is large. Set it low and jitter is small but lock takes longer and the tracking bandwidth may be insufficient to follow a spread-spectrum clock source. For the 240 Mbps target with a bang-bang PD, the 5-10 UI RMS phase error spec in the datasheet reflects this fundamental limit.
3. Manchester doubles your symbol rate budget
Manchester encoding guarantees DC balance and 100% transition density - both good properties. What it costs is a 2x symbol rate. To carry 120 Mbps net data you need a 240 Mbps serial channel. Every timing closure number in your PLL and serializer now applies to 4.17 ns bit periods (240 MHz), not 8.33 ns (120 MHz). On SKY130 that is a meaningful difference for ring-oscillator based designs.
If your target data rate is already limited by the PDK's maximum toggle frequency, Manchester is a bad choice. 8b/10b gives you DC balance and run-length control with only a 20% overhead.
4. PLL lock detection needs hysteresis
The naive lock detector - "assert lock when the lock raw signal goes high" - produces a chatter condition where PLL_LOCK bounces at startup as the VCO control voltage settles. Any downstream logic that reacts to PLL_LOCK (like releasing the CDR reset) will misfire.
The fix is a counter-based lock detector with separate lock and unlock thresholds. The LOCK_COUNT_MAX = 2400 cycles (~100 us) in this design means the PLL has to stay locked continuously for 100 us before PLL_LOCK asserts. Dropout detection uses a shorter UNLOCK_COUNT = 240 (~10 us) to catch real loss-of-lock quickly without reacting to transients.
5. Supply sequencing and analog isolation
On SKY130, analog circuits powered by AVDD (3.3V) that are connected to a digital domain running DVDD (1.8V) need explicit isolation during power-up. If DVDD comes up before AVDD, the digital outputs can stress the analog inputs beyond their rated voltage.
ISO_EN in PHY_ENABLE (0x00) defaults to 1 (isolated). The initialization sequence relies on this: even if PHY_EN goes high early, the analog bias stays disconnected until ISO_EN is explicitly cleared. Do not skip the power sequencing step.
6. Differential routing and termination
The TXP/TXN and RXP/RXN pairs are CML (Current Mode Logic) with 100 ohm differential impedance. The loopback test described in the datasheet uses a 100 ohm differential resistor bridging TXP/TXN back to RXP/RXN. If your PCB trace length or termination is wrong, the received signal amplitude will be outside the 10 mV minimum sensitivity of the limiting amplifier and the CDR will never lock regardless of software configuration.
The differential input sensitivity spec (10 mV minimum) applies after the limiting amplifier. That amplifier has a finite bandwidth that rolls off at high frequency - a clean 10 mV at DC becomes a degraded signal at 240 MHz. The 400-800 mV output swing from the CML driver provides substantial margin when the loopback path is short and matched. On a real multi-chip link, add careful SI analysis.
Electrical Summary
| Parameter | Min | Typ | Max | Unit |
|---|---|---|---|---|
| Data rate | 240 | Mbps | ||
| Reference clock | 23.5 | 24.0 | 24.5 | MHz |
| PLL lock time | 8 | 10 | us | |
| CDR lock time | 50 | 100 | us | |
| TX output swing (100 ohm) | 400 | 600 | 800 | mV_pp |
| RX sensitivity | 10 | mV_pp | ||
| Active supply current | 12 | 18 | mA | |
| DVDD | 1.71 | 1.8 | 1.89 | V |
| AVDD | 3.0 | 3.3 | 3.6 | V |
Full register descriptions, AC timing specs, and the loopback test circuit are in the datasheet linked above.
Tapeout Result
The GDS render above shows the fabricated layout. The design was submitted to TinyTapeout as project 0000 in the HDL track, targeting the SKY130 PDK.