7

How do you implement a hardware random number generator in an HDL (verilog)?

What options need to be considered?


This question is following the self-answer format. Addition answers and updates are encouraged.

Community
  • 1
  • 1
Morgan
  • 19,934
  • 8
  • 58
  • 84

3 Answers3

13

As noted in Morgan's answer this will only produce a single random bit. The number of bits in the LFSR only set how many values you get before the sequence repeats. If you want an N bit random number you have to run the LFSR for N cycles. However, if you want a new number every clock cycle the other option is to unroll the loop and predict what the number will be in N cycles. Repeating Morgan's example below, but to get a 5 bit number each cycle:

module fibonacci_lfsr_5bit(
  input clk,
  input rst_n,

  output reg [4:0] data
);

reg [4:0] data_next;

always @* begin
  data_next[4] = data[4]^data[1];
  data_next[3] = data[3]^data[0];
  data_next[2] = data[2]^data_next[4];
  data_next[1] = data[1]^data_next[3];
  data_next[0] = data[0]^data_next[2];
end

always @(posedge clk or negedge rst_n)
  if(!rst_n)
    data <= 5'h1f;
  else
    data <= data_next;

endmodule

Edit: Added a new version below which doesn't require you to do the math. Just put it in a loop and let the synthesis tool figure out the logic:

module fibonacci_lfsr_nbit
   #(parameter BITS = 5)
   (
    input             clk,
    input             rst_n,

    output reg [4:0] data
    );

   reg [4:0] data_next;
   always_comb begin
      data_next = data;
      repeat(BITS) begin
         data_next = {(data_next[4]^data_next[1]), data_next[4:1]};
      end
   end

   always_ff @(posedge clk or negedge reset) begin
      if(!rst_n)
         data <= 5'h1f;
      else
         data <= data_next;
      end
   end

endmodule

I would like to make the LFSR length parameterizable as well, but that is much more difficult since the feedback taps don't follow a simple pattern.

nguthrie
  • 2,615
  • 6
  • 26
  • 43
  • how to extend this unroll for n bits? lets say 8 bit. – noobcoder Apr 16 '20 at 22:10
  • Changing BITS will change how many steps are taken each clock cycle. However, if you want an 8 bit LFSR then you will have to figure out how to construct that. THis app note has a table which give the taps needed for various lengths: https://www.xilinx.com/support/documentation/application_notes/xapp210.pdf – nguthrie Apr 17 '20 at 10:37
11

This is a TRNG (True random number generator) that works on an FPGA. It is basically an LFSR type structure without the flip flops, so it is a combinatorial loop that runs continuously. The signal oscillates chaotically, when you combine several of these modules and XOR bits you get a truly random bit, since the jitter from each combines. The maximum clock rate you can run this at depends on your FPGA, you should test the randomness with a testing suite like diehard, dieharder, STS or TestU01.

These are called Galois Ring Oscillators(GARO). There are other TRNGs which use less power and area, but they are tricker to operate and write, usually relying on tuning delays to make a flipflop go metastable.

module GARO (input stop,clk, reset, output random);

(* OPTIMIZE="OFF" *)                    //stop *xilinx* tools optimizing this away
wire [31:1] stage /* synthesis keep */; //stop *altera* tools optimizing this away
reg meta1, meta2;

assign random = meta2;

always@(posedge clk or negedge reset)
if(!reset)
  begin
    meta1 <= 1'b0;
    meta2 <= 1'b0;
  end
else if(clk)
  begin
    meta1 <= stage[1];
    meta2 <= meta1;
  end

assign stage[1] = ~&{stage[2] ^ stage[1],stop};
assign stage[2] = !stage[3];
assign stage[3] = !stage[4] ^ stage[1];
assign stage[4] = !stage[5] ^ stage[1];
assign stage[5] = !stage[6] ^ stage[1];
assign stage[6] = !stage[7] ^ stage[1];
assign stage[7] = !stage[8];
assign stage[8] = !stage[9] ^ stage[1];
assign stage[9] = !stage[10] ^ stage[1];
assign stage[10] = !stage[11];
assign stage[11] = !stage[12];
assign stage[12] = !stage[13] ^ stage[1];
assign stage[13] = !stage[14];
assign stage[14] = !stage[15] ^ stage[1];
assign stage[15] = !stage[16] ^ stage[1];
assign stage[16] = !stage[17] ^ stage[1];
assign stage[17] = !stage[18];
assign stage[18] = !stage[19];
assign stage[19] = !stage[20] ^ stage[1];
assign stage[20] = !stage[21] ^ stage[1];
assign stage[21] = !stage[22];
assign stage[22] = !stage[23];
assign stage[23] = !stage[24];
assign stage[24] = !stage[25];
assign stage[25] = !stage[26];
assign stage[26] = !stage[27] ^ stage[1];
assign stage[27] = !stage[28];
assign stage[28] = !stage[29];
assign stage[29] = !stage[30];
assign stage[30] = !stage[31];
assign stage[31] = !stage[1];

endmodule
StanOverflow
  • 557
  • 1
  • 5
  • 21
  • 1
    A brilliant example of async TRNG for FPGA, would like to try it in real HW. However, there is a typo in the code: 'stage' should be defined as **'wire [31:1] stage;'**. For altera devices, this should look as **'wire [31:1] stage /* synthesis keep */;'** – lvd Jan 19 '15 at 13:08
  • How did you choose which bits to tap? – danyamachine Jul 02 '18 at 17:21
  • https://pdfs.semanticscholar.org/e5d5/0170e53b2a47c6c9406d24b8bbcb9924e70f.pdf I copied the polynomial on page 7, although I now see I missed the x^23 term – StanOverflow Jul 10 '18 at 15:23
  • Is this specific implementation tested in HW? If so: which one? Because I tried to implement it on a MACHXO3 and encountered the issue that afters some time the oscillation of stage[1] vanishes. If one uses stage[2] as output the ocillation persists but the synchronizer always outputs zero. Mostlikely because of multiple transistion happens during one clk cycle (3.02 MHz clock). So I am looking for hints how to solve it or alternative implementations. – Christian B. May 17 '20 at 16:41
  • 1
    @ChristianB. I tested it on a Virtex 5, I didn't carry out long duration tests maybe upto 30 minutes, maybe it would be a good idea to instantiate two, and swap between them, every X cycles, and hold the unused one in reset to allow it to cool down. – StanOverflow May 18 '20 at 18:11
  • The reason was much more embarrassing: the MACHXO3 comes with a GSR which is actually active low and the reset in my logic was active high. So the synchronizer was stucked for both levels of reset. Currently I am trying to check if the implementation of the MACHXO3 passes the diehard test. LT;DR: it works - I was just to dumb. – Christian B. May 19 '20 at 08:02
7

An LFSR is often the first port of call. Implementation is relatively simple, a shift register with a number of terms XORd together to create the feedback term.

When considering the implementation of the LFSR, the bit width of the random number and the repeatability of the number need to be considered. With N bits a Maximal LFSR will have (2**N) - 1 states. All zero state can not be used with out additional hardware.

An example 4 bit LFSR with taps a bit 0 and bit 4:

module fibonacci_lfsr(
  input  clk,
  input  rst_n,

  output [4:0] data
);

wire feedback = data[4] ^ data[1] ;

always @(posedge clk or negedge rst_n)
  if (~rst_n) 
    data <= 4'hf;
  else
    data <= {data[3:0], feedback} ;

endmodule

Choosing tap points and finding out the sequence length (number of numbers before it repeats) can be found from this table.

For example a sequence of 17,820,000, 30 bits wide could use taps of :

0x20000029 => bits "100000000000000000000000101001"   
0x2000005E => bits "100000000000000000000001011110"
0x20000089 => bits "100000000000000000000010001001"

The first would have a feedback term of:

feedback = data[29] ^ data[5] ^ data[3] ^ data[0];

If you are unsure of the order of the taps, remember that the MSB will always be a feedback point. The Last (tap) feedback point defines the effective length of the LFSR, after that it would just be a shift register and have no bearing on the feedback sequence.

If you needed a sequence of 69,273,666 you would have to implement a 31 bit LFSR and choose 30 bits for your random number.

LFSRs are a great way to create a 1-bit random number stream but if you are taking multiple consecutive bits that there is a correlation between values, it is the same number shifted plus dither bit. If the number is being used as a dither stream you may want to introduce a mapping layer, for example swap every other bit. Alternatively use an LFSR of different length or tap points for each bit.

Further Reading

Efficient Shift Registers, LFSR Counters, and Long Pseudo-Random Sequence Generators,
A Xilinx app note by Peter Alfke
.

Linear Feedback Shift Registers in Virtex Devices,
A Xilinx app note by Maria George and Peter Alfke
.

Morgan
  • 19,934
  • 8
  • 58
  • 84
  • Hi, Thanks for this detailed explanation. However I think in this line ``data <= {data[2:0], feedback} ;`` you meant to write ``data <= {data[3:0], feedback} ;``. Also since I'm posting, can you please state if it makes a difference if I shift left or right. I think here you are shifting left, any reason for that? Thank you – ipunished Jan 25 '13 at 20:24
  • 1
    Well spotted, your correct about the feedback, I was 1 bit short. The left/right shift dictates the order of the tap points. If shifting right, then data[0] MUST be a feed back point or you are shortening the repeat sequence. I always shift left, adding new data in to the LSB, the position the indicates how old the data is. For double flop meta-stability I do `reg [1:0] meta; always @(posedge clk ... meta[1:0] <= {meta[0], inp};` – Morgan Jan 25 '13 at 21:10
  • In using a 16-bit software LFSR to generate 8-bit values, I've found it useful to use a "rotate right" for one half and "rotate left" for the other; each output value is the xor of the two halves. That approach will generate 65,535 out of the 65,536 possible pairs of output bytes (by contrast, looking just at 8 bits of the LFSR would only generate 512 of the possible pairs). – supercat Dec 03 '13 at 23:32