u/ab____________a

Optimised Sbox Implementation for AES 128

Hi everyone. I have implemented a basic AES 128 in Verilog in Vivado tool

My Sbox is a 256 entry Look up table type.

The operation gets completed in 10 rounds and 1 clock cycle for each round.

So in the Sub bytes step, as there 16 bytes it uses 16 instances of Sbox and in key expansion step it uses 4 instances of Sbox ,total it uses 20 S boxes. So S box is consuming huge hardware resources.

I heard that Sbox can be implemented using Galios field (2^8) and GF((2^4)^2). Can any one please suggest me a good resource to understand this implementation for Sbox.

Does this really reduce the hardware?

Thank you

reddit.com
u/ab____________a — 10 hours ago
▲ 2 r/vlsi

Searching VLSI jobs for entry level

What is the best way and strategy to be followed for searching jobs for entry level in India?

I am just seeing on LinkedIn and I understood it's not enough. Suggestions please

Thank you

reddit.com
u/ab____________a — 1 day ago

Register file implementation in industry standard RISC-V designs

Hi everyone, I have a doubt, can anyone please clear

How is the register file implemented in the industry standard 5 stage pipelined risc-V processors? Is it implemented with FF's or SRAM? If it is implemented with SRAM doesn't it take a clock cycle extra for register reading as SRAM can only do synchronous read?

Similarly how Instruction memory and data memory are implemented?

reddit.com
u/ab____________a — 4 days ago
▲ 0 r/chipdesign+1 crossposts

Can anyone tell if TCS NQT is only for Software or for VLSI hardware roles too

Can anyone please tell what is TCS NQT test. It is only for Software roles or for VLSI domain roles too.

If yes how to apply for it.

Thank you

reddit.com
u/ab____________a — 4 days ago
▲ 2 r/FPGA

Barrel shifter

I wrote a barrel shifter code with guard, round, sticky bits targeting Xilinx zedboard zynq 7000 FPGA. Clock is 100M (10ns time period). My design is failing the timing. As my circuit is purely combinational I used the constraint set Max delay 10ns all inputs to all outputs as I don't have a clock for my design. Can anyone please tell if it is possible to design a 24bit barrel shifter with the delay of 10ns or pipelining is the only solution for this.

Thank you

Iam attaching my gode below

module barrel_shifter_grs #(

parameter WIDTH = 24

)(

input logic [WIDTH-1:0] data_in,

input logic [7:0] shift,

output logic [WIDTH-1:0] data_out,

output logic G,

output logic R,

output logic S

);

always_comb begin

data_out = '0;

G = 1'b0;

R = 1'b0;

S = 1'b0;

if (shift >= WIDTH) begin

data_out = '0;

G = 1'b0;

R = 1'b0;

S = |data_in;

end

else begin

data_out = data_in >> shift[4:0];

G = (shift >= 1) ? data_in[shift - 1] : 1'b0;

R = (shift >= 2) ? data_in[shift - 2] : 1'b0;

S = (shift >= 3)

? |(data_in & ({WIDTH{1'b1}} >> (WIDTH - (shift - 2))))

: 1'b0;

end

end

endmodule

reddit.com
u/ab____________a — 5 days ago
▲ 5 r/cryptography+1 crossposts

Help needed for optimising S box for AES 128 encryption

I have implemented a very basic AES 128 where each round takes 1 clock cycle, ie for Sub bytes, shift rows, mix columns, add round key in a single cycle. Total 10 clock cycles for 10 rounds. My Sbox is precomputed, i.e like an LUT for all 256 entries. For each round Sub bytes takes 16 instances of Sbox, one instance for each bytes replacement, total 16 bytes and 4 for add round key step.

I want to optimise my AES, came to know that it can be optimised by using some GF((2^4)^2) instead of GF(2^8).

Can any one give a good source for understanding this approach please. Also any other suggestions for better implementations for S box optimisation or for entire AES

Thank you

reddit.com
u/ab____________a — 5 days ago

I have one year experience as an RTL design engineer. I need a referral for this role. Can any one please refer me

Job id 76998

Thank you

u/ab____________a — 8 days ago

I wrote a very basic AES 128 implementation. Iam using a precomputed Sbox. Like hardcoded all the values using a case statement.For AES 128, no of rounds =10. My design takes 1 clock cycle for each round, total 11 clock cycles to produce cipher text.

It uses 20 sbox instances (16 for sub bytes operation, 1 sbox instances for each byte, total 16 byes so 16 instances and 4 instances for key expansion).

Any suggestions to do it better?

Which approach is better in between pre computing Sbox values or implementing it mathematically ?

Is it better to reduce the no of Sbox instances?

If iam doing synthesis using Yosys, to which macro I need to map the Sbox.

Any more suggestions to make the design better

reddit.com
u/ab____________a — 12 days ago

Why didn't people question the TRS govt enough for not laying a single new metro line in their regime?

I don't understand their contributions to Hyderabad, at least for public transportation.

reddit.com
u/ab____________a — 17 days ago