u/Evening_Refuse_1893 — reddlx

Hello everyone!

I have some MAGs and predictions from Prokka, which I used for KofamScan. Now I want to cluster those gene names into pathways. It looks like KEGGDecoder is the right tool.

I have a question. The output from KofamScan looks like this:# gene name KO thrshld score E-value KO definition

#-------------------- ------ ------- ------ --------- ---------------------

* PROKKA_00001 K00304 111.17 115.8 5.1e-34 sarcosine oxidase, subunit delta [EC:1.5.3.24 1.5.3.1]

PROKKA_00001 K22085 112.40 85.3 9.2e-25 methylglutamate dehydrogenase subunit B [EC:1.5.99.5]

So, if I am not mistaken, I need to filter them using one of these criteria - thrshld, score, or E-value - before processing further.

Which one should I use: thrshld, score, or E-value?

Thanks! Maybe someone can suggest different tools or approaches. I would appreciate any help.

Best