The genome sequence of the soft-rot fungus Penicillium purpurogenum reveals a high gene dosage for lignocellulolytic enzymes


The high lignocellulolytic activity displayed by the soft-rot fungus Penicillium purpurogenum has made it a target for the study of novel lignocellulolytic enzymes. We have obtained a reference genome of 36.2 Mb of non-redundant sequence (11,057 protein-coding genes). The 49 largest scaffolds cover 90% of the assembly, and Core Eukaryotic Genes Mapping Approach (CEGMA) analysis reveals that our assembly captures almost all protein-coding genes. RNA-seq was performed and 93.1% of the reads aligned to the assembled genome. These data, plus the independent sequencing of a set of genes of lignocellulose-degrading enzymes, validate the quality of the genome sequence. P. purpurogenum shows a higher number of proteins with CAZy motifs, transcription factors and transporters as compared to other sequenced Penicillia. These results demonstrate the great potential for lignocellulolytic activity of this fungus and the possible use of its enzymes in related industrial applications.