SecSGFaC - Profiles format



A profile is a set of files with a name like "fami.p3" (where "i" is the family number, beginning by 1), and an additional file with name "nameProfile.txt". Both files are compressed in a zip file called "nameProfile.zip".


Each one of *.p3 files contains a group of sequences (in FASTA format) translated to secondary structure and without signal peptide. These sequences form a very good alignment subfamily. All sequences contained in these files form the complete profile.

The alphabet used to encode these sequences is: {H,E,L,X,E,Z,*,A,B,C}


The file "nameProfile.txt" contains as many lines as *.p3 files (subfamilies) our profile is composed of. Each line contains the file name (without extension), the number of sequences that it contains and the cut-off value for this subfamily.

Example: