In the first CSV, a line is written for each PDB ID code, providing a comprehensive set of information. This section includes details related to the protein, such as the PDB ID, title of the PDB file, protein description, number of subunits, subunits ID - referred as chain -, and the number of residues for each subunit. Subsequently, it indicates whether it is a complex. Following this, information about discarded ligands - elements in the blacklist bonded to the protein - and branched molecules their names, types, functions, and the presence of a covalent bond is provided. Next, ligand information is presented, including the name, type, functions, and the presence of a covalent bond. The final segment covers mutation information, specifying the number of mutations, their location, identity percentage, and gaps.
In the second CSV, a line is written for each entity bonded to a protein. It is straightforward, containing the ID of the protein, the bonded molecule, its name, type, function, and whether it is covalently bonded and, if so, with which residue. Additionally, if it is a glycosylation, that information is also included.