SESCA: Structure-Based Empirical Spectrum Calculation Algorithm

Collaborations: Nykola Jones and Søren V. Hoffmann, ISA, Department of Physics & Astronomy, Aarhus University, DK

Funding: Alexander von Humboldt Foundation

Electronic circular dichroism (CD) spectroscopy is highly sensitive to changes in the backbone structure of proteins, and does not require special labelling, crystallization, or high protein concentrations. Therefore, it is one of the first methods applied to characterize proteins, and can be combined with other methods such as stopped-flow techniques to study the kinetics and reaction mechanism of proteins. SESCA is a computational method that allows the rapid and accurate prediction of the CD spectra of protein models from their three-dimensional structure. SESCA predictions are based on two things: the secondary structure composition of the proposed protein models, and a set of pre-calculated basis spectra (basis set). The Basis spectra encode the average CD contribution of secondary structure elements, derived from the known structures and CD spectra of an experimental reference protein set.

SESCA calculations allow a direct comparison between the measured CD spectrum of a target protein and the predicted CD spectra of model structures or structural ensembles for determining model quality. Although the CD spectrum of a typical globular protein can often be determined accurately from the secondary structure composition of its crystal structure, we applied several modifications to the original scheme to improve the prediction accuracy for short peptides and intrinsically disordered proteins. These modifications include scaling the spectrum intensity to account for normalization errors, basis spectra to address the contribution of amino acid side chains, and the use of structural ensembles to account for protein flexibility and conformation heterogeneity.

.

SESCA update on model validation

In our second publication (Nagy & Grubmüller, 2020a), we focus on the accuracy and precision of circular dicrhoism based model validation methods with respect to the experimental noise of the measured CD spectrum. This study allows us to determine typical deviations between the secondary structure signal and the measured CD spectrum. The updated version of SESCA uses typical CD deviations to estimate the expected secondary structure error of proposed protein models at a higher precision.

Bayesian secondary structure estimation

The precision of SESCA is enhanced further by using a Bayesian statistics approach to determine the likelihood of possible secondary structures of a target protein based on its measured CD spectrum. This likelihood is determined from the joint probability distribution of the two major CD deviations, namely, scaling errors and non-secondary-structure contributions. The Bayesian algorithm described in our third publication (Nagy & Grubmüller, 2020b) aids structural model validation by providing a more precise estimate on the protein secondary structure composition, and by allowing an easy comparison between the likelihood of proposed structural models, based on the estimated CD deviations of their predicted CD spectrum.

Download

SESCA Software Package Zip File

Publication

Nagy, G.; Hoffmann, S. V.; Jones, N. C.; Grubmüller, H.: Reference Data Set for Circular Dichroism Spectroscopy Comprised of Validated Intrinsically Disordered Protein Models. Applied Spectroscopy 78 (9), pp. 897 - 911 (2024)
Nagy, G.; Grubmüller, H.: Implementation of a Bayesian secondary structure estimation method for the SESCA circular dichroism analysis package. Computer Physics Communications 266, 108022 (2021)
Nagy, G.; Grubmueller, H.: How accurate is circular dichroism-based model validation? European Biophysics Journal 49 (6), pp. 497 - 510 (2020)
Nagy, G.; Igaev, M.; Hoffmann, S. V.; Jones, N. C.; Grubmüller, H.: SESCA: Predicting circular dichroism spectra from protein molecular structures. Journal of Chemical Theory and Computation 15 (9), pp. 5087 - 5102 (2019)
Go to Editor View