A short note on data dynamics

I’ve just checked that at least 400 proteins in Uniprot had their sequence updated since beginning of this year. It could be more, but I work on a subset of human proteins that consists of ca. 1500 proteins, so probably I see only portion of changes. So, instead of wrapping up the paper, I’m redoing analysis to make sure conclusions in the manuscript are still valid. Given that it happened already second time this year, I sense two areas I might look into next year. One would be data dynamics. Instead of focusing on data stability, I hope to look if one can predict/assess if newly updated data can significantly change conclusions from particular experiment, without redoing analysis from the scratch (at certain point computing becomes expensive). The second area would be identification of manuscripts with obsolete conclusions – a good exercise for text mining skills I’m just acquiring.