What makes data real?

The beautiful images of galaxies, nebulas, and other astronomical objects produced by radio telescopes have been processed several times and colorized before we see them, but we still consider these images to be real and not synthetic.

So, what makes data real? Real data are data that have been generated by a process that is appropriately connected to real phenomena, where the terms “appropriately connected” and “real” are defined by the relevant research community. For example, we can say that an MRI image of the brain is real because it has been produced by a process that is appropriately connected to a real brain. However, sometimes MRI machines produce images that radiologists classify as (unreal) artifacts because they have been produced, for example, by the scanner itself or by the patient’s movements.

Referring to data as “real” does not necessarily entail a commitment to a physicalist notion of reality. Data could be about physical, chemical, biological, social, or psychological phenomena. For example, we would consider data concerning biodiversity, stock prices, suicidal ideation, or cultural taboos to be real data, even though the phenomena they refer to cannot be equated with specific physical objects. The data could be about things we cannot directly observe, such as electrons, quarks, entropy, or dark matter. What matters most is that the relevant scientific community considers the data to be about real phenomena.

Read more at the PNAS (Proceedings of the National Academy of Sciences of the United States of America)