What happens when we use language, not as a communicative medium for humans, but as training data for an AI system? question, it’s helpful to clarify what we mean by data.
What Is Data?
As a data ethicist, I push back on definitions of data as “facts” because many things that are data are not factual (such as inferences for starters). Instead, I think about Rob Kitchin’s definition of data as a representation of a phenomenon.
We can (and do!) turn a lot of things into data and data france whatsapp number data has some unique characteristics that make it useful. Philosopher C. Thi Nguyen describes data’s power as a function of its universality and portability, as something we can measure, collect, and exchange. This comes at the expense of other things, such as context and the devaluation of concepts that don’t fit neatly into being measured, collected, and exchanged. These are the limits of data as Nguyen explains:
“We gain portability and aggregability at the price of context-sensitivity and nuance. What’s missing from data? Data is designed to be usable and comprehensible by very different people from very different contexts and backgrounds. So data collection procedures tend to filter out highly context-based understanding.”