The production of massive amounts associated with data as a result of the ongoing ‘Big Data’ revolution has transformed data analysis. The availability of analysis tools and decreasing storage costs, allied with a drive-by business to leverage these datasets with purchased and publicly available data can bring insight and monetize this new resource. This has led to an unprecedented amount of data about the personal attributes of individuals being collected, stored, and lost. This data is valuable for evaluation of large populations, but there are a considerable number of drawbacks that information scientists and developers need to consider in order to use this data ethically.
Here are just a few considerations to take into account before ripping open the predictive toolsets from your cloud provider:
1 . Contextual Integrity
Data is gathered over different contexts which have different reasons and permissions for capture. Ensure that the data you capture is valid for that context plus cannot be misused for other purposes. There could be unintended side effects of mixing public and personal data. An example is notifying other parties associated with location data without consent, as there are numerous examples of stalkers using applications to track others.
2 . History Aggregation
History is an important part of many efforts to defining…
Read More on Dataflow