Source of Data

Bad data source: A data source that is not reliable, original, comprehensive, current, and cited (ROCCC).

Good data source: A data source that is reliable, original, comprehensive, current, and cited (ROCCC).

Internal data: Data that lives within a company’s own systems.

External data: Data that lives and is generated outside of an organization.

Open data: Data that is available to the public.

First-party data: Data collected by an individual or group using their own resources.

Second-party data: Data collected by a group directly from its audience and then sold.

Third-party data: Data provided from outside sources who didn’t collect it directly.

Biases in Collecting Data

Bias: A conscious or subconscious preference in favor of or against a person, group of people, or thing

Different biases

Confirmation bias: The tendency to search for or interpret information in a way that confirms pre-existing beliefs.

Data bias: When a preference in favor of or against a person, group of people, or thing systematically skews data analysis results in a certain direction.

Observer bias/Experimenter bias: The tendency for different people to observe things differently.

Interpretation bias: The tendency to interpret ambiguous situations in a positive or negative way.

Sampling bias: Over-representing or under-representing certain members of a population as a result of working with a sample that is not representative of the population as a whole.

Population: In data analytics, all possible data values in a dataset.

Sample: In data analytics, a segment of a population that is representative of the entire population.

Unbiased sampling: When the sample of the population being measured is representative of the population as a whole.

Data Privacy and Data Ethics

Data privacy: Preserving a data subject’s information any time a data transaction occurs.

Data ethics: Well-founded standards of right and wrong that dictate how data is collected, shared, and used.

Ethics: Well-founded standards of right and wrong that prescribe what humans ought to do, usually in terms of rights, obligations, benefits to society, fairness, or specific virtues.

Consent: The aspect of data ethics that presumes an individual’s right to know how and why their personal data will be used before agreeing to provide it.

Currency: The aspect of data ethics that presumes individuals should be aware of financial transactions resulting from the sue of their personal data and the scale of those transactions.

Openness: The aspect of data ethics that promotes the free access, usage, and sharing of data.

Ownership: The aspect of data ethics that presumes individuals own the raw data they provide and have primary control over its usage, processing ,and sharing.

Transaction transparency: The aspect of data ethics that presumes all data-processing activities and algorithms should be explainable and understood by the individual who provides the data.

Data anonymization: The process of protecting people’s private or sensitive data by eliminating identifying information.

Data governance: A process for ensuring the formal management of a company’s data assets.

Data interoperability: The ability to integrate data from multiple sources and a key factor in the successful use of open data among companies and governments.

results matching ""

    No results matching ""