New

What Happens When Apps Collect Too Much User Data?

Name, email, phone number, ID number—these are all examples of user data that apps can gather freely. Depending on the allowed permissions, apps can also collect location data, contacts, SMS messages, browsing history, and even media files. While much of this data is necessary for the apps to function, many developers overreach, collecting data they don’t truly need.

This then begs the question, how much data is too much?

TikTok’s case

A few years ago, TikTok was found to be reading users’ clipboard data, even when they weren’t actively pasting anything. The company claimed it was an "anti-spam feature" designed to detect repeated content being pasted.

That's good for them, but what about the privacy and security of all the users who have passwords, links, and other sensitive text stored on their clipboards?

This is a great example of an app collecting too much user data. Reading clipboard data is not core to the app's functionality, and TikTok didn’t have explicit consent from users to read this data.

An industry-wide problem

Unfortunately, this is not a unique case. And while it’s easy to point fingers at TikTok, Facebook, Instagram, and the big social media apps, they’re not the only culprits.

Data has become the modern-day currency, and the allure for developers to collect more data than necessary is just too much.

In this article we will be looking at genuine reasons for user data collection, the motivation for excessive data collection, why it’s dangerous, and best practices for responsible data collection.

Remember, it’s no longer just a case of ethics. Now, there are laws and regulations governing how businesses collect and handle user data.

Legitimate reasons that apps need user data

It’s not uncommon for developers to justify excessive data collection with "improving the user experience." Still, a lot of apps have genuine reasons for collecting user data. These include:

Authentication and security

Name and email address are common data used for authentication which is critical for security. Additional measures such as two-factor authentication may require a user’s phone number to send the code.

Apps may also need device information such as the model and operating system to create a database of trusted devices. They may also track login times and IP addresses to help notify the user of abnormal behavior––say a login attempt at an odd time from an unknown IP.

User experience optimization

Any good app involves analytics-driven optimization to give users the best experience. This means tracking digital customer journey and user behavior data, such as screen views, button clicks, time spent on different sections, and navigation patterns. The developer can then use this data, for instance, to determine the most useful features in the app and optimize them accordingly.

Troubleshooting performance issues

It’s normal for apps to collect technical performance data, such as crash reports, app load times, screen rendering speeds, and battery usage. This data is useful for fixing performance bugs and ensuring the app runs smoothly even without the customer launching support tickets. Similarly, businesses use call center analytics to track system performance, optimize response times, and ensure seamless customer interactions.

Compliance and legal requirements

Depending on the industry, apps can collect data to satisfy regulatory requirements. For instance, financial institutions like banks and crypto exchanges must adhere to KYC (Know Your Customer) and AML (Anti-Money Laundering) laws. This means collecting additional personal information such as social security numbers, addresses, and photos.

Reasons apps collect too much user data

Advertising and monetization

They say money is the root of all evil and the statement holds true for data collection. Not all apps can rely on subscriptions to generate revenue. Some rely on ads.

The more user data an app collects, the more targeted ads it can push, which translates to more revenue. This is how data with legitimate use for ux optimization ends up being weaponized to create user profiles for easy targeting.

Algorithm and AI training

This is a new problem. The rising demand for AI-powered solutions has increased the value of user data. App owners have a bigger incentive to engage in unconsented data collection. Meta, X, and LinkedIn have all been found training their AI models with user data without consent.

Staying ahead of competitors

The drive to stay ahead can lead some app owners to engage in unethical data collection practices—not just on their own users but also on those using competing products. Facebook, always the poster boy for unethical data practices, was found to be using its VPN app (Onavo Protect) to identify other apps its users were actively using.

Government and surveillance cooperation

It’s not uncommon for government and law enforcement authorities to try and get app owners to secretly spy on their users. The UK government recently made headlines after demanding that Apple decrypt its cloud data to provide access to users’ private messages, multimedia, and other files.

Data hoarding

Sometimes, the app developer doesn’t have any immediate use for the excess data. They just collect and store it thinking it will be useful later.

The dangers of collecting too much user data

What starts as a simple attempt to improve user experience, can quickly spiral into serious privacy and security risks with negative implications for both the user and the app owner.

1. Privacy violation
The most obvious danger of collecting too much user data is invading people’s privacy. It’s a thin line between adding convenience to people’s lives through personalized ads and invading their privacy. And perhaps no story demonstrates this better than the one of a Minnesota dad who learned about his teenage daughter’s pregnancy thanks to Target’s pregnancy prediction algorithm.

Furious that his daughter was receiving baby product coupons in the mail, the father stormed a Target store accusing the manager of encouraging teen pregnancy. As it turns out, the company’s predictive model had identified the daughter as expecting based on her purchases of items like unscented lotion, supplements, and cotton balls.

Do you see the problem? When apps collect too much data, they risk exposing deeply private details in ways the data owner didn’t consent to. And the worst part? People often don’t realize how much they’re revealing about themselves until the data comes back to haunt them.

2. Security breaches

The more data an app stores, the more of a target it becomes for hackers. Sadly, data security is not a top priority for most app owners. They’re more interested in leveraging the data to further their business.

Again, users have no idea of the inherent risk posed by a particular app.

Even if the app is breached users won’t know that their data is exposed until a), they’re victims of a related attack, say the hackers use the stolen data to impersonate them online. Or b), the user checks if their information is on the dark web.

Another problem with collecting too much user data is that it has been proven to bypass security measures such as anonymization. In one study, the data was stripped of personal identifiers like name and email address. Still, the researchers reverse engineered the data set and re-identified the individuals using additional data like age, gender, and marital status.

3. Bias and discrimination

Collecting too much data can encourage algorithmic bias which is especially dangerous in areas like hiring, credit scoring, and law enforcement. This threat will only grow bigger as people increasingly embrace AI-assisted analytics for various uses.

One of the most infamous cases of algorithmic bias came from Amazon's AI Hiring Tool, which was trained on past resumes to identify strong job candidates. The problem? Most of the previous applicants were men. As a result, the algorithm learned to discriminate against women, systematically downranking resumes that contained words like "women’s chess club" or references to female-led organizations.

On the same note, predictive policing algorithms have been criticized for unfairly targeting low-income and minority communities not because of an explicit intent to discriminate, but because of how historical crime data was used.

4. Legal troubles

Excessive data collection can have legal consequences. Multiple regulations have been created to govern how businesses should collect and handle user data. They vary depending on the location, but the GDPR is the most comprehensive and far-reaching. The US has yet to develop a federal data protection regulation, but multiple states have established their own, starting with California (CCRA), Virginia (CDPA), Colorado (CPA), Utah (UCPA), and Connecticut (CTDPA).

Other laws and regulations to keep in mind include the UK Data Protection Act, the Children’s Online Privacy Protection Act (COPPA), and HIPAA which governs the collection and handling of sensitive medical data in the US.

Businesses that fail to comply with these regulations risk hefty fines, legal battles, and bans from operating in certain regions.

5. User trust erosion

Users have become more aware of data privacy. They continuously question the necessity of every piece of data they give to apps.

Once a user establishes that an app is tracking them too aggressively, misleading them about data usage, or sharing their information with third parties, trust is gone and they start looking for alternatives.

Meta’s WhatsApp had its moment in 2021 after users discovered that the app would start sharing data with Facebook. This news led to a mass exodus of users into competing apps, including Signal and Telegram, which positioned themselves as privacy-driven alternatives.

WhatsApp has enough users to mitigate the damaged reputation without significant revenue loss. Smaller apps, on the other hand, may not be as lucky.

6. The hidden cost for developers

Beyond the legal, ethical, and security risks, excessive data collection also comes with a hidden financial cost. Storing large amounts of user data requires more servers, security infrastructure, and compliance resources. Companies that hoard data indefinitely spend more on storage and security without any real benefit.

In 2018, Microsoft had to delete petabytes of old telemetry data because it was too costly and useless.

Ethical responsibilities of developers to prevent excess data collection

App developers have the biggest role in ensuring data collection is ethical and doesn’t cross user boundaries. Most of these responsibilities are founded on ethics but are also bound by law thanks to the mentioned laws and regulations.

Here are some ways developers can prevent excessive data collection while still ensuring great user experience:

1. Transparency and consent

In very clear terms, the app owner should inform their users about the data collected, how it will be used, and if it’s shared with third parties. There’s this old trick where companies try to bury dubious collection practices under long privacy policies they know users won't read. Don’t try that. Also, don’t resort to legal and technical jargon such that even those reading the policy don’t understand what they agree to.

The ethical approach is simple:

Use clear, straightforward language to explain data collection
Obtain explicit consent from users before collecting data
Request permissions only when necessary, and not all at once
Give users an easy way to opt out without making them dig through settings menus

2. Data minimization

Only collect data that is necessary for the app to function properly. If you don’t need it right now, you won’t need it in the future. There’s no point in hoarding the data.

As a rule of thumb, developers need to answer these 3 questions honestly:

Do we really need this data for the app to function?
Are we storing the data longer than necessary
What would the implications be for us and our customers if we suffered a data breach tomorrow?

3. Data security

If you must collect sensitive customer data, you are responsible for protecting it. This starts with how you collect the data and continues with how you transmit and store it.

Some tips to secure user data include:

Depersonalizing the data e.g. through aggregation or anonymizing identifiers

Encryption both in transit and at rest
Implementing strong access control protocols. SSO and multi-factor authentication are not nice to have, but rather critical security features for modern-day applications
Regularly backing up critical data, using solutions designed to backup Nutanix, VMWare or Microsoft 365 environments for quick recovery in case of data loss or cyber incidents.

What if a data breach still occurs despite employing security best practices? Take responsibility. Notify everybody you believe has been affected. It’s an important part of retaining user trust. At least you were honest about it.

It also affords them a chance to take the necessary steps to protect themselves. Businesses go the extra step to pay for the victims’ credit monitoring, so they're notified if there’s an attempt to steal their identity and commit fraud.

4. Give users control over their data

Developers have a key responsibility to allow users manage, delete, and control their data. Unfortunately, many companies deliberately make this difficult, by hiding account deletion options or requiring a series of frustrating steps to disable tracking.

The best thing to do is:

Build systems that allow users to delete their data permanently with a single action
Provide easy-to-access privacy settings that don’t require digging through multiple menus
Offer clear opt-out mechanisms for tracking and data sharing

5. Avoid manipulative practices

Some developers use deceptive designs to trick users into unknowingly giving their data. A classic example is pre-checked permission boxes that enroll users for tracking without their realizing it. Another is web apps that default to “Accept All” cookies, forcing users to click through multiple steps to reject tracking. The hope is that users won’t bother adjusting settings, and their data can be collected with minimal resistance.

LinkedIn tried it last year, but it backfired. The networking app was found to have automatically opted in its users for AI-training data collection unless you manually opted out. The right way to do it would have been to inform users of the data collection and have them opt-in.

Conclusion

At a time when most of our lives are lived in the digital world, the data you collect could be a key differentiator. It’s, therefore, understandable why some app owners may resort to unethical practices to obtain the data. And it will only get worse now that we are ushering in the age of AI. Everybody is in a race to find data to train AI systems.

Amidst all this, it’s important to remember our ethical and moral responsibility. Don’t let the instant gains shortsight you into forgetting all the negative implications of violating your users’ digital rights. Also, guess what’s more fulfilling than having a high-income generating app? A loyal and engaged community of users that trust you.

I’m interested to hear your thoughts on invasive data collection. Have you ever been surprised by how much data an app collects? Share your thoughts in the comments. If you found this post insightful, don’t forget to leave a like!

Back to Listing

credit:

What Happens When Apps Collect Too Much User Data?

TikTok’s case

An industry-wide problem

Legitimate reasons that apps need user data

Reasons apps collect too much user data

The dangers of collecting too much user data

Ethical responsibilities of developers to prevent excess data collection

Conclusion

Recent

Child Labor Is A-OK With Labor Secretary Read More

Survey: Global Consumers Prioritize Personalization And Security In AI Home Appliances Read More

Johnson Likens Himself To Second Coming Of Harold Washington Read More