The Digital Personal Data Protection Act, 2023 (DPDP) is India’s first comprehensive data-protection law. It governs the processing of digital personal data — any data about an identifiable individual that’s collected or held digitally — and that net is wider than most teams expect. The moment your analytics ties a behaviour to a person, you’re inside its scope. This guide is a practical map of what that means for product analytics. It is not legal advice: the Act passed in 2023, its operational Rules are still being finalised, and the details that matter most to you depend on your situation. Treat this as a starting point, then confirm with counsel.
Who’s who under DPDP
DPDP uses its own vocabulary, and it maps closely to GDPR’s if you know that one:
- Data Principal — the individual the data is about (GDPR’s “data subject”).
- Data Fiduciary — whoever decides the purpose and means of processing (the “controller”). If you run the product and choose what to track, that’s you.
- Data Processor — anyone processing data on a Fiduciary’s behalf, like a hosted analytics vendor.
- Significant Data Fiduciary — a class the government can designate based on volume and sensitivity of data, with extra duties: a Data Protection Officer based in India, independent audits, and data-protection impact assessments.
The distinction matters for analytics: if you send events to a third-party cloud, that vendor is your Processor and you remain accountable as the Fiduciary. If you self-host, you keep the whole chain in-house.
Does your analytics even fall under it?
Often, yes. Web analytics that counts pageviews without identifiers sits closer to the edge, but product analytics exists precisely to connect actions to people — funnels, retention, per-user timelines. Once an event is linked to a logged-in user, an email, or a device bound to an identity, you’re processing personal data. The Act applies to processing within India, and also to processing outside India that’s connected to offering goods or services to people in India — so a foreign-hosted SaaS with Indian users isn’t automatically out of scope.
The most defensible event is the one you never needed to collect. Start from the questions you must answer, then capture only what those require.
Consent and notice come first
DPDP is built around consent. It must be free, specific, informed, unconditional, and unambiguous, with a clear affirmative action — no pre-ticked boxes, no bundling unrelated purposes. Before you ask, you give a plain-language notice describing what you’ll process and why. Crucially, a Data Principal can withdraw consent as easily as they gave it, and when they do, you stop and clean up.
For analytics, that translates into one hard requirement: don’t start collecting until you have a basis, and be able to stop on a dime. A consent-first SDK makes this practical — initialise with tracking denied so no listeners attach and no events fire, then opt in once the user agrees:
init(projectId, { apiKey, defaultTrackingConsent: 'denied' })
// only after the user agrees, via clear affirmative action:
optInTracking()
The Pug Web SDK works this way: with consent denied, manual track()
and identify() calls are dropped rather than secretly queued for later replay, and revoking consent tears
the listeners back down. The Act does allow a short list of legitimate uses that don’t need consent, but
behavioural analytics rarely fits them cleanly — so the safe default is consent.
Minimise, limit the purpose, and set retention
Three of the Act’s principles bite directly on how you instrument a product:
- Purpose limitation: process data only for the purpose you told the user about. Don’t quietly reuse analytics events to train a model or build ad profiles.
- Data minimisation: collect only what the purpose needs. Every extra trait on an event is extra risk.
- Storage limitation: keep data only as long as it’s needed, then erase it. Document your retention windows.
A good habit is to audit your tracking plan for personal and sensitive fields before they ship. Our free PII event auditor flags identifiers in an event schema in the browser, which is a fast first pass — though it’s a helper, not a compliance sign-off.
Children’s data is a hard line
DPDP treats anyone under 18 as a child and is unusually strict: processing a child’s data needs verifiable parental consent, and the Act prohibits tracking, behavioural monitoring, and targeted advertising directed at children. If your product may have under-18 users, behavioural analytics on them is not something you can switch on by default — it needs deliberate, separate handling, and in many cases simply shouldn’t run.
Rights mean you must be able to delete
Data Principals get rights to access a summary of their data, to correction and completion, to
erasure, and to grievance redressal. For an analytics stack, the erasure right is the one with teeth:
you need to actually find and delete a person’s events on request, and when consent is withdrawn or the purpose is
served. That’s far easier when you own the data store — with a self-hosted setup, the events live in your own database
where you can query and purge by identity. The SDK’s reset() clears identity on the client at logout, but
honouring an erasure request is a server-side operation against data you hold.
Security and breach notification
Fiduciaries must take reasonable security safeguards to prevent breaches, and must notify both the Data Protection Board of India and affected Data Principals if one occurs. The Board can impose significant penalties — the Act sets ceilings up to ₹250 crore per instance for serious failures, with a notable band for inadequate security safeguards. The practical takeaway for analytics: the less personal data you hold, and the more control you have over where it lives, the smaller your exposure.
Cross-border data and residency
DPDP takes a relatively open stance on cross-border transfers — they’re permitted except to countries the government specifically restricts (a negative-list approach) — but sector regulators may still impose localisation, and many Indian organisations prefer to keep data in-country regardless. The cleanest way to take the question off the table is to keep event data on infrastructure you control, in a region you choose. This is where self-hosting earns its keep: run the analytics backend yourself and events are written to your own ClickHouse, on your own servers, under your own access and retention rules.
A practical DPDP checklist for analytics
- Establish a basis: show a clear notice and collect consent before any tracking starts; default to off.
- Make withdrawal real: let users opt out as easily as they opted in, and stop collecting when they do.
- Minimise: capture only the events and traits your stated purpose needs — audit the plan for PII.
- Protect children: no behavioural tracking or targeted ads for under-18s; require parental consent.
- Be able to delete: have a way to find and erase a person’s events on request.
- Set retention: keep data only as long as needed, and document why.
- Control residency: self-host, or pick a region and vendor you’re comfortable with.
- Plan for breaches: have safeguards and a notification path to the Board and affected users.
Where open source and self-hosting fit
Open source helps on two fronts that DPDP cares about: you can read exactly how data is handled, and you can self-host so it stays on servers you control, in India if you choose. Pug is AGPL-3.0 and self-hostable for exactly this reason — events stay in your own infrastructure, the consent-first SDK keeps collection off until a user agrees, and you hold the data store where access, retention, and deletion actually happen. None of that is automatic compliance, and none of it is legal advice. But data minimisation, consent-first collection, and self-hosting give you a strong, defensible foundation — the same one that helps with GDPR and privacy-first analytics more broadly.