You shipped the feature six weeks ago. The launch email went out. Engineering has moved on. The PM who owned it is two stories deep into something else. And someone asks the question that, by all rights, should have a clean answer:
"How's it doing?"
You open the dashboard. Adoption looks okay. Retention is... mixed? The chart goes up, then flattens. You honestly can't tell if this is what success was supposed to look like, because the target you set three months ago was "good adoption" and that's not a number.
Previously I argued that your analytics tool doesn't know what you shipped – it knows what users did, but not why you built it or what you expected to happen. That's the structural problem. But there’s another operational problem that lives on top of it:
Most PMs don't actually know what to measure when a feature ships.
Not because we lack tools. We have Mixpanel and Amplitude and Pendo and Heap. We have dashboards. We have funnels. We have cohort views.
What we lack is a clear answer to two questions:
-
What should I be measuring for this feature?
-
When the data comes in, how do I read it?
The question “how’s that feature performing?” assumes your team has already defined what "performing well” actually means. Most teams don't. They have charts. They don't have judgments.
Let's fix that.
The metric problem
The default metric for every shipped feature is some version of "is anyone using it?" Adoption percentage. Daily active users. Total events fired.
These tell you the feature exists in the world. They don't tell you whether it's working.
Here's the test. Your feature is being used by 312 users this week, up from 187 last week. Is that good?
If your honest answer is "I don't know, it depends," you're in the same place as most PMs I talk to. The number is real. The judgment is missing.
A feature can have rising adoption and still be failing. Users try it, find it doesn't solve their problem, and don't come back. The chart goes up; the feature dies. By the time retention shows the failure, you've already moved on to the next thing.
The reverse is also true. A feature can have low adoption and still be successful – for the small audience it was meant for. The dashboard looks weak; the strategy is sound.
The metric without the context is noise.
The five metrics that actually answer the question
For every feature you ship, you should be tracking five things. Not all of them come from dashboards. Some are decisions you make in advance.
1. Adoption rate. The percentage of eligible users – the ones who could plausibly use this feature – who try it within the first 30 days. The denominator matters more than people think. Measuring against all users hides whether the feature is reaching its target audience.
2. Activation rate. The percentage of eligible users who hit the value moment, not just the first click. Activation is the difference between "tried the saved filter" and "saved a filter and came back to it." Without activation, adoption is a vanity metric.
3. Retention curve. What percentage of activated users return in week 2, week 4, week 8. The shape matters more than the number. A retention curve that flattens is a healthy feature. A retention curve that decays toward zero is a feature that didn't earn its place.
4. Depth of use. How much of the feature's functionality is actually being used. If you shipped a feature with five options and 95% of users only use one, you have a depth problem. The other four options are tax.
5. Outcome correlation. Whether the engagement you're seeing actually moves the metric the feature was built to move. Adoption can be high, retention can be solid, and the feature can still fail this test.
Three things to notice about this list. It contains no vanity metrics – no total event count, no raw DAU, no page views. It requires you to know what you were trying to do before the feature shipped. And it's not what most analytics dashboards default to. Which is why most teams don't measure these by default.
Set baselines before the feature ships
The single highest-leverage thing a PM can do for feature engagement measurement happens before the feature ships, not after.
It's this: write down what success looks like in numbers, against the original story.
Not "we expect good adoption." Numbers. Ranges. Targets.
-
Adoption: we expect 15–25% of eligible users to try this within 30 days
-
Activation: of those, 40–60% to hit the save-and-return moment
-
Retention: week-4 retention of activated users above 35%
-
Depth: 60%+ of users to use at least three of the five options
-
Outcome: filter re-application time drops by 30–50%
You will be wrong about some of these. That's the point. The ranges are hypotheses. The numbers are what makes "is it working?" a question with an answer.
Three rules for setting baselines. Set ranges, not point estimates. "20% adoption" is a target you'll either miss by a hair or claim victory on by accident. "15–25% adoption" is a target you can read against. Write them down before launch. Pre-launch targets are honest. Post-hoc targets bend to whatever the data already shows. Add them to the story. The story is where the work originated, The targets belong with the story, not in a separate dashboard.
The 30/60/90 read
Once the feature is live, the data doesn't all arrive at the same time. Different metrics tell you different things at different points.
At 30 days, read the adoption shape. Is the curve rising, flattening, or already declining? Are you reaching the right audience? If adoption is below the low end of your range at day 30, the feature isn't reaching its target. The fix is rarely "more marketing." It's almost always "wrong audience, wrong moment, or wrong value framing."
At 60 days, read retention and depth. First-time use is mostly novelty. Week 4–8 retention tells you whether the feature actually solves a problem worth coming back for. Depth tells you whether users are finding the value or stopping at the obvious surface. Both are leading indicators of outcome.
At 90 days, read the outcome. This is where you find out whether engagement moved the metric the feature was built to move. The features that adopt well, retain well, and have no outcome impact are the most expensive features in your product. They feel like wins, but they aren't.
A note on patience: PMs who check engagement dashboards weekly and demand answers in two weeks tend to make worse decisions than PMs who check at 30/60/90 with discipline. You can't read a retention curve in week one. You can read it in week eight.
Working, iterate, kill
At 90 days, you owe the team a decision. Usually it’s one of three.
Working. The feature hits its adoption range, has a healthy retention curve, and shows measurable outcome impact. The decision is to maintain and consider expanding. Most PMs underweight this case – they keep tweaking a feature that's already working. Don't.
Iterate. The feature hits adoption but fails on retention or depth. Users try it and don't come back. This is the most common case, and it's the one most worth investing in. The diagnostic is: do you understand why they don't come back? If yes, iterate. If no, find out why before you iterate.
Kill. The feature fails adoption and there's a clear opportunity cost on the roadmap. This is the hardest call. The instinct is to give it more time, more marketing, more polish. Don't, if the data is honest. The cost of running a dying feature isn't just maintenance – it's the team's attention, support burden, and the implicit signal to users that this thing matters.
Kills are rare. Most teams iterate things they should kill. The 90-day discipline forces the conversation that "let's give it another quarter" would otherwise avoid.
Why most teams can't actually do this
Everything above is the right answer. Most teams can't do it.
The five metrics live in your analytics tool. The original targets live nowhere – in someone's head, maybe in a Confluence page, possibly in a Slack thread. The story that produced the work is in Jira and has been archived. The decisions about why this feature was built the way it was are scattered across three tools and two people, one of whom has left.
So when you sit down for the 90-day read, the right answer requires reassembling four things from four different places. Most PMs don't do this. They skip the read. The decision gets made implicitly by what comes onto the roadmap next.
The five-metric framework is correct. The reason it doesn't get used is structural: nothing connects the metric to the original intent.
What changes when stories carry their own targets
This is where the Atono argument lands.
The reason we built Living Stories isn't to make stories look prettier. It's to make the story the place where the original intent, the targets, the engagement signal, and the outcome data all live together – for the life of the feature, not just until ship.
When the targets you set pre-launch attach to the story, the 90-day read doesn't require reassembly. You open the story. The targets are there. The engagement data is there. The decision context – why you built it, what audience you were aiming at, what trade-offs you accepted – is there.
Summarized graphs that answer "how's this feature doing?" against your story doesn't have to guess what "doing well" means. The targets are explicit. The benchmark is your team's, not a generic one. The output is the actual judgment you would make if you had a free Monday to reassemble four tools.
This isn't a dashboard win. It's a decision-velocity win. You move from "let me look into that" to "here's where it stands" – for every shipped feature, every week.
Where to start
A practical sequence, regardless of tooling:
-
Pick one feature shipping in the next 30 days. Write down the five metrics with ranges, before launch.
-
Set a 90-day review on your calendar now.
-
At 90 days, run the read. Make a working / iterate / kill call, out loud, with the team.
-
After three of these reviews, audit how often the call you made was the right one in hindsight.
Do this for one quarter and the practice sticks. The hard part isn't the metrics. The hard part is committing to the read before the data exists.
For teams that want this connected to the work itself – targets attached to stories, signal flowing back to the story that produced it, showing you the results of what your team meant by "success" – Atono is built around exactly this. Get started free or book a 30-minute tour.
Shipping features is expensive. Not learning from them is worse. Measuring without judgment is the most expensive version of all.
Make your product work flow
Shared context from first decision to feature usage