Instacart has officially pulled the plug on its controversial AI-driven pricing experiments, effectively deprecating the Eversight dynamic pricing API for consumer-facing grocery costs. The decision comes after a Consumer Reports investigation revealed that the platform was running live A/B tests charging users different prices for identical goods—with variances of up to 23% based on the user, not the market.
What's New
On December 22, 2025, Instacart announced it would immediately terminate all item-level price testing using Eversight technology. The company stated that "retailers will no longer be able to use Eversight technology to run item price tests on Instacart."
The new "same store, same price" policy means that if two families shop for the same items, at the same time, from the same store location, they will now see identical prices. Period.
This rollback follows mounting pressure from consumer advocacy groups and an FTC probe into the company's pricing practices. Notably, this comes just a week after Instacart agreed to a $60 million settlement with the FTC over misleading promotions and undisclosed fees.
Key Features of the Deprecated System
Understanding what Instacart was actually doing helps explain why this became such a flashpoint:
- Randomized price buckets: The Eversight system would randomly assign users to different price groups. At a Minnesota Target, researchers found customers split into seven distinct price groups, paying anywhere from $81.24 to $86.78 for identical carts.
- Wide variance testing: The Groundwork Collaborative found the average price gap between test groups was approximately 13%, with extremes reaching 23%.
- Broad retailer adoption: Tests were observed across major chains including Albertsons, Costco, Kroger, Safeway, Sprouts, and Target—affecting nearly three-quarters of items analyzed.
- Behavioral data integration: While Instacart denied using personal data for core pricing, the company acknowledged that brands using Eversight's "Offer Innovation" tool could use behavioral metrics like "new-to-brand" flags to differentiate promotions.
The potential annual cost impact for heavily affected users was estimated at $1,200 according to Fox Business reporting on the Groundwork study.
For Developers: The Architecture Lessons
This isn't just a PR story—it's a critical case study in platform architecture decisions. Here's what developers and platform architects should take away:
The "Third Rail" Problem
A/B testing UI elements, recommendation algorithms, and even promotional offers is standard practice. But applying opaque variance to core pricing logic in essential goods marketplaces crosses a line that consumers—and regulators—won't accept. The distinction matters:
- Acceptable: Testing whether a "Buy 2, Save 10%" badge increases conversions
- Problematic: Testing whether User A will pay $4.79 while User B pays $3.99 for the same eggs
Transparency as Architecture
If you're building pricing systems, consider baking transparency into your architecture:
- Reason codes: Every price should carry metadata explaining why—"Inventory-clearance discount," "Competition match," "Seasonal adjustment"
- Hard constraints layer: Implement non-negotiable rules (max % change per window, category caps, fairness rules) that override model outputs
- Audit logs: Maintain per-transaction logs of input signals, model used, recommended price, and any overrides
The API Deprecation Pattern
For teams maintaining similar pricing APIs, note how Instacart handled the deprecation: immediate termination with a clear policy statement. When trust is broken, gradual sunset periods don't work—you need a clean break.
Comparison: Dynamic Pricing Approaches
| Approach | Transparency | Regulatory Risk | Use Case |
|---|---|---|---|
| Rule-based pricing | High | Low | Essential goods, regulated markets |
| Segmented list pricing | Medium | Low | B2B, regional variations |
| Demand-based (Uber-style) | Medium | Medium | Services, clear supply constraints |
| User-level A/B testing | Low | High | Avoid for essential goods |
| Personalized pricing | Very Low | Very High | Requires explicit disclosure |
The key insight: demand-based dynamic pricing (like Uber surge pricing) is generally accepted because the reason for price changes is visible and consistent across users. What Instacart was doing—random user-level variance with no visible justification—is categorically different.
Getting Started: Building Transparent Pricing Systems
If you're building or refactoring a marketplace pricing system, here's a practical starting point:
- Document pricing principles first: Before writing code, establish non-negotiables ("no discrimination by protected attributes," "caps on variance," "disclosure requirements")
- Separate decisioning from delivery: Your pricing API should be stateless, returning price + metadata including rule/model used and confidence bands
- Implement human-in-the-loop for sensitive categories: Let category managers override or approve model suggestions for essential items
- Build in fairness monitoring: Automatically flag when pricing outcomes show differential impact by user segment
- Use time buckets: Avoid per-request price changes; show "valid until" timestamps and smooth changes with max daily deltas
For reference architectures, Competera's documentation and Grid Dynamics' technical guides offer solid starting points for building explainable pricing systems.
Verdict
Instacart's Eversight deprecation marks a watershed moment for algorithmic pricing in consumer marketplaces. The message is clear: black-box optimization that treats users as experimental subjects rather than customers will face backlash—both from consumers and regulators.
For developers, this isn't about avoiding dynamic pricing entirely. It's about recognizing that pricing architecture is trust architecture. The systems we build encode values, and when those values prioritize extraction over transparency, the technical debt isn't just in the codebase—it's in the customer relationship.
Instacart acquired Eversight in 2022 with ambitions to revolutionize grocery pricing. Three years later, they're walking it back entirely. That's a $350 million acquisition effectively shelved because the implementation failed the most basic test: would you be comfortable if your customers knew exactly how this worked?
If the answer is no, you probably shouldn't build it.