Overview
This spike investigated whether South African grocery retailers expose public APIs for barcode-based product lookup or catalog search with pricing. The goal was to determine which retailers can serve as data sources for the Pantry App's barcode scan feature, enabling users to scan a product barcode and retrieve its name, price, and retailer availability.
Five retailers were probed (Woolworths, Checkers/Sixty60, Pick n Pay, SPAR, Makro) along with two international barcode databases (Open Food Facts, UPC Item DB). The investigation also covered third-party aggregators and community scraping projects.
Deliverables
| ID |
Title |
Effort |
Status |
Owner |
| BD-1 |
Probe Woolworths product API |
2h |
Done |
Backend Dev |
| BD-2 |
Probe Checkers/Sixty60 API |
2h |
Blocked (WAF) |
Backend Dev |
| BD-3 |
Probe Pick n Pay API |
2h |
Blocked (SPA) |
Backend Dev |
| BD-4 |
Probe SPAR API |
1h |
No API exists |
Backend Dev |
| BD-5 |
Test Open Food Facts for SA barcodes |
1h |
Low coverage |
Backend Dev |
| BD-6 |
Test UPC Item DB for SA barcodes |
1h |
Zero coverage |
Backend Dev |
| BD-7 |
Build Woolworths search service script |
2h |
Done |
Backend Dev |
| BD-8 |
Build Open Food Facts lookup service script |
1h |
Done |
Backend Dev |
| BD-9 |
Research third-party aggregators (Trundler) |
1h |
Done |
Backend Dev |
| BD-10 |
Document findings and recommendations |
2h |
Done |
Business Analyst |
| BD-11 |
Create spike report |
1h |
Done |
Business Analyst |
Retailer Results
| Retailer |
Barcode Lookup |
Catalog Search |
Price (ZAR) |
Auth |
Confidence |
| Woolworths |
No |
Yes |
Yes |
None |
High |
| Checkers/Sixty60 |
No |
Blocked |
N/A |
WAF + Auth |
Low |
| Pick n Pay |
No |
Blocked |
N/A |
SPA Auth |
Low |
| SPAR |
No |
No |
No |
N/A |
None |
| Open Food Facts |
Partial |
Yes |
No |
None |
Low |
| UPC Item DB |
No |
Yes |
No |
None |
Low |
Key Decisions
- Primary data source: Woolworths via Constructor.io. This is the only retailer with a working, unauthenticated search API that returns ZAR pricing. It will serve as the primary product catalog and price source for MVP.
- Barcode resolution: local cache + user input. Since no retailer supports barcode-to-product lookup, we will build a local barcode mapping table in Supabase. Users scan a barcode, and if the mapping is unknown, they enter the product name once. Future scans of the same barcode resolve instantly.
- Seed data from Open Food Facts. Pre-populate the barcode cache with ~8,800 SA products from Open Food Facts. This gives partial coverage without manual entry for common items.
- Defer multi-retailer price comparison. Trundler API (paid) covers Woolworths, PnP, and Makro. This feature should be evaluated in Sprint 2+ once the core barcode flow is validated.
- Abstract retailer integrations behind a service layer. Since API access may change at any time, the architecture must allow swapping or adding retailer backends without changing the scan flow.
Technical Notes
Woolworths Constructor.io Integration
The Woolworths search API is powered by Constructor.io. Two endpoints are available:
- Search:
https://ac.cnstrc.com/search/{query}?key=key_SsbVHddjxFcZQ9uI§ion=Products
- Autocomplete:
https://ac.cnstrc.com/autocomplete/{query}?key=key_SsbVHddjxFcZQ9uI
Response data includes: product name, brand, ratings, images, and price fields (p10, p30, p60). No authentication is required. The API key is embedded in the Woolworths public frontend.
Proposed Data Model
barcodes
- id (uuid)
- ean (text, unique) -- EAN-13 barcode
- product_name (text) -- canonical product name
- brand (text, nullable)
- category (text, nullable)
- source (text) -- "user", "openfoodfacts", "manual"
- created_at, updated_at
prices
- id (uuid)
- barcode_id (fk)
- retailer (text) -- "woolworths", "pnp", etc.
- price_zar (numeric)
- unit (text, nullable) -- "each", "per kg", "per litre"
- fetched_at (timestamptz)
- source (text) -- "constructor_io", "trundler", "manual"
price_history
- id (uuid)
- barcode_id (fk)
- retailer (text)
- price_zar (numeric)
- recorded_at (timestamptz)
Barcode Scan Flow (Proposed)
- Camera captures barcode image
- Client-side barcode detection (ZXing or QuaggaJS) extracts EAN-13
- Query local
barcodes table for product name
- If found: search Woolworths API for current price, display to user
- If not found: prompt user for product name, save mapping, then fetch price
- Cache price in
prices table with 24-hour TTL
Outcomes
What We Learned
- No SA retailer offers a public barcode-to-product API. Barcode resolution must be built in-house.
- Woolworths is the only viable free data source for product search and ZAR pricing.
- Checkers and Pick n Pay have APIs internally but protect them with WAF and SPA rendering respectively.
- Open Food Facts can seed a barcode cache (~8,800 SA products) but lacks pricing data.
- Trundler is the best paid option for multi-retailer price comparison (Woolworths, PnP, Makro).
- Community projects (shoprite-miner, get-my-groceries) confirm that scraping is possible but fragile.
Recommendation Summary
For MVP, use Woolworths Constructor.io as the primary price source. Build a local barcode mapping cache seeded with Open Food Facts data. Let users contribute barcode-to-product mappings via the scan flow. Plan for Trundler integration in Sprint 2+ for multi-retailer comparison.
Open Issues
- ToS Risk (High): Using the Constructor.io API key may violate Woolworths' Terms of Service. The key is public but not officially documented for third-party use. Mitigation: consult legal; consider approaching Woolworths for a partnership.
- API Stability (Medium): The API key or endpoint structure could change without notice. Mitigation: abstract behind a service layer; implement health checks and alerting.
- Rate Limiting (Medium): No rate limits were observed, but they may be introduced. Mitigation: cache responses with 24-hour TTL; batch requests where possible.
- Barcode Coverage Gap (Medium): Open Food Facts covers only ~8,800 SA products. Most barcodes will initially require manual user input. Mitigation: the cache grows over time as users scan and confirm products.
- Price Accuracy (Low): API prices reflect online/web prices, which may differ from in-store. Mitigation: label prices as "estimated online price" in the UI.
- Trundler Budget (Low): Multi-retailer comparison via Trundler requires a paid plan. Pricing TBD. Mitigation: defer to Sprint 2+; validate demand for multi-retailer comparison first.