This seems like a tough build vs buy sell. For a lot (most?) companies, the search/recommendation system isn't necessarily optimized for the customer's search. Instead, it's a way to maximize revenue via preferred placement or inject ads. This almost always leads to a gigantic if/else chain of bespoke business analyst driven decisions for the marketplace.
How are you going to allow folks to influence the system? Or do you see your system integrated behind their pseudo-recommendation engine?
tullie 154 days ago [-]
The build vs buy decision does come up, but like you mentioned, the product direction of Shaped is to be primitives for search and recommendation, allowing users that want to build use Shaped to empower them to build quicker (e.g. integrated behind their psudo-recommendation engine). In truth we have multiple abstractions to Shaped allowing more technical teams to integrate like this, or less technical ones to have more of an end-to-end integration experience.
The other related market trend we think about here: recommendation is going through a similar journey to what search did 10 years ago. Search at some point was more build leaning, but over time the technology became democratized and then companies like Elastic and Algolia had offerings that pushed search to lean towards buy. We're seeing recommendations going through the same revolution now that the technologies and system design (e.g. 4 stage recommenders) are more solidified. It's the data that makes these systems unique between companies not the infrastructure or algorithms.
rwb_1912 154 days ago [-]
Dan here. On bespoke business decisions—these are handled through SQL and model configs. This involves deciding what items to filter, how to set the objective function (what to optimize the model for), and controlling exploration and diversity in the results.
Setting the objective function is often the most challenging. Different teams may prioritize different objectives and often it requires balancing multiple at once! For instance, how does a company think about the types of user engagement and long-term metrics like retention? A model optimized for clicks might be worse for retention in some cases, but not in others. Ultimately, we A/B test to find out. Surprises and counter-intuitive results are common!
authorfly 154 days ago [-]
Can you tell me what industry your viewpoint is from? My viewpoint from another industry is also about maximizing revenue - but if/else statements have no part, it's data-derived.
esafak 154 days ago [-]
A company doing that doesn't understand LTV.
AnujNayyar 154 days ago [-]
Congratulations on the launch. We've weighed up algolia, in house, type-sense etc and so I'd would have been very keen to know more, but asking for us to integrate before knowing the pricing is a tough sell.
Would highly recommend having at least an estimated pricing calculator so we can determine if its worth our time to install.
tullie 154 days ago [-]
Thank you!
Would love to chat, we've had several customers come over from Algolia and they've seen significant uplift. I can share more if you want to message me at tullie@shaped.ai.
Our pricing is competitive with Algolia's to give you an idea there. We really wanted to get pricing calculator done get before this post but ran out of time. Keep an eye out over the next month for it to come up!
password4321 153 days ago [-]
> pricing is competitive with Algolia's
Not sure if good or bad based on HN complaints about Algolia's pricing
Congrats (from a fellow Melbournian) on the launch. I used to work at Coles and Catch leading their online businesses, search and product recommendations was a big part of it. We had over tens of thousands of SKUs. It's harder when trying to do it with dynamic inventory locations and quantities (Coles had over 850 online stores that my team managed). We were looking at Algolia but it wasn't quite there yet (back then). I don't think anyone has solved for that as yet (I left Coles >5 years ago). Curious to hear how you would approach it.
These days I'm the founder of a circular economy marketplace for South Asian ethnic clothing and items - PurvX. The current search is terrible (due to the low-code platform it's on), will be keeping Shaped on my radar when we re-platform.
philip1209 154 days ago [-]
How do you measure quality? And, can users game that quality?
I think that's the hardest thing on any recommendation or search system. It's really hard to do without using money as a neutral measure of value. And, without a good measure of quality - it's unclear that the system is optimizing the right metrics (without cannibalizing others).
tullie 154 days ago [-]
Thanks for the first question!
We run online A/B tests to objectively measure quality against our ranking algorithms and other baselines. As you mentioned it's crucial that the measure of quality for these tests chosen is fair and correlates with the topline business objective. E.g. if you just evaluate clicks then the system will show click-baity content and overall perform worse.
To handle this, we make it really easy to define different objectives and experiment with how it changes results. So although we don't claim to solve the issue directly, we believe that if users can quickly experiment with different proxy objectives, that'll be able to find the one that correlates with their topline objective quicker.
jvans 154 days ago [-]
How do you personalize to the specific signals of the product, do they ingest into your infrastructure? What happens if a customer discovers a bug in a feature they're ingesting, how do they have control of retrains/pinning model versions? Who handles monitoring, the customer or your service?
tullie 154 days ago [-]
Yes when integrating Shaped you connect up the data sources needed to ingest: interactions, items and users. The Shaped interface then allows you to select which exact fields should be used for creating a Shaped model. We provide a full SQL interface to do this, which gives a lot of flexibility.
Our dashboard provides monitoring to help understand what data is ingested and view data quality over time. We expect customers to monitor this but also have alerts on our side and jump in to help customers if we see anything unexpected.
The dashboard also shows training metrics over time (how well does the model predict the test set after each retrain?) and online attribution metrics (how well does the model optimize the chosen objective?).
Customers can disable retraining if they want (which is essentially pinning the model version to current), we can do model version rollbacks on our side if we see an issue or if requested but it's not a self-serve feature yet. Because we've made it easy to create or fork a Shaped model, we've seen customers often create several models as fall-backs that rely on more static data sources or are checkpoints of a good state.
astronautas 154 days ago [-]
How does this compare to Vespa? If the key difficulty in scaling search is infra as you say, Vespa is an interesting alternative.
tullie 154 days ago [-]
Compared to Vespa, we're much easier to get setup on. A big part of this is that we have real-time and batch connectors to all leading CDPs and data warehouses. E.g. if you're on Amplitude it takes < 10mins to stream data directly to Shaped and start seeing initial results.
Being quicker to setup, also means it's quicker to build and experiment with new use-cases. So you can start with a feed ranking use-case the first week and then move to an email recommendation use-case the next week.
In terms of actual performance and results, we've never gone head-to-head in an A/B test so i'm not sure the specifics there honestly!
astronautas 154 days ago [-]
Thanks, so it's connectors, nice differentiators. Seamless integrations are harder than it seems.
sidcool 154 days ago [-]
What are the underlying ML models? Open source or custom trained?
tullie 154 days ago [-]
We have a library with about 100 algorithms which you can choose from or by default we automatically choose based on your objective.
Majority of them are open source models we've forked and improved. Just as an example, we integrated in gSASRec last week: https://github.com/asash/gSASRec-pytorch, and added a couple of improvements on scale and the ability use language and image features. We use LLMs for the encoding of unstructured data, and we host these our self, although OpenAI and Gemini are used for error message parsing and intelligent type inference, things not on the real-time path.
The performance for us was best when we evaluated a couple of options, both in terms of scale and latency. I also like the arrow/dataframe interface. We use arrow everywhere else at Shaped so it was a natural integration.
hajrice 154 days ago [-]
How does it compare to Algolia?
tullie 154 days ago [-]
The short answer is: we're better at recommendations and personalization and lean towards more technical teams (e.g. even with data/ML experience). They're better at traditional search and, these days, lean towards less technical teams.
Yes play.shaped.ai! We just opened that up in a gateless way for this post. Let me know what you think. I should also mention that these demo models are on our cold-tier so that it doesn't break things, in production there's a big speed up.
deepskyai 154 days ago [-]
Congrats Dan and Tullie - and the rest of the team. Great to see AUSTRALIA and particularly Melbourne (formerly known as the most liveable city in the world) represented.
Is there anything different now compared to what you released ~18 months ago? Or just launching on HN now?
tullie 154 days ago [-]
Australia represent! Although we're based in NYC we still are a mostly Aus/international team over here, it's great!
The biggest change is some of the less sexy stuff, like scale and security. E.g. we're now able to scale to 100M+ MAU companies with 100M+ items, and we have a completely tenant isolated architecture, with security as a top priority.
We've also made the platform more configurable and lower levels and we've found that people like choosing their own models and experimenting rather than just relying on our system.
Finally, we launched search only a couple of months ago and are currently heavily focused on building a best-in-class experience there.
gk1 154 days ago [-]
Congrats (from Pinecone) on the launch! The e-commerce and media recommendation space desperately needs an AI-based solution without the lead-filled baggage of legacy search or recommender systems.
> 100M+ Users
I assume you mean 100M+ end-users have interacted with a site or product that uses your technology. The way it's phrased sounds like you're saying Shaped itself has 100M+ users which of course it doesn't. Consider replacing that with "100M+ interactions" or something.
tullie 154 days ago [-]
Thank you! Would love to catch up sometime assuming you're in NYC with the rest of the Pinecone team!
Yes by 100M+ users we definitely mean end-users, wasn't intentional to mislead so thanks for flagging -- we'll update.
Rendered at 01:02:04 GMT+0000 (UTC) with Wasmer Edge.
How are you going to allow folks to influence the system? Or do you see your system integrated behind their pseudo-recommendation engine?
The other related market trend we think about here: recommendation is going through a similar journey to what search did 10 years ago. Search at some point was more build leaning, but over time the technology became democratized and then companies like Elastic and Algolia had offerings that pushed search to lean towards buy. We're seeing recommendations going through the same revolution now that the technologies and system design (e.g. 4 stage recommenders) are more solidified. It's the data that makes these systems unique between companies not the infrastructure or algorithms.
Setting the objective function is often the most challenging. Different teams may prioritize different objectives and often it requires balancing multiple at once! For instance, how does a company think about the types of user engagement and long-term metrics like retention? A model optimized for clicks might be worse for retention in some cases, but not in others. Ultimately, we A/B test to find out. Surprises and counter-intuitive results are common!
Would highly recommend having at least an estimated pricing calculator so we can determine if its worth our time to install.
Would love to chat, we've had several customers come over from Algolia and they've seen significant uplift. I can share more if you want to message me at tullie@shaped.ai.
Our pricing is competitive with Algolia's to give you an idea there. We really wanted to get pricing calculator done get before this post but ran out of time. Keep an eye out over the next month for it to come up!
Not sure if good or bad based on HN complaints about Algolia's pricing
These days I'm the founder of a circular economy marketplace for South Asian ethnic clothing and items - PurvX. The current search is terrible (due to the low-code platform it's on), will be keeping Shaped on my radar when we re-platform.
I think that's the hardest thing on any recommendation or search system. It's really hard to do without using money as a neutral measure of value. And, without a good measure of quality - it's unclear that the system is optimizing the right metrics (without cannibalizing others).
We run online A/B tests to objectively measure quality against our ranking algorithms and other baselines. As you mentioned it's crucial that the measure of quality for these tests chosen is fair and correlates with the topline business objective. E.g. if you just evaluate clicks then the system will show click-baity content and overall perform worse.
To handle this, we make it really easy to define different objectives and experiment with how it changes results. So although we don't claim to solve the issue directly, we believe that if users can quickly experiment with different proxy objectives, that'll be able to find the one that correlates with their topline objective quicker.
Our dashboard provides monitoring to help understand what data is ingested and view data quality over time. We expect customers to monitor this but also have alerts on our side and jump in to help customers if we see anything unexpected.
The dashboard also shows training metrics over time (how well does the model predict the test set after each retrain?) and online attribution metrics (how well does the model optimize the chosen objective?).
Customers can disable retraining if they want (which is essentially pinning the model version to current), we can do model version rollbacks on our side if we see an issue or if requested but it's not a self-serve feature yet. Because we've made it easy to create or fork a Shaped model, we've seen customers often create several models as fall-backs that rely on more static data sources or are checkpoints of a good state.
Being quicker to setup, also means it's quicker to build and experiment with new use-cases. So you can start with a feed ranking use-case the first week and then move to an email recommendation use-case the next week.
In terms of actual performance and results, we've never gone head-to-head in an A/B test so i'm not sure the specifics there honestly!
Majority of them are open source models we've forked and improved. Just as an example, we integrated in gSASRec last week: https://github.com/asash/gSASRec-pytorch, and added a couple of improvements on scale and the ability use language and image features. We use LLMs for the encoding of unstructured data, and we host these our self, although OpenAI and Gemini are used for error message parsing and intelligent type inference, things not on the real-time path.
More info here: https://docs.shaped.ai/docs/overview/model-library
Longer answer is in our blog post about it: https://www.shaped.ai/blog/shaped-vs-algolia-recommend :)
The biggest change is some of the less sexy stuff, like scale and security. E.g. we're now able to scale to 100M+ MAU companies with 100M+ items, and we have a completely tenant isolated architecture, with security as a top priority.
We've also made the platform more configurable and lower levels and we've found that people like choosing their own models and experimenting rather than just relying on our system.
Finally, we launched search only a couple of months ago and are currently heavily focused on building a best-in-class experience there.
> 100M+ Users I assume you mean 100M+ end-users have interacted with a site or product that uses your technology. The way it's phrased sounds like you're saying Shaped itself has 100M+ users which of course it doesn't. Consider replacing that with "100M+ interactions" or something.
Yes by 100M+ users we definitely mean end-users, wasn't intentional to mislead so thanks for flagging -- we'll update.