The AI Diaries: week 2

Reading Time: 6 mins

3 Nov 2025

By Ben Newsome

Here at Octopus Ventures, we continue to back traditional B2B software businesses, but it’s becoming clearer that founders who don’t, at least, use AI in their day-to-day operations are missing a trick.  

On their own, the gains offered by AI present a compelling case for its integration into business workflows. But there’s more to it. Adopting the technology now, building familiarity with it across all levels of your business and empowering employees to experiment, discovering new opportunities and efficiencies, is a future-facing play. But still – knowing how, and where, to start isn’t always easy.  

As part of our AI Diaries series, we’ve asked a couple of the companies we support to track their journey with this revolutionary technology. Patrick Van Deven is CEO at Octopus Ventures-backed VaultSpeed. They’re using a range of tools, including ChatGPT, to support the customer lifecycle, training it on internal content to create a single source of truth. Oliver Crowe is the technical product manager at Octopus Ventures-backed Flock, who are seeking to use Claude, with a model context protocol (MCP) that integrates other tools, to consolidate the enormous amounts of customer data previously distributed across a large tech stack and harness the insights it generates.  

You can catch up with their successes and challenges in Week One, or read on to discover what Week Two had in store.

Oliver Crowe, Technical Product Manager, Flock

Week two has been really productive and we’ve had some major wins.

As an insurer, we get a lot of third-party allegations claims. Using the MCP with Claude, we have set up a workflow which queries our telemetry data to find out if a vehicle really was where it was claimed to be by the third party, on the time and date of the incident.

This week, we had a great example and disproved an allegation. Using the MCP, we found a vehicle to be 200 miles away from the alleged incident, uncovering a fraudulent, vexatious claim. Now we’re looking to productionise features like this, using AI to query the telemetry database to reduce the number of fraudulent claims.

We’ve made some other discoveries. One ‘Aha’ moment this week came via transcripts from customer calls or internal meetings. We discovered that being really clear on ‘Me’ versus ‘Them’ when it comes to transcripts creates powerful opportunities, by clearly identifying whether it’s our customer success team (‘Me’) talking, or a customer with their real pain points (‘Them’).

It’s been interesting to use this as an opportunity for internal coaching. Because we know who ‘Me’ is, we can upload these transcripts back into Claude and ask it to coach ‘Me’ as an award-winning, successful, customer success manager. It picks up on what individuals are doing well, and what needs improvement, and offers feedback: a great step forward for the team.

Our main challenge is still around prompting. Even with MCPs and the wide integrations we’ve got, with lots of different data points, if we don’t prompt properly we find that we get less focussed outputs, as well as some hallucinations. This means being clear on the outcome, as well as who the user is that the prompt is acting for.

Clear definitions matter too. In the insurance world there’s quite a lot of technical language, with different words being used in different use cases. So, within the project context, having precise definitions has been critical for a good response. From project creation, with all the context, to prompt engineering, there’s a lot to understand and I’d say the learning curve is pretty steep.

Because we have so many different tools integrated as part of the MCP, the sheer number of opportunities to go after represents a challenge in itself – it can be almost overwhelming. That’s why each week, as we develop our use of AI, we set very distinct goals as to exactly what the output is and what we want to achieve.

This week, for example, we’ve been trying to get a weekly slack update, based on all customer sentiment from the past week. We’re giving this priority over trying to progress with other opportunities: having these focussed goals has made our use of the tools much more effective.

Patrick Van Deven, CEO, VaultSpeed

One of the problems I’ve had this week, with rolling out ChatGPT across my whole go-to-market team, is that we’ve had some hallucination issues. The AI has been hallucinating product features that we don’t have.

We’re paying a lot of attention to our retrieval-augmented generation (RAG), the process by which we customise our GPT assistant, making sure it refers to a relevant set of internal documents before generating a response. We’ve been carefully curating the documents we input, but still, I’ve found some of our colleagues ploughing ahead a little too enthusiastically, prompting and creating blog posts and statements that need a bit of supervision.

Leadership is important here. It’s my job as CEO to be clear, reinforce the message that we need to stay in line with our positioning and make sure everyone understands it. It’s also important to instil a culture of healthy scepticism, and make it clear that it’s not alright to be lazy: you can’t simply post anything GPT says – sometimes it’s plain wrong.

Product management is creating another challenge. We now have functional experts who can use vibe coding, prototyping software purely through AI prompts, with no conventional coding. We use tools like Lovable, Replit and Cursor to prototype future product features – even whole modules and a whole new solution.

They can go quite far: they can really build an actual, working prototype. The challenge is that previously, they might have done a mock-up, with some Figma screens, but it was easy to tell instantly that this wasn’t an app, yet.

Now, they almost compete with our engineering team, building the things they think are needed to compete on the market at pace. The question is, alright, are we selling what we have on the shelf – or are we selling something that was vibe coded last weekend, because the pre-sales team felt that our tool should do something a bit different or move in a new direction?

Not that this is wrong. I think we’re bringing velocity in a space where it’s much needed. But the question I have is, how – and when – do we bring this into engineering? We sell to large enterprises, so it needs to ship as proper, production-grade software. I really don’t know.

I’m watching this, I’m controlling it, trying to make sense of it and guardrail it, but we’re learning a new way of working between product management and software engineering. The frontier, the limit between the two, is really blurring.

Actually producing code has become a commodity. That raises multiple questions, like what is intellectual property if code can be regenerated? What is the role of design and product management if functional experts can vibe code the whole or a critical part of a working software solution? And how long is this opportunity window going to remain open for us if anyone can code at that speed?

While these questions resonate across the whole software industry, this week I’ve been ruminating specifically on their impact on product management.