Foundational Ontologies in Palantir Foundry

7 min readFeb 24, 2024

Unlocking the Knowledge Encoded in Data

“Hiding within those mounds of data is the knowledge that could change the life of a patient, or change the world.”
— Atul Butte

It's often been said there are graphs in your data. Anyone engaged with the Neo4J community has undoubtedly heard this saying at least once. Anyone who works with data long enough will discover these hidden relationships with the right model—for example, realizing a relationship between revenue and weather or finding how social media trends can predict stock market movements. These insights are not just mere coincidences but are based on the intricate web of connections within our datasets. This realization is where the concept of foundational ontologies in platforms like Palantir Foundry comes into play. They serve as the bedrock for uncovering and understanding these complex relationships, enabling organizations to predict the system's future state.

Ontologies are in and of themselves a form of artificial intelligence. They encapsulate human knowledge and understanding about the natural world into a structured, machine-processable format. This structured knowledge, especially in knowledge graphs, enables AI systems to exhibit intelligent behaviors, such as those observed in sophisticated recommendation engines. The intelligence of ontologies stems from their ability to model the complex web of relationships within data that mirrors human reasoning, allowing machines to make inferences, recognize patterns, and predict outcomes in a manner that feels remarkably human-like. Through encoding both the entities and the semantic relationships between them, ontologies provide a rich tapestry of information beyond simple data storage, embodying a deep understanding of the domain it represents.

Modeling ontologies that accurately reflect their real-world counterparts is a complex process. Uncovering relationships between datasets within a system, such as those in a business environment encompassing structured, semi-structured, and unstructured data, involves a multifaceted approach. Initially, the process begins with a clear definition of the problem or objective, guiding the focus of the analysis. This is followed by meticulous data collection, ensuring that data from various sources and formats are accurately aggregated. The preparation phase is critical, involving cleaning and transforming data to remove inconsistencies and errors, which is crucial for maintaining the accuracy of subsequent analyses. Techniques such as clustering, association rules, decision trees, regression analysis, and time series analysis are then employed to analyze these prepared datasets, allowing for the discovery of hidden patterns and relationships across the data spectrum.

Interpreting results from these analyses is critical to the data modeling process. Understanding the dynamics and interconnections between datasets enables businesses to make informed decisions about how the real world behaves. They can also inform our logical models, often encoded with bias. Applying insights derived from data mining across the data landscape ensures we keep our models balanced with our preconceived notions about the behavior of complex systems.

By leveraging the encoded relationships and rules within ontologies, AI systems can understand context, preferences, and nuances in a way that aligns with human thought processes, making decisions or recommendations based on a nuanced understanding of complex interdependencies. This capability transforms ontologies into a cornerstone of artificial intelligence, bridging the gap between raw data and meaningful insights by infusing AI with a scaffold of human knowledge.

Integrating ontologies into AI systems, particularly agents, provides a framework to allow AI to take actions that affect the real world. Ontologies provide a semantic framework that enhances data understanding and supports complex reasoning processes. For AI agents, this means an ability to comprehend the semantics of the data — understanding not just the data itself but its context and relationships. This understanding is crucial for agents tasked with decision-making or executing remediation strategies. By having a structured, ontology-driven view of the data, agents can infer potential outcomes of various actions within the system, considering direct effects and indirect and emergent behaviors. This capability is invaluable in dynamic and complex environments where the interplay between elements can significantly impact the system's overall behavior.

Palantir Foundry was purpose-built to bring this framework to life. At its core, Foundry is an ontological modeling system that includes all the data integration, processing, modeling, and action layers required to build this digital representation of the natural world in which AI agents can operate. This allows us to use empirical and rational methods to model our systems in a single platform. This modeling process culminates in an ontology that encodes not only the state of the system but also its relationships and behaviors. The ontology layer itself includes the following capabilities that AI agents can leverage:

Simulations and What-if Analysis — Foundry's Vertex allows you to run simulations and ask "what-if" questions.
Actions — Foundry includes serverless functions and APIs that can encode complex business logic and behaviors.
Orchestration — Foundry's writeback and webhook technology allows you to orchestrate external systems.
Agents — AIP Logic is a visual interface for building AI agents that leverage the Ontology to perform tasks that can affect the real world using actions and orchestrations.

Using Foundry, we can produce Foundational Ontologies that power businesses in a given sector/sub-sector. These ready-made ontologies modeled by subject matter experts remove the cost and complexity for organizations and catapult them from the starting line into the AI race. A Foundational Ontology is, by definition, an AI system and includes a marginal workforce in the form of agents. For example, imagine a trivial notional Foundational Ontology for a consumer online retail business. In this ontology, we would include a model of how the key business metrics relate to one another, depicted below.

These metrics would have derived from datasets that included the following relationships in their data:

Marketing Spend to Reach: Increased marketing spend increases the reach.
Reach to Conversions: Greater reach leads to more conversions.
Conversions to ARPU (Average Revenue Per User): More conversions will increase ARPU because new conversions generate revenue.
New Logos to MRR (Monthly Recurring Revenue): New customers increase MRR.
ARPU to MRR: Higher ARPU generally results in higher MRR.
MRR to Churn: Higher MRR might lead to a lower churn rate if customers are satisfied, or it could be that reducing churn contributes to maintaining or increasing MRR.
Churn to LTV (Lifetime Value): Lower churn rates lead to higher LTV as customers stay longer and contribute more revenue over time.
LTV to Engagement: Higher LTV may result from better engagement or vice versa.
LTV to NPS: Higher NPS correlates with higher LTV.
CAC (Customer Acquisition Cost) to ROE (Return on Equity): CAC impacts the ROE, typically where lower CAC improves ROE.
CAC to New Logos: Customer count directly relates to CAC.
CAC to MRR: The relationship suggests that CAC influences MRR, where efficient customer acquisition may improve MRR.

This Foundational Ontology allows operators to ask questions like:

How does the Customer Acquisition Cost (CAC) affect a customer's Life Time Value (LTV)?
How does increasing the Average Revenue Per User (ARPU) impact the Monthly Recurring Revenue (MRR) and Churn rates?
What is the correlation between Net Promoter Score (NPS) and Customer Engagement, and how does this relationship ultimately influence Churn and LTV?

And these questions can be encoded into agents that continually look for insights and actions to perform in response to synthesized answers.

The underlying models these agents use to reason can also be fine-tuned in Foundry using something as simple as QA pairs like the following:

Q: What do I do if engagement is declining on the website homepage?
A: Run an engagement analysis using the analytics reporting tool to see the trend over the past several days. Then, perform a root cause analysis using the results of the SEO and SKU optimization tools. Also, consider active experiments using the experiments report. Once a correlation is found, send a notification to the "engagement" channel using the Slack notification tool.

We can ship our foundational ontology with ready-made data transformation pipelines, ML ops components (including feedback loops), and operations applications like health checks and audit logs! This is done using Foundry's marketplace application that packages and distributes all components to hydrate and use a Foundational Ontology. Companies can further leverage Foundry's built-in AI solutions for integrating source data like HyperAuto and PipelineBuilder to make the hydration process even easier.

Of course, having access to the subject matter experts and data required to create a Foundational Ontology is a real challenge. This challenge is an opportunity for the world's largest consulting firms, which have collected decades of data and insights from the largest companies across every industry. Innovative firms will realize that traditional management consulting will be replaced by these Foundational Ontologies that can continually deliver insights and measure outcomes leveraging finely tuned AI systems operating on top of a ready-made Foundational Ontology.

The world's largest organizations are stuck at the starting line of the AI race. They are stuck in the dead end of shallow use cases that do not unlock business value. Foundational Ontologies delivered through technologies like Palantir Foundry are the solution.

Foundational Ontologies in Palantir Foundry

Sources and Additional Reading

Written by Dorian Smiley