🔌 Applying Circuit Simulation and AI to Grid Interconnection and Resiliency #023
Interconnection and Grid Resiliency are critical to decarbonizing the grid
TLDR: Against a backdrop of more extreme weather events and increasing demand for electricity, the aging grid needs to rapidly decarbonize while maintaining reliability. I interview David Bromberg of Pearl Street Technologies and Mish Thadani of Rhizome to learn more about interconnection and grid resilience.
Unless you’re an energy nerd, you only ever think about the grid when the power goes out or when you spot utility linemen cutting tree branches in your neighborhood. As consumers, the fact that we don’t need to ever think about electricity is an amazing feat of modern technology. I know my phone is going to start charging when I plug it in and the lights will turn on when I flick the switch. Unfortunately, this magical abstraction that pipes power everywhere is now at risk due to a confluence of factors.
What is the grid?
The grid is an abstraction that is made up of multiple stakeholders who collectively work together to ensure reliable electricity delivery. The US power grid consists of three fragmented interconnections (Eastern, Western, and ERCOT). They don’t talk to each other at all. The next layer consists of 12 different transmission planning regions who coordinate, control, and monitor across many states. This furthers the us vs. them mentality and limits the ability to plan in a centralized and coordinated manner. Furthermore, in the US, there are over 1,600 electric utility companies who maintain the lines, wires, poles and transformers that deliver electricity to homes and buildings.
Brian Potter of Construction Physics sums it up well: “Today, the electrical grid has over 500,000 miles of high-voltage transmission lines and more than 5 million miles of lower voltage distribution lines, which supply power from nearly 6,000 large power plants. Together, this system supplies more than 4 trillion kilowatt-hours of electricity to the US each year. The extent of it has led the US electrical grid to be called ‘the largest machine in the world.’”
The grid is being pulled in different directions
Like any complex large-scale infrastructure, managing the grid has never been easy so why now in particular? The easiest way to explain what’s going on right now is with a meme:
The grid is aging and runs on old software
The majority of today’s grid was built in the 1960s and 1970s. With a projected 50 to 80-year lifespan, the time to replace large amounts of grid facilities is upon us. In some cases, we’re really testing our luck. The Whiting hydroelectric power plant was built in 1891 and is still running. The deadliest wildfire ever in California, the 2018 Camp Fire, was caused by a PG&E transmission line that was built in 1921.
Not only is the physical grid aging, the software that utilities rely on is also fall behind. The software running the centralized, simpler grid has struggled with the increased decentralization and complexity as intermittent generation emerges and energy demand increases.
Extreme weather events are becoming more common and more destructive
By using billions of dollars in damages as a proxy, the NOAA shows that things are getting worse.
The implications of extreme weather events on the grid are quite severe. In 2021, the winter storm Uri caused 70% of Texans to lose power, 50% to lose water, and cost upwards of $130B in economic damages. As global warming worsens and extreme weather events become normalized, the need for reliable electricity grows. On a 107°F summer day in Arizona, the utility shut off 72-year old Stephanie Pullman’s power due to an unpaid electricity bill. Her body was found days later. More extreme weather also means more health complications. A [study of the Virginia healthcare system](https://www.americanprogress.org/article/the-health-care-costs-of-extreme-heat/#:~:text=Extrapolated nationally%2C heat event days,health care costs each summer.) found that extreme heat adds approximately $1B in health care costs every summer. As extreme weather events exacerbate, maintaining a healthy power grid will become even more challenging and even more important for keeping our communities safe and healthy.
Utility-scale renewables and distributed energy resources are creating complex logjams in the interconnection queue
Times were simpler before renewables. Prior to the proliferation of solar and wind, generation and distribution was straightforward. Fossil-fueled power plants could be turned off and on based on demand fluctuations and there were far fewer nodes on the electricity generation map. Nowadays, with utility-scale solar, wind, and storage facilities, grid balancing is far more complex. Due to the variable nature and sheer scale, planners have to think through many more permutations of grid scenarios to ensure reliability.
Other distributed energy resources like rooftop solar, EVs, and heat pumps change the paradigm of how the grid has historically operated. Power used to only be generated in large central locations that flowed into the distribution system via high voltage power lines. Those power plants still exist, but now there are a lot more smaller decentralized nodes of both generation and demand. In the case of rooftop solar, net metering policies allow homeowners to sell back excess electricity to the grid. This new behavior of bidirectional flow is something that the actual grid and grid simulation software isn’t prepared to handle today with full fidelity.
This brings us to interconnection, the required process in which renewable project developers request to be connected to the grid prior to construction. The average time spent in queue used to take <2 years in 2008 but has since increased to 5 years as electrification has ramped up. The interconnection queue is currently slammed with requests because there’s about 2,000 gigawatts of generation potential waiting to be connected to an existing capacity of 1,250 gigawatts.
Here’s an example of what could happen: In 2017, the Oceti Sakowin Power Authority, a project developer based in South Dakota, put down a $2.5M deposit, thinking that would be the total costs associated with interconnection. Five years later, the regional grid operator finally came back with the new $48M price tag for all the additional required transmission upgrades. They were only given three weeks to come up with the extra $45.5M and understandably dropped out of the interconnection process.
Unfortunately, this story is not uncommon. As a result of the unexpected fees and long delays, only 14% of solar and 20% of wind projects are actually completed.
Demand for electricity is increasing and demand profiles are changing
For the next decade, demand for electricity will increase by an average of 3.7% per year as gas guzzlers switch to EVs and fossil fuel furnaces shift to heat pumps, among other electrification trends. Not only do grid operators have to retire fossil fuel power plants while adding new renewable projects, they also have to ensure net generation capacity continues to meet demand.
Also, consumer behavior will shift due to electrification in terms of both time and geography. Taking EVs as an example, the curve of electricity demand throughout the day will shift as drivers plug into L2 chargers for overnight fueling. On a geographical basis, charging depots and warehouse districts will shape demand as fleets electrify.
Introducing Pearl Street and Rhizome
With the context out of the way, we can finally get into the real meat of it.
Pearl Street Technologies builds interconnection software to safely and reliably add new utility-scale renewable projects to the grid. For grid operators and utilities, their software helps to accelerate modeling and simulation studies which are frequently the bottleneck for the interconnection queue. For project developers, Pearl Street’s scenario analysis and risk assessment capabilities offer certainty throughout the interconnection lifecycle. In building for both grid operators and project developers, they aim for consistency so that everybody is on the same playing field.
Rhizome helps electric utilities assess what their climate risks are in precise terms and assign dollar values to that risk. They’re currently focused on the distribution part of the grid. By ingesting datasets on historical events from the utility, combined with past weather forecasts, they’re able to simulate what would’ve happened if additional resiliency investments were made on time. By incorporating climate models of varying time horizons and geographical scale, they’re able to backcast climate risk and offer grid upgrade options across a variety of scenarios. Since grid resilience plans need to be approved by regulators, Rhizome provides detailed cost-benefit analysis so that they can justify their proposals.
The reason why I decided to write about interconnection and grid resilience in the same issue may not be immediately obvious. Rather than view Pearl Street and Rhizome as two siloed solutions, I see them both converge at the same problem. Antiquated grid software hasn’t been able to keep up with interconnection studies or grid resilience planning. A common phrase in climate is “physical problems require physical solutions”. Well, here’s an example of two physical problems existing because of bad legacy software. While Pearl Street is focused on interconnection and Rhizome is focused on resilience, I see some similarities. At a high level, they’re both solving complex grid problems through simulation — something that software is really good at. It’s expensive to test anything that’s asset-heavy in the physical world. It’s also not wise to A/B test the grid because that could be pretty catastrophic.
Simulating the grid with circuit design simulation software
David from Pearl Street: Pearl Street's background comes from simulation and design tools for computer chips. A computer chip is physically small, but under the hood, they are massively complex electronic systems with tens of billions of components that have extremely complicated physics. The software platforms that are used to design, simulate, and optimize those chips work really well because there's been decades of investment in R&D.
Companies like Intel, Apple, or Qualcomm can't just go out and build a new line of chips unless they are confident that those circuits will work when they come out of the fabrication facility. This incentivized the development of powerful software for designing and simulating computer chips.
When we started the research at Carnegie Mellon that would later lead to the founding of Pearl Street, we approached this from an academic perspective of taking algorithms from computer chip simulation and applying them to power grid simulation. We wondered, “What if we tried to model and simulate the power grid like every other circuit on the planet?”Over time, we were able to show powerful advancements in grid and transmission planning leading to the development of our first product, SUGAR™, an automation tool for power grid analysis and generation interconnection (GI) studies.
Simulating asset fragility using historical outages and weather forecasts
Mish from Rhizome: We ingest datasets from the utilities to characterize their assets and what component types they have across the system. If we can get any maintenance and inspection records (which many of the large utilities have), then those are also extremely helpful for understanding the current state of infrastructure. We also take a historical record of their outages to help understand when those assets have failed as a result of some sort of extreme weather stressor.
So we merge that with a variety of high-resolution historical weather datasets to understand the causal natures of how utility assets fail. It’s primarily geographic, so vegetation is a really important one. Understanding the distance to power lines, tree species, and the height of a canopy is crucial.
note from Matt: just over 50% of all power outages are caused by vegetation combined with wind, storms, hurricanes, etc. When overhead power lines make contact with vegetation, it creates a spark and can cause a wildfire.
We also look at the topography and terrain. In some cases, soil conditions might also be somewhat informative. And it's all to construct bottoms-up machine learning algorithms that take these data sets and identify patterns of failure. We refer to this concept as asset fragility. We're aiming to quantify asset fragility for a variety of extreme weather threats and then couple that with climate projection data. These are global climate models that are dynamically downscaled to project what that risk profile and asset fragility are going to look like in the future.
What we turn around is a probability metric for every lateral segment of a utility’s distribution system. If you think about the breakdown of distribution, starting with the substation, you have a main backbone feeder. It's usually a three-phase line. Then you have a one-phase line which is what you would picture as a power line going down your neighborhood and that’s a lateral. We can get a probability of failure for each lateral segment because that's how utilities make investment decisions on the grid. By getting that probability function, we then layer on a consequence. If a given asset were to fail, what does it mean in terms of power outages and what does it mean in terms of the anticipated restoration costs in labor and materials? That's how we get to the financial metric of risk at the granularity of the lateral level.
AI for the grid
Mish from Rhizome: Over the past 20 years, utilities have been implementing systems to house information related to their assets and also capturing outage information on their system. These have been in deployment by major software companies like Oracle, Siemens, Schneider Electric over the last 10 to 20 years. The trouble is largely around data quality and completeness and we run into that quite a bit. If you do have incomplete data, then having a large enough sample size is something that we look for. Machine learning is highly conducive as a technology to weeding out and tuning out variables that might not make sense. That also means being able to adjust for inconsistencies in data sets, which makes it a decent technology to use under the circumstances of unclean data. But then we want to enrich that data set as much as possible. So we’ve been applying some generative AI techniques when there are missing assets to approximate the asset component type, age, and condition.
A cascading series of assumptions in interconnection
David from Pearl Street: The interconnection study of a specific cluster, let’s say a 2022 cluster, still involves all of the generators that came before it. So the 2021 cluster, 2020 cluster, etc. could still be under active study, meaning they're not real generators on the grid. They're just generators that are further along in the interconnection study process, but are still hypothetical.
This is where things get complicated. Not only do you have to build the model for the 2022 cluster, but you also have to keep track of the prior years’ clusters because all of the grid upgrades that are associated with projects in these clusters are hypothetical. Projects from these clusters could still withdraw or change capacity and service type, resulting in changes to the 2022 cluster’s base model and cost allocations.
This type of analysis is very complex and can take weeks-to-months of engineering time to get to a solved state. Our software was developed to automate these types of time-consuming processes, providing robust scenario analysis given user-defined assumptions, such as the aforementioned changes.
Matt: It's like a cascading series or a waterfall of assumptions. One assumption rests on another assumption.
Game Theory applied to interconnection clusters
David from Pearl Street: Developers, at least to some extent, are competing with other developers in the same interconnection cluster to get connected at low cost. But not for every project because projects that you're not sharing potential cost with don't matter as much.
If your project is sharing a potential grid upgrade with another project, that matters a lot because if they withdraw, maybe the constraint goes away and you don't have to pay anything anymore if you stay in. Or maybe they withdraw and now the constraint is still there, and instead, more of the cost is allocated to you. This gets even more complicated when it's not just two projects, but five or ten projects sharing costs for a specific set of grid upgrades.
Our Interconnect platform allows developers and other stakeholders to strategize the actions they should take to secure the best outcomes. With interconnection, this would look like aa developer using Interconnect to evaluate different cost profiles based on scenarios such as withdrawals in current or prior-year clusters, and capacity or service changes.
Speculative projects clogging up the interconnection queue
Matt: You previously said, “If you're not getting your study results for years, that's a lot of risk and uncertainty. So to hedge your bet as a developer, you may submit multiple positions into the interconnection queue in the hopes that one of them pays off.” This reminded me of when I was in Bangalore, India and had to request multiple Ubers and Olas just to get one car due to mapping and marketplace issues. What’re the implications of multiple requests in the interconnection queue per project?
David from Pearl Street: That’s one of the big issues from the grid operator and utility side of things, which they refer to as speculative projects. Speculative projects are projects that the developer has submitted into the interconnection queue with possibly no intention of actually building them, or little hope they can be developed. The problem with that is by adding more volume to the queue, you're increasing the time and complexity of the studies which just adds more uncertainty. If you're a developer, you don't know if that other project is speculative or not and maybe you're sharing costs with it. It's part of this vicious cycle where developers may hedge their bets by having multiple queue positions which just makes the studies take longer. It can get very messy.
Backcasting on historical utility outages and weather events
Mish from Rhizome: Every time we do an implementation with utility, it always starts with a backcast analysis. This gives us the ability to create models to understand the performance of investments that they've made historically. By using a double-sided machine learning technique combined with a pre- and post-analysis on investments, we can identify where the utility has benefited off of their investments. Consequentially, if they made an investment on a part of the grid prior to a storm hitting, we can actually simulate those storms and anticipate what the reduction in risk would have been. The backcast analysis is fundamental in getting to that baseline level of climate risk across the system, and then the climate projections coupling comes in afterward to project forward what that risk profile looks like.
Matt: How do you take the backcasting and demonstrate value starting from the present and then forecasting into the future? Because you can't really A/B test the real world, right?
Mish from Rhizome: Effectively what we do is a validation for the historical backcast. We have validation metrics according to how good our model anticipates the fragility of an asset, according to some function of extreme weather that occurs. Say you have a given level of wind gusts over a period of time; you can anticipate within a certain degree of confidence which overhead lines are likely to be impacted.Then climate modeling is important to understanding what future extreme weather events are going to look like and of which likelihood. So there are two types of probabilities at play when looking to the future.
First, there's the probability of an extreme weather event like the concept of a 1 in 100-year storm or a 1 in 10-year storm. Those profiles are changing, which is why it's important to model the probabilities of intense extreme weather events occurring.
Second, assuming that you have the probabilities that these events will occur in the future, you can calculate the likelihood that it'll cause failure in the system. Since we've already done the backcast analysis, we actually already have an idea within a certain confidence of what the results are likely to be.
So it's probabilistic by nature, but we are able to forecast from a validated asset fragility model and then also a model for extreme weather that can be coupled onto it to further understand probability of failure. We look at it as an annual risk metric — so what's the probability that this asset will fail in any given year? Then that changes from year to year moving forward.
Validation of climate modeling
Matt: How do you think about the reliability of the climate models and the weather forecasting that you're leaning on? Do you take an indexed approach where you're not purely betting on one source? How do you think about this from an upstream data perspective?
Mish from Rhizome: We look for the best weather data sets [across weather stations, NOAA, or local universities] that are available in the specific region that we're working within. We validate everything. We're trying to get to the best predictive model using the data sets that represent the actual conditions. When you look at climate and weather data sets, not only do you have to ensure that you're getting a good resolution both temporally and spatially, you also have to make sure that you're also correcting for any biases that may exist within a speicfic climate model. We prefer to have at least six months all the way out to three years of historical data on both weather and climate with real-time observed conditions.
We're looking at it from a historical perspective which we have good data on and you can always just check it for ground truth data. We're also looking at climate projections at the multi-decade scale. For the multi-decade climate projections, there is a lot of uncertainty the further out that you go. However, since we've been collecting data related to climate projections on temperature, pressure, wind, etc., we can start validating recent years compared to what we've actually been observing.
The importance of good workflows — it’s not just about AI and fancy climate models
Mish from Rhizome: We started this company with the planning tools format because we wanted to build a product that utilities are really going to enjoy using and stick with. It's not all about the modeling. Obviously the modeling is extremely important, but what's tough for utilities right now is all of the multi-team collaboration and all of the time sunk with the limited tool sets that they have to create billion dollar capital plans annually. These are annual plans that require progress tracking, reporting, collaboration, and data ingestion. We've built a platform that increases the efficiency of those teams while providing them with a better modeling solution to make better decisions.
So in our view, we're a decision intelligence tool. Decision intelligence is only as good as the information and the quality of that information, as well as the ability to integrate that information into actual decision making. We're trying to satisfy that entire decision intelligence framework to ultimately help the grid become more resilient.
Overlap between Interconnection and Climate Resilience
Mish from Rhizome: There is this concept that the more intermittent renewables that you bring onboard, the lower resilience you have. It's highly dependent on the characteristics of the grid conditions. The grid has been stress-tested as time goes on with longer spells of heat waves or the Texas freeze for instance. It's convenient to blame one resource or another for those resilience challenges.
I believe you have to do the scenario modeling to understand when resources are going to be responsive at what point. The intermittency of solar and wind will, in some instances, contribute to a lower resilience than firm generation during certain circumstances. But when you have a cold event, wind tends to be more productive and it tends to be longer lasting. With solar, especially during the extreme heat days, the coincident peak of solar along with temperature is not entirely perfect, but it's really close. So having solar available on those days of extreme heat can actually add to resiliency. But what's going to truly flexibly increase the resilience of the system is storage. There's no one-size-fits-all technology to improve resilience. Rather you have to model each part of the grid and identify what that is.
How optimistic are you about the grid transition?
David from Pearl Street: What gives me optimism is the fact that the DOE, FERC, grid operators, utilities, transmission businesses, and renewable energy developers are all aligned in the need for transmission build out. There's been some real progress made in the past couple years.
A great example is MISO and SPP coordinating on their JTIQ effort. It's basically looking at the seams between those two regional operators and figuring out what new transmission is needed to support generator interconnection requests where the two systems meet. MISO is also conducting its own long range transmission planning and looking at potential needs of their footprint independent of their interconnection studies.
There's alignment that this needs to be done and it's not just talking heads saying we have to do this. The actual stakeholders agree and they've made real progress on getting it done. So, yeah, I'd say cautiously optimistic.