Episode 3: Rock Schindler, CEO of SDRefinery AI on creating information from unstructured data

80-90% of all data within an insurance company is unstructured data (audio, video, notes, photos, etc). This means insurers are running their businesses on less than 20% of all data within the firm. SDRefinery is looking to change all of that. SDRefinery AI is turning unstructured data into useful information. Listen in as Rock and I discuss how they do it and what the ramifications are for insurers.

Watch here:

Or subscribe to the Coverager Podcast and listen here:

Connect With:
Rock Schindler
Sd Refinery AI Homepage

Musical Credits:
Shadows by David Cutter Music:
https://davidcuttermusic.com
https://soundcloud.com/dcuttermusic
Free Download / Stream:
https://bit.ly/shadows-david-cutter
Music promoted by Audio Library:
https://youtu.be/qiBHOiEl9EI

Video Credits:
Intro Stock Footage by Videvo:

Transcript

Nicholas Lamparelli
That's a bad five.

That should go on the blooper reel.

And we're back. The Coverager Podcast.

Thinking of calling it The Hot Seat? Trying to ask the tough questions. My guest this week is Rock Schindler. Rock is the founder and CEO of SD Refinery AI. SD Refinery is is refining data one word at a time SD Refinery's using AI to mine unstructured text data found in all operations of an insurance carrier and turning that into into usable information. You are a serious person for agreeing to be interviewed on a Saturday afternoon, Hey Rock, how you doing?

Rock
I'm doing great, Nick. Thanks for having me. It's a privilege to be here and I'm glad to take the time.

Nicholas Lamparelli
Yeah. So you're on the hot seat. Now. I'm going to start peppering you with questions. Where'd the name SD Refinery come from? I'm assuming SD stands for sentence data.

Rock
That's absolutely correct. And we in fact, recognize the need to distinguish sentence data from other unstructured data. Well, I think maybe we can get into that a little bit deeper now.

Nicholas Lamparelli
If you want, if you want. What let's, for the audience sake, describe, structured and unstructured and then and then segue your back segue your way back into SD Refinery.

Rock
Sure. So so what we found in the industry is a lot of confusion around data. And what we recognize is that, that people understand metadata, which has a name and address that simple to get your head around. And then the industry relies heavily on what we call semi structured data. So a lost cost, a cause of injury. And those are codes that people assigned based on reading information, and they assign a code and then systems use that code for reporting and for different purposes. And what we recognized is that the industry really didn't have a good way to understand unstructured sentence data. And unstructured data has grown to include video, voice, and picture along with sentences. And as you may know, there is probably I think it's 80 to 90% of the data inside insurance companies as well as other companies. In fact, when they make the projection of 80 to 90%. They're talking about all industries. And I think we all recognize that the insurance industry has more written documentation and a lot more unstructured data than a lot of other industries. So 80 to 90% could be low. But the reality is again, for us, we recognize that calling it unstructured sentence data was a way to bring clarity to what we're actually talking about. And so SD and in our title, SD Refinery does in fact, represent sentence data. And then the refinery component relates to the fact that what we're doing is enriching information into something that's very valuable. We recognize unstructured sentence data is a treasure trove of knowledge about what's happening with your business and with their customers. And so the concept of a sentence data refinery is one that made a lot of sense, not only for the insurance industry, but for a lot of other industries as well.

Nicholas Lamparelli
Yeah. I think I just can't go on to the next question without really trying to re-emphasize or try to understand this. You're saying 80 to 90%, and maybe on the low side of all the data that's in an insurance company is unstructured.

Rock
That's absolutely right. And, and you'll find, if you go query this on our good friend, Dr. Google, you'll find there's a ton of data or analysis research that supports that number. And it's growing at just an incredible rate. If you look at the insurance industry, even by the most conservative measurement, it's growing around billions of pages of unstructured data being added daily. And then you put that in in motion. And you think about the fact that as human beings, the average person reads about 300 words a minute, you're smart guy, you can probably do 400. But even with that, even with that, okay,

Nicholas Lamparelli
the fact that I've been on the same book for like a week, so I'm not sure about that.

Rock
So the fact that matter is that as human beings, we can't keep up with the amount of unstructured sentence data that's coming into the industry.

Nicholas Lamparelli
Yeah. That the ramifications, though, are massive, because, you know, the the promise of Insurtech, when you when people talk about data, predictive models, Ai, a lot of it or most of the promise is around, you know, stuff you can do math with. Right? So it's the structured data. And what you're saying is that all of that promise is just dealing is just basically just beneath scratching the surface, just dealing with the top, you know, 10, maybe 20% of the data that's in the insurance company and the rest is just almost ignored.

Rock
Exactly. That's exactly right. And as I say it oftentimes with people, if you understood risk perfectly, you could design the perfect system and know exactly what information to capture. But as we all know, and as we all experienced that work in industry, risk is imperfect. It's nebulous it's abstract and it's continually changing. And we don't know the risk that's happening today, nor do we know what's going to happen tomorrow. And the ability to look at the unstructured sentence data gives us an unprecedented ability to understand and react to risk in a way that we never could before.

Nicholas Lamparelli
Yeah. We're going to get into that a little bit more. I want to rewind a little bit back to the beginning. Where the idea come from, can you give us the backstory?

Rock
Yeah, great question. So I started my career as an auditor with Peat Marwick was Pete Markwick Mitchell back in the day, and now is KPMG. And we were taking samples of records from a huge population. And we would take a small sample of 50-100 records, you know, part of 10,000 records then we would read it. And then we would cast judgment on the population based on that sample. And I always hated that I thought it was terribly inefficient. And then I got recruited to work in the reinsurance broking industry in 1994. After Walter Schutsie, the head of the SEC figured out that reinsurance was a lot more about banking than it was about risk transfer. And so there is a metamorphosis going on within the industry, that made it much more of a quantitative, calculated process of moving risk from the primary to the secondary market. And I got involved in that movement. But I saw the exact same thing. reinsures would come in to do an audit, they would take a sample of records, and they would use that sample to cast a judgement and it was just horribly inefficient. I could give you a bunch of stories where you could argue that the process led them to an incorrect conclusion. And so in the early 2000s, and I was exposed to a technology that had been created by a group of five linguists from the University of Utah that had patented their technology to, to look at what we call sentence data and what they were doing. It's relatively easy to identify parts of speech, you can go to the internet and get a part of speech tagger and it's simple. The nouns, the verbs, the prepositions, what they had patented was their ability to identify nouns that were the initiators of action versus the recipient of an action. And that really becomes critical become critical, excuse me, because people write in a lot of different ways. But the first time I saw that, I realized it could be a game changer for the insurance industry, because you could apply it to an entire population, and you could start doing some things that you otherwise couldn't do. And that for me was the start of the journey then and I've I've spent 14 years on this, and I'm working now with SD Refinery is my third entity, the third version, and it's it's the one that's got it all right, let's put together all the pieces in the right way. And and that's exactly what we're doing then now is the essence of it is being able to look at an entire population of records and be able to measure things in a way that never could before. And I can drill down to that some more.

Nicholas Lamparelli
Yeah, that's that's where I was going with that. So by all means, let's go ahead talk about the type of information you can glean from your technology.

Rock
Well, that's by far the most challenging thing that the industry faces. And I want to take one step back, though, if I could and challenge you a little bit, because if you think of where the industry is going, the insurance industry has figured out, they have figured out that they don't want to hear about a new technology, they could care less about a new technology. They want to figure out what the new technology can do for them. And if you go back and look at the the Zurich North America innovation championship, we were fortunate enough to be selected as a finalist but they had 1358 applicants for their championship, which is I think 300% more than what they had last year. For reality. They're inundated, inundated with technology, as is every other carrier. So guess what they don't want to hear about a new technology because they can care less. What they want to know about is a use case and a way that they can make their business better. And what we've done more recently, is, is really moved away from talking about technology, even away from talking about AI, because that does not matter. And you can talk about a lot of different lingo about AI, machine learning, natural language processing, deep learning, on and on and on. And guess what, it doesn't matter. What matters is what data are you starting with? And what answers are you providing? And that's exactly where we're going. Now if you look at the answers that were provided, for example, we will pull out the what we call the events and activities. And Nick when you you held up a book earlier, you're reading that book

In any written documentation, you can think of it as events and activities, events happen. Actions take place events and activities. And so what we're doing is looking at any body of written information, identifying the critical operational events and activities. And once you do that you set up an enormous amount of opportunity, for example, setting a milestone date for when an adjuster talk to a claimant, and then you can measure that across the entire population. But where I want to go with this whole concept here, question that you asked you a couple minutes ago, what can we do with it the right place to start? So anybody that is listening to this, okay, the right place to start is to ask yourself, what are the problems that you're facing today? And what are you doing to address those problems? And if you peel back the core operations of any insurance company, if I'm an underwriter, if I'm a adjuster, If I'm a executive managing that department, I have to rely on audits that I'm doing monthly, quarterly, semi annually, a lot of people annually if they're lucky, which they will admit, I'm looking at three to five files part of maybe 150 to 250 that they might be managing. And I'm using that audit to cast a judgment on whether or not they're performing the things I've asked them to do. What we're doing is dramatically changing that saying, Okay, one of the criteria I'm looking for go back to where I was a moment ago, is how quickly did the adjuster confirm a contact with a claimant. So now what we can do is we can measure that time, we can establish an average for the population, and we can establish the highs, the lows, and if I'm going to go look at files to audit Well, guess what I'm going to go look at the all the people that are sucking all the people that have are taken way too long. Contact a claimant and figure out what's going on. So we're clearly bringing information back in a quantitative way. It all starts though, with being able to identify those events and activities going on with an insured with a claim file with an adjuster.

Nicholas Lamparelli
Yeah. So explain the difference between think you have, but let's go a little bit further for those in the audience that that are listening that may not still understand what exactly is happening here. How is it different than traditional character recognition software? That's, you know, taking unstructured files, for instance, and just saying, Well, here's a, we just read it and we just put it into a digital format. Can you give us an example of the additional enrichment that occurs through through your process? Yeah,

Rock
Great question. And I have to acknowledge ok, I have worked in the industry for 34 years, I'm about to tell you, I was oblivious to it until I was on the phone with a carrier here a couple of weeks ago and they shared this with me. And that specifically, when the industry talks about OCR, what they're talking about is being able to pull information off of a document. So a scanned image or something to that effect, a PDF, a scanned image, what they're doing is pulling the characters off of the document and pulling it into a system. And the right analogy relates to in my mind, it all relates to food. Okay, OCR is food preparation, what we're doing is food consumption. And I want to, we can build on that a little bit. But to go back to what I just learned here a couple weeks ago, this carrier has about 8000 people and they have anywhere from 10 to 15%. to their people. So anywhere from 800 to 1200 people that are dedicated to dealing with incoming documents, trying to figure out what is in the documents, where is it supposed to go? How do we label it? And how do we make sure it gets the right place? It's just mind boggling to me that carriers are having to spend that amount of time and energy around, making sure the incoming information is consumable and getting to the right place because coming in on a document and we've done a number of projects like this, and and what we're doing is pulling off information from the document. So a first step has to be to to get the characters into into a table and that can be thought of as the OCR the optical character recognition process. But think of it this way, if you if you spent all your time preparing food Never eating it, you would die of hunger. Right? So, so it's the food preparation. That's the equivalent of the the OCR, where it needs to be where it needs to occur. But then the second process, what we're doing is saying, Okay, what what is the written information telling you? And for example, what we've been doing is pulling up when a claimant pulling information off a medical record talking about when a claimant has a confirmed comorbidity or is being treated for a comorbidity being treated for diabetes being treated for for anxiety or depression, versus when a claimant has been tested but has been proven negative or there isn't. There isn't that condition or another one is surgery is saying, When has a claimant had a surgery or when as a claimant being scheduled for a surgery? And that's what we're doing with our technology. The key thing that we're doing is distinguishing The context it happened did not happen or may happen. And so when we pull out those critical events and activities distinguishing, okay, the surgery may happen. So it's a future event versus it's a past event. And that has significant implications to what a what an adjuster is, or is not going to do, for example, with case management or follow up or with reserves, all sorts of implications for how they're adjudicating the claim,

Nicholas Lamparelli
As you're talking about that I'm thinking, it reminds me of filtering and sorting structured data, but filtering and sorting structured data in Excel where, you know, in a complex workflow, there are a lot of decisions that need to be made. And a lot of times you spend an enormous amount of effort just saying, well, the workload's big, how can we filter and sort this so that for example, letters, correspondence that comes in via fax or via, you know, USPS right? Can we immediately split this out into this this claims? And this is this goes to underwriting. And then the ones that go to underwriting can we read it? And can, you know, are the ones that go to claims This one is a death benefit. This one is a PD, this one, this one goes to the auto. There's probably a tremendous amount of effort as you're just as you were talking about all the work and the employee counts that's required to do that. I bet a lot of it is just sorting and filtering, and just getting it finally down to the expert who's going to adjudicate the the final piece of it. And your your technology sounds like a lot. A lot of in those in those decision nodes, filtering and sorting. Like hey, Let us just quickly read it and we'll tell you like it. We can we can funnel these things off quickly. So you don't have to have such massive staff just to do just to read documents.

Rock
Yeah, so that's absolutely true. It's so I guess to stay with our food analogy, it's a Saturday, we can get away with it. So yeah, so that might be whipping up some guacamole, because that's, that's a pretty, that's a pretty quick, efficient thing that we can do. But to go back to those applications on incoming documents, oftentimes what these carriers are running into is they don't know whether it's original content or duplicate. And we've had a couple of situations where we were doing pilot contract work for carriers where they sent us documents, and we went through and pulled out all the duplicates, and they came back with a red face and said, Well, we didn't even realize they were duplicates. Can you do that for us on a regular basis, which of course we can. Yeah. So it's, it's like the the when you If anyone has gone through a carrier, I'd bet as you sit in these individual siloed departments, how much work is duplicative? You know, how much work is just, there's no good process to handle it. So that's that's

Nicholas Lamparelli
Rock that's probably occurring in like every department in an insurance carrier.

Rock
Absolutely. You're absolutely right. And it's a, it speaks to the inefficiency of the industry. And I think that's what is as long as I have worked in the industry, I've seen the industry really get frustrated with its inability to have the right information, the right place and time and I can go back to 34 years and probably count on one hand, the number of companies that I encountered that felt really good about their system that said, Okay, we've got exactly what we need to make the right decisions. It just doesn't happen. And the reason is because back office systems were never designed to capture and understand And the judgment that people are applying to an underwriting file or to a claim file, and all their judgment is, is captured in their sentence data, all the documentation that they're creating gives us the perfect window into what they're doing when they're doing it. And that tells us if they're applying good judgment or bad judgment. And ultimately, as you know, Nick, because you've been in the industry a long time, good judgment produces bad results. good judgment produces good results, bad judgment, bad results. And so the quicker we can figure out who's applying the right judgment at the right time, the better rabl to put those people in the right spots, but then also go back to the people that are making poor judgment at the wrong time, and we can apply the corrective action to them.

Nicholas Lamparelli
Yeah, that's one.

Go ahead. I want to add to that, because it's still it's still bewildering to me that Potentially 80 to 90% of all the data in an insurance company is being ignored. So, I would add to, from poor judgment to in good judgment to no judgment, which has all sorts of ramifications, right? If you This is the second time this week, I'm going to throw this lyric out the rush lyrics if you choose not to decide you still have made a choice, right. And it's by having all this data sitting there not doing anything with it. Most more, I would think more likely than not, that's going to end up in the poor judgment case, the poor judgment side of the ledger, where bad things are going to end up happening because you didn't do anything. With that data that was just sitting there. It just it's still bewildering to me, that all of that data is there and we're focusing on the 10 or 20%, to manage a trillion dollar industry.

Rock
I completely agree with you and we would we would kind of respectfully submit that It's almost like you have analysis fatigue on the structure data, the semi structured data is there. And we've got all these great tools to analyze data. So we analyze the same data over and over. And guess what, there's a lot of great things coming out of the new analysis tools. But the fact of the matter is, by ignoring 80 to 90% of your data, you're still going to end up in the wrong place. A lot of times, unintentionally. Yeah. And you could argue, Nick, that it's not being ignored. Because you're tapping into it when you do a file on it. Okay, so you're you're touching a sliver. But, but here's a good analogy for you. Because I started when I started business, back with a peat Marwick. It was just when the personal computers were coming into being and we used to use 10 key calculators to add subtract, multiply divide over a 10 year period of time when the personal computer was introduced shortly thereafter, the the electronic spreadsheet Lotus 123 visicalc was introduced in over a 10 year period of time the spreadsheets came to transform the way the industry worked were numbers, right in turn manual clerical processes into thought based problem solving activities. Because the spreadsheet was a platform from which you could do all these complex calculations. And so what I would say now is that any company that isn't using a tool like what we're offering is the equivalent of using a turnkey calculator to do a quantitative analysis rather than using the incredibly powerful platform to say, I can see the entire population in a way that I never could before I can identify my critical events and activities, and I can much more quickly figure out where good judgment is happening and where poor judgment is.

Nicholas Lamparelli
Yeah. Okay, let's assume that someone's listening to this rock in we've we've kind of hit a nerve with them and they're, they're completely in agreement with us. I want to try to have a segment of this going forward buy versus build. Right, I want you to explain to someone that's listening, that's going to run back to their office on Monday. And, and try to create an initiative to build something like this, why they shouldn't?

Rock
Yeah, it's a great question. And what I would what I would submit, Nick, is that if you look at where we're trying to go, right now, we are trying to give people simplicity. And that's what we're trying to show executives. What I would say behind that simplicity is a tremendous amount of complexity. And for that, if you look at the number of people that are using, the the sort of approach and the technology that we are, there aren't very many people, if any, we are in fact we've a number of carriers have confirmed that we are exclusive in the way we're trying to tackle this problem. And the reason we're exclusive is because it's very complex. That's the reason that it's Yeah, I would kind of humbly submit it's taken the number of years that has to get us refined to the point now where we can give people something simple. In fact, I had a an X ray one time early in my career that he was mentoring me with problem solving. He said rock, you will encounter a lot of people in your career that are really smart. And the way to figure out that people that are really smart is the way is the ones that make it simple. And the ones that try to make it complex, you want to be careful with them, because they're not nearly as smart as they think they are. And I think that's kind of where we are with this sort of technology. There's so much complexity behind dealing with language and the way people write all the jargon, all the nuances that go with it. And if you go back to the kind of the 2005 2006 timeframe, this is interesting data point for you because it's kind of when There was something called text analytics came about. And you can still find people that would talk about text analytics. I never liked that term. I thought it was confusing. It's hard to say it's hard to explain it. In the early days, I would spend my precious 20 to 30 minutes with a prospect trying to explain text analytics and you got nowhere. But in those early days, people started focusing on what's called customer sentiment, and customer sentiment is really easy to figure out respectfully. But even today, it's a billion dollar industry where companies are providing sentiment analysis. And in those early days, though, there was some high profile RFPs that took place where the winner of that RFP was then selected to come in and negotiate a service contract, and ultimately, they could not replicate the work that they did during the proof of concept. Except, and that's really critical problem if you can't replicate your work. And one of the things that we've done with our platform is we built in 100%, transparency. And we can always show people how you got from point A to point B. And so when you take on that process of building something, and I will tell you, I've encountered many people that have tried and failed miserably at doing this, those are people that are some of my my best contacts and people that are continually looking for opportunities to help us bring solutions to them, because they know they've tried it, and they realize that they can't get to where they want to go. So for someone that wants to buy versus build, that's always going to be out there. But I again, where I would go with it respectfully is that it's a it's a dangerous place to go. We used to talk about it as being oceans of data, and unless you know where you're trying to go, if you get out of on a ship and that ocean, you can quickly get lost and you lose your way. And I think that's exactly what the industry has learned.

Nicholas Lamparelli
My guess is if I if I have this as a repeated question on with all of my guests, I think the vast majority of time it will be don't build it like it's more complex and more time consuming. They you can possibly think it is. And like you said, it's probably take skill in art to make it look simple. But behind it, it's really complex. And, and that's part of the point of why I want to keep continuously asked this question is to make sure that the, you know, senior leaders at carriers really think through like, Do I want a one or two year project and, and how expensive is that going to be for them to even learn the basics, like the stuff that you learned? That's now old hat, the easy stuff. They have to start from scratch and learn that right from the beginning and then Probably a lot more complex than they think it is.

Rock
Yeah. So in the spirit of Jim Collins, and Good to Great. So what I would respectfully submit is that our hedgehog is is our ability to contextually extract information and apply that across a big population. And we can be the best in the world at doing that. And that's especially true in the insurance industry. And I've had a lot of wrestling matches with my, with my software architect, and my my data analyst. And if you look at the ongoing struggle that we've had about on the backend, what how do we package our information, because it's very easy for people to spin up a really sexy looking dashboard and something that has really cool graphics that's easier than it's ever been before. And for us, what we're doing is is that really Critical core work. And where I always went with with my software architect data scientist is that we cannot be the best in the world at creating dashboards and all those kind of sexy things. We can in fact, and we are in my I say that humbly, I think the best in the world at being able to identify and extract critical operational events and activities from which you can do all sorts of things that you never could before. And that creates a sea change in the way that management can manage risk and insurance manage underwriters risk services adjusters and most importantly, their relationship with with their insured customers and agents.

Nicholas Lamparelli
That's, that's a good way to segue into the final question, which is, is should, should we have interest anyone listening to this that this might be something that could help them In their business, what does an engagement with SDI look like? It doesn't require a big investment in hardware software organization. Talk about how you've been engaging?

Rock
Yeah, that's a great question. So we are what we call a product as a service, meaning that we are a cloud based platform. And the way we engage with clients is on a subscription. So so what we're looking for is that monthly subscription. And we would say it's relatively easy to engage. We don't require them to implement any hardware or software. And I like to go back to the Excel spreadsheet analogy, just like you download quantitative information or numbers to put into a spreadsheet. We're talking about the same thing here. Yes, we are going to have to work with the folks in it to export or download the sentence data, wherever it's stored, whether it's in documents, whether it's in a database table, wherever it is, we're gonna download it and that's the same thing. That you would that an executive would do if they were pulling information out to do an analysis in Excel spreadsheet. So in that context, then we can provide information back in the way of a flash report or an event activity scorecard, or something that's even a more detailed file that they might pull into an existing table to which they could expose their AI tool, their predictive model tool, whatever they want. So for us, it's really, if you think about it, Nick, it's a good way for us to talk about it. You know, we're not about the front end, we don't care where the sentence data originates, we can we can come from multiple disparate sources. Okay. And likewise, we don't really care about where it's at on the back end, because we again, we can serve it up as a flash report event that can be scorecard that can get served up, as, you know, through an email trigger or through a web browser. We're really about that that core and I haven't said this before. Our session here today, but it's cleaning, tagging and extracting. That's really what it's about. And we're doing that in our environment. And we don't want to get technical, what we want to do is give operational executives the ability to get information that they care about. And we're doing that kind of in that in that that middle piece, not about the front end, not about the back end, we can accommodate whatever a client has there. We're about executing in that middle. That's where our secret sauce that's where I hedgehog.

Nicholas Lamparelli
fantastic way to end it. I've, I've learned a lot, which is part of why I want to do this. You know, I come as we discussed before, before we even started, part of the way I structure my questions is so I can try to understand more about this and I see it and so now I walk away with that 80 to 90% that's potentially not being used. He imagined some sort of of crisis within the company, where in that 80 or 90% is useful information that someone couldn't quite get to because it wasn't in a spreadsheet or wasn't in a database. So, let's do that. Thank you so much. Thank you for taking time out of your Saturday.

Rock
Yeah. Nick, one more point. You mentioned crisis. Okay. So go back to what happens when a crisis hits. If results get really bad. What do people do? They go back to the underwriting files, they go back to the claim files. They go back to the risk service report and they say what the hell do we miss? What are we missing? Why are we getting obliterated?

Nicholas Lamparelli
Right let's let's not miss it.

Like Yeah, going forward? Yeah, I yeah.

I I completely get that.

Rock
Yeah. But Nick, thank you for your interest in learning. I appreciate your appetite for learning. I appreciate your interest in what we're doing. And thank you for this opportunity to talk. I appreciate that.

Nicholas Lamparelli
Thank you for sitting in the hot seat.

I think that's gonna be the new name.

There you go. Okay. Take it, mechanic. Bye everyone