Data Science masterclass with Shifra Isaacs

In this masterclass Shifra Isaacs, developer Relations Advocate at Ascend.io, delves into her experience as data scientist, technical writer, and support lead providing fresh insights for FP&A and finance professionals.

In this episode:

  • Data science vs business analytics 
  • Pulling data not yet able to be modeled 
  • Python for Excel 
  • The right models for risk scoring, variance analysis and forecasting 
  • Replicating a process with a new tool using AI
  • How can we survive in an AI first world

Connect with Shifra Isaacs on LinkedIn

Full blog post and transcript

Glenn Hopper:

Welcome to FP&A Today, I’m your host, Glenn Hopper. Today’s guest is Shifra Isaacs, a developer relations advocate at Ascend io, where she empowers data engineers through automation, education and communication. Her background spans data science roles at ECT and JP Morgan Chase Analytics at pros and technical support at Sigma Computing. Schiffer is also a skilled technical writer having created content for crash course and data lemur, where she’s known for making complex data concepts accessible and engaging. Whether she’s building models, writing code, or teaching others. Schiffer brings a passion for data and a gift for demystifying it. We’re excited to have her on the show to help introduce to the world of data science to fp and a professionals and explore how finance teams can tap into these tools in the era of ai. Shier welcome.

Shifra Isaacs:

Wow. Thank you so much, Glenn, for absolutely glowing introduction. It’s really cool to be here. Thanks for having me. My roots are actually in finance, which we can talk a little bit about. So it’s cool to be getting back to those roots a little bit.

Glenn Hopper:

So listeners of the show know that I, uh, I, I keep veering towards data science and, and data and analytics. And there’s, I think the Venn diagram of fp and a and, and data science. You know, there’s a lot of crossover, but there’s difference. And I’m pushing forward with generative AI that we’re gonna be, we’re gonna run out of excuses to not become data scientists because

Shifra Isaacs:

Yeah,

Glenn Hopper:

If you don’t have to learn Python, you don’t have to learn sql, you can do it in natural language. It’s making it more accessible. So, so that’s why you’re on the show and I really appreciate you coming on and I did not know when I asked you that, uh, your, your roots are in finance. So before we dive into the planned questions, tell me a little bit about that.

Shifra Isaacs:

Yeah, absolutely. So I got into Rutgers University School of Arts and Sciences, and also Rutgers University Business School when I was applying to colleges. And I’m a pretty indecisive person, which is something that I’ve been working on the last couple of years. And I decided to pick this school that had the fewest majors so that I wouldn’t be stuck <laugh> between a hundred different majors to choose from. And the business school had six, and the Arts and Sciences school had like 150 or something. So I went for the business school and that means that I was really immersed in finance, accounting, supply chain management, and all these kind of areas of the Rutgers Business School, uh, majors. And I took classes, I did internships. I eventually worked in AI and data science at JP Morgan Chase for an internship. So I feel like that’s probably the biggest intersection between our worlds here, which we can definitely get into. And I think it really helps to understand debits and credits and the other side of a transaction when you’re talking about, you know, the cost of data, infrared, things like that. It really helps to speak the language of your finance stakeholders as a data scientist or analyst. So there are a lot of valuable tools for each side to learn from each other.

Glenn Hopper:

Yeah, and I think that’s, I mean, having spent time in another domain too, I mean, I think it shifts your brain a little bit to kind of real world examples where you saw the kind of data that, what was being used in finance and maybe that feeds into your, um, data science knowledge as well. And so we’ll see. You started in, in business and how did you get to business analytics?

Shifra Isaacs:

Yeah, so business analytics was a major at Rutgers and, or it still is rather, and I actually got into it through a very unorthodox path, which was, I was super interested in music and there was a data science society club that had just started at my school, led by someone who was in this, in the business analytics major called Sina Anon. She’s awesome. You can look her up. I think she’s a cool founder now. And, uh, back in the day she was hosting a seminar workshop with a guy who worked at Spotify, a data scientist. And I was really much more going for the Spotify piece than the data science piece. ’cause I originally wanted to work in music. And when I heard him talking about statistics and modeling and solving business problems, um, it kind of resonated with me for a different reason, which was that I took statistics in high school and I really liked the idea that statistics is kind of the closest thing we have to truth, and that we don’t prove things in science. We kind of disprove things and work from there. And it kind of resonated with me unsuspectedly for a different sort of reason. And once I knew that this was a career path that I was like one step away from pursuing, I was like, this seems really cool. I should go for this and see where it takes me.

Glenn Hopper:

That’s interesting because one of the, uh, first exposures I had to data science was, um, one of my professors had done a study using Tylo Tylo Sty Stri, I’m not sure of the pronunciation, but it’s a, it’s a statistical method that he used it to analyze Beatles lyrics to determine which lyrics were written by John Lennon and which were written by Paul McCartney. And that was one of the first studies I saw was, uh, going through and analyzing the lyrics. And I was so green in data science that I, yeah, I couldn’t even fathom how, how that was done, but, uh, it was an interesting way to, to come into data science through the music.

Shifra Isaacs:

Yeah, that’s super cool. I have to look that up after this. I’ve also done like basic lyric analysis, but that sounds like a whole set of different methods. It sounds really cool.

Glenn Hopper:

Alright, well let’s get into, uh, into your background and, and why I’ve asked you to come on the show today. Since finishing school, you’ve moved around to a couple of different jobs. Tell me, so you’re at Ascend io now and tell, tell me about what you’re doing doing there.

Shifra Isaacs:

Yeah, absolutely. So Ascend io is the unified data engineering and agent data engineering platform. We actually just had our big launch this week where we launched all of our cool AI and age features to really make AI native to data engineering rather than just a lot of companies, uh, slapping a chat bot in the sidebar. We’ve done a lot more than that. And what I do at Ascend is pretty much developer relations or Devereux for short. And that position really sits at the intersection of marketing, engineering, and product. So I’m building product narratives, I’m building developer community and building trust with that community, and then also working on engineering efforts, maintaining a ton of documentation and really getting to touch a lot of different areas because the company is 17 people.

Glenn Hopper:

Wow. Yeah. So that’s a, that’s a cool time to be at a company when they, when they’re that small and you can be just kind of move across the board, wear a lot of different hats. That’s some of my best learning, you know, better than I got in, in business school was being in the startup space. So, super cool. It’s interesting that you have the, these kind of crossover skills because a lot of, in fp and a, there’s requirements for a lot of different skills too. So I know you have a technical writing background, but, uh, technical writing kind of goes along with what we do in fp and a in that it’s one, it’s great if you can build a wonderful model and, uh, you know, do all sorts of, uh, uh, of forecasting and, and analysis on data, but if you can’t convey that story, it it falls short. So when I started my career in the stone age, we didn’t really talk about that, but the but technical writing actually goes really well with data science and fp and a

Shifra Isaacs:

Totally. And in the data space, we call it data storytelling, where you can, again, build the coolest machine learning model in the world or do the best SQL analysis of all time, but if you can’t convey it and get executive level buy-in from people who are not necessarily familiar with all the technical details of what you’re doing, then that’s not gonna bring value. That’s not gonna go anywhere. And, uh, one of my favorite data scientists, Tina Wong, who used to work at Meta always says that as a data scientist, you’re basically a lawyer defending your project and why it matters. And I’m sure that fp and a people and people in finance need to do the same thing all the time.

Glenn Hopper:

Yeah, it’s funny. It’s not just defending your analysis, it’s also asking the why questions, ask why, why again, why, again, to keep digging deeper. And it’s the same kind of motivation, it’s just using different tools. I mean, I get it, you know, if you went and got an MBA or a master’s in finance or a master’s in accounting, you’ve already chosen your path and you’ve got, you know, your domain expertise that you’re working on. And then to hear, well now you have to, you know, learn to write Python and write sql. And actually more and more fp and a people can write SQL queries because it’s just, that’s the way we get to so much of our, our data. Yeah. But we still, and I think even in data science, I mean, I, I still love Excel, but, um, a lot of fp and a teams really, they rely on Excel and traditional, you know, driver models. But I want to start exposing them to, and, and I wanna talk about sort of classical machine learning and all that before we dive into generative ai. But maybe in a sentence or two, tell us why finance professionals should care about data science methods and, and maybe, maybe now more than ever.

Shifra Isaacs:

Totally. So I would say that a lot of, of what I’ve heard about financial modeling and forecasting specifically is pretty much watered down data science. It’s like if you could learn data science in a week instead of a year, then you would be learning how to predict based on the last three months of sales rather than doing a full fledged time series model with a fancy method like rema, which would be auto aggressive, integrated, moving, average models. And basically you’re just doing, like, you’re going halfway toward data science without full sending it. So you might as well learn the full, the full reason, and again, the why behind what you’re doing. Why does this work? And how can I, how can I fully immerse myself in getting the best forecast possible? So there’s that. And then honestly, for people building their career, things like quant analysis are some of the highest paying jobs. If you wanna level up your career and get to this really well paid intersection of data science and finance, there’s pretty much no reason to not take the further step.

Glenn Hopper:

Yeah. Because you’re already, you’re already down the path and thinking a certain way. And it’s, it’s not like Excel is a super intuitive tool. I mean, you have to learn Excel. So if you took the time to learn that, take the step further and, you know, r is a great gateway drug, right? So r is a little bit easier than <inaudible> and it’s a, you know, start there, you know, you can, you can do REMA in Excel, it’s just painful. It’s just, you know, going through and de seasonal and de trending and doing all that stuff and then doing it back over. It’s just, and it’s not, it’s not practical. There’s a lot of other drivers rather than just doing trend analysis and time or time series analysis for forecast. But that’s always a great baseline. And then, you know, you go from there and you have your sort of, you, you start playing around an Excel with, with drivers.

And it, it’s funny because in in finance, a lot of times those drivers are more someone’s opinion than they are based on anything. And what I always say with data science is, it lets you go from that opinion or that hunch to a hypothesis and something that you could prove and getting to that data-driven, uh, sort of mindset data <laugh> on, on this show, they hear me all the time talk about this. But I guess from your perspective, so imagine you’re briefing a senior fp and a director and say, this is someone who’s never coded and he’s, you know, tell he or she is telling you what they do, and you are trying to explain to them what data science is and how it’s different from sort of what they may see as classic business analytics.

Shifra Isaacs:

Hmm. Yeah, that’s a really interesting question. And before I answer, I just need to quickly say, I would absolutely love to see your Excel REMA model sometime. I didn’t know that could be done, and I would love to see it. Uh, it’s been a minute since I touched time series. Getting back to your question, what is the difference between data science and business analytics? So I think that the line can be really blurry sometimes. And typically business analytics can be like the end product, right? So maybe you built a machine learning model, which is totally in the realm of data science, but now it needs to be in a dashboard and now it’s analytics again. So the line can definitely be fuzzy, especially with job titles. But the biggest difference for me is kind of descriptive versus predictive. And I would say that analytics is very descriptive, even in the sense where, like a big part of your job as a data analyst is to explain a trend, uh, sales dip this month, tell me why. And now you’re crafting narrative, you’re storytelling, you’re understanding the business, and the math is just kind of like a tool that you use to do that. Whereas with data science, a lot of your job is predicting building models, predicting relationships, predicting categories, and yeah, basically I would, I would switch the, pull the lever from descriptive to predictive.

Glenn Hopper:

Yeah. And then of course, the super advanced, if you really are a data-driven shop that has, so I, I do see in finance teams, you know, there, there will be separate teams of data scientists, and then there’s fp and a and the data s you know, the fp and a is a customer of the data scientist, so they get the data from them. Uh, but there’s in, if you have a, a shop where it is, everybody’s working together, you go from that descriptive to predictive to prescriptive where you, you know, are like, here’s, here’s the look at our past data, here’s the prediction where it’s going and here’s how what we can do to change the future once you identi, you know, and that’s through, uh, KPIs and finding the right levers to pull and all that. And that’s sort of the dream, that’s the end of the rainbow for, you know, the full digital transformation to, uh, data-driven decision making.

And that’s, you know, there’s huge companies and not a lot of companies are, are really there, but it’s an aspirational state, probably more than a, a realistic one for most companies. Mm-hmm <affirmative>. Yeah. Big companies that do have data science and fp and a teams data comes, well, I mean, we, fp and a, we do go get data ourselves, but a lot of times the data and information we get comes from data scientists. We don’t go, you know, query snowflake and get information out. So we’re at the mercy of, of what they’re able to pull. And I think if we haven’t worked in that world, like a typical data science lifecycle is, it’s a little bit different than what we do in fp and a. So kind of the, you know, ingest, clean model, deploy, monitor, could you walk through what that is from a data science perspective? Because I think it’s, even if we’re not, even if we’re not in Python, it’s, it’s an interesting thing to visualize and go through because a lot of the same steps happen in fp and a.

Shifra Isaacs:

Yeah, totally. And this is something that data analysts will even do in Excel for the first couple steps. So, um, the ingest stage really depends where you are, but essentially you’re pulling data that is not yet ready to be modeled from some source. So this could be super raw. If you’re a data engineer, for example, and you’re pulling some unstructured data that needs to then be put into a table for an FP and a analyst, it’s probably just getting a data set in an Excel CSV or something, and this is your raw data that you’re starting out with, essentially, then that data needs to be cleaned or reformatted depending how you use it. And this gets turned up to 11 for data science because we have this system called encoding, which I’ll just touch on very briefly here. Where, for example, if you have a true or false like binary sort of column, you’re not encoding that to a machine as the words true and the words false.

What you’ll probably do is turn that into a numeric flag that’s either zero for false and one for true. But you need algorithms to do that. And the more categorical options you have, the more complicated that can sort of become, depending on the analysis you’re going for. So you’re cleaning it, um, to make it human readable for an FPNA analyst, and for a data scientist, you’re cleaning it to make it machine readable for your algorithms, essentially. So that’s pretty much the cleaning piece. And then the next piece would be preparing the data for modeling. Sometimes this can include, you know, pulling down the dimensions, so taking really complicated data and projecting it onto simpler dimensions so that it can run faster. It can include scaling your data. If your data goes from like one to a billion, you might wanna scale those relationships down to a range, like one to a hundred to get it more optimized performance wise.

And then you’re gonna actually choose your model. So we call this phase model selection because you need to figure out, you know, which models you’re actually doing, uh, which models you’re actually running. And the best way to do this is to choose a baseline model. So for example, for a regression analysis between like variables X and Y and understanding their relationship, you’ll pick a very simple, uh, linear regression or a multiple linear regression where it’s just drawing a line between those two relationships. But you might land on something much more complex like random forest or extreme gradient boosting, which we affectionately call XG Boost. And so there’s a whole process of even deciding what the hell kind of model am I gonna actually use that’s best suited for my use case and what I need to get out of this. And then from there, once you decide those models and you have a comparison to say like, oh, the baseline accuracy was 60%, but the final model was 80% and we know that we have this like 20% gain across those stages, then once you have that, you’re gonna wanna deploy those models, keep them running on some cadence maybe every week or something for a weekly deliverable.

And then you’re gonna wanna monitor that to make sure that you’re, you’re keeping aware of things like data drift, schema drift, um, or other kinds of issues where either your input data is changing, the use case is changing, there’s some kind of bug in the model, and that you have essentially automated alerts set up so that you can be fixing those issues because maintaining is the hardest part of the cycle. And I think this is something people don’t know. That building is all exciting and fun, maintaining and getting paged at three o’clock in the morning to fix a problem, not so much fun. So you really wanna set yourself up for success with a lot of guardrails on the deployment stage. Um, I know that was a lot. Did you wanna double click on any of that?

Glenn Hopper:

Well, yes. So a couple things I dawned on me when you were saying that, and this has never occurred to me before, building machine learning models and watching monitoring, watching the drift is second nature, but I think about how many models over the years I’ve built in Excel and these models get passed around. So you’ve got different people making

Shifra Isaacs:

Yeah.

Glenn Hopper:

And there’s changes. They break all the time. Every time it touches sales and marketing, it falls, falls apart, um mm-hmm

Shifra Isaacs:

<affirmative>. And it’s a static file.

Glenn Hopper:

Yes. And how nobody, it’s not even spoken. It’s not like, oh, monitor the model, and it’s not even because it’s being emailed around and all that, it’s in like this fluid state. And it’s just, it’s funny. So model drift in, in something that, you know, you build a machine learning model, and I know, I know the data changes and all that, but you don’t think about the drift with that. But in Excel it’s just sort of accepted. It’s like, well, let me go find where the X lookup was broken and go <laugh>, you know, go dig through and trace back my formulas and see why it got wrong here. But that’s, that’s a real problem in Excel, but we just sort of deal with it and just go find where the formula’s broken and, or, you know, rebuild it.

Shifra Isaacs:

<inaudible>.

Glenn Hopper:

Yeah. Yeah.

Shifra Isaacs:

<laugh>. Yeah, it’s, it’s wild. And I think it’s, it’s really just because Excel is not built for software engineering workflows. Yeah. In software engineering, we have a term that people might not know, CICD, which is, uh, continuous integration, continuous deployment. And it’s really this overarching principle of like, the work is never done. Maintenance is a big brunt of the work. There’s what we call unit tests for everything. So every time you add a new piece of code to your code repository, it’s being thoroughly tested before it can even make it in and being tested from different angles in different ways. And unfortunately, this is just something that’s not as widespread in the finance community, I think, for things like static Excel sheets. And I would love to hear from you if you’re willing to share, and let me interview you for a moment.

Glenn Hopper:

Yeah.

Shifra Isaacs:

How do you think that finance can kind of take that mindset? Because I think it would be really helpful to not have to manually look to see where these numbers tie

Glenn Hopper:

Fp and a today is brought to you by Data Rails. The world’s number one fp and a solution data rails is the artificial intelligence powered financial planning and analysis platform built for Excel users. That’s right. You can stay in Excel, but instead of facing hell for every budget month end close or forecast, you can enjoy a paradise of data consolidation, advanced visualization reporting and AI capabilities, plus game changing insights, giving you instant answers and your story created in seconds. Find out why more than a thousand finance teams use data Rails to uncover their company’s reals story. Don’t replace Excel, embrace Excel, learn more@datarails.com.

We’re not completely dependent on Excel like we used to be, but it is where we start. And it is, especially in a year like this where you’re doing a lot of, uh, you know, redoing forecasts because who knows what’s gonna happen with tariffs from one minute to the next. And, uh, you know, what, what could happen to supply chain and um, are we gonna bomb Greenland or whatever, <laugh>. But it’s almost like the models are getting rebuilt. But then, I don’t know, more and more and more is done in, in forecasting tools. And when it is in forecasting tools, there is sort of a tendency to think, oh, it’s set it and forget it. There’s not a mentality around, well, because it’s so deterministic, it just seems like, well, it worked last quarter, why would it not work this quarter? And there’s also maybe, you know, think about the number of features you’re using in a machine learning model versus the number of drivers you might have in a, in a forecasting model.

I mean, maybe it’s, maybe they are more set, but it’s still gonna be, you know, there’s the, the drift in this case is really just formulas getting broken and or assumptions changing significantly enough, or you didn’t build it with the right assumptions. And that’s the other crazy thing is it’s not like a database of features and observations to, to go from your, you’re rethinking. And so if you built the model with only five drivers, and it turns out there have been significant changes and now there’s 11 drivers, well now you just have to rebuild the whole model. So even if, whether you’re in Excel or in another system. So really, I think we’re doing ourselves a disservice by, I mean, my whole career has been, if there’s gonna be some kind of mundane thing that I have to do every day or every month or every quarter, or even every year, I’m gonna, I’ll spend a bunch of time upfront to automate it so that I don’t have to <laugh> do that exact same thing again.

Um, but I, um, but it’s very hard to do that if you’re only doing in Excel, which I just always think of as like a two dimensional representation of the world. Whereas if you have all these features and you’re building out a, a true machine learning model, you have a lot more flexibility with it. And it’s a lot easier to blanket change things across the board if you’re just going through and adding Python to it, versus trying to figure out how to cha because it’s, you’re so tied in, in Excel to the original set of features that you used, it’s not very flexible to just add new ones or, or to change between them.

Shifra Isaacs:

Totally. And I was thinking about this when Python for Google Sheets came out, or was it Python for Excel came out

Glenn Hopper:

For Excel,

Shifra Isaacs:

And I was like, this helps a little bit, but it’s like you said, it doesn’t really give you the full three dimensional representation of like what a regular repository of Python files can really open you up to.

Glenn Hopper:

Yeah, and I, I’ll tell you what I think Python for Excel is, it is an interim state between when they finally nail copilot and just to have Python native to it, because that when copilot is fully integrated, it’s gonna rely on that Python. So I really think this is like an interim step because a lot of people who write Python do spend some time in Excel. Not many people spend most of their time in Excel write Python. So it’s a very small group that’s gonna be using both. But I really think it’s just an interim, like as they move towards, uh, fully integrating copilot, which they’re a long way from now, but you can see the end game there. Yeah,

Shifra Isaacs:

That’s a really interesting point. I’m not super up to date with all the, the Microsoft integration, so it’s good to to know what’s going on there.

Glenn Hopper:

So another thing that you said that I’ve latched onto was when you were talking about embeddings, and I have not read the paper yet, so I probably shouldn’t even bring it up, but there was a paper out of Cornell, it’s called Harnessing the Universal Geometry of Embeddings. The interesting finding was that all language models are converging on the same, uh, like the platonic, uh, uh, you know, ultimate form, that same universal geometry of meaning. So they researchers were able to translate between any models embeddings without, um, seeing the original text. So it’s pretty, pretty amazing, uh, that, and that’s one of those weird, it, it, it almost has like philosophical dimensions to it where these models that were the embeddings that were were created differently, all sort of converged in the same space, in the same vector space. It’s just, it’s like there’s a, a single Rosetta stone for, I don’t know, I’m curious to read the paper and I’m think I’ve got my weekend reading set out for me, but that way when you said embeddings, I just remembered I read that on the plane coming back today. But pretty interesting. Totally.

Shifra Isaacs:

I don’t even remember saying embeddings. I guess I blacked out a little bit when I was, uh, going through that. But, uh, that’s a really cool paper that I need to check out. And I think that this universal mathematical representation, it does have big philosophical implications because we have translation models between very different kinds of languages like English and Mandarin Chinese, and like how do we, how do we universalize semantic men across them? And it’s with numbers and it’s crazy and yeah, it’s awesome. <laugh>.

Glenn Hopper:

Another thing that I’ve realized, and this was sort of an unlock for me when I was early learning about machine learning, it seemed so vague and, and I, I couldn’t grasp it until I realized that machine learning really does two, well, three things, uh, being regression, regression, classification and clustering. And so all this, yeah, learning from data and all that. And when you think about it like that, I mean, so regression, you know, in time series analysis or whatever, just prediction, cla Well, actually, you know what I’m gonna, I, I ramble on about this a lot. So I’m gonna actually, um, turn it over to you and ask you to give us kind of the 92nd tour of those and then maybe your thoughts on which of the model families map most naturally to forecasting and variance analysis and risk scoring things we do in fp and a.

Shifra Isaacs:

Totally. So I really like the little overview you gave. So the three categories of machine learning, basic classical machine learning are regression, clustering and classification. And they really answer three different types of questions. So the best thing you can do when you’re learning machine learning is just think about the kinds of problems you wanna solve and then figure out which, which kind of method will help you. So for regression, you’re asking what is the relationship between X and Y, two different things. And the interesting thing about, uh, the interesting thing about time series is that it typically involves auto regression. That’s the AR and rema. So instead of predicting Y from X, you’re predicting X from X, which is what makes it so interesting and why it’s like a totally different method where you’re not using ordinarily squares, you’re using all these other kinds of like calc two types of, uh, series terms.

So that’s a total aside, but just to bring it back, um, regression is answering what is the numeric relationship between X and Y such that when I get 10 more customers, how much will sales go up by? That’s an example of a regression question. And the answer to a regression question is always a number. So that’s a, that’s a good little pro tip for people. Um, and then when it comes to classification, you’re talking about categories. So is this person going to default on their loan or not? I’m gonna predict yes or no. And the answer to a classification question, a binary question with two options, is typically a yes, no, true, false, red, blue, that type of question. You can get more options. So like, I once built a model, my first ever machine learning model was a multinomial classification between six different loan buckets.

So that was kind of an interesting random thing that I did. Um, and it gets more and more complicated the more options you have at the end of the day. So the answer to a classification question is a category. Then the final group clustering is all about grouping data points into natural segments based on some kind of similarity. And that similarity can be like, oh, if I put some central points on a board, which one is it the closest to? Or if I compare two words, if I compare Glenn and Shira, how many letters do they actually have in common? I think we might have none. So we would have like a zero similarity score by most, uh, string similarity metrics. And you can kind of see how you’d pull these levers, change parameters, which algorithm I am I actually gonna use. And you can see how many opportunities there are to really customize your analysis, getting into the relationship between each of these with fp and a for something like regression.

That’s where forecasting comes in. And that’s where we talked about forecasting based on the previous forecasting numbers, which would be a time series analysis. That’s what you see with investments, people looking at, you know, the change in a stock price over time cost projections or the example I gave, which is like, if this thing happens, then how will sales or some of the variable be impacted? That pretty much covers basic fp and a use cases for regression. Moving on to classification, you would go into the risk modeling that I mentioned. So you’re scoring credit risk. You’re saying, Hey, can I afford to give this person a credit card? Can I trust them to pay it back? Mortgage predictions of is this person gonna default on their loan payment delinquency, things like that. And then with clustering for fp and a use cases, it might be useful for variance analysis. I’m not super familiar actually with clustering use cases for fp and a, but I would love to hear if you’ve been involved in that type of use case before.

Glenn Hopper:

So to me, the coolest thing about clustering is think about churn prediction or cohorts in, in an e-commerce. So you would cluster customers by, okay, these are the customers who bought the first time in January, these are the ones that bought in February and into their cohort. That’s, that’s a group. But then it could be these are, you know, whatever demographic information, if if you’re direct to consumer or whatever, you know, this is a, uh, female 25 to 40 whatever, and you know, lives in this, uh, zip code, estimated household income, whatever, like think of a marketing persona. We just default to sort of the human understanding of clustering. But in machine learning with clustering, if, and I think about this ’cause I started my career in telecom, like there’s all kinds of similarities that customers cluster together that aren’t based on any rules we put on ’em, but on behavior or on what’s happened to them or whatever that we wouldn’t even pick up on.

If you can run a clustering algorithm and see that, oh, customers who bought, you know, who signed up for service in November of 22, have they, they got the immediate price raise and they also got two others. And then we had the big network outage and there’s things that this, like certain cohorts that we wouldn’t have identified. So I think for churn prediction or for customer segmentation in marketing, and I know sales and marketing has been using this for years, but also as fp and a gets more, I mean, there’s starting to be more crossover. Where it was interesting to me that I think starting out in finance and fp and a, when I did, you know, I kind of saw myself and my team as the original business analysts, but something happened with the early days of machine learning where finance was just, well, I don’t have big data, I just got the general ledger.

So if I have three years of data that’s only three marches, that’s not exactly big data. So what do I do with that? But then over time, as fp and a has started to embed with other groups the same way data scientists do and work with other groups, then we’re factoring in more information into our forecast. Whereas sales and marketing early on, especially in e-commerce or SaaS companies or anywhere where you had that much customer data, they kind of jumped ahead for a while and use of it. So now by fp and a being embedded with other groups and kind of working with sales and marketing and having these teams that it’s not just, you know, rev ops and fp and a, but there’s a lot of crossover between them. So that’s kind of what I’m seeing in clustering. And it is cool. And that’s kind of an eye-opener to people too, because it shows, you know, yeah, AI, machine learning or machines are very good at pattern recognition patterns that we wouldn’t see finding correlations that we wouldn’t see. And I’m terrible about p hacking, um, if I’m trying to figure out something for a forecast and trying and trying to find, uh, correlations that we might not have seen naturally and all that. And I think that machine learning lets us, us do that. Totally.

Shifra Isaacs:

And I think it’s worth mentioning just talking about, uh, machine learning and intelligence here. The difference between how we sort of prescribe to these models. So when you have a classification or regression model, in a typical use case, what you’re doing is you’re gonna label all the data and say, like, for example, if this is a credit risk model, then I’m going to label all my training data and say, okay, uh, this person defaulted and this person did it, and now how do we predict from that? What we do is we take a subset, usually 20%, and we test on that data to prove that our model’s working. Um, but with clustering, it’s very different. We’re not labeling that data, we’re literally just giving the machine learning model whatever dimensions we have in most cases and saying, Hey, you tell me what’s related. And that’s why it’s so powerful. That’s why it feels so magical, because even compared to these other typical machine learning use cases, it’s very autonomous and you’re not really telling it what to think.

Glenn Hopper:

Yeah, I mean, it always surprises me when you start seeing those clusters and you don’t, especially when you get it really dialed in, like is it three clusters? Is it five clusters or whatever, when it starts to like really make sense. And, uh, yeah, that’s, that’s, it does feel like magic. It feels like seeing the matrix

Shifra Isaacs:

<laugh>. Totally. You’re you’re taking the red pill.

Glenn Hopper:

Yeah. Yeah. <laugh>, I know there, there’ve been drag and drop tools, machine learning tools for years, and I, I was a big user of RapidMiner until I got bought a couple years ago, and I, I got lazy, uh, with <laugh> with writing Python. Even with the drag and drop tools, you couldn’t really, if you, if you didn’t understand the basics of data science, it’s like handing someone who’s never taken finance or accounting, uh, financial statements and asking them to make sense of ’em. It’s like they could, you know, get a general idea, but they’re not gonna know the right questions to ask or where to look or what, you know, how, how things stand out. So it was very, it was hard, I guess to sell finance professionals on man, if you, if you would learn Python r sql, you know, if you, if you could get these chops and, and build some really cool models for your team, you know, you, you’d have this superpower and they were like, I, I’m having a hard enough time, you know, doing what I’m doing in a day.

I don’t have the time to learn a new language. There’s sort of two questions because I was starting to go down one road, but before I asked that question, I guess I’m gonna interrupt myself. <laugh>, you know, we’re seeing how, how good generative AI is at writing code. And so developers, you know, there’s like, well are, is AI going to replace everything I do? And I, I’ve talked to, um, some of my daughter’s friends in, in college who were studying computer science and they sort of have the sense of why am I even doing this, doing this if I’m gonna be replaced by bots? Do finance professionals need to learn to code or does anyone need to learn to code? Where do you think we are in the world of generative AI writing all of our code on, you know, sort of vibe coding prompts?

Shifra Isaacs:

Totally, very apt, timely question. So I wanna caveat by saying that we don’t really know fully where this is going and where it is now might be very different than where it is in the next year, the next five years. ’cause we’re accelerating at a crazy speed. I’m gonna echo the, the statements of Zach Wilson, who is the most popular data engineering creator on LinkedIn. Um, he quit his 500 K Airbnb data engineering job to teach the whole community. It’s very cool. He’s a cool guy and he posted on LinkedIn this week that he believes that stakeholder management is what is protecting your tech job from ai. So I think that finance has a lot of that. Product managers have a lot of that where your job is to sync with stakeholders, make sure projects are running, clarify business logic and business needs, and make sure that work is aligning with those needs.

And I think that is a great place to be, to protect yourself from, um, AI automation. So I would say in terms of learning programming and building models, it’s more important to have knowledge of the theory of how these models work and understanding how to read the outputs and see like, oh, is this model a piece of garbage or can we actually use this? So at this point in time, it’s really important to understand the types of models that finance people need to be aware of, whether that’s forecasting risk, model modeling, et cetera. And then being able to read the outputs of those models. What I will say is it’s very hard to get deep knowledge on that without practice. So I would say that you should be learning whatever you need and doing projects, whatever you need to be able to have that conversation to be able to manu to manufacture consent and get buy-in with your team. And just think about it that way. Think about it from a perspective of driving value and building stakeholder management skills.

Glenn Hopper:

Did you ever use DataRobot?

Shifra Isaacs:

I’ve never heard of of that.

Glenn Hopper:

Okay. So it was super cool. I don’t, I don’t know what they’re doing now, but it was a super cool, very expensive drag and drop, just badass machine learning platform. And I was with a group that had access to it, and you’d have these people who had no idea what they were doing, no idea that how to differentiate between any sort of machine learning model, because that one thing DataRobot would do is you’d put in the data and um, tell it, you know, I wanna predict whatever variable you want or whatever, and you could let it pick the model. So people would dump data into this to them what was a completely, you know, opaque black box and just <laugh>, um, and then take the results and present them. It’s like, how, how are you gonna present that with any meaning when you don’t know the right question to your point?

Shifra Isaacs:

Yeah.

Glenn Hopper:

Like if you don’t know what a confusion matrix is or if you don’t know how to measure accuracy or precision recall, F1 score, you know, all the stuff, ways that you determine a model is efficient, then you’re, you’re really dangerous and, and shouldn’t be using the model if you, if you can’t explain or understand, uh, you know where it’s right and where it’s wrong. So it’s kind of like, I wouldn’t give, you know, a first year junior sales person the financial statements and ask him to analyze them and then take his results without question. So that’s, that’s yeah, the line of, of turning this stuff over to a model when you don’t understand it.

Shifra Isaacs:

Totally. And I feel like it’s a good time to, to share a quick cautionary tale that maybe your audience can benefit from. Um, I worked with a team once, I’m not gonna name names, not gonna say where, it was a team of business managers who wanted this sort of black box magic data project. And this was the time, like when chat T three had just come out and like AI wasn’t, uh, able to replace a lot of functions the way that it does now. And these people said to me, we want to estimate the wallet size for our mid-size like medium business customers to see how much they can afford to pay for our tools. And I said, okay, cool. What’s the data that you have? And they said, well, we have data from companies we’ve classified as small and companies we’ve classified as extra large. We would like you to build two models and take the average between them <laugh>. And I was like, are you joking me <laugh>? And this is exactly the type of trap that you don’t wanna fall into as a finance professional because you wanna look like, you know what you’re talking about, you wanna do the research and you want to work with a team of data scientists that respect you. You need to approach these types of problems in good faith and know that this analysis is completely infeasible and that you’re just, you’re vastly overreaching in terms of what we call the relevant range of a regression problem.

Glenn Hopper:

Yeah, yeah. And I, I work with a lot of companies on plans on how to roll out generative ai, and I talked to a guy who had, he’s not a coder, he is a fp and a analyst, and he had this report he had to do every day, and it was about, it took him about an hour. It was consolidating data from two billing systems. They were going through some kind of billing system integration. He was gonna have to do this every day for months and months to the end, till the end of the year when the new billing systems were gonna be integrated. And this was in the early days, so they weren’t as good as they are now, but he spent something like 14 or 16 hours for whatever reason, bouncing back and forth between chat TPT and Claude and writing scripts that he could put into that he could automate this whole workflow.

And he finally did it. And so, yeah, I guess it, you know, it took him two full days of work, but if that was something he was gonna have to do every week for months and months, he found the ROI and I think about, I wonder how long that took him. And then it’s like, when I first started writing SQL queries, it would, you know, I’d leave off a comma or something, it would take me, I could stare at it for hours and not <laugh> ended up not being very efficient. But ultimately you learn enough to be dangerous, but you also learn enough to understand what’s happening with the code. So if you use code interpreter in chat GPT now, yeah, you can click on it and write it and it’s all well commented code and everything, but you can’t, you don’t understand what’s happening in the for loop or whatever. So I’m wondering at this point, I mean, I don’t think they have to be hardcore coders and, you know, work in a, it’s not like we’re gonna go work in a production environment, but the ability to write worksheets, I don’t know. I actually, I’m trying to answer my own question. How about I ask you the guest <laugh>, what do you, I mean, what do you think if, if somebody’s not working in data science, but they’re adjacent to it, is it worth them learning the coding basics?

Shifra Isaacs:

Software engineering is a very different skillset from data scientists. I always say that data scientists use Python the way that high school students use calculators.

Glenn Hopper:

<laugh>, yes. Yep, yep.

Shifra Isaacs:

Like we use, we use data science to make number do thing or make computer do math, whatever, but we are not expected to have deep knowledge of computer science algorithms and software engineering, uh, skills. So I would say like the basics of data analytics are really valuable for finance people, especially like you were saying with finance people doing their own sql, like, okay, now you don’t have to slack somebody to see how many records there are in your table. You can select count star from your table all by yourself. Um, and that’s great. But in terms of programming itself, I would say programming is really just a tool for you. It’s not worth learning, like deep software engineering algorithms, dynamic programming, asynchronous programming, all these things. What is worthwhile is understand how to do your basic Excel workflows in SQL and Python. If you’re doing a sum product in Excel, know how to do that in Python. So learn how to use Python interchangeably with whatever you’re already doing.

Glenn Hopper:

Yeah. Uh, actually that’s smart. That, yeah, I like, I like that. Let’s go a step further. Think of like a lightweight, something that someone could do in with generative ai. Either I don’t, you know, maybe they are, are writing code that they’re gonna do a, um, a CoLab project or something, or that they’re, they’re just gonna do it within a CHATT PT uh, conversation, but kind of a, a workflow that an fpn analyst could, um, could try like just something they would normally do in Excel, put it into generative AI or write code and, and do it in a, uh, in a workbook. Um, something like, uh, predicting next month cash burn, um, or, uh, you know, forecast or or budget analysis. Any, can you think of of something like that that would be a good sort of gateway entry for an FBA person to try out using, doing some data science with generative ai?

Shifra Isaacs:

Yeah, totally. And I just wanna quickly call out, there is a new tool that’s pretty decent called the Data Science Agent in Google CoLab. I’m pretty sure it’s free to use. And it’s a data science tailored AI agent that you can work with directly in the type of environment that a data scientist would work in. It’s very friendly for beginners, so wanted to call that out there, that it’s a great tool for finance people who might wanna start on this type of project. I think cash burn, which you talked about is a really good basis for this. And something I really recommend to my students is multimodal learning. And what I mean by that is doing the same thing in a familiar medium and a new medium. So I’d recommend building this, uh, some kind of cash burn regression model, which is maybe predicting next month’s cash burn, using historical data, doing that in Excel with something like data analysis tool pack or even solver I think you can use for that.

Um, where you’re loading the historical data, you’re enabling the tool pack, and then you’re running the regression model, setting your correct cells as the dependent and independent variables, reading the outputs, and then replicating that analysis in Python. And what you can do is say, here’s all the stuff I did in Excel now data science agent in Google CoLab, now turn this into Python and ask it to comment and explain to you every step of the way. Because the whole point is you’re not learning something new. You’re learning how to do something you’re familiar with in a new way, and that’s a lot less scary and it’s, there’s a lot less friction kind of holding you back. I think

Glenn Hopper:

That’s actually a really good idea because one, it puts your mind in where you’re thinking about the workflow of what you do. So you’re what, whether you’ve coded or not, you’re gonna follow a logical workflow that is probably gonna be how you would lay it out in coding. And I’m, I think back to when I first started Python, I, somebody gave me a book, you know, how to learn Python in a day. And that was, so there’s like, okay, that’s, I’m not really gonna learn today, but you’re reading a book rather than being, you know, interacting with something. And I think that that’s learning, you know, not full on computer science, but learning how to sort of interact and do stuff with Python right now if you could in real time, yes, do it and interact with chat GPT or Gemini or whatever. That’s a, that’s a super cool way to break into it.

Shifra Isaacs:

Totally. And one more thing to add onto that is we have this concept in, um, in engineering of pseudo code, which people also might not have heard of. And this is where we basically just write all the steps of a process down. And the idea is you could give that to different engineers, you could give that to somebody who codes a sequel, somebody who codes in rust, somebody who codes a Java. They could all write the same program that does the same thing. So if you can build whatever model that you’re doing in Excel for forecasting or cash burn prediction or whatever, and just write down all the steps you did and then map those to Python, now you’re thinking like an engineer because syntax the grammar of Python, it’s just grammar. You need to know the process that you’re doing.

Glenn Hopper:

Yep. Yep. Love it. So for an fp and a person who’s right now, they’re Excel only, and if they’re interested in starting to, uh, dabble with data science, what do you think comes first? And I know my answer, I’ll tell you mine after you say yours. Do they dive into statistics more deeply? Do they learn sql, do they get some basic python, or is there some reason that gen AI shortcuts some of that learning curve?

Shifra Isaacs:

So what comes first is always whatever you’re already doing and doing that with a new medium. So if your job is to, uh, I don’t know, group some data and make an Excel dashboard for your manager that shows like, okay, these are all of the things that are opex and these are all the things that are CapEx for this week. Learn how to do that in SQL group. Buy, do that first. So it, the first thing is whatever you’re already doing in a new medium, um, and gen AI should answer questions along the way. Gen AI is your personal tutor that never gets sick of your bs. Yeah. And will always answer questions thoughtfully and make you feel like a genius. So, um, that’s a great tool at that stage. Once you’re more comfortable in that new medium, you wanna do a new project. And then once you’ve done something, then I would say learn statistics, fundamentals, learn how to get around a pandas or polars data frame Python. But it starts with just literally replicating a process you already know with a new tool. Like I said, that’s how I think it should start. Yeah.

Glenn Hopper:

So I think in another universe, I was gonna be a college professor, um, and may still, maybe that’ll be my retirement. Maybe I’ll be an adjunct professor somewhere. But, um, because I, I always think, and I, I think it’s because it’s the way that I came to it, because when I was in school, we didn’t, I mean there, I guess when I went back and got my analytic analytics certificate, then, um, I had r but I was first studying finance every, we’re doing everything in Excel, and it was learning those basic statistics concepts and um, but that really opened my eyes. It’s very different building a statistical model and the way you pick features and statistics than the way you would. And, but I thought that that was so informative because I, I can remember before I was A CFO when I was doing someone else’s bidding, and they would tell me what to use as drivers and assumptions and all that. And I just thought, but you’re just making that up, man, what do you literally,

Shifra Isaacs:

I’m so glad to hear you say that, by the way, <laugh>,

Glenn Hopper:

It’s just statistics. Because I locked into that so early before I even had access or understanding, um, to all these other tools. It was, it just, it kind of changed my, my way of thinking. But to your, your point is much more practical. It’s much more, you’re solving a problem. You’re not just going off to some ivory tower of education. You’re doing something that you’re doing anyway and your brain already thinks that way. So I think that’s probably a, a more, a quicker path to actually doing something meaningful

Shifra Isaacs:

Yeah, and you need to be able to check your work because again, the, the problem with AI is people not being able to decide if the output is correct,

Glenn Hopper:

<laugh>. Yeah.

Shifra Isaacs:

So the only way, the only way to know that is if you already know the domain and you already know what the answer should be. And so if you’re doing a process that you design that you’re already quite familiar with, you’re gonna, you’re gonna fare a lot better with AI as your teacher. One other thing I wanted to say on that is just to call out how statistics makes you think differently. You definitely should be checking all your assumptions and like being a scientist in that way. But I will say that data scientists almost never follow scientific processes or brow statistical modeling. And just to give a background for people who are probably not familiar with this, even something as simple as like a linear regression where how does x predict y there’s a bunch of assumptions about like, okay, your source data should be this shape in the distribution and it should have no outliers and it should have all these things, but when you’re in the real world, this is the only method you have. So even if you’re not necessarily hitting all those benchmarks and hitting all those assumptions, there’s nothing else you could do. So you then will end up going ahead and just making the best of it with the data that you have. But the, the, the rigorous thinking is still very helpful. <laugh>. Yeah,

Glenn Hopper:

<laugh>. So it’s funny now you said that and it, I, I thought about something as simple as, um, EDA and you know, in finance you’re not throwing out anything. You don’t throw out an outlier because it’s an actual data point or whatever. You know, if you’re using actual dollar items and not, you know, um, and you don’t, you can’t impute if there’s a, you know, if there’s an NA in a, in a data set, you don’t impute and just put in the median or whatever because that’s, it means dollars or whatever. So I, it’s funny thinking about the, the subtle differences here, um, between the, between the two. But that’s only, I mean, that’s only with financial statement data. Obviously. If you’re clustering and doing other, um, types of machine learning modeling that support the work that you do, um, you follow the data science principles. All right. I think two more questions ’cause we’re running out of time and uh, I could, I could talk about this stuff all day, but we, uh, <laugh>, if people are listening to us on the morning drive, we don’t wanna have ’em sit in the car and uh, and be late to work. <laugh>,

Shifra Isaacs:

Go to work, have a great day. <laugh>

Glenn Hopper:

As automation and AI take over more and more of kind of the monotonous and, and more mindless tasks, that tasks that aren’t gonna be replaced or the people who aren’t gonna be replaced are the people who understand what’s happening here and how data science works. So I’m always advocating for like, everyone should learn data science basics and I’m wondering what you think on that because, you know, so there’s a difference between someone who completely, you know, immersed in it, studied it, that’s your area of domain expertise versus as sideline people, I mean, how much is a finance person or a sales and marketing person or an ops person really gonna be a data scientist if they didn’t fully commit to the a, a degree in it or whatever. So, I mean, what do you think? Should we, should we say, well that’s a whole other domain, I don’t need to understand it. Or is there some level that we should understand what’s the workforce gonna look like in the future with this and would it be valuable for people even though they’re not working as, as a data specialist to know these skills?

Shifra Isaacs:

Yeah, I feel like there’s a lot of questions in that. Again, not super sure how AI’s gonna advance over the next one to five years. I think what I’d like to touch on is the level of basic liter literacy that I think people should have if they have the time and money bandwidth to do so. And for me that starts with statistical literacy. So we’re talking stats one, distributions, outliers, basic hypothesis tests. And this relates to politics, philosophy, your beliefs. As a human, I want you to be able to read a scientific study and be able to read the abstract and the conclusion and know what the hell’s going on. That matters to me. I think people need to be literate so that they can form their own opinions. So I think statistical literacy is the beginning and then now AI literacy is, is really important where I want people to know that chat GPT is not a massive black box and that it’s math calculating the next most likely thing that it thinks you want it to say.

And that’s really important. There are a lot of one-pager where you can learn basic statistics. Uh, transformers, which is the architecture behind LLMs, like Chachi pt, uh, read a one-pager on what are transformers and why are they not magic? Because it’s important for you to know that it’s important for you to know that the AI that’s rejecting your insurance claim is not magic <laugh> and you should challenge it. Yeah. Right. So <laugh>, I think that, uh, stats 1 0 1 and AI literacy are the biggest things when it comes to machine learning. I would say that not everyone needs to know it, but what I will say is that for people who work in the corporate world, data analysts and data scientists, we are your decision support, right? So it’s our job to bring data and experiments to you and say, Hey, here’s what we found. Please go make a decision now. And I would say that whatever your teams are using to support your decisions as a CFO, as an executive, as a head of fp and a as a leader, that’s where you should get basic literacy on. So I guess to summarize what I said is Stats 1 0 1 literacy. So you can read scientific papers and form your own opinions about the world, AI literacy so you can survive in the AI era. And then just domain literacy for the decisions that you’re making in your organization.

Glenn Hopper:

Love it. Love it. Schiffer, you are wise beyond your years. I’m old enough I can say stuff like that and maybe that <laugh>, I don’t know. I sound like a pedantic old man. I don’t know. I had a bunch more questions on here. We are running, running close to being out of time. When we ask everybody what is something most people don’t know about you?

Shifra Isaacs:

I got an a on a piano recital when I was a kid.

Glenn Hopper:

<laugh>. And, and do you still play piano <laugh>?

Shifra Isaacs:

A little bit. I, my main instruments now are like vocals and bass. I haven’t picked up my bass in like two months or something, but this is a good reminder to get back on that. Music is awesome.

Glenn Hopper:

Alright. Alright. And this one we ask everyone, and I don’t know why we’re not, uh, logging the answers here, but, um, you’re in Python, you’re, you’re, you’re doing work there, but we all end up kind of starting with Excel. That’s always my starting point anyway, but we ask every guest, what is your favorite Excel function and why?

Shifra Isaacs:

The only one that’s really coming to mind for me is some product. I really like some product because it’s matrix multiplication and it, without it, you’d have to do so many steps in Excel. And I, I like things that save you a lot of thinking and steps. It’s a good one. And then there’s also one in Google Sheets that I’ll say, which is the Google stocks I think is the formula where it gives you the live stock price. So like, oh yeah, for example, if you’re a tech bro, yeah, it’s really cool. So if you’re a tech bro working in Silicon Valley, like a lot of my friends and you wanna know how much money you make at any given moment with your equity in your total compensation package, that will keep it up to date for you, which is pretty cool.

Glenn Hopper:

Super cool. Before I let you go, um, I know you’re doing a lot, um, sort of an educational space and it’s, um, you are pretty active on LinkedIn. So if our, if our listeners want to connect with you and and learn more, um, where, how can they reach you?

Shifra Isaacs:

Yeah, totally. So you can definitely find me on LinkedIn slash shifra dash Isaacs and that’s where I post a lot of educational content, content about what I’m doing with Ascend Community Building with some of my LinkedIn influencer friends. And yeah, you can find me there, my email’s there if you’re interested in collaborating, talking, thinking about the future of AI data and finance, that’s where I’ll be.

Glenn Hopper:

Cool. Cool. Schiffer, thank you so much for coming on the show. Yeah,

Shifra Isaacs:

Thank you for having me.