Salesforce & SQL: Get More from Your CRM Data Using the Tools You Love

Salesforce is one of your company’s core data sources, so why is it so difficult to work with the data? This 30-minute webinar explains how you can take back your CRM data and explore it in completely new ways.


David: To start out, I wanted to give a brief overview what the current world of reporting and analyzing Salesforce data really looks like. After talking to a lot of companies that use Salesforce about how they analyze their Salesforce data, what we see is that time and time again really three main methods of doing this reporting and analytics come up. The method that we hear by far the most often here is that companies are actually doing all of their Salesforce reporting inside of Salesforce's built-in reporting UI. Now, if the companies tell us that they aren't using Salesforce's built-in reporting functionality, what they often tell us is that they're actually exporting all of their data from Salesforce into spreadsheets and then doing all that reporting analytics in spreadsheets. And then third most often what he hear is, and obviously the one that we tend to favor the most here, is that companies are finding a way to send that data from Salesforce to a centralized data warehouse and then doing all of their reporting and analytics on top of this data warehouse.

Now, obviously we've heard other edge cases of how people are doing this, and we would actually love to get a better sense of how all of you are currently analyzing your Salesforce data, so we're going to put out a quick poll here to see what you guys are doing most often and how you guys are doing your analytics in Salesforce right now. Feel free to respond to that, and we'll take a look at the results here. Oh yeah, the results are flying in exactly the way I would have expected them to here. Let's see what we have so far here. Even what we've seen so far, this almost...Let me see. When we talk to most of the people that use Salesforce, the majority of the people that use Salesforce currently really depend on that internal reporting UI inside of Salesforce, or they're just exporting their data outside of Salesforce into Excel, into Google Drive, etc.

That's great, and since we hear these first two options the most often, I really wanted to briefly dive into these first two messages and get a better idea of what the Bennefits and the limitations are of these approaches. Lt's start out with Salesforce's built-in reporting tools. As many of you probably already know since a lot of you seem to be using them, on top of the standard CRM functionality that Salesforce offers, Salesforce actually also provides out-of-the-box reporting functionality, and what we've noticed is that for really really early-stage companies, this reporting functionality tends to be perfectly sufficient for their needs at the time.

For example, things like displaying opportunities by quarter or showing the current sales pipeline are actually really simple to do in Salesforce's UI, but what we see happen is that more often than not these companies quickly start to mature and they start asking these more and more complex questions of their data, and then you'll see that there's a point in where Salesforce's built-in reporting functionality really starts to show its limitations, and Benn and I will actually give some specific examples of these kind of limitations later on in the webinar.

The thing to keep in mind is that Salesforce actually openly admits that it's very much a CRM tool first and a reporting tool second. What I mean by that is that the main value proposition of Salesforce, it's really not to help you answer these complex business questions. It's just to act as a really robust CRM tool. So again, what happens more often than not is that the consumers of the Salesforce data very quickly mature, and they start asking these really complex questions about their business that Salesforce's built-in reporting functionality really just isn't built to handle.

So, if we can't answer the questions we want inside of Salesforce, then the obvious solution to do that is just to bring the data outside of Salesforce. Makes complete sense. And this actually goes hand in hand with the method that we see people use second most often, and obviously what you guys seem to use second most often or first most often, which is just exporting that data outside of Salesforce into spreadsheets and then doing all of your reporting analytics out of spreadsheets. Now, while this method certainly does improve on the analytics capability that Salesforce offers, it does tend to come with its own set of limitations, and we actually like to categorize these limitations here into three broad categories, scalability, organization, and automation.

From a scalability standpoint, I'm sure that many of you experienced that after accumulate a large amount of data or you're trying to perform a large amount of functions in Excel, analytics performance really starts to crumble within Excel, and you notice that operations that would be extremely efficient or would scale super easily in more of a data warehouse environment can really bring your computer to its knees when you're trying to do it in an Excel environment. What we've noticed is that this is really apparent when we talk to customers who are extremely accustomed to using this method of doing it, and suddenly they hit a point where their data volume or the complexity of what they're trying to do really scales beyond what's appropriate for a non-database environment.

Aside from this scalability issue, there's also an inherent organization issue to doing this out of Excel, and what we see happen more often than not is that with more and more people performing these regular data exports and more and more analysts doing work in silo on these documents, it becomes a lot easier to become really unorganized and lose that single source of truth within the company. This is because it's actually really hard to maintain that master version of the data, especially when multiple people are working on the same files or the same project. Technology like Google Drive, for example, has definitely tried to improve on this, and they've definitely helped in that collaboration aspect, but again, there's still a scale ability issue as far as the analytics you can do in Excel or Google Drive.

Last but certainly not least, running analysis out of Excel can really be a massive time sink. We've experienced it here a few times, and we've certainly talked to prospects that have told us that the majority of their job is actually just to export data from Salesforce into Excel and prepare it for analysis. To top this kind of thing off, they often say that this process sometimes takes up to a whole day to perform, and they have to do this multiple times a week. Now, I don't know about you guys, but personally this sounds like an absolute nightmare scenario to me.

What we've noticed is that this seemingly extremely repeatable task is often left unautomated and handled by a human for seemingly no reason other than the fact that the technology to automate it really just hasn't been implemented. To summarize here, I think it makes sense to categorize the limitations that we've seen so far into two main categories: reporting capabilities and silo data. As I described earlier, Salesforce's main value prop really isn't to be a robust reporting analytics tool. It's to be a robust CRM tool, and this fact becomes more and more evident as your company really starts asking these more complex questions of your data.

As I explained earlier, a common method that we see people take here to make up for this limitation is actually just to bring that data outside of Salesforce into Excel and then do all of the reporting analytics out of Excel, but again, as we saw earlier, Excel really does come with its own set of scalability, organization, and automation issues. Something we really haven't touched that much on yet that's arguably, in our opinion, one of the most, if not the most, limiting factor of them all is that when your data lives exclusively in Salesforce, you really lose the ability to tie that data to the rest of your company's data sources. And again, as companies start to mature and they start asking these complex questions aimed at better understanding things like customer acquisition and user behavior over time, answering these kinds of questions becomes really dependent on being able to tie all of your multiple disparate data sources together, and when your data infrastructure doesn't allow you to do this, it doesn't allow you to tie these disparate data sources together, you really fail to obtain that bigger insight that you're looking for.

So what exactly is the right way to do this? What's the best way to set yourself up so that you're prepared to run these more complex analytics on your Salesforce data? As many of probably have guessed by now, the obvious first step here is to consolidate all of your Salesforce data into a data warehouse so that you can really start to begin to harness the power of SQL on top of that data in an environment that's very analysis friendly, but it's important to remember here that putting your Salesforce data into a data warehouse not only opens up a new world of analytic capabilities, it also allows you to solve that silo data problem by centralizing all of your other data sources you have into that same data warehouse. Not only does this allow you to begin to tie all of those data sources together to power those more complex analytics, it also allows you to have single source of truth for all the data that your business relies on.

So the obvious next question becomes, how I do I set this data pipeline up for my business, and the honest answer here is that plenty of companies have actually done this just by building a custom data pipeline internally, but the problem that we see happen is more often than not the amount of effort involved and the amount of engineering resources involved is often hugely underestimated.

For example, Braintree revealed that they actually devoted a team of four engineers for six months to build their data pipeline, and then even after the pipeline was actually built they still had to keep a two-person engineering team at all times just to maintain and extend that project. I know that Benn actually has some firsthand experience with trying to build to a data pipeline internally. Benn, do you want to speak to that a little bit?

Benn: Yeah, absolutely. So like David said, this is something that we've actually experienced firsthand. At the company I was at prior to Mode, which was a large web software startup, we went to great lengths to make sure that we could pull our Salesforce data into a SQL database like this, and it took us roughly the same amount of time that it took Braintree. It took about six months for us to build a stable pipeline to our central data warehouse, and once it was built we also had to dedicate a full-time engineer to maintain it and slowly upgrade it and slowly adapt to the changing API at Salesforce.

But even that I think actually underestimates the cost. We had to dedicate a lot of time to monitoring it, to making sure that the connections were up and that they were always live, and because we had these troubles with it initially, analysts actually grew to distrust the data some, and we always were double-checking it to make sure that the data was accurate. We put a lot of extra time and energy into the actual analysis just because we weren't entirely confident in the pipeline, and then occasionally when things actually did go down there were always these big fire drills that took over a lot of the engineering team's time to make sure we got those pipelines back up, because so much of the business depended on it.

It was a hugely valuable integration for us and it was a cost that we gladly paid, but it was a very steep cost, and this was all just for one integration as well. Soon after we built the integration with Salesforce, we realized there was a lot of value in having our data like that in a central warehouse, and so we started to want to pull over data from email marketing tools and things like that, and we actually, because the cost of building one pipeline was so high and because the dealing with different APIs from different tools is often very different, we actually weren't able to put together the resources to build all the integrations we wanted. It was the sort of thing where the tool was extremely valuable, but it was just so expensive to build in-house that we weren't able to build the full capacity of it that we wanted. But like I said, it was an extremely valuable tool despite the very high cost that we paid to build it.

David: All of the issue that Benn just mentioned were basically the exact issues that we set out to solve when we went to build Pipeline. Just to reiterate here, what pipeline actually does is it takes any number of data sources and enables all of that data to continuously flow into a single Amazon Redshift data warehouse. Right now at this point in pipeline's development period, we are extremely focused on expanding our base of integrations that we offer, so if you see an integration, or if you have one or you need one that you actually don't see in front of you today, please feel free to let us know, and if you want to learn more about again our RJMetric's Pipeline or Mode, please stick around for the demos at the end.

Now that we've gone over exactly how to get your Salesforce data into an SQL-ready and an analytics-friendly environment, Benn's actually going to start this section by showing you how you can really start to harness the power of SQL to improve your Salesforce reporting, and the way that Benn's going to show you how to do this is by layering Mode directly on top of your Redshift data warehouse.

As a little aside, we here at RJ actually love using Mode internally for our analytics, mostly due to the ability to really rapidly move from SQL query directly to visualization, and then on top of that just to have that blank canvas of really being able to come up with the perfect visualization you need and then expand that analysis outwards to the company for further collaboration and further sharing. With that, I'm going to pass it over to Benn, and he's going to showcase some examples of how using Mode on top of a data warehouse that contains your Salesforce data can really start to expand your analytics capabilities. Benn.

Benn: Thanks, David. To get started, before I show you the way to build some of this Mode, I want to start with something you're probably familiar with, which is the first method that David went over in the three methods of analyzing Salesforce data, and that's using Salesforce reporting. I want to use Salesforce reporting here to answer a couple simple questions. One of them is very basic. It's just determine how much business closed each month. That's an obviously valuable question for us to know.

The second one though, if we add a little bit of a wrinkle, is let's try to figure out how much time it took for each of those deals to close so we could see if our sales cycle is growing longer or if our sales cycle is getting shorter. I want to say of all the deals that closed in September, how many closed that the opportunities were started that month, how many opportunities were started in the previous month, how many started two months ago, and so on. This obviously has implications for financial projections and things like sales resource allocation to better understand our sales cycle and how it's evolving. Let me answer this first question. We can use Salesforce for this, and it's pretty straightforward.

Here's an example of a sample Salesforce report we built on some sample data that shows the number of opportunities closed by month that's new business, and here you can see below it the simple filters that we applied to the sample data to show where the stage of the opportunity is closed, one, and the type is new business. Very straightforward. This is a simple thing for us to create, and I'm sure many of you already have charts like this in your Salesforce things, but say you want to use RJMetrics Pipeline and Mode to recreate the same chart. How hard is that to actually do?

Turns out with the data provided by RJMetrics it actually only requires six lines of SQL, and you can actually translate the Salesforce report directly into the SQL query that's written here, and in some ways the Salesforce report probably actually generates the SQL query behind the scenes for you. So here you can see this is we're querying the, if you're familiar with SQL, we're querying the SF, so the Salesforce opportunity table, where stage name is closed one and the type is new business, and we're grouping it by month.

This is very simple. It's very basic, and if we add a chart on top off this, which is what Mode allows you to do, then you can see we have the exact same report as the Salesforce report already generate, so we have the number of deals closed per month. And here again because this is in Mode and this is existing on a live report that's sitting on top of RJMetric's data that's always updated, if we want the latest results we can just click the refresh button in the upper right-hand corner and always get the latest results.

In this case we can use Mode to largely replicate this Salesforce report with one small added side Bennefit of a lot of people throughout the business don't actually have Salesforce accounts, because you've now moved the data outside of Salesforce. You've in many ways broken down the silo that limits access for a lot of people to that data, and now can provide them the same information that's in Salesforce in tools that are external to that that they don't have to pay for Salesforce license to use.

So like I said, now let's try to make this a bit more interesting and add an additional question to this data, and let's try to answer, how do we actually cohort this data so we can understand how long it took for each of these deals to close? In Salesforce this actually gets very complicated. This is not something I've actually ever been able to figure out how to do using Sales force. Maybe there are some folks out there who are Salesforce wizards and really good at the Salesforce reporting tools and can actually do this, but this is something we've actually never seen replicated in Salesforce, and part of the reason is, like David mentioned, the Salesforce reporting tool just isn't robust enough to answer these question, and you can't do things like computed fields, and I can't calculate the amount of time between an opportunity created date and an opportunity closed date.

So at this point we could do this in Excel. We could pretty easily export all of our opportunities into Excel, do some math on the dates, and ground them by the amount of months, and then build the same, sort of cohorted chart, but then the data's no longer live. It requires these manual updates, like David was talking about earlier, and it becomes a very involved process. Alternatively, if you're using RJ Metrics Pipeline and Mode, you can actually create this cohort with one additional line of SQL.

So the query on the left is the query that we saw before that's looking at the number of opportunities per month and the query on the right, which adds this one second line which has a date diff function which calculates the difference between two dates, calculates the age of that opportunity. With this one additional line of SQL, you've now been able to figure out the cohorts that these opportunities actually have.

So now if we put another chart on top of that, we can use some of the more advanced charting functionality that Mode provides and then we can now create this stacked bar chart that has these gradients showing the length of time it took for each of these deals to close, and so very quickly with one line of SQL and a quick update to a chart, we can now answer this much more complicated question because we have access to the raw data that underlies it.

With this chart we can see that in October we had a lot fewer deals closed. It had a short sale cycle, but some more deals closed in the one-month window rather than in the two-month window. This starts to help us provide some insight about our sales team and about how our sales process is going all with just one additional line of SQL, and as before this report is always live, it's always updated. Anybody in the organization can see it rather than having to rely on someone pulling data out of Salesforce, exporting it to Excel, and sending them the latest version once a week or something like that.

This is just one example of a case when using SQL and using raw data provides a lot more opportunities to answer questions than using reporting directly in Salesforce. There's a few things here that you can do that you can't do in Salesforce. One of them, much like this, is you can calculate metrics that Salesforce can't, things like figuring out how long it takes for an opportunity to travel between stages. Say you want to understand, how long does it take for our average opportunity to go from in contact to contract negotiation?

With SQL that's actually very easy. In Salesforce you have build very complicated reports to figure that out. Similarly, say you want to understand the actual close probabilities of accounts at different stages. That's also kind of difficult in Salesforce, but in SQL it's very easy to say of all the opportunities ever reached this stage, which percentage of them closed, and that helps you really refine your financial projections and really better understand whether or not your sales team is actually hitting the targets they need to hit and if they're actually reaching the opportunity stages they need to reach to meet those metrics.

Another problem actually that working with SQL provides is that you can work around messy data. Often if you're using Salesforce reporting, everything in there has to be taken as truth. In the report I showed before where we had limited opportunities that were new business, we had to rely on those opportunities actually being tagged correctly, but as I'm sure anybody who's worked with Salesforce data knows, Salesforce data can get very messy. Accounts can be entered twice, you can have duplicate accounts, new sales folks can mislabel accounts as new business when it's actually an upsell, and lots of other data entry problems emerge, as with any tool it requires a lot of manual entry.

When you're working with Salesforce, again, you don't really have a way to work around this. You have to rely on the data actually being accurate, But in SQL you can work around these problems by adjusting your queries to show only the first opportunity created with a particular account regardless of the business type, so that ensures that it's new business rather than an accidental new business entry, or you can look at distinct accounts rather than all accounts, which will help you weed out duplicates in case duplicate accounts get created.

So in that process you can not only build a better report, but you can also start to understand if your Salesforce data is messy and start to clean it up so that the entire sales team has a much more accurate picture of how the sales process is actually evolving. David has a few other examples of instances when they've used this Salesforce reporting and helped add to what the functionality is that Sales force [inaudible 00:21:29] out of the box. So David, turn it over to you to provide a couple of these examples.

David: Thanks, Benn. Just to piggyback on top of what Benn's saying here, I actually wanted to show a few examples of something that we're doing with our existing Salesforce data that just would be really challenging to do inside of Salesforce. The chart you're actually seeing in front of you right now is a real chart that we use internally to really track top-of-the-funnel sales activities for our account development representatives. For us, the number of calls held in a day is actually a really critical leading indicator for the deeper funnel sales performance the following week, so for us it's really important for our sales team leads to make sure that we're always improving on this front on a daily basis and make sure that week over week we're improving.

This chart right here, it allows us to observe how our daily performance differs week over week in these top-of-funnel sales activities and it allows us to really quickly identify shortcomings as they happen, and then we can act on them really before it becomes too late. And again, an example of a relatively simple analysis, but very insightful, that just really wouldn't have been possible to build in Salesforce's native reporting UI.

One more quick example, again, still using 100% Salesforce data at this point, but now what we're doing here is we're actually layering goal data over top of the actual data coming from Salesforce. This additional layer of information really makes this chart extremely valuable to our sales leaders here, because it allows them to quickly get that insight of if we're constantly hitting our daily goals or not. But these are really examples that are still dealing with strictly Salesforce data, and we really haven't touched yet on the power of actually using the Salesforce data in a data warehouse and starting to tie it with your other data sources, and to start a little bit more on that I'm going to actually pass it back to Benn here.

Benn: Thanks, David. In the list of Bennefits of using SQL over directly in Salesforce, one of them was excluded actually, and that was one of the biggest ones that David just mentioned, that you can combine data from multiple sources. Rather than just looking exclusively at Salesforce data, you can look at Salesforce data from...and combine it with your product data, your usage data of how people are actually using your application and website, or we can combine it with other external services, for instance a help desk data or payments data.

I want to go through a couple examples that we actually use here of combining our Salesforce data with these external data sources. They both revolve around the central question of we want to make sure that our sales team, when they reach out to prospects, are talking to the right people. We want to make sure that we're always talking to our potential champions rather than someone who may have not had as great an experience with Mode. So how do we do this?

Here's an example of a query that we used that actually builds this report for us. This query, as you can see, is a bit more complicated than the previous one, but it's actually a fairly simple query, and especially simple relative to the complexities that typically go into trying to answer these questions. The first third of this query you can see that we're referring to something in sales touches, and this is the Salesforce activities data where sales folks logged called and emails, and it looks at for customer emails, how much are we actually contacting each of those customers, so how much do we call a particular customer, how much do we email them for an account that we've called Octan here?

The second third of this query is the part called product activity. It uses production data that we have from our app, so it's actually looking at application data of Mode and looking at, how often are people logging in. We're using the same email address of our users and then seeing, how often are they logging in, and when was the last time they actually logged into the product? Then the third part of this joins these two things together. It joins the product activity with sales touches on a user's email address.

The result of that is a table like this. This is a very simple report. The output is very basic but it's also very powerful. In this table we can look at for this account that we're looking at, which we can call Octan, we can see that the top two users of Mode, Samantha and Jessica, are actually people that our sales team have reached out to. They've talked to them a few times each.

This is great. This shows our sales team is doing a really nice job of finding the champions and making sure that we're talking to folks that really love our product. But if we look a little further down on the list, we can see that Christopher is actually the person you talk to the most, and he's about tenth on this list. That's not necessarily wrong. Christopher may be a decision maker or somebody else who's our sales contact but not necessarily the person using the product every day, but there's these gaps in between Christopher and the two folks at the top that may provide some opportunities. For instance, James, the third highest person, we haven't ever talked to, and he's the third most active person on Mode. So this might be a great opportunity for our sales team to contact James or our marketing team to contact James and really build another champion at this account for us.

Similarly, Matthew looks like he might be a champion, but we can see from the last-seen date here that he hasn't actually logged in in over a month, and so this potentially raises other questions as well. It may show that we should reach out to Matthew to re-engage him, or it may show that he's not a great champion and we should perhaps move further down the list when we're looking for other folks to contact.

So this report can be very helpful for these sorts of things, and really helpful for our sales team, but in some ways it's actually a little bit limited, because this is for one account. Our sales team is making calls to lots of prospects every day. We want to make sure that they're always looking at this information for the prospect they're calling and they're not having to ask someone to build this report for them. Mode provides a way for you to be able to do this.

Within the Mode interface, within the queries that you create, you can add these parameter fields that allows anybody to update the queries without actually having to edit the SQL. So in this case, when someone on the sales team comes to this report, they can click the refresh button to open up this field and then they can type in the account name of the account they're about to call. And this way, what was once this very basic static report becomes a very dynamic tool that can be used by the rest of the business to quickly understand how prospects are performing and who the sales team should be contacting anytime they're making a call.

But we can actually go one step further with this and make this integration with Salesforce even tighter and easier for the rest of our business. This is the URL to this report. It's just a basic URL, but Mode allows you to add parameters within the URL, and so if I add these two parameters within the URL where we define the account name directly in the URL, now when I visit that page it will automatically update through a port to that specified account. So if I go to this URL, it will update this report with the latest data and fill in the parameter with the account name, with this Octan name. This may seem like a fairly basic feature, but it's really powerful if you actually integrate it back into tools where the rest of your business lives.

So we've done this here at Mode and integrated reports like these directly into Salesforce. This way, when the sales team goes to look at an account in Salesforce, which is where they're living every day, they can look on this most active users field that we've created,. It's a custom field in Salesforce. We've created that field such that it dynamically populates that URL with the account name directly in the URL. That way, the sales team can just click on one link and immediately have the most up-to-date data on who the prospects are that they should be talking to within that account. It's a very powerful tool to tighten the circle between Salesforce and really close the loop between Salesforce data being pulled over by RJ Metrics, analyzed in Mode, and then that Mode report being available directly in Salesforce. The sales team never has to actually leave the tool that they're comfortable in, but we're actually cycling through these other tools and providing them all the Bennefits of these other tools with them being able to live in the environment they're most comfortable.

And again, this is just one example of other things you can do by combining data sets in a tool like RJMetrics. There's lots of other things you can actually do to help your sales team by combining data sets from different data sources. For instance, you may be looking to reach out to people as they become very active to make sure you're getting the iron while it's hot. In these cases, you can build reports where you can see who's the most active user in the last week, who's your sales team's talked to that may have been a recent champion or may have been someone who is just now starting to really love your product and still in that honeymoon phase when they're a real opportunity to talk to.

Second, you may want to look for other indicators of champions. Product usage is one way to look at it, but there are other things you could find that might indicate someone who's had a great experience with your product or someone actually hasn't. One way that we do this is we combine usage of support tickets. We want to see who has had a really great experience with our support team, who's had support tickets closed very quickly, who's rated our support very highly. Those might be people who have a very positive experience with Mode and might be great people for our sales team to talk to.

But on the flip side, we might actually want to avoid talking to people that have submitted a lot of support tickets that have complained about a lot of things or seem to not actually have as good an experience as other folks. Those may be people either we can reach out to try to make sure that we're giving them as best experience we can, or if we're trying to get people on a sales call and we want to have all of our champions on the call, they may be people we don't want to actually reach out to.

But all of these things can really help inform our sales team to make sure that they're not just talking to the person that the prospect says they should be talking to, or they're not just talking to whoever it is that they found first, but they're talking to the person who actually will provide the most value for that conversation. And again, I finally want to turn it over to David once more, who has some examples of other ways that they've actually combined data like this within RJMetrics.

David: Great. Thanks, Benn. So this is actually a really great example of the kinds of things you can start doing once you start to consolidate all of your data sources together. This is actually something that we recently wrote a blog post about on Mode blog, and what you're seeing here is a comprehensive acquisition and onboarding funnel for our pipeline product.

To give you background, we really wanted the ability to track a single prospect from the first time they visit our website to the moment that they become a paying customer, and we wanted to be able to see which steps between these two points were the steps that were acting as bottlenecks in this process, the problem being the data for each one of these individual steps actually lives in a variety of different data sources, including a Redshift database via Snowplow, Salesforce Pardot, and various MySQL and [inaudible 00:32:40] databases. Luckily, Pipeline allowed us to consolidate all of those data sources into a single Redshift data warehouse and write a query that can produce this beautiful chart you see in front of you, and now we can actually really quickly get the insight into exactly where users are falling off in this process and we can watch how this funnel changes over time.

Another really good example of something we recently did that's actually really important to us internally that would not be possible without a data pipeline tool is to be able to look at our self-service score over time. I'm sure that many of you probably already know this, but there's a really great measure of how effective your help center is in getting users to service themselves instead of filing support tickets, and it's calculated as the ratio of unique people visiting your help center content and attempting to help themselves to the unique number of people that are actually just filing support tickets, the problem here being that the data to determine the number of people that are visiting our help center and trying to help themselves actually lives in a Redshift database via Snowplow, and the data to determine the number of people that are filing support tickets lives in Zendesk.

So again, this is a great example of how Pipeline allowed us to quickly consolidate these two data sources into the same data warehouse, and now we're able to have a report that monitors this KPI over time. The examples that I just went over and Benn went over are honestly only scratching the surface of the kinds of stuff that you can do once you start to consolidate your Salesforce, your help center, your web analytics, or even your production databases into a single data warehouse. Luckily, we're entering a time where technology like Pipeline and technology like Mode makes this process easier than ever to set up.

With that, that's all we have for you right now, so at this time we're going to open it up for Q&A. I'm going to see if we have any questions here. It looks like the questions are already starting to flood in here. Let's see what we have. It looks like someone has asked, "Can Pipeline connect to Postgres, NetSuite, and MySQL?" As of right now, we actively support MySQL as a database integration. Postgres and NetSuite are actively in development. We actually are soon reaching the end of our development cycle with those, so they are surely coming soon as database integrations. Let's see here. Benn, I'm going to pass this one to you. It looks like someone's asking, "What are the advantages of using SQL over other programming languages?"

Benn: SQL is very much designed for asking questions of data, for extracting data from databases and manipulating that data in ways that's most Benneficial for answering questions. There are certainly other powerful languages out there for performing analysis, things like R and Python can be powerful tools for that, but often for a different audience. Those types of tools have a steeper learning curve and are often used by data scientist types, where SQL is a much more accessible language that is actually fairly straightforward to learn, and one of the things we've seen from a lot of our customers is because Mode makes that SQL accessible a lot of people who are sort of on the edges of knowing a little bit of SQL but not fully comfortable with it actually take a lot of time to dig in to it and learn, because they can use Mode to answer a lot of their own questions.

So it's largely because it's just the ideal way to interact with data in a database, and really any analysis begins with SQL. If you're using something like R and Python, you always have to pull the data out of a database to begin with, and that always involves writing SQL. So it's very foundational for performing analysis like this, and actually for many of our customers, and some of the largest companies that are out there, companies like Facebook and Linkedin and the leading tech companies in Silicon Valley who you'd expect to have the most advanced data scientists in the world using the most advanced tools, we've talked to a lot of their analysts, and SQL is the meat and potatoes of what they do.

So for folks actually looking to learn SQL, Mode has a resource. We have a tool called SQL School. If you go to SQLschool.modeanalyticscom, it's a tutorial that walks through the basics of SQL and how to use SQL for analysis. A lot of the SQL tutorials out there are designed for things like database administrators and for how to create databases. This tutorial is designed for you have an existing database and you want to ask questions of the data in it, what skills do you need to actually be able to do that?

David: Great. Thanks Benn. We got a question here saying, " Can you elaborate on the connecting of various databases to identify the bottleneck for a customer conversion cycle?" Yeah, absolutely. I'm actually going to hop back to that slide so we can dig in a little more. So basically what we did here, and we outlined the data sources that these various steps come from in this chart, is what I went over is saying the critical conversion points for us in our acquisition and onboarding funnel, the problem that we face when trying to analyze them all together like this is that they live in a large variety of different databases, and I'll run you through each one right now.

For example, the actual visiting of our website, the visiting of our sign-up page, we track that data via Snowplow, and that gets dumped into a Redshift data warehouse, but then the form submission, someone actually raising their hand and submitting a form saying I'd like to sign up, that data is collected through Pardot, and then the various steps after that, like the client making an account, connecting their data sources, connecting a Redshift database, replicating data, all of the data points that tell us if those events happened, live in a variety of different MySQL and Postgres databases.

So if we really wanted to get this comprehensive view like you see in front of you of this whole cycle and see exactly where the conversion points and the drop-off points are, we really needed to get them in to one singular place to start querying them, and the point I was trying to make is that this process has historically been extremely difficult, but luckily now with a data pipeline tool like Pipeline this becomes extremely easy. We can easily get all these data sources into one singular place and simply write a Mode query that gives us a beautiful output of a chart like this, and again, we can continue to watch this over time now that we have it in front of our eyes. I'm going to look for another question here. Looks like someone's asking, I'm going to toss it over to you, Benn, "What is your favorite CRM data analysis that you've ever worked on with a client?"

Benn: That's a tough question. I think the most interesting things have come from one of the things I mentioned in the slide when I was talking about other examples, which was figuring out how long it takes for deals to progress from one part of the pipeline to another, from one part of the sales stage to another. When people initially see okay, we have five sales stages, we have prospect, in contact, contract negotiation, security review, and then close one, it's often easy to think about those as each stage is kind of you're going from one to two to three to four to five, or each one requires the same amount of effort to get from one stage to the next or the same amount of time. I think in this analysis we actually found that going from, I don't remember the exact example, the exact details, but going from in negotiation to security review was the sort of thing that often took a day, and it was one phone call in a day, but going from security review to close one took ten phone calls in two weeks, or going from prospect to negotiation took even longer.

It really helped realign the sales team around these are the parts of our sales process that are the ones we need to focus on, and rather than having a team that's designed for dealing with each stage equally, we should have teams that are designed for really focusing on the ones where we're seeing the bottleneck. That was one of the things that I think it was you kind of have these assumptions of oh, the sales funnel is a nice smooth thing, when in reality it's this very jagged, and constricts and widens and narrows at various points that you wouldn't necessarily expect, and so just by looking at a simple question like that we actually could find some of those cases where that was the case.

David: Great. Thanks Benn. It looks like we have one more question here. The question is "Could you please say some words about statistical modeling. Is there any possibility to use OurProject or SAS?"

That is a great question and one I love answering, because the honest answer to that question is that Pipeline really is only the data consolidation part. You can really do anything you want on top of Pipeline as long as you can connect to it and run some level of a SQL-based query on top of it. For example, internally what we've done is we've actively used R on top of our pipeline data to run statistical models. So again, just to reiterate there, really anything that you would like to do that can connect to Pipeline can be run on top of Pipeline again as long as you can connect to it and run some level of SQL-based querying on top of it. Benn, I don't know if you want to speak to anything regarding R or SAS here.

Bennn: I think that R or SAS feels like Python. Those tools have a great use case, especially if you're doing the more advanced modeling like was referred to in this question, and there are places where SQL is great. SQL is great at interacting with a database, it's great at doing a lot of the counting and averaging and stuff like that that's the meat of a lot of analysis. It's not great at building these models, and I think that you have different languages for different things, and that's one of the places where R and SAS can really shine. At Mode currently it's a SQL-only tool, but we're very aware of these places where people are looking for other tools to really expand on their analysis, and that's something that I think we'll be looking for things in the future there where we can actually help those use cases out as well. SQL, like I said, it's the heart of analysis for most analysts, but it's not the end, and that's a thing we're very much aware of.

David: Great. Looks like we got one last question that came in here. It looks like the question is "What's the best way to deal with messy data in Salesforce?" Going into a little bit in to what the issues are with their Salesforce. So basically, what we've seen,and I honestly believe that what you'll see is the majority of people have messy data coming from Salesforce. It's certainly not a problem limited to anyone, and again, just going over it, the beauty of just being able to use raw SQL on top of this data, it...the capabilities of what you're able to do to clean this data as well.

We have personally, I know speaking from experience, what we have happen is we actually have scheduled SQL runs that happen once every day that essentially does transformations necessary to clean and shape our Salesforce data in a way that makes it more ready, more prepared for analysis, so again, just the ability to run this raw SQL on top of it really opens up a lot of capabilities for cleaning and transforming, etc. So I think that that is all the questions we have for right now.

So at that point I'm going to conclude the Q&A, but before we jump right into the demos I wanted to announce our cupcake winner. This is certainly the most exciting part in my opinion. So our cupcake winner today is John Michael Valet. John, please send us an email at, and we will get those sent out to you as soon as possible. With that, I'm going to jump in to a quick Pipeline demo here, so let me start to share my screen.

Everyone should be able to see my screen right now. I'm going to take you through Pipeline. This is the main Pipeline UI right here. What you can see, we got all systems go here. These are really just little status indicators telling us that everything is good to go. Green means good. You have an overall one right here telling us that our integrations and our data warehouse are both good to go, but for further granularity here we actually have it broken down to see if your integrations are good or if your data warehouse is good. These could be things like connection failures, etc. So right here again we are all systems go. Good to go.

Right here it tells us how many rows we replicated per month this month, and here are our active integrations right here. Here are the integrations that we currently have active. Again, little status indicators here telling us that we're all good. You can see I have all these integrations. Here I have my My SQL one. I'm just going to click into my MySQL integration right here and take you for a little tour. Here's another little monitoring capability. You can see how many rows you replicated per day for this specific connection, and you can also see just about how much of the data coming from this connection is responsible for all of your data this current month. So again, here are my databases inside of the MySQL integration.

I can come in here, I can set different settings for this integration. For example, I can set the frequency at which this integration replicates. I can set it down to a minute. I can set it up to 24 hours or anything in between, and I can click into each database, and you can see here that I can see all of my tables in the given database over on the right here. I can set each one to sync if I want. This will take me to a table settings page. I can choose incremental replication, which really just means that Pipeline's only going to replicate new or updated data every time it tries to replicate the data, or I can switch to full table replication, which means that Pipeline's going to replicate the entire content of the table every single time it goes to replicate the data.

I'll leave that as is, and again, I can even go further inside of a table here, and I can even sync the individual columns that I need. So a very very large area of control over what you need to replicate here. I'm going to go back to the dashboard right now, back to our main integrations page. You can see that if I actually want to add a different integration here I can click the add integration button. Here are all our current live integrations we have. You can see that it actually will also tell me which ones I currently have active, so I have two AdWords active on my SQL, and connecting other ones is very simple as well.

For example, if I wanted to connect Google Ecommerce, I would simply click on it, and right here it would just ask me to sign in with my Google Analytics account. At that point it's totally done. It connects right away, and the data starts flowing immediately. Same with any of these. For example, I'll bring up Zendesk. Zendesk just asked me to name my integration, give you the site prefix and a quick login and API token, and you are good to go. That data will start flowing immediately into the data warehouse.

Going back here back to the dashboard, any warehouse settings we need to look at, I can go here up to the top right, click on warehouse settings, and again, it just shows us our data warehouse settings right here. A little testing area to tell you that we're currently trying to establish a connection, and then right there it will tell you we're good to go. That's really all there is to it. Pipeline is an extremely simple...but extremely simple tool. At this point I'm going to actually going to turn it over back to Benn, and Benn's going to run you through a quick product demo of Mode.

Benn: Great. Let's see. Let me show my screen here. So this is what Mode looks like. This is the homepage for Mode when you log in. You're looking at the list of all the reports that have been created in this organization. I'm in a demo organization here, but for your company you could see all of the reports that analysts are creating within your organization so that everybody's always on the same page. You can sort your reports by different things at the top.

You want to stall reports, you can do things like that, so you can always find the things that you return to frequently.

We also have things like an activity feed, where you can see what's actually happening within your report, within your organization. Here are some of the things that have happened recently. We can see that 12 minutes ago Chris Davis published a report called Demo. So it provides visibility into the analytics work that's happening across your organization so you can understand all the things that are being created for folks, for both you and for other people in your company.

To show you what an actual report looks like, I'm just going to search for a quick one here, go to an example report. So here's a report. This is what you saw screenshots of in the presentation. You can see reports have charts, they can include tables. It's whatever it is you want to actually add into your report to create it.

Say you're looking at this report. If I wanted to update it, all I would need to do is click this refresh button, and in this case this data is static, so nothing will actually change, but I can run the report, and now I have the latest version, as you can see. It was run just a few seconds ago. If I wanted to see the raw data underneath it, say the report just showed a chart and not a table, I can always see the raw data.

If I want to see the query that generates it, I can click on the query tab and see the query that actually produces this report. This, like I mentioned before, is one of the things that really helped expose SQL to a lot of people, a lot of our customers and a lot of these people who are familiar with it but not comfortable writing it. They would see reports like this, see the query, and be like, "Oh, I want to show this by week. It's pretty basic for me to just change this one thing and that's always what happens," so people become much more comfortable with SQL because they become exposed to it in this way. This report page is actually also an HTML. We provide access to the underlying HTML in this report, so you can create whatever custom visualizations, whatever custom styling to your reports that you want.

In this case, I can click on the HTML tab to see the HTML that's actually powering it. This is a very basic report. This is stock options that we provide with a chart and table and some header descriptions, but if you wanted to add something much more interesting or much more complicated, I can actually do that. So here's an example of a custom visualization that we've created using a heat map, and so we wrote it in an external library that builds a nice heat map like this that we can put on top of our data if that's how we want to present it. So what actually underlies this? How do you create these? I'll actually prior to that show you one other report with parameters, which is what I showed you before about being able to enter in account numbers.

So in this case this report has a parameter that gives business users and anybody who's not comfortable writing SQL the ability to change the data, manipulate the data without actually having to edit the SQL. In this case, if I wanted to refresh it I could pick an event, and so say I wanted to instead of showing a login event show some message event. I can change that parameter, click refresh results, and the query will now update with that new parameter in it.

Queries actually have a lot of flexibility. A lot of folks use them for entering things like date ranges and account numbers and things like that, but you can actually have them programmatically generate your SQL queries and do much more advanced things if you'd like. What actually goes into creating this report? If I click on edit, I can see the editor that builds it. So here's the SQL query, here's the result it'll return from it. Say I wanted to edit it, and like I said, maybe I just want to change this day to week. I can very easily do that and rerun the report and now see the new version of that report with the new data.

If I want to see what it looks like, I can now click on the report tab, see the new report. Say I want to edit it a little bit, like maybe get rid of this table and add a new chart. Lets' say we add a bar chart and format these dates so that they look different and maybe change them so they're formatted with a different style. I can now add this chart into my report. My report now has a couple charts in it. If I wanted to I can now click preview and see what it would look like if I want to share it with the rest of my organization, and so now my report is updated with the new data and the new charts.

It can also provides access to the history of the reports too if you want to always return to old analysis, if you want to share a report but know you can get back to it. We have a historical runs here that you can always see the versions of the runs that have happened in the past so you never actually lose any old data that you shared out before.

Finally, I want to show the process of actually connecting data. Say you're using something like RJMetrics, and how hard is it to actually integrate Mode within RJ Metrics? To do that is actually very simple. If I go to the homepage and click on this add data button. I want to connect a database. Now, Mode connects to a lot of different databases.

It connects to Redshift obviously, which is what RJ Metrics is powered by, but we also can connect to things like Azure, things like BigQuery, Snowflake and then other on-prin [SP] databases that are a lot of times used for more advanced SQL languages, things like Impala, Hive, Presto. Mode also connects to those as well if you're using those databases.

To show you a quick example of how it connects, say I want to connect to my RJMetrics database. I would just click connect on Redshift. I'll be taken to this form. In this case, I have a form pre-populated here. This is the same form pre-populated with some of my Redshift credentials. I can click connect, and then you're ready to go. Your database is now connected. I can click New Query and immediately start querying it. In this case, say I just wanted to write a very simple query that was select one. I can just run that, and here's the query that's running against that database. That's really all it takes to integrate with RJMetrics is filling out this simple form, and you're good to go. So that's Mode, and with that I think we can turn it back over to the slides. I don't know how to do.

David: Great. Thanks Benn. Awesome. That really wraps up our event for today. I hope everyone learned a lot. If you would like for us to get in touch with you either about Mode or RJMetrics Pipeline, please let us know here. We'll have the right person follow up with you after the show, and again, thank you so much, everyone, for joining us. Have a great day.