How to Use Data to Drive Your Content Strategy & SEO
- May 20, 2020
- 53:17 Watch
Learn how you can use different types of data to organize and produce audience- and SEO-pleasing content. We cover how to use everything from Google Analytics, Google Search Console, backlinks and your page copy to inform your IA and produce better content.
Transcript
Jen Boland: Hello. My name is Jen Boland and I am a senior SEO lead and content strategist at the Beaconfire RED division of AFG (now Allegiance Group) and the mail person just came, which we anticipated happening during this call.
TJ Peeler: I’m TJ peeler. I’m a senior UX designer and content strategist at Beaconfire division of AFG as well.
If you’re here for how to use data, to drive your content strategy and SEO. Thank you for using your lunchtime or giving us an hour of your day to, to listen to our thoughts on, data and how data can really help improve your content. When we’re talking about content strategy and SEO, what we’re really saying is that we can use data to improve your content. Content is the most important thing on your website. Content can include HTML pages, PDFs, videos, podcasts, publications, it can be imagery, infographics. So really it’s the meat and potatoes on your website. No one is really coming to your website to just check out your design or just stop by or see your fancy widgets.
They’re really there because they’ve got a question or a task that they are trying to accomplish, and they think that your content will have the answer for them. So that’s our point of view for this talk, basically we’re starting from that assumption. And I think the advantage of using data, probably, I’m not sure we need to persuade you as to why to use data because you’re probably all here cause you’re like, Oh, that sounds like a really good idea. Data can help you make these decisions, take it out of a vacuum and can really help you form consensus among stakeholders too.
I’m sure you guys are probably all bought into data can be persuasive. You can do testing, show reasons for things as opposed to just opinions in a room, arguing back and forth over something. That’s why we like to use data. A little bit more about what we’re going to talk about today.
We’re going to start with some definitions and then we’re going to introduce you to one of our clients that we are using as a case study, The American Physical Therapy Association, the problem that they came to us with and how we thought we could help them.
And then Jen is going to talk a little bit about modern SEO and how that’s changed the internet landscape, how people’s behavior on the internet has changed, and how you can see that in SEO. And then we’re really going to dive into what y’all are here for, which is how to use data to write your content, how to use data to audit your content, meaning trim it back, and how to use data to organize your content.
So bear with us through, some landscape setting as we bring you all along for the data and the content part. Okay. So what does content strategy, SEO and data mean? Content strategy is the planning, creation, and delivery of good content that helps your users accomplish their goals or helps your business accomplish its goals.
The key thing with good content and good content strategy is that it’s connecting the user’s goals with your business goals.
Jen Boland: And then we want to talk a little bit about SEO. SEO is the practice of increasing both the quantity and more importantly, the quality of traffic that comes to your website through organic search engine results, typically Google, but also Bing and Yahoo and all of the other players.
But I really want to focus here on the word quality because not every website visit is equal. You want to write content that’s going to bring the right users to your site. Not just any user to your site. It’s a waste of your time to develop content that brings unqualified people to your website.
We’ll talk about how we think through this process of how do you get the right people.
So I just want to quickly talk about the two different kinds of data that we like to use to help inform this process. The first is the quantitative data. This is the data that you’re going to get right from Google Analytics. Most of us are familiar with Search Console and some other tools.
These data points have numbers associated with them and they can tell you how often something happened. But they can’t tell you the why, which is why we need the qualitative data. The qualitative data tends to be the why does this happen? Why did that happen? It helps us really understand what our customers are thinking.
It helps us understand why we might be seeing a drop-off in our numbers that we can’t really account for when we look at the site ourselves. And together, the quantitative data helps pinpoint where there might be problems, and the qualitative data helps give us the information that we need to actually solve our problems.
TJ Peeler: We are going to be talking about how we’re using data to inform writing content, auditing the content, which is basically looking at all the content that exists and making decisions about whether it’s good or bad content. Is it helpful or not? Do you need it really? And then organizing the content.
But first we want to do a little overview of one of our clients that we’re using as a case study for this talk, The American Physical Therapy Association. We call them APTA. That’s their acronym. So APTA is a is a membership-driven association.
They support physical therapists, physical therapy assistants and students. They have an interesting membership organization because most members are required to join as a school requirement. Their big thing is showing member value so that people continue to renew after they graduate from PT or PTA school and have massive amounts of student debt.
So APTA came to us with this issue, this problem, that’s not unusual. They had thousands of pages on over a dozen different sites. They have all these customer support tickets. There’s no overlying governance strategy for when content gets posted and when content doesn’t get posted. They had a very reactive method of posting content where you have a few vocal members asking for something, and they would just throw something up on the website, and this grew organically.
We see this all the time with websites. You start off with this contained sphere of content and then 10, 12 years later it looks like unwieldy octopus monster thing. There’s a lot of required content, politically necessary content.
There’s tons of different microsites. There’re actually five different sites for clinician resources. So a lot of PTs are working in clinical settings with clients and patients and they’ve got five different websites for all their resources. The end result, and we hear this all the time, is no one can find anything on our website.
And that is always a symptom of a problem. What’s underneath that? Is there too much content? Is it not the right content? What’s happening there? Is it the organization? What is it?
It’s usually some combination of all of that. So APTA knew that they were probably going to end up cutting back some of this content. They came to us aware that they had a bit of a content overflow problem. Their goals for the redesign were to increase membership and renewals.
And as I said before, people join them in school. And the real thing is trying to keep them engaged after they graduate with all the student debt and keep them renewing their membership. Keep showing them the value of their membership. At the same time, they were undergoing this big brand redesign.
They were coming to us having done a fair bit of research and work that we don’t normally see clients coming to us with. Where they had new themes that they wanted to organize their content around. And they had some very good brand research that we’ll talk about in a little bit too.
And then they also wanted to make top tasks easier. And we’ll talk more about that.
So in working with APTA, we had some initial strategy decisions that were based on their goals. There was a big decision that content should serve the 99% of members interest. Again, APTA gets those 1% of very vocal members who want something very specific and they think everybody’s going to love it and want to know all about it.
So they’re really trying to serve most people, not everybody. They want the content to be up-to-date and they want the content to be on brand. We knew that we would be needing to do a substantial content audit and look at trimming back some of the content. So we call this trimming back the rot, which is the redundant out-of-date and trivial content.
And we’re going to talk a lot about how we used data to make that decisions. Because again, remember they had 13,800 pages, so that’s a ton of content after manually review. So we wanted to automate at least an initial pass of doing that review. But I’m jumping ahead a little there. Sorry. The other decision was to scale back the use of microsites and trying to pull as many sites back into the main site as possible, try to make a more cohesive user experience unless there was a very clear business reason that some sites needed to be separate.
And then they wanted to build off their brand strategy. They came to us with this very good research that they had done on why people want to be members, what’s the value, this really good qualitative and quantitative data research they had.
So to exemplify APTA’s problem, we created a persona of Jordan. This is an example of someone and their struggles with the website. As we’ve mentioned before, people join in school, then they graduate and they have a huge amount of student debt.
Managing their finances and trying to trim their expenses is really important to them. And APTA has a lot of good resources on that for them. APTA also offers the ability for people to specialize in career in different sections of physical therapy. A lot of people are very interested in growing their career as a specialist.
And then why does Jordan Love PT? We just always want to keep in mind people’s motivations. She wants to help people. So we want to keep that in mind while we’re thinking through the content and the experience of the website.
Jen, back to you.
Jen Boland: All right. So now we’re going to have a little talk about what I call the modern state of SEO. And I have this infographic that I cannot take credit for, but it is published by Moz, which is a major SEO company. I really love the simplicity of the graphic and how we’ll explain what is really important in terms of search.
Most of you are probably familiar with Mazlow’s Hierarchy. This is Moslow’s Hierarchy of SEO needs. For so many years, we were really focused on optimizing SEO at the top of this pyramid and doing these things at the very top. But the reality is that these things at the bottom are far more important.
The first thing you need to do is make sure that your site is crawlable. Most cases, people don’t actually have a problem with that. Although you can accidentally turn something on that could make your site uncrawlable. So once we’ve checked that box, that there’s no problem with crawlability, we really move into what this whole entire talk is about, which is creating compelling content that is keyword optimized. And these two things together. I almost work on this process together. It’s figuring out how do I create a page that solves my user’s problem. They’ve typed something in they’ve said, I need information on X. My job is to then make sure that we write something that answers that question fully and completely. And that is really how you win at SEO in the modern world.
Really what’s important about how you write that compelling content is what the search intent is behind a user searching for that term. That is the why behind the search. What am I really trying to do when I searched for this thing?
And then once you understand this intent, you can start to come up with ideas and use the various keyword research tools to figure out what are the exact words that I need to weave into my content to help answer the question and create the content that people are actually looking for. And the thing to keep in mind is too, is that Google search algorithm, they’ve spent billions of dollars on it. Maybe even trillions of dollars on it. We don’t really even know. But far more than any of us in our individual organizations could ever spend thinking about search. Every single day they actually make changes to their algorithm to try to surface the very best pages that meet the searcher’s intent.
And so as we write better content, our content is going to rank better in search because it answers these questions and it is the information that people are actually looking for.
Just to reiterate the biggest influence that we can have on SEO, and really ultimately on a user’s experience, is to create compelling content that meets the search intent.
Another, way you can think of search intent is what is the user’s goal? What are they trying to accomplish in that task?
The other thing that is really important, or it’s just a new idea and a new way of thinking about things, is that SEO research is user research. So as TJ said it’s how do I make sure that I’m helping the user with the task?
Doing that keyword research and understanding all the different kinds of things people search for together helps you understand more of what that user is looking for. That can be the user research if you can’t do an actual survey.
TJ Peeler: And I think as Jen was just saying that Google spends trillions of dollars, trying to get the right content to the right person based on their search term, their search intent, their problem. So we can leverage that. Basically we just want to steal Google’s research and Google’s brain, right? Because that is very valuable, cheap, fast user research that you can use.
Jen Boland: And then what’s even more important is every single page on your website, if you have Search Console enabled, and that’s very easy to do, you can see how well every single page on your website ranks in search and what keywords it ranks for. And that search position. So whether or not you’re on the first position and you’re winning or you’re on the 50th page at Google where you’re probably not getting any clicks or maybe just one or two here and there. Knowing that you’re on the first page of Google, ideally in the top five, you can be pretty confident that page is not only good for search, but that page is also meeting all of your other website visitors’ needs. And so we can use it as a proxy for user satisfaction.
High ranking pages are also our best pages on our website in general.
TJ Peeler: Yeah. So this is a really key concept is that working on the bottom of the funnel, that bottom of Mozlow’s triangle of trying to create compelling content, SEO is not about trying to trick Google into having your page rank so that you can get clicks so that your ads can be seen.
It’s really about leveraging the information that Google has on people’s behaviors to create the best content for your users. So this is a lens that we use. Also it’s great to rank number one on a search term, as Jen said, you win. But it’s not just about winning. It’s about leveraging this data that Google has.
Jen Boland: To give you even a little bit more perspective on how SEO has changed throughout the years, before it was all about the keyword and you wanted one page for one keyword, and having the short information that was very specific to the keyword was really the way to win. But as SEO has evolved, and as Google’s algorithms have evolved, this concept of content and a fully answered question is where things have gone. Making sure that your page about fruit, and this isn’t even the very best example, making sure we actually on our what is fruit page, talk about what fruit is and what it’s not as well as the different types of fruit. Whereas for so many years, we just created pages that were landing pages of a whole bunch of other options without ever providing the context for how these things connected together.
TJ Peeler: And I think this change is also the result of people’s online behaviors changing, dial up being not a thing anymore. It’s a result of broadband and faster internet speeds and how people consume content, mobile cell phones, how comfortable people are scrolling now.
This is a way of viewing a very large behavior change in society, around how people consume content on the internet.
Jen Boland: So just to reiterate, this was a study done, I don’t know, probably two, three years ago at this point in time, but this was actually a correlation of how many words a piece of content had and how well it ranked in search. And you can see that all of the pages that ranked in the first, second, third even to the 10th position, all had over 2000 words and the best pages were upwards of 2,400 words.
I don’t want to paint a picture that this is a hard and fast rule that you have to write 2,400 words to rank well, in search. What this simply means is that these longer pages tend to be more informative. And therefore those are the pages that are ranking better in search. And so making sure that you have completely answered the question, that you’ve completely satisfied the user intent, that is how you rank well in search.
And typically that’s going to require more than the 300 to 500 words that we were all used to writing previously. That is not to say that you can’t ever rank well, with a 500-word response that is clear and to the point, but just in general, make sure that you have completely answered the question.
As we are writing this longer content, it really is important to think about the style in which we write. And we really recommend this method called pyramid writing. It helps us present information sent to users as they’re reading our page. So we’re front-loading the article was the most important information. And then into the next most important information with the headlines being front-loaded. Each paragraph even being front-loaded, but the most important takeaway from that particular paragraph, so that users can quickly skim and find and dial into the exact piece of information that they’re looking for with respect to your topic. Only about 28% of people actually read any given page on a website. So helping users be able to skim and find what they’re looking for to get the information that they need to complete their task is really important. And this style of writing called pyramid writing really helps users do that.
The last thing that is really important, is doing the keyword research to understand what are the words that my users are looking for when they are actually trying to find this content. We have some examples later in this talk, but just making sure that we do that keyword research and that we’re using the right words.
Google is great with synonyms, but if you can use the right words, that not only helps Google, but that also helps the user think, “Oh, I’m definitely on the right page because this is the information that I’m looking for” when they see those words on your page.
TJ Peeler: I think a good example that we see with this frequently is people in organizations like to make up special branded terms to indicate something that there is a normal word for. Like people want to make up a special campaign for something that is like balloons, they want to call it something like festival animal art, and nope, it does not help anyone find that content. People are looking for information about balloon parties, call it balloon parties.
Jen Boland: Now we’re talking about how we’re going to actually use this data to write our content. And first, we’re going to talk a little bit about some of the tools that we use that can help you actually get to this data. And I already hinted at Google Search Console. Please, if you haven’t already installed Google Search Console on your website, look up domain authentication, it’s actually better than doing it the old property way. Just as a quick FYI, if you’re taking any notes here today. But Google Search Console allows us to see every single page and every single keyword that our page is surface for on, Google search results. And because Google has somewhere between 95 and 97% of search market share, this is really the authoritative data on search. And the first is keyword impressions, or even just how often that page had an impression. So we see both of those values within search console. That’s just how many times that page was surfaced. And then what is your position?
If you’re trying to rank for your brand, you’re hopefully in position number one for your brand name. If you’re trying to rank for something related to the cause that you support, hopefully you’re in the top, if not number one, for those types of queries. So that’s just your search position, how well you rank for the search term. And then the click-through rate. Google also tells you how many clicks that page or that keyword got, and then it computes a click-through rate of how often that was clicked. And usually your click-through rate is dependent on how well your page title and your meta description match the search intent that the user was trying to fulfill.
When you search Google, you see the results. I don’t know about you, a lot of times I click on the first result, but sometimes I’ll click on the second, the third, the fifth, if there’s something in that meta-description that really resonates with me or possibly it’s just a website that I trust more or whatever, but just making sure that I have a strong title and meta-description once I rank for the content. But if you don’t have the content to begin with, the title and the meta description don’t really matter. If you’re on the 80th page of Google, it doesn’t really matter that you have a great title embedded description because no, one’s going to see it.
So to say, if a tree falls in a forest doesn’t really fall. And then finally, it’s using this survey and qualitative data to understand why people are searching. It could be as much as just asking your mom. But ideally maybe you’re actually targeting your audience and you’re doing a survey and you’re understanding what is positive, what’s missing from a page.
TJ Peeler: And a lot of times that survey data exists someplace else in your organization. Sometimes we’re working in one silo of IT or digital communications and there’s another department in your organization like membership or something else that’s already got this research.
It doesn’t have to be digitally specific research. It just needs to be research on why people donate, become members, follow your content.
One of the things we’re starting to do now more with web build is optimized key pages that are very important for conversion or they’re their top traffic pages, so that we can optimize the content for them. So, that’s what this whole thing is about is how do we use data to optimize that content?
This is APTA’s old site. This is their membership benefits page. We used a lot of different data to look at optimizing this page. We looked at where people were coming from, where people were going. We looked at keyword data. We looked at qualitative research that APTA had from their brand work, to look at how we were going to reorganize this page.
I just want to show you what it looked like to begin with. And then let’s talk a little bit about some of the research that we found. As I mentioned before, APTA actually came to us with this amazing research foundation that most people don’t come in with. They had a McKinley survey of their members and potential members because they were undergoing this big brand redesign at the same time.
One of the things we saw in this research was that the discounts were very important to members in terms of deciding to renew. And that makes sense when you think about Jordan’s persona, she’s joined in grad school, she’s graduated with a huge amount of student loans and membership is expensive.
So it’s really a cost benefit analysis for her about is this going to provide enough benefit to make it worth it when I need to be conserving all my dollars. The other thing that we saw was that there, this seems like an obvious takeaway, but I think it’s a good one to keep in mind is that between members and non-members, so members are on the left column and the non-members are on the right column There was a pretty big difference between whether they thought there was a value in being a member or not. That makes sense. But I think the takeaway there is that we need to clearly explain the value of membership to non-members and remind members of the value of their membership all over the place.
Then we did some keyword research. As Jen was saying, we got this from Google Search Console, and we wanted to see what are people searching for when they come to the membership benefits page? I grayed out some things just to look at some trends on this. The things that popped out to us is that discounts was a big thing and not just APTA discounts, but specific discounts. Discounts on Brooks, discounts on Asics. Specific discounts were very important and still it matches up with what we were seeing in that qualitative survey data too.
We like to take all that data. And then we use something that is called the core content model. And I want to give a big shout out to Carrie Hayne because she introduced us to this model. And then we made a few adjustments and adapted it, but she’s been pushing this for years and we’re jumping on her bandwagon.
This is a way of looking at optimizing the content on a specific page. One of the things we do is it gives us like a framework for putting in all this data for putting in the keyword research. If that page exists on your website now, we want to look at where people are coming from and where people are going on the site.
And then we want to look at trying to outline a holistic answer, the search intent, answers the user’s questions. So as Jen was explaining before, I think the big change that we’re seeing on the internet is that these longer form holistic answers that are scannable are very successful for people.
People like them and Google ranks them highly because people like them. A lot of what we’re doing is pulling information from the six different pages on fruit. We want to pull all that information back together on one holistic page. So this is a method for helping do that.
The other thing that this helps us do, is content-first design. Content-first design is when you, sounds obvious or self-explanatory, but you do the content first and then you design around the content. So that tends to be a very successful way of building websites. Otherwise you end up with here are your options and you just have to fit these square pegs in round holes sometimes.
We did the core content method on the membership benefits page. One of the things we wanted to do was push up the discount content. So on the old APTA site, the discount information was at the very bottom of the page. We wanted to push that up to the top of the page and we wanted to use hyperlinks. Hyperlinks are a great way of calling attention to scannable content. People scan the hyperlinks. So you want to make sure that you have good hyperlinks, don’t ever use “click here”. You want to use what the terms are that people are searching for. And again, try not to use your branded term for it. In this example, we’ve got hyperlinks on the discount on Asics, the discount on Brooks, very specific discounts because that’s what we saw people were searching for in the data.
From this, we’re building the wire frame around this. So the wire frame is not the design, it’s the black and white layout of the content. It’s more for saying this content goes here, this, is a button, this is a square, that kind of stuff. This is the order of the content on the page.
And this tends to be much better if you’re working with real content than if you’re making up lorem ipsum or you’re working with your old content. Sometimes you have to work with your old content, we can’t do this for every page on your website. It would just take far too long, but it is really good, if you guys are thinking about doing a redesign, to start thinking about your content before you get into the meat of the redesign process. Or you can do this without a redesign too, you don’t even need to have this as redesigned. You can do this on your own because I’m sure everyone has so much extra time that they’re just like, but this is very valuable time to spend.
In that survey data, we saw that non-members we’re not seeing a value in joining APTA. So we wanted to bang people over the head with the value of being a member. We wanted to have promotions or talk about it all throughout the site.
Not just on the membership benefits page. Don’t wait until people are at that point where they’re probably considering joining or renewing. But take it out of that and put it all throughout the user experience on the website. So anytime a member is taking advantage of a benefit, we want to remind them, Hey, you get to do that because you are a member. Don’t forget to remind them you should be interested in renewing in next year.
This is another example of a page that we rewrote because it was a very important page for APTA. APTA has a program where they offer certifications. So one of the things we were talking about with Jordan was that she was interested in specializing to further her career in geriatrics.
APTA offers, I think 12 to 14 specializations in like sports oncology, women’s health, geriatrics, that kind of stuff. And this is a way of physical therapists being able to specialize in different parts of their career. So this is their old homepage for the site. It’s still the homepage for the site.
It’s not great, but I want to show you where we started from. And then going back again to that McKinley survey data, one of the things we saw was that continuing education or enhancing your career was a big value for people in deciding to join an organization. So we supplemented this with some keyword research.
Jen actually did this on Moz because we didn’t have access to Google search Console for this site. You can pull stuff from different tools. One of the things we can see, is that these first two results, physical therapy and specialized physical therapy, these are probably not qualified traffic. Results is the name of a physical therapy franchise, not a hundred percent sure it’s a franchise, but.
Since specialized physical therapy is probably people looking for specialists. You want to look at this with a critical eye. Don’t just absorb this and just say write it all on this. We should have results be the top word in this.
Jen Boland: This is actually what search intent is looking at this data, thinking about it and trying to understand what someone who’s searching for this is actually looking for. This highlights really well of the concept of using search intent to make sure that you are trying to surface your content for the right keywords, not just any keyword.
TJ Peeler: When Jen’s talking about the qualified versus unqualified traffic, I often think about this as who is your audience? So our audience is not the general public looking for physical therapists, our audience is physical therapists who are looking to further their career. And so we’re trying to determine from the search words, who is searching for what.
One of the things we saw here was that specialties and specialists were being searched for more. And the word that we were using on the old site on the homepage a lot was specialization, and specialization was not showing up in any of these keyword searches for the page. So, that’s a smaller change in terms of, as Jen was saying, trying to use the keywords that people are searching for. Google’s pretty smart at being able to interpret the differences between a couple of those.
So, this is a little bit of a smaller example here. But we did just want to go ahead and try to use specialist and specialties over specialization.
Jen Boland: And I think that actually comes down less to trying to rank for the right word in Google, and more for making sure that your content is scannable and that when the user actually arrives at the website, that the page resonates with them, because it uses the words that they’re looking for when they’re looking for information about that topic.
TJ Peeler: Again, we did this core content outline because the homepage is pretty critical for APTA. Not only is the specialization very important for PTs to further their career, and that’s a big reason for people being a member, but it’s a good business program for APTA, too. So in the beginning we talked about trying to keep microsites together, unless there was a very distinct need for them to be separate. This one, because of the way medical board conventions work, this one had to be a separate website and had to have a distinct board. It has to be seen as a different entity.
And, it is a financial generator, so it was important for business reasons and important for their audience reasons, for their members. So we went through this core content model where we talk about what people are trying to do on this page. Who’s coming, how are they getting here? Where are they going next? And then we start pulling together this outline of the who, what why, where, when, how, and cost. I always put costs in parentheses there. This is a screenshot of the outline that we put together.
But one of the things I wanted to point out with this is that you can see that this came from six different webpages. So again, going back to what Jen was talking about with the fruit example of having six different pages being stitched into one comprehensive page.
We want to pull that information back together and this is something we do on almost every website, this is not unusual. Again, 10, 15 years ago it was all about put out three to 500 words, two to three times a week, and that was your digital strategy. And now things have changed.
People’s behaviors have changed. You can see that in Google’s search results. And we’re trying to stitch those pages back together to form a holistic page that is scannable for people to use. So I did the outline of the content and then we use that to draft the content. So one of the things I want to point out here is that just because we do the outline and who, what, where, when, why, how, those are not the headlines on your page. That’s just helping us form a holistic answer. The headlines on your page can be completely different. The headline is how to get a specialist certification. Do you want to become one? Do you want to maintain one? It can be completely different from the outline.
The outline helps you bring all that information together from those different pages. And then you start drafting the content I’m not saying you have to have six different headlines. And sometimes they don’t even apply to who is often PTs and that’s it.
So you don’t have to include that in your content outline. And then again, we were doing content-first design with this. So starting to outline that content, drafting the content. And then we ended up with this wire frame, the black and white layout of where the content is going to go on the page.
This is just watching how this changes from the outline, to the content, to the wireframe of the page.
All right. Back over to Jen.
Jen Boland: All right. So now we’re going to talk a little bit about using our data to audit our content. And so we’re going to talk a little bit first about some of the tools that we use to get various pieces of data.
TJ Peeler: Let me just mention here that when we’re doing content audits, so content audit is like the inventory is pulling together all the information, and the audit is looking at the data and making decisions on it. But we usually do two levels of the audits. So we do like a high level audit to just figure out the lay of the land and what’s getting traffic what isn’t, super essential to know what your top 20 pages are on your website that people are coming to for. And then we do this very detailed dive into what’s working, what’s not working, and what should be cut later.
Jen Boland: One of the tools that we use to actually collect a lot of the data about the page is Screaming Frog. Screaming Frog is known as an SEO scraper in the SEO world. But what it does is allows us to actually find the URL of every single page you have on the website, and then basically extract any piece of information on that page that we want to extract. By default it extracts things like the title, the meta description the word count, is it a 404? Is it a redirect? Is it a good page? It brings in your H1s. So it really brings in a lot of data by default, but then we can also set it up to bring in more content.
Like for example, we sometimes scrape the entire content off of the page, which we did for APTA. And we’ll talk a little bit more about how we use that later on. All you have Google Analytics probably installed on your website. So making sure that we know what the pageviews are of various pages, that we know if that page view is associated with the goal completion, that we know if that page has a lot of entrances, which means it gets new people to the website. So just getting that key data from Google Analytics matched up with all of your pages. And then finally, the last thing that we think is really important metric to look at are the backlinks. And we use either Moz or Semrush.
There are other tools out there. Those just happened to be the two that we typically use. And this helps us get a list of all the pages that have backlinks. And backlinks are really important because not only do backlinks help Google know what are the best pages, because for example, if the New York Times is linking to you, they’re going to say, Oh, the New York Times, that site has a lot of authority and we trust it.
And therefore, if they’re linking to you, this specific page must be important. And so knowing what pages have backlinks help you understand what pages on your website other people think are important. It’s just another check that we can add to our list of items in our audit to make sure that we are identifying your key content on your website.
When we ran this here is what we found. This is looking at the different quartiles of pageviews. And so we found that the top 75% quartile, so it’s confusing, but it’s the, top 25% of pages had an average of 691 pageviews a year. The middle quartile had 68 pageviews a year.
And then the bottom 25% of the content had an average of 14 pageviews. And this was in a given year. So clearly by looking at this, we could say this bottom 25% is probably opportunity to be cut. And we can go through a little bit more about where we set our thresholds and how are we move through this data.
TJ Peeler: But this is not unusual to see. When we run a content inventory this happens when the site grows organically for years, that there’s all this old leftover content that is out of date and nobody is looking at on the website. So that’s one of the reasons that doing a content audit is amazing and a great side project to do in all your spare time.
It’s is a great thing to do it. Doesn’t have to be tied to a redesign at all. Cleaning out your content, makes your content more find-able and better.
Jen Boland: Here are some other pieces of data that we wanted to pull in as part of our content audit. So one, we wanted to know if this page had been updated in the last three years. And while we weren’t able to get those data in general, we were actually able to scrape it off from the front end of the website.
Basically it’s like using a CSS selector and then just saying, Hey, Screaming Frog anywhere where you see this CSS selector return back this piece of information, which got us the data. Then then we wanted to look at, and I will be honest with you. We actually came up with this 500 pageviews a year before we ran the distribution analysis, but it did end up being interesting how close that 691 number was, and the 500 number turned out to be, but we just made a decision that anything more than 500 pageviews a year, we were automatically going to keep, that those pages were on our keep list. And then on our first pass, decided that anything that had even just one backlink needed to at least be manually reviewed.
And so it made our initial pass to be kept because backlinks are important and there typically is a reason why someone’s linking to that content. And we want to make sure that we preserve that content in one way, shape or form or another.
TJ Peeler: And so one of the things we did was we talked to APTA to come up with what were going to be the business rules for the first pass. And then we just put in a formula in Excel and reviewed 12,000 pages by copying and pasting that formula to say In or Out, which was a really nice way of quickly getting a starting threshold for if you keep these pages based on this criteria, this is how many pages you’re keeping.
Jen Boland: After we applied these rules, we identified that 36% of the content that we would keep, but that we were probably fairly safe actually deleting 64% of the content on the site. So if you think about it, a site shrinking by roughly two thirds, that really helps assist with findability on a website.
If you don’t have to sort through so much more data. Also too, it sends signals to Google that, Hey, we don’t have a lot of this content bloat. A lot of times, if you have multiple pages that rank for the same keyword, Google doesn’t know that one page. Might be more up to date or have the latest information. And therefore those two pages can fight against each other in the search results, causing both pages to actually be ranked lower in the search algorithm.
TJ Peeler: It confuses users when they see two pages on a very similar topic. Which, one should they be looking at?
Jen Boland: And if the date is really tiny at the bottom of the page, they’re not even seeing that until they go to the very bottom of the page. Making sure that we’re getting them to the right most up-to-date content so that they can make the right business decisions or treatment decisions based on the latest evidence.
One of the really great things is that after applying this analysis, we were actually able to look at the pageviews associated with that 64% of content. And I don’t like to call it a one-to-one ratio because we’re going to use some 301 redirects and it’s not a perfect assessment, but we estimated that even with them deleting 64% of their content that they would only lose somewhere between two and 10% of their traffic. And that was definitely considered an acceptable loss for them. Having their content on brand, having this pruned content, having fewer pages to manage was just a win. And unless maybe you’re an ad-driven website where a page view is how you make all your money, an acceptable traffic loss of two to 10% was okay for their business.
TJ Peeler: So we are a little bit running behind. So I’m going to skip over this content inventory slide. I think I can summarize it really quickly to say this is a ton of data to look through. And Jen and I are working on ways to make it not look so overwhelming and horrifying to people.
And then we also have just a little bit on essential fields that we always want to put in content inventories. This is our recommendation for what we would always put in content inventories. Last updated date, word count is really helpful for finding content that is trivial, too short.
There’s always going to be some pages that are trivial. Like your privacy policy, depending on your organization, your privacy policy might be way too long, it’s sometimes very short and that’s okay. It’s still a very important page. The analytics data and the Moz data that Jen was talking about.
Now we’re talking about actually using data to organize the content. I think we’re going to try to blow through this pretty quickly.
Jen Boland: I’m going to say that you’re going to be able to read this yourself and we’ll actually send you the video or the slide deck after so you can read this, but basically we use a whole bunch of tools to get this really important information about a page that we’re then able to use to help organize the content.
TJ Peeler: So one of the things we did with APTA is, we pulled their breadcrumbs so that we could figure out their current navigation. And then we looked at evaluating their navigation based on how many pageviews and number of pages there.
This is very common to see. Websites used to be organized by content type. We are moving to organize them by topic structure, because that’s better for user experience and that’s how people think. But you could see under news and publications, they have most of the content is under this news and publications content type structure.
We wanted to break that out. We were incorporating some microsites into the main site, so we broke out a couple of different sections because we would be growing the amount of content underneath that section. This is a screenshot of the old navigation and the new navigation.
They weren’t drastically different. They weren’t exactly tweaks, but they weren’t a crazy departure from their navigation, which is a good thing because people don’t like change. So don’t introduce change for change sake.
We also did some testing with this navigation to ask people about top tasks and see if they could complete them based on the new structure. The good thing we found was that basically all of our proposed new navigations were better than the current navigation, which is the red bar on this slide.
We learned a couple of things that made us finesse and adjust a few things, but in general it was validating that we were in the right direction.
Jen Boland: And then also we mentioned how we were moving to this topically based content organization versus the content type based organization. So one of the things that we had to do was apply our taxonomy to a lot of these news and article pages that did not have that taxonomy applied to them. So I was able to use some natural language processing tools within Python to actually understand what the page was about and find the various tags within the page. And then present a spreadsheet to the client that they were then able to upload into the CMS with all the appropriate tags for the piece of content, so that they did not have to manage the tagging of these thousands of pages of content manually. Also, just as a side note, we were able to do short descriptions using summarization technique. The summaries weren’t all perfect. This is not the be all end all the Python’s never going to write better than a user. But they did not have the time to write these short descriptions and so we were able to just use the summarizer to write descriptions for a couple of thousand pages, then they were able to upload. So it was just like one less thing that they had to….
TJ Peeler: They had their subject matter experts review the short descriptions and then upload them. I think reviewing a draft is a lot easier than trying to write something from scratch.
You can organize content in your navigation using your taxonomy and also doing the layout on the page. So we use this information about what content people were looking for on the page to decide how we would reorganize the content.
So again, we want to prioritize the content that people are looking for the most on the page and put that higher up on the page.
So that was breezing through how to organize your content in five minutes or less. We’ve got to the end of the talk and we just wanted to leave you with a couple of key takeaways in our last two minutes.
Jen Boland: Just remember that you can use this keyword research as user research to understand how people are using and consuming your page and how they’re finding it. Using the keyword research or Google analytics data and the qualitative data you have to help you write the page. And then also to even design it, to make sure that you’re creating a page that resonates with your users. And then also using formulas and scripts to do, an example is to take the first pass at the content audit of which content should be kept and deleted, but really important just because doing that manually would have been too time consuming. And then finally, using qualitative data, where you can, to test your assumptions and iterating from there. I’d love to say we always get it right on the first try, but sometimes we don’t.
Thank you for joining us for this speed session on content and data and how to improve your content. I am so sorry that we are completely out of time for questions, but if you guys have questions, please reach out to us on email. We’re happy to email, talk about your questions, or hop on the phone.
Thank you. Thanks.