Flanke 7 Bildmarke
Back to overview

What Google knows - Part 2

Share on:

Tell me your URL and we'll tell you what Google sees. 

In Part 1 of this article series, we took a closer look at what Google knows about a website - but what the owner or operator doesn't know. Quite a discrepancy, because how can you get better if the information is missing? By taking advantage of data-driven analysis.

In part 2, we'll present the analytics options in a bit more detail - this time, we'll put on the technical Google glasses. Let's go.

Site Survey and Site Technical 

These are 2 technical products, both of which help to get an overview of one's website. 

The developed algorithm behind Site Survey evaluates general characteristics of a web presence. These include: 

  • The measurement of meta tags

  • the collection and measurement of the different images 

  • the links to other pages

  • the navigation depth 

  • the top 10 categories 

  • etc. 

The developed algorithm behind Site Technical looks at the technical characteristics of a website in detail, such as: 

  • Latency and loading time 

  • Page size and page range 

  • Number of available web pages 

  • Intact pages and those that lead to an error message. 

  • etc. 

What are the benefits? 

For example, if you have a new site that loads quickly at launch, you can have it checked periodically over time to see if the speed is maintained. Especially if content is added to the site independently, this can have an impact on the volume and thus on the loading time. 

Web presences that have grown historically, for example, may have some pages that lead to a 404 error message - not a good sign from Google's point of view, because web presences should have intact pages - if this is not the case, the rating drops. 

If e.g. the meta tags are examined (title & description), it can be recorded how Google sorts the own website thematically - and what is not named, cannot be sorted by Google and accordingly also not be ranked.

In addition, there is further, important information that can be collected about one's own site. With the help of an algorithm-based evaluation, there is another benefit for every interested party: such an evaluation is fast and does not take months.


Site Flow - the flow of information

This analysis product is also based on an algorithm. Here, the structure and architecture of a website is evaluated and recorded in the form of a so-called Sankey diagram. The algorithm calculates the frequency of occurrence of each level and derives the flow of information from it. One such level can be a landing page, for example. 

An example: A company offers various services in the form of services. Ideally, these services should be visible to a visitor of the website at the latest in step 2 after entering the page - whether this is so or not can be read from the Sankey. 

A Sankey diagram is a method of visualizing quantity flows, i.e. that individual flows are shown in proportion to quantity. 

Why is this important? People visit a website with different intentions - for a company, each visitor should ideally be encouraged while browsing through the site, e.g. to buy something, request information material or even contact the company. This is directly related to the structure of the page, because if information cannot be found quickly, the probability that the page will be left again is very high. Ergo: no desired action. The reason for this: structure and architecture of the information. 

This means that the amount of flowing information is not the information that can or should lead to an action.

Site Keyword - Content 

Keywords are an important part of every site and still not to be neglected in terms of SEO. Here, various aspects are analyzed - all based on developed algorithms. Thus, the textual contributions of each page of a website are evaluated and sorted into a relevance scheme. Sources used here include the titles & descriptions, the extracted image & link descriptions, and the text lines of each individual page. 

A text post is classified as relevant if the frequency of occurrence is significantly less than the total number of pages. This means that a keyword should be used more often, but each time in a different context - it does not take 5 pages of the same content. For the calculation of relevance, a gradient method known and often used in text analysis is used. 

However, the selected and analyzed keywords can also be related to each other. This makes it possible to evaluate whether content contributions lead to relevant information networks around the respective keywords. This means that keywords should not stand alone, but always together - because this is the only way to create a text that is appealing to a reader.

For this form of analysis, we use various methods, such as computational linguistics, data mining methods for word groups, and also a content similarity approach.

Site Social - The Relationship of the Website to Social Media

If a company has accounts on the social networks such as Instagram, Facebook or YouTube in addition to the website, it happens that content is shared on the respective profile from the website.  The algorithm developed extracts the social media links from each website and evaluates their relevance. Again, something is considered relevant if the frequency is significantly less than the total number of web pages. 

Google checks all web pages again and again - of course not by hand or by employees, the pages are technically crawled. From the website's relationship to social media, Google can see if the company has a content marketing and/or social media strategy, and this analysis also shows Meta's (formerly Facebook's) view of that website. These classifications have a significant influence on the evaluation and ranking of the page.

Site Performance - How fast is fast 

Speed is critical, but how fast is fast and what on my site is how fast? With a performance audit we can filter information from the site and identify the weak points.  

Here you can find out more about the performance of the site, following on from Site Technical: 

  • Speed Index: This measures how quickly a site's content is visible to the user. 

  • First Contentful Paint: This measures the time when content is first displayed to the user.

  • Largest Contentful Paint: This measures the time when the largest content is displayed. This corresponds to the time until the main content is visible to the user. 

  • Time to Interactive: This measures how long it takes for a page to become fully interactive. 

  • Total Blocking Time: This measures all loading segments and totals between First Contentful Paint and Time to Interactive that take longer than 50 miliseconds. 

  • Cumulative Layout Shift: Here the movement of elements of the website is registered, such as displaying a banner or even "jumping a page". 

These factors of performance are important for two reasons: for Google's rating and for a user's experience on the page. 

Google rates fast and well performing pages better, users stay on pages longer if the experience is not disturbed by e.g. loading times. The latter also increases the chance of an action by the user.


Site Readability - How do we actually write 

The developed algorithm performs an inventory of the readability of selected texts and URLs. Which ones should be analyzed can be decided individually. 

Readability is a metric that determines how easily visitors can read and, above all, understand the text of a website. In addition to the selection of font size, spacing and colors, the writing style in particular is very important. 

Visitors to a website often don't have much time and are not interested in concentrating very hard. Information must be found quickly and, above all, understood quickly. If this rule is ignored, other marketing measures have a very low chance of success. 

The method was developed by the Austrian Rudolf Flesch. The readability is calculated with the help of the Flesch Index. The developed formula is based on the number of words, syllables and other language-related parameters.  

The Flesch Index evaluates a text with a value between 0 and 100, assigns this value to a level of difficulty. Values between 0 and 20 are classified as scientific articles. Very simple texts with values between 80 and 100 have the difficulty level of an advertising slogan. 

Why is this important? Anyone who knows their target audience knows what they want to hear or read. And even if there is a need for specialized articles, they should be placed separately within a page, because readers need time for this - when visiting a page, it's all about being able to consume information as quickly and easily as possible. 

So: How and in what way are my texts read and perceived - that's what the site readability tells me.

Summary 

Spending the day in someone else's shoes - an experience that holds great fascination for many. And how would it be to be able to see your own site through Google's eyes? Absolutely exciting, isn't it? Technically feasible, quickly implemented, valuable for decision-making. 

With the help of the analysis products, we can quickly provide a lot of information and answers - if there are further questions, further analyses are conceivable and possible.


If you have any questions or suggestions, please feel free to contact us - here.

Do you have an exciting project?

Philip Vögele Emoji

Feel free to contact us,
we look forward to exchanging ideas with you.