Ever since Google announced its decision to hide search referral keywords for organic traffic, digital marketers from around the world have been coming up with ways to tackle the (not provided) search data in Google Analytics.

Many of these methods described aim to make sense out of the (not provided) data but none of them are designed to estimate exactly which keywords are contained in the (not provided) data population.

One of the methods that come close is by utilizing custom filter to determine which landing pages attract the (not provided) traffic. Another method is to utilize the custom filter and combine the organic search keyword position and landing page.

A reasonable solution would be to in-directly estimate unknown value by utilizing what is already known. Here is a scientific method that demonstrates this idea. Imagine you have two tubes of protein solutions. One with known concentration, the other one is unknown. You have a chemical that can change the color of the protein solution; if it has a lot of protein, the color will get darker, if no protein is present, it will stay clear.

Since you have a tube with known concentration, you can create a standard curve by varying the amount of protein is present in the solution to create different darkness. When plot it on a graph, it will look like the following.


unknown protein concentration


Then you can take out certain amount of the unknown concentration protein solution to see how dark it will get. So by knowing the darkness of the unknown protein solution, you can find out how concentrated it is based on the standard curve.

Now back to our search traffic.

Our known factors

–          A specific landing page

–          % of traffic that is brought in by the known Keywords

Assuming the condition between the known keyword traffic and (not provided) traffic is the same (distribution of user’s browser, geo location, language etc), we can now estimate how much (not provided) traffic is brought in to this specific landing page by these known keywords.

For example, let’s say there are two keywords (keyword #1 and keyword #2) that are bringing users to a landing page A. Keyword #1 brings in 20% of the organic search traffic and keyword #2 brings in 80% of the organic search traffic.

Next we will determine the (not provided) traffic to the landing page A (let’s say it is 100). So out of these 100 visits, 20 visits (20%) are likely brought in by keyword #1 and 80 visits (80%) are likely brought in by Keyword #2.

This estimation method is not perfect and a lot of assumption is made, but it will at least give a quantifiable number and eliminate some of the (not provided) data uncertainty.


