Talk:Observer bias

Statistics Low‑importance

	This article is within the scope of WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.StatisticsWikipedia:WikiProject StatisticsTemplate:WikiProject StatisticsStatistics
Low	This article has been rated as Low-importance on the importance scale.

Science Low‑importance

	Science portal This article is within the scope of WikiProject Science, a collaborative effort to improve the coverage of Science on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.ScienceWikipedia:WikiProject ScienceTemplate:WikiProject Sciencescience
Low	This article has been rated as Low-importance on the project's importance scale.

Examples of cognitive bias - wrong article?

Latest comment: 2 years ago1 comment1 person in discussion

I have removed the following content, as it seems to be relevant not to this article but to cognitive bias. --_{Piotrus at Hanyang| reply here} 07:21, 16 March 2024 (UTC)Reply

Examples of cognitive biases include:

Anchoring – a cognitive bias that causes humans to place too much reliance on the initial pieces of information they are provided with for a topic. This causes a skew in judgement and prevents humans and observers from updating their plans and predictions as appropriate.
Bandwagon effect – the tendency for people to "jump on the bandwagon" with certain behaviours and attitudes, meaning that they adopt particular ways of doings things based on what others are doing.
Bias blind spot – the tendency for people to recognize the impact of bias on others and their judgements, while simultaneously failing to acknowledge and recognize the impact that their own biases have on their own judgement.
Confirmation bias – the tendency for people to look for, interpret, and recall information in such a way that their preconceived beliefs and values are affirmed.
Guilt and innocence by association bias – the tendency for people to hold an assumption that individuals within a group share similar characteristics and behaviours, including those that would hail them as innocent or guilty.
Halo effect – the tendency for the positive impressions and beliefs in one area around a person, brand, company, product or the like to influence an observers opinions or feelings in other unrelated areas.
Framing effect – the tendency for people to form conclusions and opinions based on whether the pertinent relevant is provided to them with positive or negative connotations.
Recency effect – the tendency for more recent pieces of information, ideas, or arguments to be remembered more clearly than those that preceded.

Interesting example

Latest comment: 5 months ago3 comments2 people in discussion

Hi!

I found the following example concerning statistical bias that I find interesting, particularly because of its unexpected outcome. As a teacher in mathematics and physics I would like to use it and also put it as an example in this article. This is how I would like to use it here in the article, in the examples part:

The observer bias may also have to do with cultural characteristics of the observer. Let's suppose that the number of tourists per month was counted in a village. The outcome, including a linear regression for the yearly number of tourists, is presented in the following graphics:

The villagers noted a decline of the number of Tourists through the years. In the year 2013 they organised three concerts of a local music group in the hope to increase the number of tourists. The conclusion was that this event had an effect on the number of tourists, as the number of tourists in these months was rather high and the total number of tourists for 2013 was higher than the years direct before and lies outside the 95% interval according to the linear model. The R² value lies in this case also quite high (0.81), which shows a rather high correlation.

Let's suppose now, that the outcome would be as follows:

We have here exactly the same data with the difference that the data are shifted some months. In this case we cannot statistically contradict the null Hypothesis, that the concerts had no effect on the total number of tourists through the year. The differences could be due to statistical fluctuation. We have though a value in 2012, that is slightly under the 95% limit and would maybe need an explanation. The R² value is slightly higher (0.91).

In the first case we use two abrupt "highs" in a period of less than one year. Of course the months, when the group played, have the highest amount of tourists. This is though exactly the case also in the second case. The only difference is the timing of the unstable maximal values ("highs"). Because the "highs" are unstable and abrupt, we should use intervals that include only one "high", in order to make valuable conclusions, if this is possible. The conclusion is therefore that, with these data, we cannot dispose of the null hypothesis and thus, that we cannot say that the group actually affected the number of tourists (more precisely: we cannot statistically contradict the hypothesis that the group had no effect on the total amount of tourists) in BOTH cases: In the second case this is obvious; In the first case we can conclude that we have a statistical bias due to the use of two abrupt "highs" in one (12-month) period (for the year in question, namely 2013), although there is a way to have 12-months periods with just one "high". The most probable conclusion would be rather that tourists, that would anyway come, preferred to come in the concert months. Thus we can see, that a cultural characteristic (here: defining the beginning and the end of a year) can have an effect on the outcome of the statistical analysis.

My questions are:

Is the example (and its conclusions) correct?
If so, do you find it interesting enough, so as to use it here, in this article about observer bias (or maybe also in your class, if you are teaching)?

Here are the data sets for the first and the second case, in case you want to test the outcomes.

year	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	total (x1000)
1994	934	895	889	838	861	822	822	855	864	943	948	836	10.51
1995	1010	915	847	791	847	853	808	903	861	984	1040	850	10.71
1996	928	861	811	772	833	811	805	856	825	947	969	819	10.24
1997	875	808	830	780	797	802	780	836	827	921	910	871	10.04
1998	1038	882	827	794	799	788	766	794	790	890	928	812	10.11
1999	906	818	818	790	818	807	796	834	834	889	927	845	10.08
2000	960	856	795	773	790	768	790	823	817	904	909	822	10.01
2001	909	855	806	768	800	768	773	806	805	908	1016	854	10.07
2002	864	794	794	735	746	735	724	805	792	846	915	782	9.53
2003	862	803	782	733	776	749	760	798	791	913	919	770	9.66
2004	866	797	775	712	765	743	754	807	759	865	891	886	9.62
2005	923	812	785	717	754	743	733	791	764	884	921	832	9.66
2006	895	827	754	717	727	727	717	764	752	824	840	767	9.31
2007	798	747	767	705	767	747	726	772	744	826	888	831	9.32
2008	888	795	754	703	729	698	718	744	746	807	883	792	9.26
2009	827	766	726	680	736	726	695	761	755	845	930	805	9.25
2010	825	775	750	685	695	700	690	760	741	815	870	825	9.13
2011	934	771	687	657	736	687	687	736	694	782	816	733	8.92
2012	762	718	699	650	679	689	670	738	721	765	808	731	8.63
2013	828	1021	862	721	687	673	653	707	790	983	983	727	9.64
2014	751	703	698	636	674	679	689	727	734	821	854	792	8.76

year	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	total (x1000)
1995	864	943	948	836	1010	915	847	791	847	853	808	903	10.57
1996	861	984	1040	850	928	861	811	772	833	811	805	856	10.41
1997	825	947	969	819	875	808	830	780	797	802	780	836	10.07
1998	827	921	910	871	1038	882	827	794	799	788	766	794	10.22
1999	790	890	928	812	906	818	818	790	818	807	796	834	10.01
2000	834	889	927	845	960	856	795	773	790	768	790	823	10.05
2001	817	904	909	822	909	855	806	768	800	768	773	806	9.94
2002	805	908	1016	854	864	794	794	735	746	735	724	805	9.78
2003	792	846	915	782	862	803	782	733	776	749	760	798	9.60
2004	791	913	919	770	866	797	775	712	765	743	754	807	9.61
2005	759	865	891	886	923	812	785	717	754	743	733	791	9.66
2006	764	884	921	832	895	827	754	717	727	727	717	764	9.53
2007	752	824	840	767	798	747	767	705	767	747	726	772	9.21
2008	744	826	888	831	888	795	754	703	729	698	718	744	9.32
2009	746	807	883	792	827	766	726	680	736	726	695	761	9.15
2010	755	845	930	805	825	775	750	685	695	700	690	760	9.22
2011	741	815	870	825	934	771	687	657	736	687	687	736	9.15
2012	694	782	816	733	762	718	699	650	679	689	670	738	8.63
2013	721	765	808	731	828	1021	862	721	687	673	653	707	9.18
2014	790	983	983	727	751	703	698	636	674	679	689	727	9.04

R Statistics was used for the linear regression diagrams (first and last column of the table), OpenOffice for the presentation of the data in diagrams. The code for R statistics follows:


library(readxl)
rs1 <- read_excel("Documents/rs1.xlsx")
rs1f <- data.frame(rs1)
modelA<- lm(total~ year, data= rs1)
a<- length(rs1$year)
yearValues <- seq(1, a, 1)
Apredict <- predict( modelA, list(year=yearValues))
Apredictf <- data.frame(Apredict)
ConfInSwed <- predict(modelA,interval = "confidence")
ConfInSwed <- data.frame(ConfInSwed)
ConfInSwed$year <- rs1$year
ConfInSwed$year2 <- ConfInSwed$year^2
modelUpCI<- lm(upr~ year+year2, data= ConfInSwed)
LineUpCI <- predict( modelUpCI, list(year=yearValues,year2=yearValues^2))
modelDownCI<- lm(lwr~ year+year2, data= ConfInSwed)
LineDownCI <- predict( modelDownCI, list(year=yearValues,year2=yearValues^2))
rs1f$pred <- Apredictf$Apredict
rs1f <- transform(rs1f, PercPred = 100*(total-pred) / pred)
rs1f$downCI <- LineDownCI
rs1f$upCI <- LineUpCI
rs1f <- transform(rs1f, PercDownCI = 100*(total-upCI) / upCI)
rs1f <- transform(rs1f, PercUpCI = 100*(total-downCI) / downCI)
PredInSwed <- predict(modelA,interval = "prediction")
PredInSwed <- data.frame(PredInSwed)
PredInSwed$year <- rs1$year
PredInSwed$year2 <- PredInSwed$year^2
modelUpPI<- lm(upr~ year+year2, data= PredInSwed)
LineUpPI <- predict( modelUpPI, list(year=yearValues,year2=yearValues^2))
modelDownPI<- lm(lwr~ year+year2, data= PredInSwed)
LineDownPI <- predict( modelDownPI, list(year=yearValues,year2=yearValues^2))
rs1f$downPI <- LineDownPI
rs1f$upPI <- LineUpPI
rs1f <- transform(rs1f, PercDownPI = 100*(total-upPI) / upPI)
rs1f <- transform(rs1f, PercUpPI = 100*(total-downPI) / downPI)
rs1f <- transform(rs1f, DifPIMCI = (upPI-downPI) - (upCI-downCI))
yearList<-seq(2015-a, 2014, 1)
rs1$year<-yearList
rs1f$year<- yearList
plot(rs1$year,rs1$total,xlab="Year",ylab="Tourists")
lines <- lines(yearList, Apredict, col=2, lwd=2)
lines <- lines(yearList, LineUpCI, col=2, lwd=3, lty=2)
lines <- lines(yearList, LineDownCI, col=2, lwd=3, lty=2)
lines <- lines(yearList, LineUpPI, col=2, lwd=2, lty=3)
lines <- lines(yearList, LineDownPI, col=2, lwd=2, lty=3)
write.table(rs1f, col.names = NA)
summary(modelA)

In your documents you should load (and save) the tables one after another putting "(x1000)" away, before running R-statistics

Thanks in advance for your advice! Yomomo (talk) 11:32, 16 July 2025 (UTC)Reply

Wikipedia is not for 'self research' - it should only include such analysis if it was published in a trusted source (e.g., peer-reviewed journal) Tal Galili (talk) 13:46, 12 January 2026 (UTC)Reply

Hallo Tal! Thanks for the answer and for the time. I am a teacher and would like to know if the example is right. You already wrote to me in an e-mail, that it might be right. It is though no "self research". It is just an example using already existing knowledge like in Cross-multiplication#Double rule of three or like most of the diagrams in linear regression. As far as I know and as the aforementioned examples show, such examples may exist in many articles in mathematics. The most important for me is though, if it is correct. If you have some time, please look at it one more time and write to me with some more certainty :-) if it s right. If you still think this is "new knowledge" and this is also the mind of others in the wiki statistic group, then it can be removed. But still, if it is correct, and you are a professor teaching statistics, then it would be maybe also for you an interesting example to use... Greetings! Yomomo (talk) 20:50, 12 January 2026 (UTC)Reply

Add topic