By Matida Jallow
The Center for Policy Research and Strategic Studies (CepRass) has commissioned a political Opinion Poll (OP) to track public
perceptions ahead of the most-talked upcoming election. One of the portions of the poll surveys the opinions of the electorate about their intentions of vote. The findings of this, which are being widely doubted by many Gambians, revealed that most of the electorate (about 40%) are still undecided, while 29% have reported that they will cast their marble for the NPP, followed by 13% who reported that they will vote for UDP, 5% for APRC, and 4% for GDC and PDOIS each. The intention to vote for CA is 1% and for GMC and NRP is less than 1%.
A close review of the methodology section of the survey, reveals that the methodology is not only detailed, but it is also inarticulate in many important issues related to sampling. This and other considerations related to the sampling method threaten the internal and external validities of research, and the reliability of the results.
In the research methodology, internal validity reflects that a given study makes it possible to eliminate alternative explanations for the results. This can be determined by the extent the participants are chosen at random or in a manner in which they are representative of the population that they are drawn from. Many factors influence the internal validity of a research including sampling method adopted by the researchers and events that occur over a period of time, such as a change in the the political landscape that influences how study participants feel and respond.
On the other hand, external validity refers to how generalizable the findings are. For instance, do the findings apply to other people, settings, situations, and time periods. This is determined by many factors including sample features: when some feature of the particular sample was responsible for the effect (or partially responsible), leading to limited generalizability of the findings, and selection bias: which describes differences between groups in a study that may relate to the independent variable.
Given the about mentioned considerations, the manner in which the study sample was selected in the poll suggests selection bias, which poses a serious threat to the internal and external validities of the results.
According to the CepRass, the study adopted the recent Integrated Household Survey (IHS) with telephone numbers as the frame, which comprises 14,191 households from 8 local government area within 48 districts in The Gambia. It added that at each stratum (LGA) proportional sampling was used to select a representative sample of districts. In the final stage, it continues, a random number of households were selected from each district. Thus, 34 districts of which 969 respondents were selected for the opinion poll and the team of 12 enumerators was
provided airtime on the various cellular
networks (Africel, Qcell, and Gamcel).
The above descriptions of the sample methods not only unsatisfactory, but raise wider range of questions about missing information in relation to the the selected sample.
First : Why the Integrated Household Survey (IHS) was chosen among the available methods? IHS is one of the two major household surveys alongside Multiple Indicator Cluster Survey (MICS) that are regularly conducted by the Government of The Gambia through GBoS. Thus, it will be interesting for a reader to know the reasons behind preferring this particular method above other methods. IHS compromised only 14191 households. This raised the question about the criteria adopted to choose 14191 households across the country? And what is the proportion 14191 to the total household in the country for one to determine the extent to which the 14191 households can genuinely represent the entire population.
Furthermore, the IHS data contributes to improvement in availability of data on gender and specific population groups and age cohorts and other socio-economic characteristics of population , such as their educational attainment and occupation. While these information provide invaluable information about groups voting attitudes and political affiliation in society, these were missing in the report , thus leaving readers wondering about the socio-economic background of the sample, which can influence one’s political orientation.
Second: the report indicates that telephone numbers of households were used as the frame, which comprises 14,19 households from 8 local government area within 48 districts in The Gambia. As a result, 34 districts of which 969 respondents were selected for the opinionpoll and the team of 12 enumerators was provided airtime on the various cellular networks (Africel, Qcell, and Gamcel). However, the study did not provide the details of 38 districts which were selected out of 48 districts, and the villages as well as towns from which 969 household were selected. Given the fact that popularity of political parties and their support bases are linked not only to local government areas, but also to districts, villages and towns , the absence of this information reduced the truth worthlessness and the reliability of the result as an indication of who could win the next election.
Additionally , LGA and districts in the Gambia are characterized by their heterogeneity, and there is huge variation between different LGA and districts in terms of their population. For example Brikama alone has more 250, 000 votes, compared to Banjul, which has only 22,000 votes. Now the biggest question is the efficiency of sample drawn from each LGA and district. The efficiency of a sample of a given size is a function of only one thing the degree of heterogeneity or the variance of the population from which it is drawn. This was neither provided in the report.
Related to the efficiency of the sample from each LGA and district is the probability proposition of units selected in each LGA and district to the size of this LGA and district. Probability proportion to size is a sampling procedure under which the probability of a unit being selected is proportional to the size of the ultimate unit, giving larger clusters a greater probability of selection and smaller clusters a lower probability. This is not illustrated in the report too.
Added to this, the report never made mention of the design effect (D). The design effect (D) is a coefficient which reflects how sampling design affects the computation of significance levels compared to simple random sampling. A design effect coefficient of 1.0 means the sampling design is equivalent to simple random sampling. A design effect greater than 1.0 means the sampling design reduces precision of estimate compared to simple random sampling (cluster sampling, for instance, reduces precision). A design effect less than 1.0 means the sampling design increases precision compared to simple random sampling (stratified sampling, for instance, increases precision). Again, the poll shorts fall in providing to determine the extent to which the selected sample in each LGA and district are representatives of the overall population of these LGA and districts, upon which the reliability of the results should be determined.
Third: The report indicates that telephone numbers as the frame, and that team of 12 enumerators was provided airtime on the various cellular networks (Africel, Qcell, and Gamcel). What is not clear is the basis upon which mobile numbers of participants were selected. And why cellular networks included in the study were limited to (Africel, Qcell, and Gamcel), while one of the biggest networks in country, in terms of subscribers, Comium was excluded? The fact that not all Gambians have Qcell, Gamcel, and Africel networks poses threat to the external validity of the result, hence those with Comium number did not have equal chance to be included in the study. More importantly, the report didn’t provide details of the percentage of the respondents who are using Africel, Gamcel or Qcell. The fact the subscription to certain mobile network suggest the socio-economic background of the person (for example subscription to Gamcel compared to Comium), this put the reliability of the results of poll under threat.
Finally, the report indicates that the data was collection started on 27th July 2021 and ended on 10thAugust 2021. This period did not account the latest political developments which will have impact on the intention of votes of the respondents. These developments include the controversial alliances between NPP and APRC, and the introduction of Essa Faal into the political arena, as well as the sudden surge in the popularity of UDP, which was demonstrated in their recent political rallies of the party. Given these factors, the results of the poll cannot be an indication of the possible winner of the upcoming elections.
In conclusion, it is a tedious task to reach out to all 900,000 voters; however the chosen sample should represent the population for the results to be reliable . This sampling strategies are directly related to internal and external validities. The choices researchers make in selecting sampling frames and sampling participants need to be clearly articulated, which lacked in the report of CepRass. This has introduced a variety of biases into findings that reduce the external validity of samples. Sampling related biases has made the results of poll inconclusive, unreliable to be an indication of the possible winner in the coming December election. Unless CepRass address these biases, the results of their poll can only be valid for those who participated in the survey, which is only 969 households across the country.