Skip to Main Content

Scholarly Publishing vs. Predatory Publishing: Google Scholar

This guide is to help authors determine if a journal is scholarly or predatory, an important distinction for rank and tenure applications.

Google Scholar

Limitations and criticism
Quality – Some searchers consider Google Scholar of comparable quality and utility to commercial databases.[24][25] The reviews recognize that its "cited by" feature in particular poses serious competition to Scopus and Web of Science. An early study, from 2007, limited to the biomedical field, found citation information in Google Scholar to be "sometimes inadequate, and less often updated".[26] The coverage of Google Scholar may vary by discipline compared to other general databases.[27]

Lack of screening for quality – Google Scholar strives to include as many journals as possible, including predatory journals, which "have polluted the global scientific record with pseudo-science, a record that Google Scholar dutifully and perhaps blindly includes in its central index."[28]

Coverage – Google Scholar does not publish a list of journals crawled or publishers included, and the frequency of its updates is uncertain. Bibliometric evidence suggests Google Scholar’s coverage of the sciences and social sciences is competitive with other academic databases; however as of 2017, Scholar’s coverage of the arts and humanities has not been investigated empirically and Scholar’s utility for disciplines in these fields remains ambiguous.[29] Especially early on, some publishers did not allow Scholar to crawl their journals. Elsevier journals have been included since mid-2007, when Elsevier began to make most of its ScienceDirect content available to Google Scholar and Google's web search.[30] As of February 2008, the absentees still included the most recent years of the American Chemical Society journals. It is, therefore, impossible to know how current or exhaustive searches are in Google Scholar, although a recent study[4] estimates that Google Scholar can find almost 90% (approximately 100 million) of all scholarly documents on the Web written in English. Large-scale longitudinal studies have found between 40–60% of scientific articles are available in full text via Google Scholar links.[31]

Matthew effect – Google Scholar puts high weight on citation counts in its ranking algorithm and therefore is being criticized for strengthening the Matthew effect;[22] as highly cited papers appear in top positions they gain more citations while new papers hardly appear in top positions and therefore get less attention by the users of Google Scholar and hence fewer citations.

Google Scholar effect – It is a phenomenon when some researchers pick and cite works appearing in the top results on Google Scholar regardless of their contribution to the citing publication because they automatically assume these works’ credibility and believe that editors, reviewers, and readers expect to see these citations.[32]

Incorrect field detection – Google Scholar has problems identifying publications on the arXiv preprint server correctly. Interpunctuation characters in titles produce wrong search results, and authors are assigned to wrong papers, which leads to erroneous additional search results. Some search results are even given without any comprehensible reason.[33][34]

Vulnerability to spam – Google Scholar is vulnerable to spam.[35][36] Researchers from the University of California, Berkeley and Otto-von-Guericke University Magdeburg demonstrated that citation counts on Google Scholar can be manipulated and complete non-sense articles created with SCIgen were indexed from Google Scholar.[37] They concluded that citation counts from Google Scholar should only be used with care especially when used to calculate performance metrics such as the h-index or impact factor. Google Scholar started computing an h-index in 2012 with the advent of individual Scholar pages. Several downstream packages like Harzing's Publish or Perish also use its data.[38] The practicality of manipulating h-index calculators by spoofing Google Scholar was demonstrated in 2010 by Cyril Labbe from Joseph Fourier University, who managed to rank "Ike Antkare" ahead of Albert Einstein by means of a large set of SCIgen-produced documents citing each other (effectively an academic link farm).[39]

Inability to shepardize case law – As of 2010, Google Scholar was not able to shepardize case law, as Lexis can.

Google Scholar Citations