A little step sideways from small-angle scattering for this week’s post. As you are probably aware by now, I sometimes use the LaN weblog to crystallize ideas into something resembling a coherent story. This is needs to be done now, as I am preparing another presentation (due late January), one that covers a tangential topic from my usual repertoire: a little overview of bad tidings in science.
Let me start by quoting the (slightly flowery) talk abstract:
The pressures on young scientists are enormous. After years of education, they emerge to find themselves scrambling for jobs in an unforgiving world. Some scientists survive. Some make it out through perseverance and hard work, some are in the right place at the right time, yet others take less illustrious approaches as they discover the system can be gamed.
The system can be gamed. There are ways to reduce the pressure and follow an easier path towards perceived “success”, and several scientists have been spotted following this road (with unknown numbers still following it, wittingly or unwittingly). Through highlighting such examples in this talk, we can get a glimpse of what risks lie ahead for the pressured and unwary scientist on a slippery slope.
The reasons for this gameability of science lie in the values of our society. We can point out a few easy culprits. We can point to the enormous pressures to succeed coupled with a skewed understanding of success. To the propensity of administrative staff to elevate metrics beyond their application limits. To the push from the public to make complex science palatable, and demand quick, polarized answers to increasingly intricate questions. To the efforts from progressively obsolete for-profit publishers struggling to highlight their relevance by peddling inadequate impact factors. Examples will be shown, unfortunate details will be brought to the surface.
This talk has it all. A warning message, cringe-worthy examples, and real-life experience from a young old fart drawing experience from so many more. Science is in trouble, and it is starting to show. Something is rotten in the state of science.
Not my field of expertise, as I am sure you would agree. this topic is more often covered by the likes of David Colquhoun, Michael Eisen, Stephen Curry, Philip Moriarty, and others. However, I am getting interested in this for a variety of reasons:
- A long-time sufferer of self-doubt and impostor syndrome (hi academia!), I obsessively check for evidence of my impact on science. H-index, journal impact factors, number of citations to my papers, ResearchGate “score”, number of downloads of my papers, Twitter followers, number of grants (none so far), … the list goes on. But the more often one looks at these numbers and compares them with the numbers of others, the more depressed one gets at their performance. Occasionally, however, I start to wonder if it is really me being bad at science or if there is a problem with the numbers themselves.
- Messages on Twitter are reinforcing the idea that it may just be the numbers (a relieving thought). From these sources, it appears there is a significant over-reliance on numbers and numeric evaluation metrics in assigning all kinds of things: funding, job security, prestigious positions, and so on. Such a pressure on having the right numbers naturally leads to cheating, or gaming, epitomized by Goodhart’s law (but also Campbell’s law and Lucas’ critique): “When a measure becomes a target, it ceases to be a good measure”.
- The experience and knowledge gained from being involved in the stripy nanoparticle saga also leaves me wondering. How on earth can such a vitiated, poorly substantiated version of science lead to such prolific writing in high-impact journals, prestigious academic positions and plentiful grants? Why has it become so incredibly tough to do critical work, why are we at every step feeling hindered in actively discussing or calling out such poor work?
And then you start seeing it more and more; epitomized by groups “surfing” the grant waves along popular topics, whose output comprises an impressive amount of very quick but poorly researched papers.
This behaviour is encouraged by universities and institutes ostensibly having no nobler goal than a higher ranking (in whatever obscure system they rank highest), and basing their hiring and firing policies on that [ex1, ex2, ex3]. This trend towards quantification of people (typically with a single number), naturally causes worried researchers to try to stay employed by boosting their metric (and can be lethal to those who don’t). Whatever kludges are proposed to improve the metrics, you are still trying to encompass an entire, complex academic using simplistic measures. Hands up, all who have seen quantifications similar to the one used by Imperial College London (source and discussion):
Multiply the impact factor of the journal by the author position weight, and divide by the number of authors. The author position weight is 5 for the first and last author, 3 for the second author, 2 for the third author and 1 for any other position.
My institute uses something similar, but assigns double points if you publish in our institute’s own journal. Make of that what you will. None of these, by the by, lets one score points by doing arguably important scientific outreach through blogs or videos, often eschews book contributions, and are thereby actively discourages such activities. Most metrics stay necessarily confined to weighting number of publications, number of citations and grant income in one way or another.
Likewise, journals are trying to stay relevant in light of advances in alternative science dissemination outlets, are pushing terrible metrics (journal impact factors, JIFs) and blaming open access (um, no) for problems that have arisen. Apropos, many problems have arisen due to the increased importance of the metrics they themselves promote. Their push of journal impact factors leads or forces researchers to waste time chasing that metric instead of dissemination for the purpose of science.
Just to recap briefly, one problem with the impact factor is that it does not reflect very well that which it is supposed to represent; a very large fraction of papers in a given journal are cited less than its JIF (due to the arithmetic mean being a poor statistical measure for the Bradford distribution it tries to encompass). Secondly, the JIF is used to assess the “quality” of researchers publishing in them (see example quoted earlier). Lastly, the JIF only correlates strongly with the amount of retractions, and poorly predicts how many citations a paper will receive.
Researchers are found sexing up their topics, or twisting facts and figures just to aim their paper at that higher IF journal. Researchers not publishing when research is done, but when it is time for another paper. Publishing duplicate papers to boost publication record.
Editors and reviewers are found boosting their H-indices or JIFs by asking for their own off-topic papers to be cited (since it appears that the H-index is surprisingly well described as 0.54 times the square root of your total number of cites). Similar gaming appears through citation rings, soliciting or accepting ghost-authorships (more here).
All this leads to an involuntary shift of priorities of the scientist from research first and foremost, to publishing and grant-finding first (and research in their spare time).
So it seems apparent that science has strayed from its original path. To highlight what that was, skip to 5:33 in this video (previously highlighted in this post) and watch what GE had to say about it in the 1950’s:
The core message for me from that video, is the following:
“one must realize […] that the truly great discoveries of science — the fundamental breakthroughs — have most often resulted from scientific research that had no specific or practical goal. The scientist who is left free to pursue the truths of nature, no matter where his research may lead, and who is free to fail, time and time again, without criticism, may in the end come to a finding. This finding may become valuable immediately, eventually, or not at all.”
Science is the search for the nature of things. And sure, some scientists are more successful at finding things than others may be. But science is the search, not the result. It is the process of exploring old and new ideas as rigorously and critically as necessary, and then to document the findings of that exploration. The key to quality in science is not the publication but the scientific attitude and rigor, and it is that which is most difficult to quantify.
We can get a sense of the degree of rigor and attitude of the scientist through personal communication and by careful study of their notes and protocols. What we cannot do is assess these qualities through proxies of the journals they may eventually publish in, through the number of citations their works garner, or by counting the money they spent each year. We should also consider whether it is even worthwhile to spend the time and money on assessments of researchers. Sure, it is a slight waste of money to support an unfortunate researcher, but can you really deign to foretell the unfortunates from those on the verge of “success”?
We seem to be taking another, altogether more dangerous approach: we are on the path to redefine “good science” in terms more easily quantified and judged by bean counters. We are moving to a bastardized version where its quality is synonymous to retweets, citations (essentially a more official retweet), impact factors and self-reinforcing grant expenditure. This way, science is rapidly becoming a caricature of itself, not dissimilar from the version portrayed by Aaron Diaz. If we let science be redefined in these terms, then yes, it can be quantified, assessed, evaluated, ranked. And you better make sure your metric is up to scratch.
But that is not the science I signed up for.
Thank you very much for the response, Jennifer. The more I researched it, the more I found similar thoughts in a wide variety of other places (hence the plethora of links in the article).
I like your insight on the “abstracted monetary unit” (AMU?), but I wonder how one would go about spending them. Perhaps 50 AMU’s for a 100k grant? But again, this just causes groups to perfect the mining of AMU’S. Good for the mining industry, not good if you want to find out how nature works.
Perhaps we need to retreat back to our ivory towers and just do research ignoring the “market” influences… And fire the administrators on the way back up :).
“But that is not the science I signed up for.”
Neither is it mine. Actually, it is not science at all.
Well done for publishing this, Brian. Thanks, also, to Philip Moriarty for drawing attention to this post.
The new objectives of university management seem to be (1) income and (2) “no nobler goal than a higher ranking”. Science/knowledge does not, itself, come into it. Some universities are secure enough in their ranking to concentrate exclusively on (1). I doubt any are rich enough to give priority to (2). Others want REF, for example, to deliver both.
I once had impostor syndrome. It faded when I found I’d done original research that made a difference. I’d had ideas, then discovered things, of interest to others. Scientists are usually motivated by the desire for recognition, which is a reward. For true scientists, just recognition is the only sort that counts. Self-doubt persists, as it should, being the source of honest self-criticism.
I think I know the real importers. They are immune to self-criticism, and prone to destroy critics. They give up trying to advance knowledge, and enter the management wonderland where the pay, prospects and working conditions are much better, and where the appearance (which is all that matters) of research achievement (if it still matters) can be purchased as undeserved recognition. The trap they fall into is to imagine that they own others, whose work therefore becomes theirs. Think of it as a modern form of feudalism.
Perpetuating such system is the only purpose I can think of for Research Assessment and REF.
As someone who has been editing research and other publications in a variety of fields and teaching scientific writing for a number of years now, I am thrilled to see this debate break out in the open. The eruption of commercial marketing and advertising techniques into scientific discourse is doing a great disservice to many fields across the board. I want to be challenged every day with a new idea, not a new infographic. No marketing advice can replace a truly expert evaluation by someone who is a peer. Exchanges between experts on a particular topic are the key to what makes the ideas presented better. Innovation and impact do not happen until after an idea is tested and adopted by others. How many highly touted EU research projects end up being simply empty websites three years down the line?
The ‘valuation by metrics’ approach is in essence creating a kind of abstracted monetary unit. The “bean counters” are confusing scientific value with the metrics wealth of those individuals. If we are going to go down this path, then we should be able to lose value and invest our beans too, not simply keep amassing metrics-coins like some out-dated industrial production curve. This system is creating a bubble economy where many fields are simply running in place because they keep short-term investing in the same persons (or groups).
This is not the science I signed up for either. Research takes time to do well.
Thanks, John, for that insightful comment.
I think mild impostor syndrome can be helpful at times as it lets you double-check results and cross-reference statements. However, it should not be driven by “rank”.
Following up on your last paragraph, one has to wonder how this metrics-driven system came to be so dominant in science. I suspect at some point, research was tagged as being “inefficient”. Such a denomination then causes funding agencies, governments and universities to state they are going to “improve efficiency”. However, the only way to make measurable success in this is if you start quantifying science. Now that’s a tough cookie.
How to quantify the efficiency of science? The trouble is that science and scientists have only very few measurables. As far as I can tell, that is citations, grants and dissemination (presentations, books, papers, but only the latter has a quantifiable response). Any metric has to be derived from these measurables and therefore emphasize only this aspect.
Quantification also allows a certain degree of distantiation between management and the people they *ahem* “manage”. That allows for easy, clean-hands laying off of people simply by pointing at their metric. And thus starts the drive to improve one’s metric.
It is often said that science should be self-correcting, and I suspect that this effect was imagined to be strong enough to counter the negative side-effects of such metrification. However, as we have seen, the self-correcting system is very fragile. It is no longer in one’s interest to correct the literature (slow, much resistance), so poor work stays in literature and is built upon. The self-correcting system is thus usurped by the metrification.
What can be done? Well, I think one way is to limit the number of publications one can publish, to (f.ex.) max two per year. Grants should be disbursed to institutions based on their size and divided (equally) within, freeing scientists from grant application (which should be admin’s job in my opinion). Impact factors should not count for anything, freeing up time and allowing for more in-depth articles (do you remember those 50-page work from the mid-1900s?). Open peer review should be encouraged, as this is the critical look essential to science.
Wishful thinking, perhaps, but an idea for the future?
Thank you, Brian. I agree with all, except, perhaps, limiting number of publications, but this is a minor point.
In the UK there has been a frenzy of what is claimed as “research assessment”, and a national report on the outcome, termed “REF”, is due on December 18.
In my submitted comment, I made the final words “research assessment and REF” a hypertext link, but WordPress seems to have have eliminated it. “Website” works as hypertext, so for this comment it links to the lost URI, namely:
In the UK, this started with what turned out (as one might expect) to be a bunch of tactical initiatives aimed to distributing funds according to a formula, rather than history and to make research more relevant to the economy. In the 1980s. The result is we destroyed much of the framework for technical training, which occurred in Polytechnics (converted to standard universities), RAE (now REF), which savaged a lot of practical research.
We now have a coterie of folk pushing “excellence” without being able to define it at a level that would be sufficient for an undergraduate essay; panels that hire without actually asking questions (real questions) about the research papers of the applicants and the mantras “NCS, IF” in various forms.
These things had a use, but they are past their sell by date and we should move on, a little wiser.
I suspect that politicians and the Civil Service will be overtaken by events. Few engage with social media, fewer understand the implications of open access, open data and citizen science, or of fascinating experiments in education, such as the University Technical Colleges in the UK, where high school students (y10-y13) are operating like undergraduates on a hybrid 1st year and 3rd year degree course!
I feel I’ve been lucky in not having to deal extensively with metrics to date. Prior to my faculty position, my postdoctoral hiring was done by people in my own field. This removed the necessity for a one number assessment of my abilities.
Possibly the desire for a magic number appears when trying to compare cross-disciplinary fields (as is needed for faculty hires). Yet, this is of course where such numbers become daft: in astrophysics, the student typically is first author. This means that the metic (as dictated by Imperial above) would drop for a researcher as they became more senior, presuming they were doing their job right. So you have the choice of respect from your community (and letting the student go first as is protocol) or securing your job where such a choice will effect your score.
(I’ve only been in this situation once and I told the person who questioned my recent lack of first author papers to stuff it. But I doubt I’ll have that luxury a second time!)
I wrote a comment on your article about the nano particles, but decided not to publish it. Since you re-open this topic, I would like to comment now.
What is the real value nowadays of any paper?
Bottom line is that since we are all under pressure with most of us having time limited contract, we have to publish in the amount of time our contract provides. Either we do boring, predictable science that can be finished in a year or six month before moving on to the next job / topic or we do blow up our papers. Instead of one conclusive paper, one writes three, breaking all results up to increase the numbers of first authorships.
A shame for those researcher working on a more difficult topic, for example this bizarre stripy nanoparticle paper. How could one justify to let a PhD student etc work on this topic when the publication alone takes years? He will leave empty handed.
Or what happens if one works in a multidisciplinary field where tasks include sample preparation, intensive programming for data analysis and device controlling, construction of complicated sample environments, all at the same time?
To push our impact factor, we advertise our papers to our colleagues hoping they would be so kind to cite us although the paper is not the best one in the field.
Another way to influence our numbers.
Is the problem that the “system” generates too many scientists? PhD students, master students are cheap, so why investing in a permanent position or in a longer Post-doc?
I have no suggestion to solve the problems in the science business because there is an endless supply of students not understanding that they are in fact exploited by professors etc.
Note, a similar suggestion is going on in some European countries claiming permanent positions would only invite the bad researchers to stay. But are all professors with permanent jobs choirboys? I doubt this seriously.
I don’t like to blow up my work, I won’t lie, and I won’t twist figures or facts or manipulate my data until I see what I hoped, even if it is not there. I would check for reproducibility. Drawback is that I am considered to be slow and not having enough high impact papers. The future is unsure and my passion for science will probably be killed by this paper system.
I really do question if the number of publications, their impact factor, and number of first authorships really reflect the quality of a scientist and its work. It rather severely erodes the ethics of scientific research (see the stripy nanoparticle story).
I think the hyperlink works in the last post, but just in case, I hid it amongst the links in the article as well.
The idea of limiting (first-author) publications comes from something someone once said to me: “if you publish more than two first-author papers a year, you haven’t thought enough about your work.”. I tend to agree with this, good research takes time to gestate, and even if you think you have the answer, it is probably best to sleep another night over it.
I heard about the REF already many months ago, and it is honestly frightening to see how much effort and time is wasted on the filling in of forms and the assessment of papers. For once, I am happy to be rank-and-file so I do not get asked to do this. Once I win a Nobel prize, I will do like Richard Feynman and say “to hell with all this committee work, I have decided to be irresponsible”.
Dear Dave, thank you very much for commenting and giving us a brief history of its origin in the UK. It is indeed a powerful meme to try and “improve the quality and efficiency of science”, but I think it is practically impossible to do so through assessments and funding assignments. I think the move towards public acceptance also may have removed some support for fundamental research.
What has made a big improvement in science is the internet, blogging, open access and peer review forums (at least until the educated trolls show up). This allows for much faster collaboration, dissemination and feedback. Can that improvement be measured, however? I suspect the answer is “not objectively”.
As you say, I hope we can move on from this approach driven by abhorrent metrics. As I mentioned in the reply to Jennifer, perhaps a return to the proverbial “ivory towers”? I suspect, however, that for many managers, you’ll have to pry the metrics from their cold, dead hands.
Dear M. Bernard, Thanks for the comment.
The value of papers is the same as it always has been: referable dissemination of findings. However, it has certainly reduced in importance since the advent of alternative methods of communication. On LaN, I often post small findings or investigations, too small to make a paper out of, but still of interest to the community. Occasionally, I can combine some of these into a large paper. In the ideal world, I would publish but one large paper per one or two years (remember the big, comprehensive papers from the mid-1900s?). This approach would work for bigger, more blue-sky research as well, especially if the support for blogging by scientists improves.
Realistically, however, the situation for many scientists is closer to what you describe. We need to publish every small bit we find just to keep the rate up, and blow up the significance of our findings to pass the “of interest to the community”-criterion. As you say, we can improve our own metrics by heavy promotion of our papers and hoping they will be cited more. The problem with this is that your own promotion must be louder than everyone else.
We are generating too many scientists too, but that is a whole other complicated problem. There simply does not seem to be sufficient public support to increase the research institute size proportionally to the influx.
The fear of permanent positions is also something I do not understand. Permanence frees up time and time constraints, allowing more focus on research. There seems to be such a disproportionate fear of catching a “bad apple”, that it resulted in a near-abandonment of permanence altogether. However, in my opinion, there are and have always been a constant fraction of bad apples. Putting systems in place to “catch” these places everyone in an atmosphere of distrust. Such an atmosphere breeds bad apples.
As for your last point, no, I don’t think IF, citations and first authorships are the hallmark of a good scientist. Unless you change the definition of “good”.
Thank you, Brian and others. It seems we agree.
The “bad apples” were once merely a waste of resources. I fear they are now in charge, know they are unable to appreciate the science itself, and so fall back on ersatz “measures” to decide who gets resources, and who keeps their jobs. By the way, tenure was abolished in UK some time in the 1990s. No-one has job security, while bad apples ruthlessly guard their immunity from criticism.
Scientists welcome criticism, of course, knowing that there is no progress without it.
Hi Brian– following on from our twitter discussion it got me thinking about this and I started writing a short essay. Fingers crossed it makes sense and all that!
The Business of Science
We strive in the scientific endeavour to explore the world and find out the secrets of life and the universe. Some of us are motivated in the purest sense, uncovering everything and anything that enters their vision. Others are motivated by a desire to enable the progression of society. Many more blend these two aspirations, with intense passion and unwavering curiosity.
As our knowledge expands, the level of training and discussion necessarily increases. This drives progress with specific constraints.
The nature of the consensus of many basic scientific facts, basic in their fundamental contribution to our knowledge but complex in the understanding involved, requires training and collaboration to shape the abilities of the next generation of innovators and re-train the existing population. In the UK this has driven us towards intensive PhD study followed by further postdoctoral posts and fellowships where new approaches are explored and techniques are shared amongst laboratories and institutions. As an experimentalist, these opportunities enable talented individuals to harness equipment and resource that would be unaffordable for a lab starting up. As a theorist, sharing ideas, new approaches and algorithms makes new challenges tractable. Fitting together the jigsaw of new talent, long game projects, and rewarding excellence involves drawing together scientists from a wide array of backgrounds and managing them effectively to promote the birth of new discoveries, challenging the status quo, and undertaking in the grander scientific endeavour.
There will of course be brilliant individuals who can survive with a pencil and paper, but it is likely that there are few lone bright stars who make effective and competitive contributions in isolation. Requirements to make bigger machines, use more complex mathematical and theoretical tools ensures that many more advances are made by large teams in innovative and competitive environments, where new ideas rise under peer challenge and intense discussion. This can be seen with the rise of many authored publications, centres of excellence, and the popularity of large international conferences. All of these provide scientists opportunities to collaborate, share insight and discuss new ideas. Embargoes by publishers for ‘top tier’ publications stifle pre-publishing critiques and hamper discussions, creating unnecessary hype and slow our march forward.
Unfortunately science does not exist in a vacuum and we must justify time, resource and interest in our calling. This justification competes at the highest level in society with areas such as heath, security, and the day-to-day running of entire countries. For scientists this transcribes with funding applications, impact statements, and writing the highest quality papers from their quest to unpick nature’s gears and cogs. We must also instil the public with our passions, drawing their focus towards the needs of sustained investment and opportunities for science to progress. By some, this may be viewed as a burden and distraction necessitating assessment frameworks, impact statements, public engagement and similar. Yet it also can be a significant and exciting motivation providing inspiration and grand challenges, drawing individuals together to solve problems, generating new technology and innovation growing the economy and opening up more resource for science to progress.
Where resource is limited and we have many talented and able scientists, how do we qualify our need for resource and justify ourselves to our funders – typically the taxpaying public, charities and industry? Furthermore, in a competitive market where there is more science to be done than resource available, how do we ensure that the system enables the most progress possible and that we can justify an appropriate amount of resource, effort and time expended? This is further complicated by the diverse range of fields currently explored, from splitting atoms and chasing the most fundamental physics, developing new materials that enable cheaper transport of people, goods and ideas, towards innovative healthcare that increase the quality of life for society.
Where does this leave us as scientists? How do we balance our time and effort? What is an effective way for science to progress and for us, as individuals to remain optimistic about our futures and our thirst for new ideas and innovations?
Resources are limited and our use of time, money and facilities is precious. A strong proposal system, with peer review, ensures that experiments are well thought out and do not duplicate existing work and therefore waste resource. Ranking and assessment of proposals is fraught with risk, so a panel of peers increases conservatism but mitigates bias and makes life fairer for all. With an increased number of applications, given the growth of new institutes and centres of excellence, there has to be a level of management and reductionism to make the process more efficient bringing the timeframe between proposal writing and award closer together and so crystallisation of ideas and data reduction by the panel are going to happen. Once the award is made, the taxpayer needs to ensure that money is well spent and it is only fair to all applications that review of the grant is performed with periodic inspection and discussion of deliverables and progress. Clearly ideas, opportunities and objectives will change overtime as fields evolve, so a degree of reasonable flexibility is essential.
Clusters of excellence require sustained funding. These clusters draw good students in and ensure that the whole scientific endeavour is competitive and that the best ideas are being worked on by the best people. This requires management, and not all scientists make good managers. Furthermore as progress between and within fields is not a continuous march forward evaluation has to be careful and this requires careful selection of good managers, as well as a culture of trust and sharing, creating a blended team of managers and creators.
Sustained funding without review comes at a cost that it may stifle the creation of new groupings and therefore dampen progress for all. It also hinders academic mobility, where fresh ideas from one field are rapidly transplanted towards another. Periodic review, with a dialogue between reviewers and reviewed, offering constructive criticism and support is therefore essential. To increase transparency, facilitate openness and reduce administrative burden this necessitates reporting and standardised evaluation. At times this will get things wrong and it may become gamed overtime, as individuals chase metrics at the expense of core science. Maintaining a healthy dialogue and ensuring that peer review is part of the management process that must be rewarded at all times.
Hi, I’m no scientist, I’m an engineer by training, but I do find this conversation very interesting. From my corporate perspective, it seems to me that the problem stems from government funding. Private companies tend to steer their research towards practical things that they can sell. If the research shows their products are dangerous, they will kill the research. Government administrators are no different. They will steer their funded researchers towards social policy that they can “sell”. If the research shows that the current administration’s policies are dangerous or misguided, they will kill the research. When you are funded by government grants, you are participating primarily in social engineering; the science is simply an excuse for the primary agenda of societal change. I think it may be that if there’s anything more destructive than capitalism, it’s socialism.
I think you are correct, Paul. Except to say that universities have, mostly, public funding rather than “government” funding. Government’s idea of “social engineering” is to spend other people’s money on things that it hopes will deliver votes for its own candidates. Whether the objective is capitalist or socialist, killing curiosity-led research is a serious obstacle to discovery.
And the bane of many universities if managers’ claim that they can predict future funding streams, outside of which nothing must be allowed to proceed. They have no idea what these streams are, of course, but it is enough for them to be able to claim that they know.