Let me just flag up to those of you who aren’t aware of my day job, I’m the Chief Editor of Nature Chemistry; best I put that at the top of this post considering the subject matter…

***See updates at the bottom of the post***

Many thanks to a reader of this blog for forwarding to me an email that they received from Angewandte Chemie thanking them for being one of their ‘outstanding referees’. I applaud such referee-recognition efforts; more journals should do this.

What caught my eye, however, is that the final paragraph of the e-mail mentions statistics related to Angewandte Chemie (and one other journal; can you guess which one?), backed up with 6 figures. First are a couple of graphs showing where submissions to Angewandte Chemie come from (for 2001-2015) and the geographical breakdown of referees (for 2008-2015). It appears that, in 2015, more submissions came from East Asia (somewhat unhelpfully not defined) than any other region and that more referee reports came from Germany than any other country. I don’t feel that I can reproduce the actual graphs here because I did not make them and nor have I sought permission to reproduce them (hey kids, we call this part of the blog post ‘foreshadowing’).

After discussing submissions and referees, the final paragraph goes on to discuss citation metrics in a somewhat contradictory fashion; I’m going to quote from the e-mail now and assume this won’t get me into any hot water (legal or otherwise):

“Too many scientists are obsessed with metrics these days. Figures 3-6 in the attachment demonstrate that Angewandte Chemie publishes indeed many highly cited papers (one only has to look at absolute numbers).”

Which is a bit weird, no? Hey everyone, too many people are obsessed with metrics these days, so here are MORE metrics for you to look at. Hmm.

Now, you know what, I am going to reproduce figures 3 and 4 from the e-mail that Angewandte Chemie sent to its outstanding referees. The reason that I feel on pretty firm ground doing this (despite Wiley’s history of going after bloggers who have reproduced figures from their journals) is that figures 3 and 4 are, with only minor modifications, mine. They first appeared on this blog in a post about citation distributions in chemistry journals.

Here’s figure 3 from the e-mail (top) and my original (bottom).


And here’s figure 4 from the e-mail (top) and my original (bottom).


Pretty similar huh? Same colours, same fonts… – well, that’s because they are clearly the original images from my blog post.

Did Angewandte Chemie ask my permission to reproduce these figures? No.

Did Angewandte Chemie acknowledge in the e-mail that they sent to their referees where these figures came from? I don’t think so (at least not as far as I can tell; perhaps the figures contain hyperlinks, but in the flat version forwarded to me, there is nothing that appears to attribute the figures to me).

What Angewandte Chemie have done, however, is to add logos for GDCh, Angewandte Chemie and Wiley-VCH to the bottom of each figure. Stay classy Angewandte, stay classy. You are now the Daily Mail of chemistry publishing. You couldn’t even be bothered to get the data yourself and re-plot it!

To be fair, they have invested a huge amount of time and effort to add a title and a footnote to each figure, but I’m not sure that means you can ignore where you got the actual figure from.

Now, I get the impression that Angewandte Chemie is unhappy with my citation-distribution blog post. ‘Why?’, I hear you ask. Well, we also get treated to figures 5 and 6. Figures 5 and 6 from the e-mail that they sent to their referees are not my figures, so I will not reproduce them here (take note, Angewandte). I will, however, describe them to you. Figure 5 basically puts the data from figures 3 and 4 onto a single graph, but retains the y-axis scale from figure 4. What this does of course, is put the Nature Chemistry data in the noise. This figure very clearly proves that Angewandte Chemie publishes more papers than Nature Chemistry; I’m pretty sure there are simpler ways to plot this, however.

Now, figure 6. Figure 6 takes the portion of figure 5 that looks at the citation range from 40-100+ citations (except, of course, they simply use the label ‘100’, not ‘100+’ like I did on my graphs… c’mon Angewandte, attention-to-detail folks!). Now, what this graph shows is that when you consider citations in 2014 to papers published in 2012 and 2013, Angewandte Chemie has many more that have accumulated 40-100+ citations than Nature Chemistry. This is not surprising. Angewandte Chemie publishes many very good papers and many highly cited papers. And considering that in 2012-2013 it published almost 20 times as many papers (reviews and research papers) as Nature Chemistry, of course it’s going to have more.

The purpose of my original blog post was, for the most part, to examine citation distributions – it was a follow-up to another blog post that looked in detail at what contributed to Nature Chemistry‘s 2014 impact factor. Sure, I can see how the post could come across as a citation pissing-contest, but the only specific comparisons I make in the text of the post are somewhat superficial and are between Angewandte Chemie and the Journal of the American Chemical Society (JACS) and between Angewandte and Chemical Science. I explicitly pointed out that the y-axis is very different across the charts I plotted for 6 different journals. I did offer a way in which you could compare journals if you wished to, by considering the percentage of published items with a given number of citations. The only way to make any comparisons is to normalize for quantity (and even then it might not be all that meaningful).

Anyway, I find it interesting that Angewandte Chemie compared itself to Nature Chemistry in its e-mail to its referees. I thought I would be helpful and do the comparison for them to JACS; the graphs are below. In terms of published items, the 2012-2013 counts are 4572 (Angewandte) and 5939 (JACS), so roughly similar – well, much more similar than Nature Chemistry and Angewandte for a start.

Here’s my version of figure 5 from Angewandte‘s e-mail to its referees (only with JACS data instead of Nature Chemistry):


And here’s my version of figure 6 from Angewandte‘s e-mail to its referees (only with JACS data instead of Nature Chemistry):


Well, look at that, JACS publishes more highly cited papers than Angewandte Chemie! They also publish more papers in total too. Read into that what you will.

So Angewandte, if you do fancy using these graphs in your next e-mail to your referees, don’t put your logo on them and do acknowledge where you got them from – but I do give you my permission to re-use them for that purpose.

With regard to your unauthorized and unattributed re-use of my figures in the e-mail that you have already sent, I kindly ask that you send a follow-up e-mail to everyone that received it noting where the images came from, including a link to the original blog post.

I don’t know if your use of my images in an e-mail that went to hundreds of people counts as ‘publishing’ them, but regardless, it seems only fair to give credit where credit is due – just like you are doing with your outstanding referees.

UPDATE 8:45 am, 15 March: @angew_chem have apologized on Twitter and have promised to follow up with the referees; I thank them for getting back to me so promptly and for doing what I asked.


UPDATE 3:45 pm, 15 March: I’ve had a very nice e-mail from Angewandte Chemie apologizing for their oversight and they also shared with me the draft of the follow-up e-mail that they intend to send to the referees and asked me for any comments I had concerning it. I take it back, they are not the Daily Mail of chemistry publishing after all.

UPDATE 3:35 pm, 17 March: As pointed out in the comments below, and also confirmed in e-mails from others, Angewandte has indeed followed up with the referees as they said they would. I thank them for doing this so quickly and without any fuss whatsoever.

We need to talk about Twitter…

As part of Materials Week at the University of Warwick, I was asked to talk about social media and how it is used by scientists (and of course I threw in a bit about how journals use it too). Because it is what I know best, the majority of my talk focused on Twitter, with a side of Tumblr, Facebook, YouTube and blogging. And I figured that the best way to find out the reasons why scientists use Twitter was to ask them, and so…

The response was great (and was kindly Storified by @CrimsonAlkemist). I’m certainly not the first to throw this question out there on Twitter; I know others have done it before and people have also blogged about why they use Twitter too (for example, see this post by @Alexis_Verger). Please do point to other such posts in the comments as well – I imagine there are plenty of others out there.

Anyway, the presentation that I threw together can be found by clicking on the image below; it’s essentially a series of screenshots that I talked around. I think the talk needs refining somewhat, but this is a good place to start and I’ll hone it from here if I give it again in the future… if you wish to use any of the slides yourself, please feel free to do so.


Imperfect impact

The problems with impact factors are well known – I could give you a long list of things to read that explain why, but just start with this blog post from Stephen Curry and go from there.

I have a slide that I use in my talks that sums up one particular problem – that the impact factor (IF) of any given journal tells you absolutely nothing about any given article in that journal. For example, the current IF of Organometallics is just over 4, whereas Nature‘s is more than 10 times that at just over 41. But does that mean that every Nature paper is 10 times ‘better’ than every Organometallics paper? (Answer: of course not! – and how on Earth would you measure ‘better’ anyway?). It also doesn’t guarantee that a particular Nature paper will have received more citations than any given Organometallics paper (after all, a wide distribution of citations make up an IF). Considering the perverse incentives in science, however, I wonder how many people would rather have on their CV an Organometallics paper that has received 50 citations in a year instead of a Nature paper that has garnered only 10 in the same period of time?

Anyway, I digress. The slide I have looks at things from a different point of view. Wouldn’t it be interesting if you could take exactly the same paper and publish it at roughly the same time in a bunch of different journals? Take your fancy-metal-catalyzed-cross-coupling-based synthesis of tenurepleaseamycin and submit it to (and have it published in) Angewandte, JACS, Nature Chem, Science, JOC, Tet Lett and Doklady Chemistry and then sit back and see how the citations roll in. Of course, it’s the same paper – it’s not a better paper in one journal than another, so it will get cited roughly equally in all journals, right? Well, all you can really do is speculate, because if you did try to do exactly that you’d end up really annoying some chemistry-journal editors and you might not get the paper published anywhere (well, I can think of a few places that would probably still take it, but discretion is the better part of valour and all that).

Well, never fear! The experiment has been done. Although it wasn’t an experiment, it wasn’t done for the purpose of comparing citations in different journals and it’s happened more than once. It turns out that in medical publishing, editorials/white papers occasionally get published in more than one journal. So, say hello to ‘Clinical Trial Registration — Looking Back and Moving Ahead‘. A few years back, I looked at the citations this paper had received in a range of different journals and the IFs of those journals – the slide from my talk with all of the data on is shown below.


There’s a pretty good correlation between the number of citations that this identical paper received in each journal with the IFs of those journals. Of course, perhaps more people read the New England Journal of Medicine than the Medical Journal of Australia and so a wider audience will likely mean a wider potential-citation pool. Whatever the reasons (and it’s not all that difficult to come up with others), the slide shows how silly it is to assume that the IF of a journal has any bearing on how good any particular paper in that journal is. As I have said before, the only way to figure out if a paper is any good is to actually read the damn thing – the name (or IF) of the journal in which a paper is published should never act as a proxy for how awesome (or not) a paper is.

So, as well as pointing out one specific flaw in the IF, when showing this slide it does allow me to make a joke about how the correlation would be even better if it wasn’t for some (imaginary, I hasten to add) Croatian citation ring… I apologize if I have offended any Croatian doctors who happen to be reading this… but the joke usually gets a laugh.

Chemistry journal citation distributions

Over at my day job, I recently looked at the distribution of citations that 2012 and 2013 Nature Chemistry papers (Articles, Reviews and Perspectives) received in 2014 – essentially the citations that are used to calculate the 2014 impact factor of the journal. I would recommend having a read of that post before ploughing through this one. I’ve now done the analysis for five other general chemistry journals, just to see how they all stack up. In each case, the data is from Web of Science (All Databases) and is refined by document types ‘Article’ and ‘Review’. In the Sceptical Chymist post I also did the calculation for Nature Chemistry after removing the Review articles from the data, but haven’t done that here.

So, here is what Nature Chemistry looks like:


And here’s JACS, Angewandte Chemie (the International Edition), Chemical Science, Chem Comm and Chem Eur J (note that because of the wildly different volume of content across the 6 journals, the scale on the y-axis changes quite significantly – as does the smoothness of the distribution; also, for the Chem Comm and Chem Eur J, I have included magnified sections of the later portions of the distributions):






One way that you can compare journals that publish vastly different numbers of papers is to look at the percentage of published items that have more than a given number of citations. For example, each journal has 100% of papers with 0 or more citations, but what does the percentage drop to when you consider papers with 1 or more citations? If 5% of a journal’s papers have 0 citations in 2014, then the second point plotted on the graph would appear at 95% (i.e., 95% of papers would have one or more citation). If you do this analysis for the 6 journals above, this is what you find:

n or more cites all

If you stack these graphs on top of one another, you can then compare (for the most part) across the 6 journals:

n or more cites overlap

It’s interesting to note that JACS compares favourably to Angewandte, even though Angewandte publishes far more review-type articles, and also note how Chemical Science is not all that far behind Angewandte when you do this sort of analysis.

Back to the future (of chemistry publishing)

So, here’s my obligatory Back-to-the-Future Day post and, because it is me doing this, it’s obviously about chemistry publishing. I figured I’d compare one issue of a journal published in 1985, with an issue published in 2015. Because the last time I looked at chemistry publications over a particular period of time I chose JACS, I thought I’d do Angewandte Chemie (the English edition) this time so that my friends over at Wiley don’t feel all left out. So, I looked at the October issue from 1985 (yes folks, there was only one issue of Angewandte each month in those prehistoric times) and compared it with the October 26th issue from 2015 (which is 5 days from now – and that seems appropriate considering the time-travel inspired nature of this post).

I just looked at the ‘Communications’ section of the issue in each case (that’s 27 papers from the 1985 issue and 51 papers from the 2015 issue) and this is what I found from these – admittedly tiny – samples:

1. Papers now have more authors on them than 30 years ago. The average (mean) for a paper in the 1985 issue was 3.07 authors, whereas it is more than double that in the 2015 issue at 6.37 authors per paper (the medians are 3 and 6, respectively).

2. Papers are now longer than they were 30 years ago. The average page extent for a paper in the 1985 issue was 2.15 pages, whereas it is now more than double that in the 2015 issue at 4.86 pages per paper (that’s just based on page ranges; not full printed pages in the journal). For comparison, the medians are 2 and 5, respectively.

***UPDATE – see comment below from @fluorogrol and my replies***

3. The geographical spread of corresponding authors is much greater now than it was 30 years ago. In 1985, German authors dominated Angewandte Chemie, but that’s not true anymore it seems – just look at the charts below.

Breakdown of geographical location of corresponding authors in Angewandte Chemie.

Breakdown of geographical location of corresponding authors in Angewandte Chemie.

As I mentioned above, these are really small samples so do take the analysis with a pinch of salt. That said, @fxcoudert has looked at these trends in more depth in the past and I highly recommend that you go and check out these two blog posts here and here.

I don’t know if this counts as #OldTimeChem or #FutureTimeChem (or perhaps a bit of both), but anyway, this is my little bit for #RealTimeChem week 2015!

All your base are belong to JACS

This is a follow-up post to yesterday’s that looked at word clouds made up from the titles of JACS papers from the last 115 years.

Jake Yeston commented on Twitter about the lack of catalysis-based words in the clouds. This is something that also caught my eye and I’ve now had a chance to dig a little deeper into this.

The way the word clouds work (the ones you can make using Wordle at any rate) is by counting exact copies of the same word and then scaling the size of the word in the cloud in proportion to the number of times it appears in the input text. So, if you look closely at the word clouds from yesterday’s post, you will see ‘reaction’ and ‘reactions’ both appearing in the same word cloud. Similarly, acid and acids, complex and complexes, study and studies, and so on. Wordle also does not separate hyphenated words, so you will see things like ‘gas-phase’ and ‘electron-transfer’.

What does this mean for catalysis? Well, I started looking through the titles for the 2010-2014 data and found all of the following words (and there are probably other variants that I missed):

anticatalysis, autocatalysis, autocatalytic, biocatalysts, biocatalytic, catalase, catalysis, catalyst, catalytic, catalytically, catalyze, catalyzed, catalyzes, catalyzing, cocatalysis, cocatalytic, cocatalyzed, electrocatalysis, electrocatalyst, electrocatalysts, electrocatalytic, electrocatalyze, multicatalytic, nanocatalysts, organocatalysts, organocatalytic, photocatalysis, photocatalyst, photocatalysts, photocatalytic, precatalyst

This means that catalysis is being spread quite thin and not being lumped together as a single entry in the word clouds. But it gets worse. In the 2010-2014 cloud, if you look carefully you can find ‘palladium-catalyzed’… and remember what I said above about Wordle not separating hyphenated words? Not only is ‘palladium-catalyzed’ counted separately from ‘palladium’ and ‘catalyzed’, but also separately from things like ‘Pd-catalyzed’ too. And obviously you get lots of different ‘X-catalyzed’ terms, such as ‘gold-catalyzed’, ‘Rh-catalyzed’, ‘copper-catalyzed’, and so on. There’s an awful lot of catalysis going on, it just isn’t adequately captured in the word clouds. On the other hand, consider the word ‘synthesis’ — sure, it might lose some of its count to ‘synthetic’, but that’s about it; there aren’t anywhere near as many derivatives of ‘synthesis’ as there are of ‘catalysis’.

To get a sense of how much catalysis (in any and all of its guises) has been published in JACS down the years, I went back to the lists of titles and then searched for ‘catal’ as a fragment. For comparison, I did the same for ‘synth’ and what I found is plotted below.


In the 2000s, ‘catal’ words were almost level with ‘synth’ words, and by the end of the current decade, it looks very much like they will be in the lead. Is this the decline of synthesis?

Now, as I pointed out in yesterday’s post, it seems as though chemists really have something for acid and acids. Those words dominate the clouds in the early-to-mid part of the 20th century. On Twitter, Cafer Yavuz suggested that ‘base’ and ‘basic’ might be excluded as part of the set of common words, but I don’t think that is the case. Wanting to get a sense of acid vs base, I repeated the ‘catal’/’synth’ analysis for these words. The results are plotted below:


The analysis is not perfect, partly because ‘base’ and ‘basic’ can have different meanings (more so than acid and acidic), and ‘base’ is also a fragment of ‘based’ which might be adding to its total. Nevertheless, something interesting appears to be happening. When it comes to acids and bases, it seems that the balance of power (in JACS at least) is shifting — where acids once ruled supreme, bases took the crown in the 2000s and seem to be consolidating their position in the current decade.

If you have any questions about the analysis (or other things you want me to look for in the titles), just leave a comment or drop me a line on Twitter. Similarly, if you want the raw data, drop me a line by e-mail, I’m happy to share.

115 years of JACS titles

When Nature Chemistry celebrated its 5th anniversary last year, we put together a word cloud (using Wordle) featuring the 150 words that appeared most often in the titles of the papers we had published up to that point. That was a collection of just under 600 papers, but a clear winner did emerge — ‘synthesis’ was the word used in titles more than any other (excluding some common words such as ‘from’, ‘by’, ‘to’, ‘with’, ‘and’, ‘so’, ‘on’…). It seems that a large part of chemistry is still very much about making things, and that reminds me of one of my favourite chemistry quotes:

‘la chimie crée son objet’ (chemistry creates its object) — Marcellin Berthelot, 1860.

The Nature Chemistry title-word cloud was not based on a particularly large data set, however, and is also from a very recent period. I wondered if the titles of chemistry papers have changed much over time, and so I decided to look to a journal with a lot more history. I wanted it to be a general chemistry journal to ensure there was no intrinsic bias towards words associated with a particular sub-field within chemistry and so I turned to the Journal of the American Chemical Society (JACS).

The date range I chose is somewhat arbitrary, but round numbers have a certain appeal and so I started at 1900 and worked my way up to 2014, the most recent complete year of JACS papers. This amounted to a little over 168,000 article titles and just shy of 2,000,000 words in total. I may well do more analysis in time, but first of all I decided to break down the data into decades (including a half-decade of 2010-2014 to cover the most recent papers) and look at the most popular 150 words for titles in each given period (excluding the same common words as we did when analysing the titles of Nature Chemistry papers).

Note that the size of each word corresponds to the number of times it appears in titles in that period — the larger it is, the more it is used. I have not combined words with the same root and nor have I combined singular and plural versions of the same word. I have made everything lowercase for the sake of simplicity though (otherwise ‘Synthesis’ appears as a separate entry to ‘synthesis’). Also, the number of papers published varies a lot between decades, so comparing the sizes of words between different clouds is meaningless.

This is what I found:



So, chemists at the start of the 20th century (yes, I know the century started on January 1st, 1901, but just go with it) were a determined bunch who liked to study milk, oil, wheat, sugar and urine — perhaps not all at the same time. Also, note the presence of a decent-sized ‘sulphur’. Yes, sulphur, with a ‘ph’. And remember, this is JACS, with all its American-ness. There’s not a hint of a ‘sulf’ to be found in JACS titles in this decade!



Still a healthy dose of determination, but also a lot of acid. And now ‘sulphur’ has become ‘sulfur’ — in fact, there are 143 ‘sulf’-based words and only 17 ‘sulph’ ones in titles from this decade.



Acid still looms large, but a lot of derivatives and compounds now too. Note that there is a lot more preparation than there is synthesis.



Seriously, what is it with chemists and acid? Compounds and derivatives remain popular and it seems as though synthesis is catching up a little with preparation.



The age of synthesis is upon us. And note the appearance of the word ‘spectra’ too. Also, ‘esters’, what’s going on there?



Synthesis remains dominant, but words such as ‘kinetics’ and ‘mechanism’ are growing larger, suggesting that there is an increasing drive to understand reactions as well. And ‘stereochemistry’ rears its head in the cloud for the first time.



Synthesis is not quite as prominent in the 1960s, but still a popular word in the titles of JACS papers. A new (and quite prominent) entry is ‘resonance’, along with ‘magnetic’, and note that both ‘nuclear’ and ‘proton’ are there too, reflecting the growing use of NMR as a technique to characterize chemical compounds. Another notable entry: ‘carbonium’ (the old name for carbocations), which was an active area of research at this time.



Chemists’ fascination with acid finally seems to be wearing off somewhat. And ‘complexes’ is now much more prominent. I suspect that this is a result of host–guest chemistry really taking off in the 1970s and the word ‘complex’ being associated with many more things than just traditional metal-coordination compounds.



There’s a fairly sizeable entry for ‘total’, and the vast majority of time it is used in the context of ‘total synthesis’ — and ‘synthesis’ itself dominates once more. Also note that the popularity of the word ‘via’ is increasing and both ‘novel’ and ‘new’ are well used (‘new’ seems to be a fairly constant presence in titles throughout the decades).



There’s still an awful lot of synthesis going on.



Nanotubes and nanoparticles make an appearance in the top 150 for the first time — nano comes of age? Other notable first-time entries (although small) are ‘supramolecular’, ‘self-assembly’ and ‘quantum’; I’m a little surprised it took so long.



Synthesis remains at the top, but look at the topics creeping into the top 150. ‘Metal–organic’ and ‘framework’ heralds the growing popularity of MOFs and it’s easy to miss, but there is also a little innocuous ‘graphene’ creeping into the picture at the bottom. ‘C–H’ is growing in size too, which is usually found in titles in the context of C–H activation. And finally, chemists’ love of ‘via’ is sealed!

To summarize, here are the top-ten words for each period:


(EDIT added June 3rd: I forgot to mention when I first posted this that for the top-ten lists I did combine simple singular and plural versions of the same word, so ‘reaction’ is actually ‘reaction’ and ‘reactions’ combined. Same goes for study/studies, complex/complexes, acid/acids and some of the others. What I did not do, however, is go beyond that and combine words that share the same root, so ‘synthesis’ and ‘synthetic’ have not been counted together and nor have ‘molecule’ and ‘molecular’, for example.)

Just to give you a sense of scale, if you don’t exclude the really common words, the top-20 words for the last full decade (2000-2009) are shown below (and remember that the words are scaled relative to the number of times they appear – the larger the word, the more times they appear in JACS titles).


So, the most common word in JACS titles is probably ‘of’ or, more meaningfully, ‘synthesis’.

(EDIT added June 3rd: there’s now a follow-up post, with some cautionary notes about word clouds and how they can miss some concepts…)

