Monday, April 1, 2013

Do people really still badmouth DOE?

In this week's C&EN, Rick Mullin writes about Design of Experiments in the chemical and pharmaceutical industry and gives a very interesting and helpful quick history of the concept and the field. The article veers into a very interesting set of comments between chemists and chemical engineers towards the end:
Drug companies also have identified the need to gain better control of their development and manufacturing process. The FDA guidance not only encourages the application of statistical tools such as DOE, according to Julia O’Neill, director of engineering at Merck & Co.’s West Point, Pa., facility, it is forcing companies to use them. “There is a new expectation that manufacturers will be able to demonstrate they are using statistical methods for monitoring results and monitoring processes to ensure they remain in a validated state,” O’Neill says. “That drives an expectation back upstream that statistical methods will also be used during the design and qualification stages.” 
O’Neill, a chemical engineer who came to Merck from the chemical maker Rohm and Haas eight years ago, acknowledges that, prior to the recent push from FDA, the chemical industry had made more headway than the pharmaceutical industry on DOE. In doing so it also paved the way for the drug industry. “There is a whole set of tools just waiting to be taken off the shelf and used to achieve similar goals in pharma,” O’Neill says. 
But the cultural divide between chemical engineers and chemists may be harder to cross. “I have heard people say that statistics are for people who don’t know the science,” she says. “I totally disagree.” 
Some chief executive officers say they are anxious to close the divide. “Concepts like statistical design of experiment have gained a lot of ground in recent years, but for mainstream chemists they are somewhat anti-intuitive,” says Siegfried CEO Rudolf Hanko. “They are 180 degrees against what you learn in grad school, where you learn to look for a cause-consequence relationship and the fundamental rule is you only change one parameter at a time.” 
In an industrial setting, where the focus is on efficiency, changing one parameter at a time takes too long, Hanko says. “With DOE, you create a hyperdimensional space from which you determine the highest optimum of whatever parameter you are looking at. The fact that this is against the nature or at least the education of most technical people has led to a situation where the uptake was very slow in the industry.”
I have a bit of a suspicion that there is some level of strawmanning going on here. I think most chemists recognize that DOE is a very powerful tool, especially for situations in which all the interactions between different variables are not fully understood. I think that most chemists also intuitively understand that one-variable-at-a-time is not the most efficient way of running experiments. I would like to hear specific examples of resistance to DOE before I believe that there's a real cultural divide.

(Also, I would like to hear from DOE supporters about areas where they believe that this tool would not be helpful. I think that would be important, too.)

12 comments:

  1. I am a chemist who has used DoE for decades and see it increasing in utility. Probably one main advance I have seen is that the programs to design and execute DoEs are much easier to apply without extensive experience. I think the basic problem reamins in training and exposure. DoE is integral in most ChemE education and majority of chemists get zero introduction until in industry where then ChemE present as panacea for all process understanding. Like many tools DoE can be very useful if applied appropriately but can also be subject to misleading or false sense of security if not run or interpreted cautiously. For chemists seen likely often provides answers "what we already knew" arguments and ChemEs can get caught up in "you can't understand without statics" defense. Although chemists can realize interaction of variables can have impacts they typically know can gain sufficient ability to control reactions by experimentation without matrices. Chemists admittedly are much more liable to reach firm conclusions on very limited data sets that make ChemEs uncomfortable. ChemEs can make wonderful graphics that proclaim they can predict everything without knowing hidden reaction mechanism that keep chemists awake at night. At the same time DoE, if done well, can provide insights that would not be so obvious otherwise and especially in this day is something FDA expects as part of development.

    ReplyDelete
  2. DOE! I have hard time convincing students that the experiments should be repeated (let alone reproduced).

    ReplyDelete
    Replies
    1. I remember Stuart Hunter teaching us many years ago and quipping, " You know why chemists don't repeat experiments? They don't know what to do with the second number."

      Delete
  3. I hate them. They give you no or little insight as to what the underlying principals could be learned and used elsewhere - except to run another DOE. For instance, in a pressure-sensitive adhesive formulation DOE, you may see that adding more tackifier increases the tack. So marketing wants more tack? Then add more tackifier...until you suddenly see a decrease in tack. No what?

    The problem is that the underlying physics drive the results, not an artificial concept like tack. When you see that the tackifier is lowering the plateau modulus (hence more tack) but also increasing the Tg, you realize that you can overdo it. Raise the Tg too much and you have a tack-free material, plateau modulus be damned.

    But a DOE will never educate you about this. It will only give directions based on what the inputs were. Garbage in, garbage out. There is a good reason you never see a DOE in Nature, JACS, or pretty much any scientific journal. You don't learn anything fundamental from them. And as soon as we give up our focus on the fundamentals, we are all done.

    ReplyDelete
    Replies
    1. The problem is not DOE, the problem is building a correct model. If a response changes direction it just means you did not account for all the factors. In that sense yes, garbage model will give garbage predictions.

      Delete
    2. Yes, you can gather that and that is correct. But the bigger point is that you don't LEARN anything from one DOE that you can apply to another situation - except to run another DOE. That's not how science has progressed so far, and it is not about to change.

      Delete
  4. DOE in Science and JACS:

    Harper and Sigman Science, 2011, 333, 1875.
    Harper, Vilardi, Sigman JACS 2013, 135, 2482

    ReplyDelete
  5. I have used DOE several times and I believe I couldn't have gotten the same results without it. I also heard the sentence "statistics is for people who don't know the science" (in that particular example it was more about The Statistician).

    I think people are "scared" of statistics and think they know better (although it might take more work and time).

    ReplyDelete
  6. I recently had to justify tossing out the DOE for my project, against pushback from engineers. The bosses wanted a one-pot reaction, combining 3 steps that were separate for good reason. DOE identified maybe 30 probably-relevant variables. The problem was that each run is extremely costly in terms of analysis and time, so we could afford maybe 25 runs total for R&D. In this case, a traditional iterative approach based on chemical reasoning is far more productive, because you simply can't explore that hyperdimensional space in a few shots. I pushed for more small scale work, with observations from those trials informing the design of the scale-up. You cannot rely on DOE to analyze your results for you, and trust that running it will lead you in the direction you want. Systematically exploring bad chemical space will not yield good results, even if it does keep the bosses happy.

    I found a decent way to explain this in engineering-speak. I told them I would select an experiment off their DOE list to run. Afterwards, I would generate a new DOE list using the data I had just obtained. The ChemE guys had the security blanket DOE telling them their reactors weren't being run by a madman, and I exercised a fair bit of hand-wavy intuition in nailing down decent conditions.

    ReplyDelete
  7. DoE is wonderful for honing in on optimal conditions on a process you already understand well. It is not so useful when you are just starting out. Why? Because you will set up an elaborate DoE, only to find that one of the solvents you chose is a complete failure and causes your material to precipitate once one of your reagents is added, or that any temperature over X causes your catalyst to crap out, or that one of the concentrations you were targeting is too viscous to mix properly after the reaction gets going. You end up wasting half or more of the experiments if you follow through.

    Once you have decent conditions sorted out, using DoE to pump that yield from 70% to 90% is awesome. There is no better way. But getting from 0% to 70% is the work of one-off, learn-from-what-went-wrong thinking.

    ReplyDelete
  8. DOE pops up in the blogosphere every once in a while. I remember seeing it over in the pipeline some time ago and getting curious. Can anyone recommend any easy-to-use software for this, and with this I don't think of normal maple, mathemathica or other crap, that'll take you a while to pick up the synthax of programming...

    ReplyDelete
    Replies
    1. JMP is great for DOE.

      I use it for process engineering but found it was also great in biology if you can run something high throughput--lots of cellular stuff tends to be a black box from a molecular bio standpoint, DOE can turn up any gaps that you need to look at more closely: chaperone folding, cofactor recycling, posttranslational modifications. Sometimes things are a slower or more important step in the process than you think.

      Delete