Statistical testing is hard. Some of the common dangers are well known; repeat a study twenty times and there's a good chance you'll see a significant result at least once. The Replication Crisis was the outcome of such abuses of statistical testing in scientific research.
John Ioannidis digs deeper into why so many studies fail to replicate and finds some more interesting insights about statistical testing:
1. The smaller the studies conducted in a scientific field, the less likely the research findings are to be true.
2. The smaller the effect sizes in a scientific field, the less likely the research findings are to be true.
3. The greater the number and the lesser the selection of tested relationships in a scientific field, the less likely the research findings are to be true
4. The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true.
5. The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true.
If you're familiar with statistical testing none of the above will seem particularly surprising, but it's interesting to see how validity of research findings could vary so much due to characteristics of the field. E.g. effect sizes in nutrition science are quite small which leads to most research findings in that field being false.