A fundamental precept of the scientific method is reproducibility of methods and results, and there is growing concern over the failure to reproduce significant results. Family dogs have become a favoured species in comparative cognition research, but they may be subject to cognitive differences arising from genetic (breeding lines) or cultural differences (e.g. preferred training methods). Such variation is of concern as it affects the validity and generalisability of experimental results. Despite its importance, this problem has not been specifically addressed to date. Therefore, we aimed to test the influence of three factors on reproducibility: testing site (proximal environment), breed and sex (phenotype). The same experimenter tested cognitive performance by more than 200 dogs in four experiments. Additionally, dogs" performance was tested in an obedience task administered by the owner. Breed of dog and testing site were found to influence the level of performance only mildly, and only in a means-end experiment and the obedience task. Our findings demonstrate that by applying the same test protocols on sufficiently large samples, the reported phenomena in these cognitive tests can be reproduced, but slight differences in performance levels can occur between different samples. Accordingly, we recommend the utilisation of well-described protocols supported by video examples of the whole experimental procedure. Findings should focus on the main outcome variables of the experiments, rather than speculating about the general importance of small or secondary performance outcomes which are more susceptible to random or local noise.