Who the fuck came up with introducing a different bounding box format for the TensorFow Object Detection API? Everybody would expect an output like [x_min, y_min, x_max, y_max] to describe a bounding box but hardly anybody would expect [y_min, x_min, y_max, x_max]. Changes like these should be documented properly - yes there are some hints on the TF hub pages but they are not excuse for such nonsense!