Question d’entretien chez Apple

Explain how you would reduce latency for your model and why you would use an int quantization over something else