Known
0
Review later
0
Completed
0/10
You are an ML engineer at a mobile gaming company. A data scientist on your team recently trained a TensorFlow model, and you are responsible for deplo ying this model into a mobile application. You disc over that the inference latency of the current model doe sn’t meet production requirements. You need to redu ce the inference time by 50%, and you are willing to accep t a small decrease in model accuracy in order to re ach the latency requirement. Without training a new model, which model optimization technique for reducing lat ency should you try first? A. Weight pruning