Matrik Jurnal Manajemen Teknik Informatika dan Rekayasa Komputer, Journal Year: 2024, Volume and Issue: 23(3), P. 569 - 582
Published: June 18, 2024
Artificial intelligence at the edge can help solve complex tasks faced by various sectors such as automotive, healthcare and surveillance. However, challenged lack of computational power from devices, artificial models are forced to adapt. Many have developed quantified model compres-sion approaches over years tackle this problem. not many considered overhead on-device compression, even though compression take a considerable amount time. With added metric, we provide more complete view on efficiency edge. The objective research is identifying benefit methods it’s tradeoff between size latency reduction versus accuracy loss well time in devices. In work, quantitative method used analyze rank three common ways compression: post-training quantization, unstructured pruning knowledge distillation basis accuracy, latency, compress overhead. We concluded that best, with potential up 11.4x reduction, 78.67% speed up, moderate accura-cy
Language: Английский