Deep Learning Model Compression Techniques Performance on Edge Devices DOI Open Access

Rakandhiya Daanii Rachmanto,

Ahmad Naufal Labiib Nabhaan,

Arief Setyanto

и другие.

Matrik Jurnal Manajemen Teknik Informatika dan Rekayasa Komputer, Год журнала: 2024, Номер 23(3), С. 569 - 582

Опубликована: Июнь 18, 2024

Artificial intelligence at the edge can help solve complex tasks faced by various sectors such as automotive, healthcare and surveillance. However, challenged lack of computational power from devices, artificial models are forced to adapt. Many have developed quantified model compres-sion approaches over years tackle this problem. not many considered overhead on-device compression, even though compression take a considerable amount time. With added metric, we provide more complete view on efficiency edge. The objective research is identifying benefit methods it’s tradeoff between size latency reduction versus accuracy loss well time in devices. In work, quantitative method used analyze rank three common ways compression: post-training quantization, unstructured pruning knowledge distillation basis accuracy, latency, compress overhead. We concluded that best, with potential up 11.4x reduction, 78.67% speed up, moderate accura-cy

Язык: Английский

Deep Learning Model Compression Techniques Performance on Edge Devices DOI Open Access

Rakandhiya Daanii Rachmanto,

Ahmad Naufal Labiib Nabhaan,

Arief Setyanto

и другие.

Matrik Jurnal Manajemen Teknik Informatika dan Rekayasa Komputer, Год журнала: 2024, Номер 23(3), С. 569 - 582

Опубликована: Июнь 18, 2024

Artificial intelligence at the edge can help solve complex tasks faced by various sectors such as automotive, healthcare and surveillance. However, challenged lack of computational power from devices, artificial models are forced to adapt. Many have developed quantified model compres-sion approaches over years tackle this problem. not many considered overhead on-device compression, even though compression take a considerable amount time. With added metric, we provide more complete view on efficiency edge. The objective research is identifying benefit methods it’s tradeoff between size latency reduction versus accuracy loss well time in devices. In work, quantitative method used analyze rank three common ways compression: post-training quantization, unstructured pruning knowledge distillation basis accuracy, latency, compress overhead. We concluded that best, with potential up 11.4x reduction, 78.67% speed up, moderate accura-cy

Язык: Английский

Процитировано

0