Cited by HERMES: Homomorphic Encryption over Residual Number System for Multi-level EvaluationS

SHARP: A Short-Word Hierarchical Accelerator for Robust and Practical Fully Homomorphic Encryption DOI

Jongmin Kim, Sangpyo Kim, Jaewan Choi

et al.

Published: June 16, 2023

Fully homomorphic encryption (FHE) is an emerging cryptographic technology that guarantees the privacy of sensitive user data by enabling direct computations on encrypted data. Despite security benefits this approach, FHE associated with prohibitively high levels computational and memory overhead, preventing its widespread use in real-world services. Numerous domain-specific hardware designs have been proposed to address issue, but most them excessive amounts chip area power, leaving room for further improvements terms practicality.

Language: Английский

Citations

Secure Outsourced Matrix Multiplication with Fully Homomorphic Encryption DOI

Lin Zhu, Qiang-Sheng Hua, Yi Chen

et al.

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 249 - 269

Published: Jan. 1, 2024

Language: Английский

Citations

CiFlow: Dataflow Analysis and Optimization of Key Switching for Homomorphic Encryption DOI

Negar Neda, Austin Ebel,

Benedict Reynwar

et al.

Published: May 5, 2024

Language: Английский

Citations

Cinnamon: A Framework for Scale-Out Encrypted AI DOI

Siddharth Jayashankar,

Edward S. Chen,

Tom Tang

et al.

Published: Feb. 3, 2025

Language: Английский

Citations

FAST: FPGA Acceleration of Fully Homomorphic Encryption with Efficient Bootstrapping DOI

Zhihan Xu, Tian Ye, Rajgopal Kannan

et al.

Published: Feb. 26, 2025

Bootstrapping is a critical operation in Fully Homomorphic Encryption (FHE) for privacy-preserving computation. Due to its significant computational overhead, accelerating bootstrapping crucial practical FHE applications involving deep evaluation circuits. In this paper, we introduce FAST, an FPGA-based accelerator efficient bootstrapping. We propose novel datapath optimizations two key operations bootstrapping: homomorphic linear transformation (HLT) and polynomial evaluation. Our memory-efficient designed HLT significantly reduces off-chip ciphertext access. also speed up the process by reducing number of required HE operations. conduct in-depth analysis Advanced Algorithm (ABA) highlight advantages. FAST first support ABA, demonstrating speedup addition, develop versatile permutation circuit handle diverse patterns FHE, achieving high throughput resource utilization. Compared with state-of-the-art (SOTA) GPU FPGA designs, achieves 8.84× 5.89× speedups bootstrapping, respectively. As illustrative examples applications, show that delivers over 20× logistic regression training compared SOTA implementation outperforms design 1.43× ResNet-20 inference.

Language: Английский

Citations

HEngine: A High Performance Optimization Framework on a GPU for Homomorphic Encryption DOI

Jinghao Zhao,

Hongwei Yang, Meng Hao

et al.

ACM Transactions on Architecture and Code Optimization, Journal Year: 2025, Volume and Issue: unknown

Published: April 28, 2025

Homomorphic encryption (HE) represents an technology that allows for direct computation on encrypted data without requiring decryption. However, the substantial computational complexity and significant latency associated with HE has impeded its broader adoption in practical applications. To address these challenges, we propose a GPU-based acceleration framework, namely HEngine, tailored homomorphic tasks. Specifically, first warp shuffle-based optimization method two key phases, i.e., inverse Chinese Remainder Theorem (ICRT) number theoretic transformation (NTT), to mitigate synchronization overhead encryption. Secondly, fuse NTT kernel inner product imbalance between memory access computation. Thirdly, considering potential difference amount of tasks users real world, design different encoding methods small batch large inference improve efficiency. Finally, experiments demonstrate our proposed framework achieves 218 × speedup multiplication compared CPU-based SEAL library. In addition, convolutional neural network shallow structures, amortized performance at millisecond level sub-millisecond data, respectively. For deeper structures (i.e., ResNet-20), second-level inference.

Language: Английский

Citations

LP-HENN: fully homomorphic encryption accelerator with high energy efficiency DOI

Mingzhe Zhang, Lei Chen,

Shengyu Fan

et al.

Cybersecurity, Journal Year: 2025, Volume and Issue: 8(1)

Published: May 30, 2025

Abstract Fully homomorphic encryption (FHE) enables direct computation on encrypted data without decryption, ensuring privacy in cloud computing scenarios and preventing the leakage of sensitive information. However, computational overhead HE typically exceeds that plaintext by 4 to 5 orders magnitude, while energy consumption is 6 magnitude higher. These substantial performance overheads significantly hinder widespread adoption FHE. This paper proposed LP-HENN, a novel low-power energy-efficient FHE accelerator architecture leverages RISC-V vector coprocessor ReRAM crossbar arrays. LP-HENN targets power-constrained application such as edge devices, aiming provide highly acceleration support for applications. collaborative work processor crossbars, employing optimization strategies achieve full pipelining minimize memory access. Furthermore, this parameter selection model early-stage design, which achieves an optimal balance between through multiple parameters. Experimental results show that, FHE-based convolutional neural network (HE-CNN) inference application, 31.82Ã- 11920.56Ã- improvement efficiency, respectively, compared CPU. Compared FxHENN, state-of-the-art FPGA-based with high efficiency 2.36Ã- 10.04Ã- respectively. The comparable F1, ASIC accelerator, featuring low power design suitable computing.

Language: Английский

Citations

Evaluating Homomorphic Operations on a Real-World Processing-In-Memory System DOI

Harshita Gupta, Mayank Kabra, Juan Gómez-Luna

et al.

Published: Oct. 1, 2023

Computing on encrypted data is a promising approach to reduce security and privacy risks, with homomorphic encryption serving as facilitator in achieving this goal. In work, we accelerate operations using the Processing-in-Memory (PIM) paradigm mitigate large memory capacity frequent movement requirements. Using real-world PIM system, Brakerski-Fan-Vercauteren (BFV) scheme for addition multiplication. We evaluate implementations of these statistical workloads (arithmetic mean, variance, linear regression) compare CPU GPU implementations. Our results demonstrate 50 – 100× speedup real system (UPMEM) over 2 15× vector addition. For multiplication, outperforms by 40 50×. However, it lags 10 behind due lack native sufficiently wide multiplication support evaluated first-generation system. regression, performance improvements vary between 30× 300× 10× GPU, uncovering trade-offs terms scalability varying amounts data. plan make our implementation open-source future.

Language: Английский

Citations

16.1 A 2.7-to-13.3μJ/boot/slot Flexible RNS-CKKS Processor in 28nm CMOS Technology for FHE-Based Privacy-Preserving Computing DOI

Hyunhoon Lee, Hyeokjun Kwon, Youngjoo Lee

et al.

2022 IEEE International Solid- State Circuits Conference (ISSCC), Journal Year: 2024, Volume and Issue: unknown, P. 296 - 298

Published: Feb. 18, 2024

Fully homomorphic encryption (FHE) has been gaining significant attention as a privacy-preserving solution for emerging server systems with critical information, which allows the to perform various primitive computations on encrypted data from clients without decrypting original messages shown in Fig. 16.1.1 [3–6]. Among FHE candidates, based ring learning error (RLWE) problem, recent CKKS approach using residue number system (RNS) is regarded most promising method [4–6]. For an n-slot user message, RNS-CKKS scheme constructs (l+1) N degree polynomials save processing complexity low-resolution coefficients bounded by small prime moduli $q_{i}(0 \leq i \eta)$. unlimited operations, however, bootstrapping step requires computing costs, mainly caused key-switch (KS) functions, including NTT/iNTT, base conversion (BConv) and other modular operators different applications, moreover, KS characterized parameters like bit-width of primes, ciphertext level (l), special (a), resulting tradeoffs 16.1.1. Hence, it desirable develop cost-efficient flexible accelerator practical systems; existing works only report outdated parameter sets [4], architecture-level estimations [5] energy-inefficient realization This paper presents integrated high-efficiency processor meet all demands. Based programmable core dedicated instructions, 16.1.1, we novel design-level optimizations energy consumption reduce latency: 1) inter-/intra-set scheduling 2) cost-reduced engines. Implemented 28nm CMOS, efficiencies $2.7-13.3 \mu \mathrm{J}$ /boot/slot log values, outperforms comparable parameters.

Language: Английский

Citations

A Framework for Generating Accelerators for Homomorphic Encryption Operations on FPGAs DOI

Yang Yang, Rajgopal Kannan, Viktor K. Prasanna

et al.

Published: July 24, 2024

Language: Английский

Citations