Data Reduction Architecture for Next Generation Storage Servers
Published in IEEE CAL'17, HPCA'19, MICRO'19
This was my PhD thesis, resulted in three papers in top conferences (HPCA and MICRO) and a top journal (IEEE CAL). It received strong credit from industry and was later used as part of commercial products.
Brief Description: We are in the era of big data and large scale deployments of big data processing, machine learning and other data intensive applications. To handle the performance and capacity demands, architects are designing large and high performance SSD arrays. As SSDs are expensive, storage servers usually apply inline data reduction techniques to reduce the data writes. However, we observe that popular software-based data reduction architectures make a severe bottleneck on the CPUs and cannot provide enough performance to exploit the next generation ultra fast SSD arrays. Existing hardware solutions are also inefficient at high throughput or with large SSD-arrays. In this project, we propose a novel architecture that uses data reduction acceleration on an FPGA array and provides efficient scalability to Tbps-scale performance using our novel hardware-software co-optimizations.