ieeexplore.ieee.org/document/10946718
Preview meta tags from the ieeexplore.ieee.org website.
Linked Hostnames
2Thumbnail

Search Engine Appearance
SparseWeaver: Converting Sparse Operations as Dense Operations on GPUs for Graph Workloads
Thanks to their scalable parallel processing capability, GPUs are promising computing resources for graph processing, in which identical operations are applied to a large number of edges and vertices. However, the sparsity and skewness of real-world graphs cause imbalanced workloads across GPU threads within the same warp, thus impeding efficient processing on the GPU. To mitigate this workload imbalance problem, existing works propose workload balancing hardware and software schemes. However, these solutions often suffer from additional memory overhead or increased computations and communication overheads during inter-warp and intra-warp synchronization. This work proposes a new hardware-software collaborative graph processing framework, SparseWeaver, that converts sparse operations in graph processing into dense operations using graph topology and makes the workloads balanced across GPU threads. Based on the analysis of common patterns in software schemes, we propose Weaver, a new lightweight GPU functional unit microarchitecture that fully leverages the benefits of the GPU architecture and exploits memory access locality. We prototype SparseWeaver on the open-source RISC-V Vortex GPU and demonstrate 2.36 times faster execution time compared to state-of-the-art schemes while incurring a low area overhead of 0.045% from increased dedicated logic registers.
Bing
SparseWeaver: Converting Sparse Operations as Dense Operations on GPUs for Graph Workloads
Thanks to their scalable parallel processing capability, GPUs are promising computing resources for graph processing, in which identical operations are applied to a large number of edges and vertices. However, the sparsity and skewness of real-world graphs cause imbalanced workloads across GPU threads within the same warp, thus impeding efficient processing on the GPU. To mitigate this workload imbalance problem, existing works propose workload balancing hardware and software schemes. However, these solutions often suffer from additional memory overhead or increased computations and communication overheads during inter-warp and intra-warp synchronization. This work proposes a new hardware-software collaborative graph processing framework, SparseWeaver, that converts sparse operations in graph processing into dense operations using graph topology and makes the workloads balanced across GPU threads. Based on the analysis of common patterns in software schemes, we propose Weaver, a new lightweight GPU functional unit microarchitecture that fully leverages the benefits of the GPU architecture and exploits memory access locality. We prototype SparseWeaver on the open-source RISC-V Vortex GPU and demonstrate 2.36 times faster execution time compared to state-of-the-art schemes while incurring a low area overhead of 0.045% from increased dedicated logic registers.
DuckDuckGo
SparseWeaver: Converting Sparse Operations as Dense Operations on GPUs for Graph Workloads
Thanks to their scalable parallel processing capability, GPUs are promising computing resources for graph processing, in which identical operations are applied to a large number of edges and vertices. However, the sparsity and skewness of real-world graphs cause imbalanced workloads across GPU threads within the same warp, thus impeding efficient processing on the GPU. To mitigate this workload imbalance problem, existing works propose workload balancing hardware and software schemes. However, these solutions often suffer from additional memory overhead or increased computations and communication overheads during inter-warp and intra-warp synchronization. This work proposes a new hardware-software collaborative graph processing framework, SparseWeaver, that converts sparse operations in graph processing into dense operations using graph topology and makes the workloads balanced across GPU threads. Based on the analysis of common patterns in software schemes, we propose Weaver, a new lightweight GPU functional unit microarchitecture that fully leverages the benefits of the GPU architecture and exploits memory access locality. We prototype SparseWeaver on the open-source RISC-V Vortex GPU and demonstrate 2.36 times faster execution time compared to state-of-the-art schemes while incurring a low area overhead of 0.045% from increased dedicated logic registers.
General Meta Tags
12- titleSparseWeaver: Converting Sparse Operations as Dense Operations on GPUs for Graph Workloads | IEEE Conference Publication | IEEE Xplore
- google-site-verificationqibYCgIKpiVF_VVjPYutgStwKn-0-KBB6Gw4Fc57FZg
- DescriptionThanks to their scalable parallel processing capability, GPUs are promising computing resources for graph processing, in which identical operations are applied
- Content-Typetext/html; charset=utf-8
- viewportwidth=device-width, initial-scale=1.0
Open Graph Meta Tags
3- og:imagehttps://ieeexplore.ieee.org/assets/img/ieee_logo_smedia_200X200.png
- og:titleSparseWeaver: Converting Sparse Operations as Dense Operations on GPUs for Graph Workloads
- og:descriptionThanks to their scalable parallel processing capability, GPUs are promising computing resources for graph processing, in which identical operations are applied to a large number of edges and vertices. However, the sparsity and skewness of real-world graphs cause imbalanced workloads across GPU threads within the same warp, thus impeding efficient processing on the GPU. To mitigate this workload imbalance problem, existing works propose workload balancing hardware and software schemes. However, these solutions often suffer from additional memory overhead or increased computations and communication overheads during inter-warp and intra-warp synchronization. This work proposes a new hardware-software collaborative graph processing framework, SparseWeaver, that converts sparse operations in graph processing into dense operations using graph topology and makes the workloads balanced across GPU threads. Based on the analysis of common patterns in software schemes, we propose Weaver, a new lightweight GPU functional unit microarchitecture that fully leverages the benefits of the GPU architecture and exploits memory access locality. We prototype SparseWeaver on the open-source RISC-V Vortex GPU and demonstrate 2.36 times faster execution time compared to state-of-the-art schemes while incurring a low area overhead of 0.045% from increased dedicated logic registers.
Twitter Meta Tags
1- twitter:cardsummary
Link Tags
9- canonicalhttps://ieeexplore.ieee.org/document/10946718
- icon/assets/img/favicon.ico
- stylesheethttps://ieeexplore.ieee.org/assets/css/osano-cookie-consent-xplore.css
- stylesheet/assets/css/simplePassMeter.min.css?cv=20250701_00000
- stylesheet/assets/dist/ng-new/styles.css?cv=20250701_00000
Links
17- http://www.ieee.org/about/help/security_privacy.html
- http://www.ieee.org/web/aboutus/whatis/policies/p9-26.html
- https://ieeexplore.ieee.org/Xplorehelp
- https://ieeexplore.ieee.org/Xplorehelp/overview-of-ieee-xplore/about-ieee-xplore
- https://ieeexplore.ieee.org/Xplorehelp/overview-of-ieee-xplore/accessibility-statement