Skip to content

Added Base.similar methods for CuSparseMatrixCOO and BSR#3114

Open
rainerrodrigues wants to merge 4 commits intoJuliaGPU:masterfrom
rainerrodrigues:add-sparse-similar
Open

Added Base.similar methods for CuSparseMatrixCOO and BSR#3114
rainerrodrigues wants to merge 4 commits intoJuliaGPU:masterfrom
rainerrodrigues:add-sparse-similar

Conversation

@rainerrodrigues
Copy link
Copy Markdown

This PR adds the missing Base.similar methods for CuSparseMatrixCOO and CuSparseMatrixBSR, allowing them to fallback gracefully without converting to dense CPU arrays.

Fixes #3061
Fixes #3055

Comment thread lib/cusparse/src/array.jl Outdated
@kshyatt
Copy link
Copy Markdown
Member

kshyatt commented Apr 21, 2026

Also, can some tests be added?

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Details
Benchmark suite Current: 845e83c Previous: d08923d Ratio
array/accumulate/Float32/1d 101140.5 ns 101201.5 ns 1.00
array/accumulate/Float32/dims=1 76322 ns 77253 ns 0.99
array/accumulate/Float32/dims=1L 1584657 ns 1586384.5 ns 1.00
array/accumulate/Float32/dims=2 143544 ns 144050 ns 1.00
array/accumulate/Float32/dims=2L 657740 ns 658315 ns 1.00
array/accumulate/Int64/1d 118704 ns 118535 ns 1.00
array/accumulate/Int64/dims=1 80145 ns 80398 ns 1.00
array/accumulate/Int64/dims=1L 1705722 ns 1695368 ns 1.01
array/accumulate/Int64/dims=2 156343 ns 156400.5 ns 1.00
array/accumulate/Int64/dims=2L 961909 ns 962477 ns 1.00
array/broadcast 20273 ns 20549 ns 0.99
array/construct 1273.6 ns 1244.7 ns 1.02
array/copy 17792 ns 18346 ns 0.97
array/copyto!/cpu_to_gpu 213574 ns 217009 ns 0.98
array/copyto!/gpu_to_cpu 281602 ns 283646 ns 0.99
array/copyto!/gpu_to_gpu 10655 ns 10940 ns 0.97
array/iteration/findall/bool 134322 ns 135015 ns 0.99
array/iteration/findall/int 149464 ns 150942 ns 0.99
array/iteration/findfirst/bool 80978 ns 81514 ns 0.99
array/iteration/findfirst/int 83157 ns 84056 ns 0.99
array/iteration/findmin/1d 84753 ns 87756 ns 0.97
array/iteration/findmin/2d 117102 ns 117744 ns 0.99
array/iteration/logical 197029.5 ns 201505 ns 0.98
array/iteration/scalar 64912 ns 67810 ns 0.96
array/permutedims/2d 52533 ns 52558 ns 1.00
array/permutedims/3d 52825.5 ns 52631 ns 1.00
array/permutedims/4d 51771 ns 51253 ns 1.01
array/random/rand/Float32 12720 ns 12853 ns 0.99
array/random/rand/Int64 25014 ns 25414 ns 0.98
array/random/rand!/Float32 9253 ns 8376.666666666666 ns 1.10
array/random/rand!/Int64 21467 ns 21965 ns 0.98
array/random/randn/Float32 40917.5 ns 38474 ns 1.06
array/random/randn!/Float32 25853 ns 30963 ns 0.83
array/reductions/mapreduce/Float32/1d 34170 ns 34333 ns 1.00
array/reductions/mapreduce/Float32/dims=1 39205.5 ns 40570 ns 0.97
array/reductions/mapreduce/Float32/dims=1L 51195.5 ns 51354 ns 1.00
array/reductions/mapreduce/Float32/dims=2 56360 ns 56637 ns 1.00
array/reductions/mapreduce/Float32/dims=2L 69013 ns 69577 ns 0.99
array/reductions/mapreduce/Int64/1d 41911 ns 42670 ns 0.98
array/reductions/mapreduce/Int64/dims=1 42090 ns 43498 ns 0.97
array/reductions/mapreduce/Int64/dims=1L 87036 ns 87200 ns 1.00
array/reductions/mapreduce/Int64/dims=2 59145 ns 59546 ns 0.99
array/reductions/mapreduce/Int64/dims=2L 84434 ns 84737 ns 1.00
array/reductions/reduce/Float32/1d 34387 ns 34657.5 ns 0.99
array/reductions/reduce/Float32/dims=1 40647 ns 39996.5 ns 1.02
array/reductions/reduce/Float32/dims=1L 51397.5 ns 51457 ns 1.00
array/reductions/reduce/Float32/dims=2 56481 ns 56754 ns 1.00
array/reductions/reduce/Float32/dims=2L 69363.5 ns 70015.5 ns 0.99
array/reductions/reduce/Int64/1d 41982 ns 42974.5 ns 0.98
array/reductions/reduce/Int64/dims=1 50345.5 ns 42327 ns 1.19
array/reductions/reduce/Int64/dims=1L 86951 ns 87167 ns 1.00
array/reductions/reduce/Int64/dims=2 59596 ns 59657 ns 1.00
array/reductions/reduce/Int64/dims=2L 84767 ns 84515 ns 1.00
array/reverse/1d 17643 ns 17882 ns 0.99
array/reverse/1dL 68225.5 ns 68439 ns 1.00
array/reverse/1dL_inplace 65570 ns 65793.5 ns 1.00
array/reverse/1d_inplace 10158.5 ns 8645.333333333334 ns 1.18
array/reverse/2d 20688 ns 21177 ns 0.98
array/reverse/2dL 72784 ns 73131 ns 1.00
array/reverse/2dL_inplace 65691 ns 65813 ns 1.00
array/reverse/2d_inplace 10372 ns 9973 ns 1.04
array/sorting/1d 2733819 ns 2735906 ns 1.00
array/sorting/2d 1068081 ns 1068705.5 ns 1.00
array/sorting/by 3302391 ns 3304477 ns 1.00
cuda/synchronization/context/auto 1162.1 ns 1153.8 ns 1.01
cuda/synchronization/context/blocking 930.5714285714286 ns 920.219512195122 ns 1.01
cuda/synchronization/context/nonblocking 7707.299999999999 ns 7049.8 ns 1.09
cuda/synchronization/stream/auto 989.2105263157895 ns 1045.5 ns 0.95
cuda/synchronization/stream/blocking 807.7676767676768 ns 845.6486486486486 ns 0.96
cuda/synchronization/stream/nonblocking 7204.8 ns 7261.5 ns 0.99
integration/byval/reference 143774 ns 143708 ns 1.00
integration/byval/slices=1 145530 ns 145668 ns 1.00
integration/byval/slices=2 284392 ns 284436 ns 1.00
integration/byval/slices=3 423048 ns 422883 ns 1.00
integration/cudadevrt 102357 ns 102385 ns 1.00
integration/volumerhs 23462010.5 ns 23480818 ns 1.00
kernel/indexing 12997 ns 13249 ns 0.98
kernel/indexing_checked 13782 ns 14040 ns 0.98
kernel/launch 2047.111111111111 ns 2194.8888888888887 ns 0.93
kernel/occupancy 705.3356164383562 ns 704.5137931034483 ns 1.00
kernel/rand 14118 ns 16709 ns 0.84
latency/import 3837187980.5 ns 3825894453 ns 1.00
latency/precompile 4600227064.5 ns 4600509263 ns 1.00
latency/ttfp 4404423502.5 ns 4395653546.5 ns 1.00

This comment was automatically generated by workflow using github-action-benchmark.

Comment thread lib/cusparse/src/array.jl
Copy link
Copy Markdown
Author

@rainerrodrigues rainerrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kshyatt Hi, can you check if this is suitable and extensive enough for testing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CUSPARSE] Missing appropriate similar methods Missing sparse array methods for CuSparseMatrixCOO and CuSparseMatrixBSR

2 participants