Replace specific loops with Boolean masking and vectorized assignments by mhucka · Pull Request #8045 · quantumlib/Cirq

mhucka · 2026-04-17T04:12:02Z

This amounts to a simple bit of micro-optimization. With the help of Gemini, I looked for cases where a specific type of matrix assignment was implemented in a for loop that could be rewritten to use a Boolean mask and vectorized assignment. There were a few cases, and that's what this PR is about.

This amounts to a simple bit of micro-optimization. With the help of Gemini, I looked for cases where some simple loops could be rewriten to use NumPy's `eye()` operator if appropriate. There were a few cases, and that's what this PR is about.

codecov · 2026-04-17T04:21:25Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.63%. Comparing base (89993a9) to head (47c7651).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #8045      +/-   ##
==========================================
- Coverage   99.63%   99.63%   -0.01%     
==========================================
  Files        1110     1110              
  Lines       99795    99787       -8     
==========================================
- Hits        99435    99425      -10     
- Misses        360      362       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

pavoljuhas · 2026-04-22T17:53:13Z

    matrix = np.copy(matrix)
-    for i in range(min(matrix.shape)):
-        matrix[i, i] = 0
+    matrix[np.eye(*matrix.shape, dtype=np.bool)] = 0


This changes assignment cost from O(N) to O(N^2) and allocates an extra N^2 sized temporary matrix. Note that matrix gets copied one more time in all_near_zero.

Here is a version that does away with one matrix copy w/r to the initial code:

absolute = np.abs(matrix) np.fill_diagonal(absolute, 0) return tolerance.near_zero(np.max(absolute, initial=0), atol=atol)

And here are timings for original, PR, and suggestion:

In [1]: a = np.eye(2048, dtype=float) In [2]: %timeit is_diagonal(a) 17.9 ms ± 113 μs per loop (mean ± std. dev. of 7 runs, 10 loops each) In [3]: %timeit is_diagonal1(a) 18.4 ms ± 21.8 μs per loop (mean ± std. dev. of 7 runs, 10 loops each) In [4]: %timeit is_diagonal2(a) 8.9 ms ± 38.2 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

pavoljuhas · 2026-04-22T20:59:26Z

-    for i in range(width):
-        result[(i + shift) % width, i] = 1
-    return result
+    return np.roll(np.eye(width), shift, axis=0)


NVM this is a test helper function so its performance does not matter much.

This has to copy N^2 values (compared to N assignments before) and allocate 2 arrays instead of 1. It is about 6 times slower than the original. Here is a better way of vectorizing Python loop:

Suggested change

return np.roll(np.eye(width), shift, axis=0)

result = np.zeros(width * width)

shift = shift % width if width else 0

# lower diagonal starting at the first column

result[shift * width :: width + 1].fill(1)

# upper diagonal starting at the first row

if shift:

result[width - shift : shift * width : width + 1].fill(1)

return result.reshape((width, width))

Again, timings for the original, PR, and suggestion:

%timeit shift_matrix(8, 3) 1.46 μs ± 66.1 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) 8.13 μs ± 305 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) 1.1 μs ± 26.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) %timeit shift_matrix(1024, 3) 475 μs ± 4.82 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each) 3.69 ms ± 8.34 μs per loop (mean ± std. dev. of 7 runs, 100 loops each) 298 μs ± 1.85 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

pavoljuhas · 2026-04-27T03:31:11Z

+        N = 2**self._num_qubits
+        return self._state[None, :, None] * np.eye(N, dtype=self._state.dtype)[:, None, :]


This allocates an extra temporary matrix of N^2 elements which gets multiplied N times. The overall complexity is increased from O(N^2) to O(N^3) with over 50% slowdown.

Here is a version which is comparable and in some cases faster than the original - depending on num_qubits:

N = 2**self._num_qubits operator = np.zeros(shape=(N, N, N), dtype=self._state.dtype) idx = np.arange(N) operator[idx, :, idx] = self._state return operator

Here are timings for original, PR, suggestion:

# num_qubits = 2, N = 4 2.17 μs ± 67.6 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) 3.51 μs ± 96.5 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) 2.29 μs ± 43.2 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) # num_qubits = 4, N = 16 6.95 μs ± 189 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) 10.8 μs ± 114 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) 3.48 μs ± 206 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

pavoljuhas · 2026-04-27T03:43:31Z

            new_xs = np.zeros((2 * self.n + 1, self.n), dtype=bool)
-            for i in range(self.n):
-                new_xs[i, i] = True
+            new_xs[: self.n, : self.n] = np.eye(self.n, dtype=bool)


This increases number of assignments from N to N^2, but runs still faster for a bit array. That said, np.fill_diagonal does the job with O(N) complexity

np.fill_diagonal(new_xs[: self.n, :], True)

Timing for 20 qubits for original, PR, suggestion:

4.01 μs ± 98.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) 3.5 μs ± 29 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) 2.57 μs ± 7.33 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

pavoljuhas · 2026-04-27T03:44:37Z

            new_zs = np.zeros((2 * self.n + 1, self.n), dtype=bool)
-            for i in range(self.n):
-                new_zs[self.n + i, i] = True
+            new_zs[self.n : 2 * self.n, : self.n] = np.eye(self.n, dtype=bool)


Suggested change

new_zs[self.n : 2 * self.n, : self.n] = np.eye(self.n, dtype=bool)

np.fill_diagonal(new_zs[self.n : 2 * self.n, :], True)

pavoljuhas

LLMs seems very capable of making suggestions that look good at a first sight, but in reality make things worse. We should be careful about quantifying any supposed code optimizations and only accept them if they are truly change for better.

Before doing such timings, we should also consider if the modified code is speed critical. If not the changes are probably not worth the effort, unless they make the code clearly easier to read which is again not much of a strength of LLMs.

mhucka requested review from a team and vtomole as code owners April 17, 2026 04:12

mhucka requested a review from tanujkhattar April 17, 2026 04:12

github-actions Bot added the size: S 10< lines changed <50 label Apr 17, 2026

mhucka changed the title ~~Use NumPy eye() for some simple optimizations~~ Replace some simple loops with NumPy eye() if possible Apr 17, 2026

mhucka marked this pull request as draft April 17, 2026 04:12

mhucka added the area/performance label Apr 17, 2026

mhucka and others added 3 commits April 20, 2026 17:12

Merge branch 'main' into use-np-eye

0938888

Use np.bool, not bool

ae34474

Merge branch 'main' into use-np-eye

47c7651

mhucka changed the title ~~Replace some simple loops with NumPy eye() if possible~~ Replace specific loops with Boolean masking and vectorized assignments Apr 21, 2026

mhucka marked this pull request as ready for review April 21, 2026 03:05

mhucka requested a review from NoureldinYosri April 21, 2026 03:05

pavoljuhas reviewed Apr 22, 2026

View reviewed changes

pavoljuhas self-assigned this Apr 24, 2026

pavoljuhas reviewed Apr 27, 2026

View reviewed changes

pavoljuhas requested changes Apr 27, 2026

View reviewed changes

pavoljuhas added the ci/no-release Use this label for pull request that should not have Cirq pre-release on PyPI. label Apr 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace specific loops with Boolean masking and vectorized assignments#8045

Replace specific loops with Boolean masking and vectorized assignments#8045
mhucka wants to merge 4 commits intoquantumlib:mainfrom
mhucka:use-np-eye

mhucka commented Apr 17, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

pavoljuhas Apr 22, 2026 •

edited

Loading

Uh oh!

pavoljuhas Apr 22, 2026 •

edited

Loading

Uh oh!

pavoljuhas Apr 27, 2026

Uh oh!

pavoljuhas Apr 27, 2026

Uh oh!

pavoljuhas Apr 27, 2026

Uh oh!

pavoljuhas left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    return np.roll(np.eye(width), shift, axis=0)
+    result = np.zeros(width * width)
+    shift = shift % width if width else 0
+    # lower diagonal starting at the first column
+    result[shift * width :: width + 1].fill(1)
+    # upper diagonal starting at the first row
+    if shift:
+        result[width - shift : shift * width : width + 1].fill(1)
+    return result.reshape((width, width))

		N = 2**self._num_qubits
		return self._state[None, :, None] * np.eye(N, dtype=self._state.dtype)[:, None, :]

	new_zs[self.n : 2 * self.n, : self.n] = np.eye(self.n, dtype=bool)
	np.fill_diagonal(new_zs[self.n : 2 * self.n, :], True)

Conversation

mhucka commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pavoljuhas Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pavoljuhas Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pavoljuhas Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

pavoljuhas Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

pavoljuhas Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

pavoljuhas left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mhucka commented Apr 17, 2026 •

edited

Loading

codecov Bot commented Apr 17, 2026 •

edited

Loading

pavoljuhas Apr 22, 2026 •

edited

Loading

pavoljuhas Apr 22, 2026 •

edited

Loading