• 为了保证你在浏览本网站时有着更好的体验，建议使用类似Chrome、Firefox之类的浏览器~~
• 如果你喜欢本站的内容何不Ctrl+D收藏一下呢，与大家一起分享各种编程知识~
• 本网站研究机器学习、计算机视觉、模式识别~当然不局限于此，生命在于折腾，何不年轻时多折腾一下

# scipy稀疏矩阵模块

3周前 (08-02) 46次浏览

## 1.sparse 模块初探

python 中 scipy 模块中，有一个模块叫 sparse 模块，就是专门为了解决稀疏矩阵而生。本文的大部分内容，其实就是基于 sparse 模块而来的。

<code class="hljs python has-numbering"><span class="hljs-prompt">>>> </span><span class="hljs-keyword">from</span> scipy <span class="hljs-keyword">import</span> sparse</code>

<code class="hljs bash has-numbering">>>> <span class="hljs-built_in">help</span>(sparse)</code>

<code class="hljs livecodeserver has-numbering">    Usage information
=================

There are <span class="hljs-constant">seven</span> available sparse matrix types:

<span class="hljs-number">1.</span> csc_matrix: Compressed Sparse Column <span class="hljs-built_in">format</span>
<span class="hljs-number">2.</span> csr_matrix: Compressed Sparse Row <span class="hljs-built_in">format</span>
<span class="hljs-number">3.</span> bsr_matrix: Block Sparse Row <span class="hljs-built_in">format</span>
<span class="hljs-number">4.</span> lil_matrix: List <span class="hljs-operator">of</span> Lists <span class="hljs-built_in">format</span>
<span class="hljs-number">5.</span> dok_matrix: Dictionary <span class="hljs-operator">of</span> Keys <span class="hljs-built_in">format</span>
<span class="hljs-number">6.</span> coo_matrix: COOrdinate <span class="hljs-built_in">format</span> (aka IJV, triplet <span class="hljs-built_in">format</span>)
<span class="hljs-number">7.</span> dia_matrix: DIAgonal <span class="hljs-built_in">format</span>

To construct <span class="hljs-operator">a</span> matrix efficiently, use either dok_matrix <span class="hljs-operator">or</span> lil_matrix.
The lil_matrix class supports basic slicing <span class="hljs-operator">and</span> fancy
indexing <span class="hljs-operator">with</span> <span class="hljs-operator">a</span> similar syntax <span class="hljs-built_in">to</span> NumPy arrays.  As illustrated below,
<span class="hljs-operator">the</span> COO <span class="hljs-built_in">format</span> may also be used <span class="hljs-built_in">to</span> efficiently construct matrices.

To perform manipulations such <span class="hljs-keyword">as</span> multiplication <span class="hljs-operator">or</span> inversion, <span class="hljs-keyword">first</span>
<span class="hljs-built_in">convert</span> <span class="hljs-operator">the</span> matrix <span class="hljs-built_in">to</span> either CSC <span class="hljs-operator">or</span> CSR <span class="hljs-built_in">format</span>. The lil_matrix <span class="hljs-built_in">format</span> is
row-based, so conversion <span class="hljs-built_in">to</span> CSR is efficient, whereas conversion <span class="hljs-built_in">to</span> CSC
is less so.

All conversions <span class="hljs-operator">among</span> <span class="hljs-operator">the</span> CSR, CSC, <span class="hljs-operator">and</span> COO formats are efficient,
linear-<span class="hljs-built_in">time</span> operations.</code>

## 2.coo_matrix

coo_matrix 是最简单的存储方式。采用三个数组 row、col 和 data 保存非零元素的信息。这三个数组的长度相同，row 保存元素的行，col 保存元素的列，data 保存元素的值。一般来说，coo_matrix 主要用来创建矩阵，因为 coo_matrix 无法对矩阵的元素进行增删改等操作，一旦矩阵创建成功以后，会转化为其他形式的矩阵。

<code class="hljs lua has-numbering">>>> row = [<span class="hljs-number">2</span>,<span class="hljs-number">2</span>,<span class="hljs-number">3</span>,<span class="hljs-number">2</span>]
>>> col = [<span class="hljs-number">3</span>,<span class="hljs-number">4</span>,<span class="hljs-number">2</span>,<span class="hljs-number">3</span>]
>>> c = sparse.coo_matrix((data,(row,col)),shape=(<span class="hljs-number">5</span>,<span class="hljs-number">6</span>))
>>> <span class="hljs-built_in">print</span> c.toarray()
<span class="hljs-string">[[0 0 0 0 0 0]
[0 0 0 0 0 0]
[0 0 0 5 2 0]
[0 0 3 0 0 0]
[0 0 0 0 0 0]]</span></code>

## 3.dok_matrix 与 lil_matrix

dok_matrix 和 lil_matrix 适用的场景是逐渐添加矩阵的元素。doc_matrix 的策略是采用字典来记录矩阵中不为 0 的元素。自然，字典的 key 存的是记录元素的位置信息的元祖，value 是记录元素的具体值。

<code class="hljs r has-numbering">>>> import numpy as np
>>> from scipy.sparse import dok_matrix
>>> S = dok_matrix((<span class="hljs-number">5</span>, <span class="hljs-number">5</span>), dtype=np.float32)
>>> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(<span class="hljs-number">5</span>):
<span class="hljs-keyword">...</span>     <span class="hljs-keyword">for</span> j <span class="hljs-keyword">in</span> range(<span class="hljs-number">5</span>):
<span class="hljs-keyword">...</span>             S[i, j] = i + j
<span class="hljs-keyword">...</span>
>>> print S.toarray()
[[ <span class="hljs-number">0.</span>  <span class="hljs-number">1.</span>  <span class="hljs-number">2.</span>  <span class="hljs-number">3.</span>  <span class="hljs-number">4.</span>]
[ <span class="hljs-number">1.</span>  <span class="hljs-number">2.</span>  <span class="hljs-number">3.</span>  <span class="hljs-number">4.</span>  <span class="hljs-number">5.</span>]
[ <span class="hljs-number">2.</span>  <span class="hljs-number">3.</span>  <span class="hljs-number">4.</span>  <span class="hljs-number">5.</span>  <span class="hljs-number">6.</span>]
[ <span class="hljs-number">3.</span>  <span class="hljs-number">4.</span>  <span class="hljs-number">5.</span>  <span class="hljs-number">6.</span>  <span class="hljs-number">7.</span>]
[ <span class="hljs-number">4.</span>  <span class="hljs-number">5.</span>  <span class="hljs-number">6.</span>  <span class="hljs-number">7.</span>  <span class="hljs-number">8.</span>]]</code>

lil_matrix 则是使用两个列表存储非 0 元素。data 保存每行中的非零元素,rows 保存非零元素所在的列。这种格式也很适合逐个添加元素，并且能快速获取行相关的数据。

<code class="hljs lua has-numbering">>>> from scipy.sparse import lil_matrix
>>> l = lil_matrix((<span class="hljs-number">6</span>,<span class="hljs-number">5</span>))
>>> l[<span class="hljs-number">2</span>,<span class="hljs-number">3</span>] = <span class="hljs-number">1</span>
>>> l[<span class="hljs-number">3</span>,<span class="hljs-number">4</span>] = <span class="hljs-number">2</span>
>>> l[<span class="hljs-number">3</span>,<span class="hljs-number">2</span>] = <span class="hljs-number">3</span>
>>> <span class="hljs-built_in">print</span> l.toarray()
<span class="hljs-string">[[ 0.  0.  0.  0.  0.]
[ 0.  0.  0.  0.  0.]
[ 0.  0.  0.  1.  0.]
[ 0.  0.  3.  0.  2.]
[ 0.  0.  0.  0.  0.]
[ 0.  0.  0.  0.  0.]]</span>
>>> <span class="hljs-built_in">print</span> l.data
<span class="hljs-string">[[] [] [1.0] [3.0, 2.0] [] []]</span>
>>> <span class="hljs-built_in">print</span> l.rows
<span class="hljs-string">[[] [] [3] [2, 4] [] []]</span></code>

## 5.csr_matrix 与 csc_matrix

csr_matrix，全名为 Compressed Sparse Row，是按行对矩阵进行压缩的。CSR 需要三类数据：数值，列号，以及行偏移量。CSR 是一种编码的方式，其中，数值与列号的含义，与 coo 里是一致的。行偏移表示某一行的第一个元素在 values 里面的起始偏移位置。

<code class="hljs lua has-numbering">>>> from scipy.sparse import csr_matrix
>>> indptr = np.array([<span class="hljs-number">0</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">6</span>])
>>> indices = np.array([<span class="hljs-number">0</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>])
>>> data = np.array([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>, <span class="hljs-number">6</span>])
>>> csr_matrix((data, indices, indptr), shape=(<span class="hljs-number">3</span>, <span class="hljs-number">3</span>)).toarray()
array(<span class="hljs-string">[[1, 0, 2],
[0, 0, 3],
[4, 5, 6]]</span>)</code>

<code class="hljs 1c has-numbering"> Notes
<span class="hljs-string">|  -----</span>
<span class="hljs-string">|</span>
<span class="hljs-string">|  Sparse matrices can be used in arithmetic operations: they support</span>
<span class="hljs-string">|  addition, subtraction, multiplication, division, and matrix power.</span>
<span class="hljs-string">|</span>
<span class="hljs-string">|  Advantages of the CSR format</span>
<span class="hljs-string">|    - efficient arithmetic operations CSR + CSR, CSR * CSR, etc.</span>
<span class="hljs-string">|    - efficient row slicing</span>
<span class="hljs-string">|    - fast matrix vector products</span>
<span class="hljs-string">|</span>
<span class="hljs-string">|  Disadvantages of the CSR format</span>
<span class="hljs-string">|    - slow column slicing operations (consider CSC)</span>
<span class="hljs-string">|    - changes to the sparsity structure are expensive (consider LIL or DOK)</span></code>

## 6.bsr_matrix

block_size 为 2 时，分块表示的压缩矩阵 E：

BSR 的 zero-based 索引表示：

values  =  (1 02 1 6 7 8 2 1 4 5 1 4 3 0 0 7 2 0 0)

columns  = (0  1   1   1   2)

pointerB= (0   2  3)

pointerE= (2   3  5)

Deeplearn, 版权所有丨如未注明 , 均为原创丨本网站采用BY-NC-SA协议进行授权 , 转载请注明scipy 稀疏矩阵模块

• 版权声明

本站的文章和资源来自互联网或者站长
的原创，按照 CC BY -NC -SA 3.0 CN
协议发布和共享，转载或引用本站文章
应遵循相同协议。如果有侵犯版权的资
源请尽快联系站长，我们会在24h内删
除有争议的资源。
• 网站驱动

• 友情链接

• 支持主题

邮箱：service@deeplearn.me