Sparse vector and Dense vector

date
Dec 21, 2021
slug
10060
status
Published
tags
Math.NET
summary
type
Post
密集向量 (Dense Vector)就是一个普通的double数组,例如:向量(1,0,1,3)用密集格式表示为[1,0,1,3]
而稀疏向量 (Sparse Vector)由两个并列的 数组indices和values组成,上面向量的value数组为(1, 1, 3),只取非零数值;indices数组为(0, 2, 3)表示向量0的位置的值是12的位置的值是1,而3的位置的值是3,其他的位置都是0,即为{(0, 2, 3), (1, 1, 3)}

Conceptually it is the same. Just a vector.
The data structure behind it is different tho. Being sparse means that it won’t explicitly contains each coordinate. I’ll explain.
Consider a  dimensional vector 
You sometimes know that your vector will have a lot of ui=0ui=0 value. Then you may want, to avoid memory wasting, to store values that are not 0, and then, and consider, other values as zero. This is hugely useful when one-hot is used.
Usually sparse vector are represented by a tuple (id, value) such as:
if
;
otherwise (if i is not in id)
From a dev point of view, getting sparse vector from dense vector is like doing:
sparse_vec = {“id”: [], “values”: []} 
d = len(dense_vec) 
for i in range(0, d): 
	if d[i] != 0: 
		sparse_vec["id"].append(i) 
		sparse_vec["values"].append(d[i])
And for exemple a dense vector (1, 2, 0, 0, 5, 0, 9, 0, 0) will be represented as {(0,1,4,6), (1, 2, 5, 9)}
 

© Wen Bo 2021 - 2022