# Path Hashing and SmartCuckoo

## A Write-Friendly and Cache-Optimized Hashing Scheme for Non-Volatile Memory Systems

### 0x01 基本思路

  Storage cells in the path hashing are logically organized as an inverted complete binary tree. ... All nodes in the remaining levels are non-addressable and considered as the shared standby positions of the leaf nodes to deal with hash collisions. When hash collisions occur in a leaf node, the empty standby positions of the leaf node are used to store the conflicting items. Thus insertion and deletion requests in path hashing only need to probe the leaf node and its standby positions for finding an empty position or the target item, resulting in no extra writes.


#### 算法

Path hashing的基本算法和普通的hash table没有区别，算法也简单直观，直接就直接给出paper中的描述就可以了: 插入，可以插入的位置都要考虑到。我们这里也发现了path hasing的一个缺点，没有rehash的逻辑。(个人认为这里有一个比较好的解决方法，不知道行不行得通).

## SmartCuckoo: A Fast and Cost-Efficient Hashing Index Scheme for Cloud Storage Systems

### 0x11 基本思路

Paper中讨论的是使用两个hash函数的情况。在一个Key新添加到一个Cuckoo Hash中的时候，SmartCuckoo根据在这些subgraph形成的pseudoforest里面新增的顶点的数量分为三种情况，

New item insertions can be classified into three cases, i.e.,v+2,v+1,and v+0. As each item is represented as an edge in the pseudoforest, different placements of the item will increase the graph’s vertex count differently (by two, one, or zero)


• v + 0，这里表示可选的两个都已经在pseudoforest中了，这里又细分为5中情况，如下面的图所示，

1. 如果都位于一个non-maxinal的subgraph中，直接使用驱逐策略。
2. 如果位于两个non-maxinal的subgraph中，使用驱逐策略之外还要将这两个subgraph合并。
3. 如果一个为non-maxinal，一个maxinal，选在一个non-maxinal的驱逐。
4. 其它的两种情况都需要进行rehash的操作。

• 对于v+1的情况，这里比v+0的情况简单很多，这样的操作会导致两个subgraph合并，然后直接进行添加操作即可，不用驱逐操作。

• 对于v+2的情况，表示这里会新添加一个subgraph，在赋予这个新添加的一个唯一的subgraph ID之后的操作和v+1的是一样的。

... Deleting an item from the hash tables is equivalent to removal of an edge in the corresponding subgraph, which causes the subgraph to be separated into two subgraphs. We assign each of the two subgraphs a new ID, and update the IDs of each member vertex of the two subgraph in their corresponding buckets


## 参考

1. A Write-Friendly and Cache-Optimized Hashing Scheme for Non-Volatile Memory Systems, TPDS 2018.
2. SmartCuckoo: A Fast and Cost-Efficient Hashing Index Scheme for Cloud Storage Systems, ATC ‘17.