Skip List | 跳表

Albert Wang / 2024-03-08 / 500 Words/has been Read Times

跳表是一种基于链表的动态数据结构，它允许快速地进行查找、插入和删除操作，且实现相对简单。跳表通过在链表的不同层级上建立索引，从而提高了查找效率。跳表的基本思想是在原始链表的基础上建立多层索引，每一层索引都是原始链表的子集，且越高层的索引节点越稀疏。

跳表的节点结构如下面代码中所示，每个节点包含一个键值对以及指向下一层的指针数组，其中 forward[ i ] 表示这个 node 指向跳表第 i 层的下一个节点，也就说说普通的链表 node 都是只有一个 next 节点，而跳表的 node 每一层都有一个 next 节点。跳表的层级是动态的，随着插入操作的进行而逐渐增加。在跳表中，每个节点都包含一个层级属性，表示它在跳表中所处的层数。

#define MAX_LEVEL 6

typedef struct SkipListNode {
    int key;
    int value;
    struct SkipListNode *forward[MAX_LEVEL + 1];
} SkipListNode;

typedef struct SkipList {
    int level;
    SkipListNode *header;
} SkipList;

SkipListNode *createNode(int key, int value, int level) {
    SkipListNode *newNode = (SkipListNode *)malloc(sizeof(SkipListNode));
    newNode->key = key;
    newNode->value = value;
    for (int i = 0; i <= level; i++) {
        newNode->forward[i] = NULL;
    }
    return newNode;
}

SkipList *createSkipList() {
    SkipList *skipList = (SkipList *)malloc(sizeof(SkipList));
    skipList->level = 0;
    skipList->header = createNode(-1, -1, MAX_LEVEL);
    return skipList;
}

跳表的插入操作通过随机生成一个层数，并将新节点插入到对应的层级链表中。

int randomLevel() {
    int level = 0;
    while (rand() < RAND_MAX / 2 && level < MAX_LEVEL) {
        level++;
    }
    return level;
}

下面的代码中 insert 函数首先从最高层开始往下查找，对于第 i 层来说，如果当前节点的 key 小于要插入节点的 key，就在这一层往下查找。这一层最后遍历到的节点的值要么是 NULL，要么 key 的值是第一个大于要插入的 key 的值，同时把这个 key 对应的 node 更新到 update 数组中。

    SkipListNode *update[MAX_LEVEL + 1];
    SkipListNode *current = skipList->header;

    for (int i = skipList->level; i >= 0; i--) {
        while (current->forward[i] != NULL && current->forward[i]->key < key) {
            current = current->forward[i];
        }
        update[i] = current;
    }

然后调用 randomLevel() 函数得到了新的层数去插入，完整的函数如下：

void insert(SkipList *skipList, int key, int value) {
    SkipListNode *update[MAX_LEVEL + 1];
    SkipListNode *current = skipList->header;

    for (int i = skipList->level; i >= 0; i--) {
        while (current->forward[i] != NULL && current->forward[i]->key < key) {
            current = current->forward[i];
        }
        update[i] = current;
    }
	
    // 这里是指向了第 0 层的数据，也就是真实的数据，其他层都是索引
    current = current->forward[0];

    if (current == NULL || current->key != key) {
        int newLevel = randomLevel();
        if (newLevel > skipList->level) {
            for (int i = skipList->level + 1; i <= newLevel; i++) {
                update[i] = skipList->header;
            }
            skipList->level = newLevel;
        }
        SkipListNode *newNode = createNode(key, value, newLevel);
        for (int i = 0; i <= newLevel; i++) {
            newNode->forward[i] = update[i]->forward[i];
            update[i]->forward[i] = newNode;
        }
    }
}

查找操作从顶层开始，逐层向下查找，直到找到目标节点或者到达最底层。跳表的时间复杂度为 O(log n)，其中 n 表示跳表中的节点数。

SkipListNode *search(SkipList *skipList, int key) {
    SkipListNode *current = skipList->header;
    for (int i = skipList->level; i >= 0; i--) {
        while (current->forward[i] != NULL && current->forward[i]->key < key) {
            current = current->forward[i];
        }
    }
    current = current->forward[0];
    if (current != NULL && current->key == key) {
        return current;
    } else {
        return NULL;
    }
}

为了方便验证，这里增加跳表的打印程序，因为初始化跳表的时候都加了一个值为 -1 的 dummy 节点，所以这里也特别打印出来，以便于理解。

void printSkipList(SkipList *skipList) {
    printf("Skip List:\n");
    for (int i = skipList->level; i >= 0; i--) {
        SkipListNode *current = skipList->header->forward[i];
        printf("Level %d: %-4d", i, -1);
        while (current != NULL) {
            printf("%-4d", current->key);
            current = current->forward[i];
        }
        printf("\n");
    }
}

int main() {
    SkipList *skipList = createSkipList();
    
    insert(skipList, 3, 3);
    insert(skipList, 6, 6);
    insert(skipList, 7, 7);
    insert(skipList, 9, 9);
    insert(skipList, 12, 12);
    insert(skipList, 19, 19);
    insert(skipList, 17, 17);
    insert(skipList, 26, 26);
    insert(skipList, 21, 21);
    insert(skipList, 25, 25);
    insert(skipList, 29, 29);
    insert(skipList, 37, 37);
    insert(skipList, 31, 31);
    insert(skipList, 42, 42);
    insert(skipList, 39, 39);
    insert(skipList, 49, 49);
    insert(skipList, 46, 46);
    insert(skipList, 38, 38);

    printSkipList(skipList);

    return 0;
}

然后构造测试用例，执行效果如下：

Last modified on 2024-03-08