Support Vector Classification with SKLearn#

Authored by Mark (Zixuan) Song

We use the SKLearn library to implement SVC in the following tutorial.

Overview#

The aim of this tutorial is to implant a quantum machine learning (QML) transformer into SVC pipeline. And this is a general introduction to connect tensorcircuit with scikit-learn.

Setup#

Install scikit-learn and requests. The data that is going to be used is German Credit Data by UCI

pip install scikit-learn requests
[7]:
import tensorcircuit as tc
import tensorflow as tf
from sklearn.svm import SVC
from sklearn import metrics
from time import time
import requests

K = tc.set_backend("tensorflow")

Data Preprocessing#

The data has 20 variables and each is a integer value. In order for the model to use the data, we need to normalize the data to between 0 and 1.

[8]:
def load_GCN_data():
    link2gcn = "http://home.cse.ust.hk/~qyang/221/Assignments/German/GermanData.csv"
    data = requests.get(link2gcn)
    data = data.text
    data = data.split("\n")[:-1]
    x = None
    y = None

    def destring(string):
        string = string.split(",")
        return_array = []
        for i, v in enumerate(string):
            if v[0] == "A":
                return_array.append(int(v[1 + len(str(i)) :]))
            else:
                return_array.append(int(v))
        return K.cast([return_array[:-1]], dtype="float32"), K.cast(
            [return_array[-1] - 1], dtype="int32"
        )

    for i in data:
        if x is None:
            temp_x, temp_y = destring(i)
            x = K.cast(temp_x, dtype="float32")
            y = K.cast(temp_y, dtype="int32")
        else:
            temp_x, temp_y = destring(i)
            x = K.concat([x, temp_x], axis=0)
            y = K.concat([y, temp_y], axis=0)
    x = K.transpose(x)
    nx = None
    for i in x:
        max_i = K.cast(K.max(i), dtype="float32")
        temp_nx = [K.divide(i, max_i)]
        nx = K.concat([nx, temp_nx], axis=0) if nx is not None else temp_nx
    x = K.transpose(nx)
    return (x[:800], y[:800]), (x[800:], y[800:])


(x_train, y_train), (x_test, y_test) = load_GCN_data()

Quantum Model#

This quantum model takes in 1x20 matrices as input and output the state of 5 qbits. The model is shown below:

[9]:
def quantumTran(inputs):
    c = tc.Circuit(5)
    for i in range(4):
        if i % 2 == 0:
            for j in range(5):
                c.rx(j, theta=(0 if i * 5 + j >= 20 else inputs[i * 5 + j]))
        else:
            for j in range(5):
                c.rz(j, theta=(0 if i * 5 + j >= 20 else inputs[i * 5 + j]))
            for j in range(4):
                c.cnot(j, j + 1)
    return c.state()


func_qt = tc.interfaces.tensorflow_interface(quantumTran, ydtype=tf.complex64, jit=True)

Wrapping Quantum Model into a SVC#

Convert quantum model into svc that can be trained.

[10]:
def quantum_kernel(quantumTran, data_x, data_y):
    def kernel(x, y):
        x = K.convert_to_tensor(x)
        y = K.convert_to_tensor(y)
        x_qt = None
        for i, x1 in enumerate(x):
            if i == 0:
                x_qt = K.convert_to_tensor([quantumTran(x1)])
            else:
                x_qt = K.concat([x_qt, [quantumTran(x1)]], 0)
        y_qt = None
        for i, x1 in enumerate(y):
            if i == 0:
                y_qt = K.convert_to_tensor([quantumTran(x1)])
            else:
                y_qt = K.concat([y_qt, [quantumTran(x1)]], 0)
        data_ret = K.cast(K.power(K.abs(x_qt @ K.transpose(y_qt)), 2), "float32")
        return data_ret

    clf = SVC(kernel=kernel)
    clf.fit(data_x, data_y)
    return clf

Create Traditional SVC#

[11]:
def standard_kernel(data_x, data_y, method):
    methods = ["linear", "poly", "rbf", "sigmoid"]
    if method not in methods:
        raise ValueError("method must be one of %r." % methods)
    clf = SVC(kernel=method)
    clf.fit(data_x, data_y)
    return clf

Test#

Test the accuracy of the quantum model SVC with the test data and compare it with traditional SVC.

[12]:
methods = ["linear", "poly", "rbf", "sigmoid"]

for method in methods:
    print()
    t = time()

    k = standard_kernel(data_x=x_train, data_y=y_train, method=method)
    y_pred = k.predict(x_test)
    print("Accuracy:(%s as kernel)" % method, metrics.accuracy_score(y_test, y_pred))

    print("time:", time() - t, "seconds")

print()
t = time()

k = quantum_kernel(quantumTran=func_qt, data_x=x_train, data_y=y_train)
y_pred = k.predict(x_test)
print("Accuracy:(qml as kernel)", metrics.accuracy_score(y_test, y_pred))

print("time:", time() - t, "seconds")

Accuracy:(linear as kernel) 0.78
time: 0.007764101028442383 seconds

Accuracy:(poly as kernel) 0.75
time: 0.024492979049682617 seconds

Accuracy:(rbf as kernel) 0.765
time: 0.011505126953125 seconds

Accuracy:(sigmoid as kernel) 0.695
time: 0.010205984115600586 seconds

Accuracy:(qml as kernel) 0.66
time: 3.0243749618530273 seconds

Issue with SKLearn#

Due to the limitation of SKLearn, SKLearn’s SVC is not fully compatible with quantum machine model (QML).

This is because QML outputs a result as complex number (coordinate on the bloch sphere) whereas SKLearn only accept float. This is causing the result output by QML must be converted into float before it can be used in SVC, leading to a potential loss of accuracy.

Conclusion#

Due to the present limitation of SKLearn, quantum SVC is worse than traditional SVC in both accuracy and speed. However, if the limitation is removed, quantum SVC might be able to outperform traditional SVC in both accuracy.