结合SKLearn实现的支持向量分类#
本示例结合了sklearn
库中的SVC
类,实现了支持向量分类。
概述#
本示例的目的是将量子机器学习(QML)转换器嵌入到SVC管道中并且介绍tensorcircuit
与scikit-learn
的一种连接方式。
设置#
安装 scikit-learn
和 requests
. 本模型测试数据为 [德国信用]The data that is going to be used is German Credit Data by UCI
pip install scikit-learn requests
[1]:
import tensorcircuit as tc
import tensorflow as tf
from sklearn.svm import SVC
from sklearn import metrics
from time import time
import requests
K = tc.set_backend("tensorflow")
数据处理#
数据集包含20个变量,每个变量都是整数值。为了使模型能够使用数据,我们需要将数据归一化为0到1之间。
[2]:
def load_GCN_data():
link2gcn = "http://home.cse.ust.hk/~qyang/221/Assignments/German/GermanData.csv"
data = requests.get(link2gcn)
data = data.text
data = data.split("\n")[:-1]
x = None
y = None
def destring(string):
string = string.split(",")
return_array = []
for i, v in enumerate(string):
if v[0] == "A":
return_array.append(int(v[1 + len(str(i)) :]))
else:
return_array.append(int(v))
return K.cast([return_array[:-1]], dtype="float32"), K.cast(
[return_array[-1] - 1], dtype="int32"
)
for i in data:
if x is None:
temp_x, temp_y = destring(i)
x = K.cast(temp_x, dtype="float32")
y = K.cast(temp_y, dtype="int32")
else:
temp_x, temp_y = destring(i)
x = K.concat([x, temp_x], axis=0)
y = K.concat([y, temp_y], axis=0)
x = K.transpose(x)
nx = None
for i in x:
max_i = K.cast(K.max(i), dtype="float32")
temp_nx = [K.divide(i, max_i)]
nx = K.concat([nx, temp_nx], axis=0) if nx is not None else temp_nx
x = K.transpose(nx)
return (x[:800], y[:800]), (x[800:], y[800:])
(x_train, y_train), (x_test, y_test) = load_GCN_data()
量子模型#
这个量子模型是输入为1x20的矩阵,并输出为5个量子比特的状态。模型如下所示:
[3]:
def quantumTran(inputs):
c = tc.Circuit(5)
for i in range(4):
if i % 2 == 0:
for j in range(5):
c.rx(j, theta=(0 if i * 5 + j >= 20 else inputs[i * 5 + j]))
else:
for j in range(5):
c.rz(j, theta=(0 if i * 5 + j >= 20 else inputs[i * 5 + j]))
for j in range(4):
c.cnot(j, j + 1)
return c.state()
func_qt = tc.interfaces.tensorflow_interface(quantumTran, ydtype=tf.complex64, jit=True)
将量子模型打包成SVC#
将量子模型打包成SKLearn
能使用的SVC模型。
[4]:
def quantum_kernel(quantumTran, data_x, data_y):
def kernel(x, y):
x = K.convert_to_tensor(x)
y = K.convert_to_tensor(y)
x_qt = None
for i, x1 in enumerate(x):
if i == 0:
x_qt = K.convert_to_tensor([quantumTran(x1)])
else:
x_qt = K.concat([x_qt, [quantumTran(x1)]], 0)
y_qt = None
for i, x1 in enumerate(y):
if i == 0:
y_qt = K.convert_to_tensor([quantumTran(x1)])
else:
y_qt = K.concat([y_qt, [quantumTran(x1)]], 0)
data_ret = K.cast(K.power(K.abs(x_qt @ K.transpose(y_qt)), 2), "float32")
return data_ret
clf = SVC(kernel=kernel)
clf.fit(data_x, data_y)
return clf
创建传统SVC模型#
[5]:
def standard_kernel(data_x, data_y, method):
methods = ["linear", "poly", "rbf", "sigmoid"]
if method not in methods:
raise ValueError("method must be one of %r." % methods)
clf = SVC(kernel=method)
clf.fit(data_x, data_y)
return clf
测试对比#
测试量子SVC模型并于传统SVC模型进行对比。
[6]:
methods = ["linear", "poly", "rbf", "sigmoid"]
for method in methods:
print()
t = time()
k = standard_kernel(data_x=x_train, data_y=y_train, method=method)
y_pred = k.predict(x_test)
print("Accuracy:(%s as kernel)" % method, metrics.accuracy_score(y_test, y_pred))
print("time:", time() - t, "seconds")
print()
t = time()
k = quantum_kernel(quantumTran=func_qt, data_x=x_train, data_y=y_train)
y_pred = k.predict(x_test)
print("Accuracy:(qml as kernel)", metrics.accuracy_score(y_test, y_pred))
print("time:", time() - t, "seconds")
Accuracy:(linear as kernel) 0.78
time: 0.00810384750366211 seconds
Accuracy:(poly as kernel) 0.75
time: 0.024804115295410156 seconds
Accuracy:(rbf as kernel) 0.765
time: 0.011444091796875 seconds
Accuracy:(sigmoid as kernel) 0.695
time: 0.010396003723144531 seconds
Accuracy:(qml as kernel) 0.66
time: 6.472219228744507 seconds
SKLearn
的局限性#
因为SKLearn
的局限性,SKLearn
的SVC
并不完全兼容量子机器学习(QML)。
这是因为QML输出的为复数(布洛赫球上的坐标),而SKLearn
只接受浮点数。这导致QML输出的结果必须在使用SVC之前转换为浮点数,从而可能导致精度损失。
结论#
由于SKLearn
的局限性,量子SVC在准确性和速度上都不如传统SVC。但是,如果这种局限性被消除,量子SVC可能会在准确性上都优于传统SVC。