Quantum Data Embedding Suite - Basic Workflow¶
This notebook demonstrates the basic workflow of the Quantum Data Embedding Suite, including:
- Loading and preparing data
- Creating quantum embeddings
- Computing quantum kernels
- Evaluating embedding quality
- Visualizing results
Prerequisites¶
Make sure you have the package installed:
pip install quantum-data-embedding-suite
1. Import Libraries and Load Data¶
In [ ]:
Copied!
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris, make_blobs
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
# Import quantum data embedding suite
from quantum_data_embedding_suite import QuantumEmbeddingPipeline
from quantum_data_embedding_suite.embeddings import AngleEmbedding, IQPEmbedding
from quantum_data_embedding_suite.visualization import plot_kernel_comparison
from quantum_data_embedding_suite.utils import generate_random_data
# Set random seed for reproducibility
np.random.seed(42)
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris, make_blobs
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
# Import quantum data embedding suite
from quantum_data_embedding_suite import QuantumEmbeddingPipeline
from quantum_data_embedding_suite.embeddings import AngleEmbedding, IQPEmbedding
from quantum_data_embedding_suite.visualization import plot_kernel_comparison
from quantum_data_embedding_suite.utils import generate_random_data
# Set random seed for reproducibility
np.random.seed(42)
In [ ]:
Copied!
# Load sample data - using Iris dataset
iris = load_iris()
X_iris, y_iris = iris.data, iris.target
# Use only first 2 classes and 2 features for simplicity
mask = y_iris < 2
X = X_iris[mask, :2] # sepal length and width
y = y_iris[mask]
# Standardize the data
scaler = StandardScaler()
X = scaler.fit_transform(X)
print(f"Dataset shape: {X.shape}")
print(f"Number of classes: {len(np.unique(y))}")
# Visualize the data
plt.figure(figsize=(8, 6))
scatter = plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis', alpha=0.7)
plt.xlabel('Sepal Length (standardized)')
plt.ylabel('Sepal Width (standardized)')
plt.title('Iris Dataset (2 classes, 2 features)')
plt.colorbar(scatter, label='Class')
plt.grid(True, alpha=0.3)
plt.show()
# Load sample data - using Iris dataset
iris = load_iris()
X_iris, y_iris = iris.data, iris.target
# Use only first 2 classes and 2 features for simplicity
mask = y_iris < 2
X = X_iris[mask, :2] # sepal length and width
y = y_iris[mask]
# Standardize the data
scaler = StandardScaler()
X = scaler.fit_transform(X)
print(f"Dataset shape: {X.shape}")
print(f"Number of classes: {len(np.unique(y))}")
# Visualize the data
plt.figure(figsize=(8, 6))
scatter = plt.scatter(X[:, 0], X[:, 1], c=y, cmap='viridis', alpha=0.7)
plt.xlabel('Sepal Length (standardized)')
plt.ylabel('Sepal Width (standardized)')
plt.title('Iris Dataset (2 classes, 2 features)')
plt.colorbar(scatter, label='Class')
plt.grid(True, alpha=0.3)
plt.show()
2. Create Quantum Embedding Pipeline¶
In [ ]:
Copied!
# Create an angle embedding pipeline
pipeline_angle = QuantumEmbeddingPipeline(
embedding_type="angle",
n_qubits=4,
backend="qiskit",
shots=1024,
normalize=True
)
print("Angle Embedding Pipeline created")
print(f"Embedding info: {pipeline_angle.get_embedding_info()}")
# Create an angle embedding pipeline
pipeline_angle = QuantumEmbeddingPipeline(
embedding_type="angle",
n_qubits=4,
backend="qiskit",
shots=1024,
normalize=True
)
print("Angle Embedding Pipeline created")
print(f"Embedding info: {pipeline_angle.get_embedding_info()}")
3. Compute Quantum Kernel¶
In [ ]:
Copied!
# Use a subset of data for faster computation
n_samples = 20
X_subset = X[:n_samples]
y_subset = y[:n_samples]
print(f"Computing quantum kernel for {n_samples} samples...")
# Fit the pipeline and compute quantum kernel
K_quantum = pipeline_angle.fit_transform(X_subset)
print(f"Quantum kernel shape: {K_quantum.shape}")
print(f"Kernel trace: {np.trace(K_quantum):.3f}")
print(f"Mean off-diagonal element: {np.mean(K_quantum[~np.eye(len(K_quantum), dtype=bool)]):.3f}")
# Use a subset of data for faster computation
n_samples = 20
X_subset = X[:n_samples]
y_subset = y[:n_samples]
print(f"Computing quantum kernel for {n_samples} samples...")
# Fit the pipeline and compute quantum kernel
K_quantum = pipeline_angle.fit_transform(X_subset)
print(f"Quantum kernel shape: {K_quantum.shape}")
print(f"Kernel trace: {np.trace(K_quantum):.3f}")
print(f"Mean off-diagonal element: {np.mean(K_quantum[~np.eye(len(K_quantum), dtype=bool)]):.3f}")
4. Compare with Classical Kernel¶
In [ ]:
Copied!
# Compute classical RBF kernel for comparison
from sklearn.metrics.pairwise import rbf_kernel
K_classical = rbf_kernel(X_subset, gamma=1.0)
print(f"Classical kernel shape: {K_classical.shape}")
print(f"Classical kernel trace: {np.trace(K_classical):.3f}")
# Visualize both kernels
fig, axes = plt.subplots(1, 3, figsize=(15, 4))
# Quantum kernel
im1 = axes[0].imshow(K_quantum, cmap='viridis', vmin=0, vmax=1)
axes[0].set_title('Quantum Kernel (Angle Embedding)')
axes[0].set_xlabel('Sample Index')
axes[0].set_ylabel('Sample Index')
plt.colorbar(im1, ax=axes[0])
# Classical kernel
im2 = axes[1].imshow(K_classical, cmap='viridis', vmin=0, vmax=1)
axes[1].set_title('Classical RBF Kernel')
axes[1].set_xlabel('Sample Index')
axes[1].set_ylabel('Sample Index')
plt.colorbar(im2, ax=axes[1])
# Difference
diff = K_quantum - K_classical
im3 = axes[2].imshow(diff, cmap='RdBu', vmin=-1, vmax=1)
axes[2].set_title('Difference (Quantum - Classical)')
axes[2].set_xlabel('Sample Index')
axes[2].set_ylabel('Sample Index')
plt.colorbar(im3, ax=axes[2])
plt.tight_layout()
plt.show()
# Compute classical RBF kernel for comparison
from sklearn.metrics.pairwise import rbf_kernel
K_classical = rbf_kernel(X_subset, gamma=1.0)
print(f"Classical kernel shape: {K_classical.shape}")
print(f"Classical kernel trace: {np.trace(K_classical):.3f}")
# Visualize both kernels
fig, axes = plt.subplots(1, 3, figsize=(15, 4))
# Quantum kernel
im1 = axes[0].imshow(K_quantum, cmap='viridis', vmin=0, vmax=1)
axes[0].set_title('Quantum Kernel (Angle Embedding)')
axes[0].set_xlabel('Sample Index')
axes[0].set_ylabel('Sample Index')
plt.colorbar(im1, ax=axes[0])
# Classical kernel
im2 = axes[1].imshow(K_classical, cmap='viridis', vmin=0, vmax=1)
axes[1].set_title('Classical RBF Kernel')
axes[1].set_xlabel('Sample Index')
axes[1].set_ylabel('Sample Index')
plt.colorbar(im2, ax=axes[1])
# Difference
diff = K_quantum - K_classical
im3 = axes[2].imshow(diff, cmap='RdBu', vmin=-1, vmax=1)
axes[2].set_title('Difference (Quantum - Classical)')
axes[2].set_xlabel('Sample Index')
axes[2].set_ylabel('Sample Index')
plt.colorbar(im3, ax=axes[2])
plt.tight_layout()
plt.show()
5. Evaluate Embedding Quality¶
In [ ]:
Copied!
# Evaluate the embedding quality
print("Evaluating embedding quality (this may take a moment...)")
metrics = pipeline_angle.evaluate_embedding(X_subset, n_samples=100) # Reduced for speed
print("\n=== Embedding Quality Metrics ===")
for metric_name, value in metrics.items():
if isinstance(value, float):
print(f"{metric_name}: {value:.4f}")
else:
print(f"{metric_name}: {value}")
# Evaluate the embedding quality
print("Evaluating embedding quality (this may take a moment...)")
metrics = pipeline_angle.evaluate_embedding(X_subset, n_samples=100) # Reduced for speed
print("\n=== Embedding Quality Metrics ===")
for metric_name, value in metrics.items():
if isinstance(value, float):
print(f"{metric_name}: {value:.4f}")
else:
print(f"{metric_name}: {value}")
6. Compare Different Embedding Types¶
In [ ]:
Copied!
# Compare different embedding types
embedding_types = ['angle', 'iqp']
results = []
# Use even smaller subset for comparison
X_small = X_subset[:10]
y_small = y_subset[:10]
for embedding_type in embedding_types:
print(f"\nTesting {embedding_type} embedding...")
try:
pipeline = QuantumEmbeddingPipeline(
embedding_type=embedding_type,
n_qubits=3, # Reduced for faster computation
backend="qiskit",
normalize=True
)
# Compute kernel
K = pipeline.fit_transform(X_small)
# Basic kernel statistics
result = {
'embedding': embedding_type,
'kernel_trace': np.trace(K),
'mean_off_diagonal': np.mean(K[~np.eye(len(K), dtype=bool)]),
'kernel_std': np.std(K[~np.eye(len(K), dtype=bool)]),
}
results.append(result)
print(f" Kernel trace: {result['kernel_trace']:.3f}")
print(f" Mean off-diagonal: {result['mean_off_diagonal']:.3f}")
except Exception as e:
print(f" Error with {embedding_type}: {e}")
continue
# Display comparison
if results:
import pandas as pd
df = pd.DataFrame(results)
print("\n=== Embedding Comparison ===")
print(df.round(4))
# Compare different embedding types
embedding_types = ['angle', 'iqp']
results = []
# Use even smaller subset for comparison
X_small = X_subset[:10]
y_small = y_subset[:10]
for embedding_type in embedding_types:
print(f"\nTesting {embedding_type} embedding...")
try:
pipeline = QuantumEmbeddingPipeline(
embedding_type=embedding_type,
n_qubits=3, # Reduced for faster computation
backend="qiskit",
normalize=True
)
# Compute kernel
K = pipeline.fit_transform(X_small)
# Basic kernel statistics
result = {
'embedding': embedding_type,
'kernel_trace': np.trace(K),
'mean_off_diagonal': np.mean(K[~np.eye(len(K), dtype=bool)]),
'kernel_std': np.std(K[~np.eye(len(K), dtype=bool)]),
}
results.append(result)
print(f" Kernel trace: {result['kernel_trace']:.3f}")
print(f" Mean off-diagonal: {result['mean_off_diagonal']:.3f}")
except Exception as e:
print(f" Error with {embedding_type}: {e}")
continue
# Display comparison
if results:
import pandas as pd
df = pd.DataFrame(results)
print("\n=== Embedding Comparison ===")
print(df.round(4))
7. Kernel Eigenspectrum Analysis¶
In [ ]:
Copied!
# Analyze the eigenspectrum of the quantum kernel
eigenvals = np.linalg.eigvals(K_quantum)
eigenvals = np.real(eigenvals)
eigenvals = np.sort(eigenvals)[::-1] # Sort in descending order
# Plot eigenspectrum
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Eigenvalues
ax1.plot(eigenvals, 'o-', markersize=6, linewidth=2)
ax1.set_xlabel('Eigenvalue Index')
ax1.set_ylabel('Eigenvalue')
ax1.set_title('Quantum Kernel Eigenspectrum')
ax1.grid(True, alpha=0.3)
# Cumulative variance explained
cumvar = np.cumsum(eigenvals) / np.sum(eigenvals)
ax2.plot(cumvar, 'o-', markersize=6, linewidth=2, color='orange')
ax2.axhline(y=0.95, color='red', linestyle='--', alpha=0.7, label='95% threshold')
ax2.set_xlabel('Eigenvalue Index')
ax2.set_ylabel('Cumulative Variance Explained')
ax2.set_title('Cumulative Variance')
ax2.grid(True, alpha=0.3)
ax2.legend()
plt.tight_layout()
plt.show()
# Calculate effective dimension
eff_dim = np.searchsorted(cumvar, 0.95) + 1
print(f"\nEffective dimension (95% variance): {eff_dim}")
print(f"Condition number: {eigenvals[0] / eigenvals[-1]:.2f}")
print(f"Spectral gap: {eigenvals[0] - eigenvals[1]:.4f}")
# Analyze the eigenspectrum of the quantum kernel
eigenvals = np.linalg.eigvals(K_quantum)
eigenvals = np.real(eigenvals)
eigenvals = np.sort(eigenvals)[::-1] # Sort in descending order
# Plot eigenspectrum
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
# Eigenvalues
ax1.plot(eigenvals, 'o-', markersize=6, linewidth=2)
ax1.set_xlabel('Eigenvalue Index')
ax1.set_ylabel('Eigenvalue')
ax1.set_title('Quantum Kernel Eigenspectrum')
ax1.grid(True, alpha=0.3)
# Cumulative variance explained
cumvar = np.cumsum(eigenvals) / np.sum(eigenvals)
ax2.plot(cumvar, 'o-', markersize=6, linewidth=2, color='orange')
ax2.axhline(y=0.95, color='red', linestyle='--', alpha=0.7, label='95% threshold')
ax2.set_xlabel('Eigenvalue Index')
ax2.set_ylabel('Cumulative Variance Explained')
ax2.set_title('Cumulative Variance')
ax2.grid(True, alpha=0.3)
ax2.legend()
plt.tight_layout()
plt.show()
# Calculate effective dimension
eff_dim = np.searchsorted(cumvar, 0.95) + 1
print(f"\nEffective dimension (95% variance): {eff_dim}")
print(f"Condition number: {eigenvals[0] / eigenvals[-1]:.2f}")
print(f"Spectral gap: {eigenvals[0] - eigenvals[1]:.4f}")
8. Machine Learning Performance Comparison¶
In [ ]:
Copied!
# Compare quantum vs classical kernel performance on a classification task
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_score
from sklearn.metrics import accuracy_score
# Prepare data for ML comparison
X_ml = X[:40] # Use more samples for ML
y_ml = y[:40]
# Compute kernels
print("Computing kernels for ML comparison...")
pipeline_ml = QuantumEmbeddingPipeline(
embedding_type="angle",
n_qubits=3,
backend="qiskit",
normalize=True
)
K_quantum_ml = pipeline_ml.fit_transform(X_ml)
K_classical_ml = rbf_kernel(X_ml)
# Simple train-test split (first 30 for training, last 10 for testing)
train_idx = np.arange(30)
test_idx = np.arange(30, 40)
# Quantum kernel SVM
K_train_q = K_quantum_ml[np.ix_(train_idx, train_idx)]
K_test_q = K_quantum_ml[np.ix_(test_idx, train_idx)]
svm_q = SVC(kernel='precomputed')
svm_q.fit(K_train_q, y_ml[train_idx])
y_pred_q = svm_q.predict(K_test_q)
acc_quantum = accuracy_score(y_ml[test_idx], y_pred_q)
# Classical kernel SVM
K_train_c = K_classical_ml[np.ix_(train_idx, train_idx)]
K_test_c = K_classical_ml[np.ix_(test_idx, train_idx)]
svm_c = SVC(kernel='precomputed')
svm_c.fit(K_train_c, y_ml[train_idx])
y_pred_c = svm_c.predict(K_test_c)
acc_classical = accuracy_score(y_ml[test_idx], y_pred_c)
print(f"\n=== Classification Performance ===")
print(f"Quantum Kernel SVM Accuracy: {acc_quantum:.3f}")
print(f"Classical Kernel SVM Accuracy: {acc_classical:.3f}")
print(f"Quantum Advantage: {acc_quantum - acc_classical:+.3f}")
# Compare quantum vs classical kernel performance on a classification task
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_score
from sklearn.metrics import accuracy_score
# Prepare data for ML comparison
X_ml = X[:40] # Use more samples for ML
y_ml = y[:40]
# Compute kernels
print("Computing kernels for ML comparison...")
pipeline_ml = QuantumEmbeddingPipeline(
embedding_type="angle",
n_qubits=3,
backend="qiskit",
normalize=True
)
K_quantum_ml = pipeline_ml.fit_transform(X_ml)
K_classical_ml = rbf_kernel(X_ml)
# Simple train-test split (first 30 for training, last 10 for testing)
train_idx = np.arange(30)
test_idx = np.arange(30, 40)
# Quantum kernel SVM
K_train_q = K_quantum_ml[np.ix_(train_idx, train_idx)]
K_test_q = K_quantum_ml[np.ix_(test_idx, train_idx)]
svm_q = SVC(kernel='precomputed')
svm_q.fit(K_train_q, y_ml[train_idx])
y_pred_q = svm_q.predict(K_test_q)
acc_quantum = accuracy_score(y_ml[test_idx], y_pred_q)
# Classical kernel SVM
K_train_c = K_classical_ml[np.ix_(train_idx, train_idx)]
K_test_c = K_classical_ml[np.ix_(test_idx, train_idx)]
svm_c = SVC(kernel='precomputed')
svm_c.fit(K_train_c, y_ml[train_idx])
y_pred_c = svm_c.predict(K_test_c)
acc_classical = accuracy_score(y_ml[test_idx], y_pred_c)
print(f"\n=== Classification Performance ===")
print(f"Quantum Kernel SVM Accuracy: {acc_quantum:.3f}")
print(f"Classical Kernel SVM Accuracy: {acc_classical:.3f}")
print(f"Quantum Advantage: {acc_quantum - acc_classical:+.3f}")
9. Summary and Insights¶
In [ ]:
Copied!
print("=== Summary ===")
print(f"Dataset: Iris (2 classes, 2 features, {len(X)} samples)")
print(f"Quantum Embedding: Angle encoding with {pipeline_angle.n_qubits} qubits")
print(f"Backend: {pipeline_angle.backend}")
print("\nKey Findings:")
print(f"- Quantum kernel computation successful")
print(f"- Embedding metrics computed")
print(f"- Quantum vs Classical accuracy comparison performed")
print(f"- Kernel eigenspectrum analyzed")
print("\nNext Steps:")
print("- Try different embedding types (IQP, data re-uploading, etc.)")
print("- Experiment with different numbers of qubits")
print("- Test on larger datasets")
print("- Optimize embedding parameters")
print("- Use real quantum devices")
print("=== Summary ===")
print(f"Dataset: Iris (2 classes, 2 features, {len(X)} samples)")
print(f"Quantum Embedding: Angle encoding with {pipeline_angle.n_qubits} qubits")
print(f"Backend: {pipeline_angle.backend}")
print("\nKey Findings:")
print(f"- Quantum kernel computation successful")
print(f"- Embedding metrics computed")
print(f"- Quantum vs Classical accuracy comparison performed")
print(f"- Kernel eigenspectrum analyzed")
print("\nNext Steps:")
print("- Try different embedding types (IQP, data re-uploading, etc.)")
print("- Experiment with different numbers of qubits")
print("- Test on larger datasets")
print("- Optimize embedding parameters")
print("- Use real quantum devices")
Conclusion¶
This notebook demonstrated the basic workflow of the Quantum Data Embedding Suite:
- Data Preparation: We loaded and standardized the Iris dataset
- Quantum Embedding: Created angle-encoded quantum feature maps
- Kernel Computation: Computed quantum kernel matrices using fidelity
- Quality Assessment: Evaluated embedding expressibility and trainability
- Comparison: Compared quantum vs classical kernel performance
- Analysis: Examined kernel eigenspectra and effective dimensions
The framework provides a comprehensive toolkit for exploring quantum machine learning with different embedding strategies, enabling researchers to:
- Systematically compare embedding approaches
- Assess quantum advantage potential
- Optimize quantum feature maps
- Scale to real quantum devices
For more advanced examples, see the other notebooks in the tutorials directory.